TWiki
>
CMSPublic Web
>
Dashboard
>
JobExitCodes
(revision 90) (raw view)
Edit
Attach
PDF
---++ IMPORTANT: Any change which are done here also must be done in: https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/WMExceptions.py * Range(1 - 512) * standard ones in Unix and indicate a CMSSW abort that the cmsRun did not catch as exception * Range(7000 - 9000) * cmsRun (CMSSW) exit codes. These codes may depend on specific CMSSW version https://github.com/cms-sw/cmssw/blob/CMSSW_5_0_1/FWCore/Utilities/interface/EDMException.h#L26 * Range(10000 - 19999) * Failures related to the environment setup * Range(50000 - 59999) * Failures related executable file * Range(60000 - 69999) * Failures related staging-OUT * Range(70000 - 79999) * Failures related only for WMAgent. (which does not fit into ranges before) * Range(80000 - 89999) * Failures related only for CRAB3. (which does not fit into ranges before) * Range(90000 - 99999) * Other problems which does not fit to any range before. ---++ Error codes currently sent from CMS jobs to the dashboard %X% indicates site error * *Error exit code of the cmsRun application itself - range 0-10000* * *Exit codes in 1-255 are standard ones in Unix and indicate a CMSSW abort that the cmsRun did not catch as exception* * 1 - 255 those exit codes are usually returned by =bash= when it is interrupted by a fatal signal. The exit code is set to =128+signal number= as per https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html . The list of standard linux signal can be read from =man -s 7 signal= * For convenience we have collected a list of most common such exit codes: StandardExitCodes * *cmsRun (CMSSW) exit codes. range 7000-9000 * *These codes may depend on specific CMSSW version, the list is maintained in [[https://github.com/cms-sw/cmssw/blob/CMSSW_5_0_1/FWCore/Utilities/interface/EDMException.h#L26][here]] and you should look at tags there to find out what is appropriate for a given CMSSW release.* The situation as of 5_0_X is below * // The first three are specific categories of CMS Exceptions. * 7000 - Exception from command line processing * 7001 - Configuration File Not Found * 7002 - Configuration File Read Error * 8001 - Other CMS Exception * 8002 - std::exception (other than bad_alloc) * 8003 - Unknown Exception * 8004 - std::bad_alloc (memory exhaustion) * 8005 - Bad Exception Type (e.g throwing a string) * // The rest are specific categories of CMS Exceptions. * 8006 - !ProductNotFound * 8007 - !DictionaryNotFound * 8008 - !InsertFailure * 8009 - !Configuration * 8010 - !LogicError * 8011 - !UnimplementedFeature * 8012 - !InvalidReference * 8013 - !NullPointerError * 8014 - !NoProductSpecified * 8015 - !EventTimeout * 8016 - !EventCorruption * 8017 - !ScheduleExecutionFailure * 8018 - !EventProcessorFailure * 8019 - !FileInPathError * %X% 8020 - !FileOpenError (Likely a site error) (see [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCrabFaq#Exit_code_8020][HERE]].) * %X% 8021 - !FileReadError (May be a site error) * The following link can help, please check: https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookWhichRelease#DifferentReleases * 8022 - !FatalRootError * 8023 - !MismatchedInputFiles * 8024 - !ProductDoesNotSupportViews * 8025 - !ProductDoesNotSupportPtr * 8026 - !NotFound (something other than a product or dictionary not found) * 8027 - !FormatIncompatibility * %X% 8028 - !FileOpenError with fallback * 8030 - Exceeded maximum allowed VSize (!ExceededResourceVSize). * 8031 - Exceeded maximum allowed RSS (!ExceededResourceRSS). * 8032 - Exceeded maximum allowed time (!ExceededResourceTime). * %X% 8033 - Could not write output file (!FileWriteError) (usually local disk problem) * 8034 - !FileNameInconsistentWithGUID * 9000 - cmsRun caught (SIGINT and SIGUSR2) signal. * %X% *Failures related to the environment setup - range 10000-19999* * 10001 - Connectivity problems * 10002 - CPU load is too high * 10003 - CMS software initialisation script cmsset_default.sh failed * 10004 - CMS_PATH not defined * 10005 - CMS_PATH directory does not exist * 10006 - scramv1 command not found * 10007 - Some CMSSW files are corrupted/non readable * 10008 - Scratch dir was not found * 10009 - Less than 5 GB/core of free space in scratch dir * 10010 - Could not find X509 certificate directory * 10011 - Could not find X509 proxy certificate * 10012 - Unable to locate the glidein configuration file * 10013 - No sitename detected! Invalid SITECONF file * 10014 - No PhEDEx node name found for local or fallback stageout * 10015 - No LOCAL_STAGEOUT section in site-local-config.xml * 10016 - No frontier-connect section in site-local-config.xml * 10017 - No callib-data section in site-local-config.xml * 10018 - site-local-config.xml was not found * 10019 - TrivialFileCatalog string missing * 10020 - event_data section is missing * 10021 - no proxy string in site-local-config.xml * 10022 - Squid test was failed * 10023 - Clock skew is bigger than 60 sec * 10031 - Directory VO_CMS_SW_DIR not found * %X% 10032 - Failed to source CMS Environment setup script such as cmssset_default.sh, grid system or site equivalent script * %X% 10034 - Required application version is not found at the site (see [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCrabFaq#Exit_code_10034][HERE]].) * %X% 10040 - failed to generate cmsRun cfg file at runtime * %X% 10042 - Unable to stage-in wrapper tarball. * %X% 10043 - Unable to bootstrap WMCore libraries (most likely site python is broken). * %X% 10050 - WARNING test_squid.py: One of the load balance Squid proxies * %X% 10051 - WARNING less than 20 GB of free space per core in scratch dir * %X% 10052 - WARNING less than 10MB free in /tmp * %X% 10053 - WARNING CPU load of last minutes + pilot cores is higher than number of physical CPUs * %X% 10054 - WARNING proxy shorther than 6 hours * *Executable file related failures - range 50000-59999* * 50110 - Executable file is not found * 50111 - Executable file has no exe permissions * 50113 - Executable did not get enough arguments * 50115 - cmsRun did not produce a valid job report at runtime (often means cmsRun segfaulted) * 50116 - Could not determine exit code of cmsRun executable at runtime * 50513 - Failure to run SCRAM setup scripts * 50660 - Application terminated by wrapper because using too much RAM (RSS) * 50661 - Application terminated by wrapper because using too much Virtual Memory (VSIZE) * 50662 - Application terminated by wrapper because using too much disk * 50663 - Application terminated by wrapper because using too much CPU time * 50664 - Application terminated by wrapper because using too much Wall Clock time * 50665 - Application terminated by wrapper because it stay idle too long * 50669 - Application terminated by wrapper for not defined reason * *Staging-OUT related troubles- range 60000-69999* * 60302 - Output file(s) not found (see [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCrabFaq#Exit_code_60302][HERE]].) * 60307 - Failed to copy an output file to the SE (sometimes caused by [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCrabFaq#SE_Getoutput_time_out_issue][timeout issue]]). Or by the issues mentioned [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCrabFaq#Exit_code_60307][HERE]]. * %X% 60311 - Local Stage Out Failure using site specific plugin * 60312 - Failed to get file TURL via lcg-lr command * %X% 60315 - ProdAgent !StageOut initialisation error (Due to TFC, SITECONF etc) * 60316 - Failed to create a directory on the SE * 60317 - Forced timeout for stuck stage out * 60318 - Internal error in Crab cmscp.py stageout script * 60319 - Failure to do AlcaHarvest stageout (WMAgent) * 60320 - Failure to communicate with ASO server * %X% 60321 - Site related issue: no space, SE down, refused connection. * 60322 - User is not authorized to write to destination site. * 60323 - User quota exceeded. * 60324 - Other stageout exception. * 60401 - Failure to assemble LFN in direct-to-merge by size (WMAgent) * 60402 - Failure to assemble LFN in direct-to-merge by event (WMAgent) * 60403 - Timeout during attempted file transfer - status unknown (WMAgent) * 60404 - Timeout during staging of log archives - status unknown (WMAgent) * 60405 - General failure to stage out log archives (WMAgent) * 60407 - Timeout in staging in log files during log collection (WMAgent) * 60408 - Failure to stage out of log files during log collection (WMAgent) * 60409 - Timeout in stage out of log files during log collection (WMAgent) * 60450 - No output files present in the report * 60451 - Output file lacked adler32 checksum (WMAgent) * *Failures related only for WMAgent.- range 70000-79999* * 71101 - No sites are available to submit the job because the location of its input(s) do not pass the site whitelist/blacklist restrictions (WMAgent) Twas 61101 * 71102 - The job can only run at a site that is currently in Aborted state (WMAgent) * 71103 - The !JobSubmitter component could not load the job pickle (WMAgent) * 71104 - The job can run only at a site that is currently in Draining state (WMAgent) * 71300 - The job was killed by the WMAgent, reason is unknown (WMAgent) * 71301 - The job was killed by the WMAgent because the site it was running at was set to Aborted (WMAgent) * 71302 - The job was killed by the WMAgent because the site it was running at was set to Draining (WMAgent) * 71303 - The job was killed by the WMAgent because the site it was running at was set to Down (WMAgent) * 71304 - The job was killed by the WMAgent for using too much wallclock time (WMAgent) Job status was Running. * 71305 - The job was killed by the WMAgent for using too much wallclock time (WMAgent) Job status was Pending. * 71306 - The job was killed by the WMAgent for using too much wallclock time (WMAgent) Job status was Error. * 71307 - The job was killed by the WMAgent for using too much wallclock time (WMAgent) Job status was Unkown. * 70318 - Failure in DQM upload. * 70452 - No run/lumi information in file (WMAgent) * *Failures related only for CRAB3- range 80000-89999* * 80000 - Internal error in CRAB job wrapper * 80001 - No exit code set by job wrapper. * 80453 - Unable to determine pset hash from output file (CRAB3). * *Other problems which does not fit to any range before.- range 90000-99999* * 90000 - Error in CRAB3 post-processing step (currently includes basically errors in stage out and file metadata upload). * 99109 - Uncaught exception in WMAgent step executor
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r92
<
r91
<
r90
<
r89
<
r88
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r90 - 2019-12-09
-
MattiKortelainen
Log In
CMSPublic
CMSPublic Web
CMSPrivate Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Create
a LeftBar
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Cern Search
TWiki Search
Google Search
CMSPublic
All webs
Copyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback