PanDA Pilot 2 Error Codes

Introduction

The following list is the current range of implemented error codes, their (internally used) acronyms and meanings.

Note: the pilot stores all errors that have occurred and reports the errors at the end of the log. Only the first error (usually the highest priority) is reported to the server.

Error codes

Error code Acronym Meaning Note
1008 GENERALERROR General pilot error, consult batch log ..
1098 NOLOCALSPACE Not enough local space Error code is set e.g. by job monitoring, also if copytool command fails (if "No space left on device" is found in command output).
1099 STAGEINFAILED Failed to stage-in file ..
1100 REPLICANOTFOUND Replica not found The rucio API function list_replicas() did not return any replicas. Check log for details.
1103 NOSUCHFILE No such file or directory Error thrown by open_file() function. Also set if copytool fails and "No such file or directory" is found in command output.
1104 USERDIRTOOLARGE User work directory too large The error is set if the user work directory exceeds the maximum allowed limit, as defined by schedconfig.maxwdir (default: 14 GB).
1106 STDOUTTOOBIG Payload log or stdout file too big Set if stdout exceeds maximum allowed limit of 2 GB, set in the Pilot's default config file.
1110 SETUPFAILURE Failed during payload setup ..
1115 NFSSQLITE NFS SQLite locking problems Pilot identifies this error by doing a grep on the strings "prepare 5 database is locked" and "Error SQLiteStatement" in the payload stdout.
1116 QUEUEDATA Pilot could not download queuedata ..
1117 QUEUEDATANOTOK Pilot found non-valid queuedata ..
1124 OUTPUTFILETOOLARGE Output file too large ..
1133 NOSTORAGE Fetching default storage failed: no activity related storage defined ..
1137 STAGEOUTFAILED Failed to stage-out file ..
1141 PUTMD5MISMATCH md5sum mismatch on output file Error acronym to be renamed.
1143 CHMODTRF Failed to chmod trf After downloading a trf, the pilot tries to do a chmod 0755 on it. If this fails, the pilot will set this error.
1144 PANDAKILL This job was killed by panda server The pilot assigns this error code when it receives a 'tobekilled' instruction from the PanDA server via the backchannel in the updateJob command.
1145 GETMD5MISMATCH md5sum mismatch on input file Error acronym to be renamed.
1149 TRFDOWNLOADFAILURE Transform could not be downloaded Relevant for transforms used with user jobs only. The error means that the curl command failed to download the transform from the source. Details will be in the job log (pilotlog.txt)
1150 LOOPINGJOB Looping job killed by pilot The pilot will kill the payload (or stop stage-in/out) if there is no activity (i.e. files touched in the work directory or if the file transfer is stuck) within the allowed time. The default looping job time limit is 12*3600 s for production jobs and 3*3600 s for user analysis jobs. The limit can be overridden in the pilot's config file (or set by the user using the maxCPUCount variable).
1151 STAGEINTIMEOUT File transfer timed out during stage-in Currently only identified for rucio file transfer (unless "Operation timed out" is in stderr)
1152 STAGEOUTTIMEOUT File transfer timed out during stage-out Currently only identified for rucio file transfer (unless "Operation timed out" is in stderr)
1163 NOPROXY Grid proxy not valid Set if grid-proxy-info fails or if "Could not establish context" is found in copytool command output.
1165 MISSINGOUTPUTFILE Local output file is missing ..
1168 SIZETOOLARGE Total file size too large Before stage-in, the pilot verifies that the sum of the input file sizes does not exceed maxwdir (set in schedconfig or in pilot config file). Any files that are to be accessed directly/remotely are excluded.
1171 GETADMISMATCH adler32 mismatch on input file Error acronym to be renamed.
1172 PUTADMISMATCH adler32 mismatch on output file Error acronym to be renamed.
1177 NOVOMSPROXY Voms proxy not valid Set if arcproxy fails
1180 GETGLOBUSSYSERR Globus system error during stage-in Pilot identifies this error if "globes_xio:" is found in command output.
1181 PUTGLOBUSSYSERR Globus system error during stage-out Pilot identifies this error if "globes_xio:" is found in command output.
1186 NOSOFTWAREDIR Software directory does not exist ..
1187 NOPAYLOADMETADATA Payload metadata does not exist ..
1190 LFNTOOLONG LFN too long (exceeding limit of 255 characters) When validating a job definition, before executing the payload, the Pilot makes sure that no output file has an LFN that is longer than 255 characters.
1191 ZEROFILESIZE File size cannot be zero Before executing the stage-out command, the Pilot verifies that the size of the file is not zero.
1199 MKDIR Failed to create local directory ..
1200 KILLSIGNAL Job terminated by unknown kill signal ..
1201 SIGTERM Job killed by signal: SIGTERM ..
1202 SIGQUIT Job killed by signal: SIGQUIT ..
1203 SIGSEGV Job killed by signal: SIGSEGV ..
1204 SIGXCPU Job killed by signal: SIGXCPU ..
1205 USERKILL Job killed by user Reserved error code for user defined kill instructions. Currently not implemented.
1206 SIGBUS Job killed by signal: SIGBUS ..
1207 SIGUSR1 Job killed by signal: SIGUSR1 ..
1211 MISSINGINSTALLATION Missing installation Assigned error code if the payload fails to execute the transform.
1212 PAYLOADOUTOFMEMORY Payload ran out of memory Assigned error code if the pilot finds the string "FATAL out of memory: taking the application down" in the stderr and "St9bad_alloc", "std::bad_alloc" in the stdout.
1213 REACHEDMAXTIME Reached batch system time limit Pilot aborts automatically when 10 minutes remain of the maximum allowed running time, as set by 1) schedconfig,maxtime or 2) Pilot option -l <maxtime> (both values are in seconds).
1220 UNKNOWNPAYLOADFAILURE Job failed due to unknown reason (consult log file) ..
1221 FILEEXISTS File already exists Error code is set if "File exists", "SRM_FILE_BUSY" or "file already exists" is found in copytool command output.
1223 BADALLOC Transform failed due to bad_alloc Assigned error code if the pilot finds "badalloc" among the job report errors.
1224 ESRECOVERABLE Event service: recoverable error ..
1228 ESFATAL Event service: fatal error ..
1234 EXECUTEDCLONEJOB Clone job is already executed ..
1235 PAYLOADEXCEEDMAXMEM Payload exceeded maximum allowed memory ..
1236 KILLEDBYSERVER Killed by server. This error is not set by the pilot. It is currently only set by Harvester.
1238 ESNOEVENTS Event service: no events ..
1240 MESSAGEHANDLINGFAILURE Failed to handle message from payload ..
1242 CHKSUMNOTSUP Query checksum is not supported The error code is set if Pilot finds "query chksum is not supported" or "Unable to checksum" in command output.
1244 NORELEASEFOUND No release candidates found ..
1246 NOUSERTARBALL User tarball could not be downloaded from PanDA server ..
1247 BADXML Badly formed XML Parsing of metadata failed most likely due to presence of illegal character.
1300 NOTIMPLEMENTED The class or function is not implemented ..
1301 UNKNOWNEXCEPTION An unknown pilot exception has occurred Developers should be contacted
1302 CONVERSIONFAILURE Failed to convert object data E.g. if a JSON dictionary can't be converted from unicode to utf-8
1303 FILEHANDLINGFAILURE Failed during file handling E.g. if a file can't be opened or a dictionary can't be loaded from file
1305 PAYLOADEXECUTIONFAILURE Failed to execute payload
1306 SINGULARITYGENERALFAILURE Singularity: general failure Site issue; set if the Pilot finds "Operation not permitted" in stderr.
1307 SINGULARITYNOLOOPDEVICES Singularity: No more available loop devices Site issue; set if Pilot finds "No more available loop devices" in stderr.
1308 SINGULARITYBINDPOINTFAILURE Singularity: Not mounting requested bind point Site issue; set if Pilot finds "Not mounting requested bind point" in stderr.
1309 SINGULARITYIMAGEMOUNTFAILURE Singularity: Failed to mount image Site issue; set if Pilot finds "Failed to mount image" in stderr.
1310 PAYLOADEXECUTIONEXCEPTION Exception caught during payload execution Internal pilot problem
1311 NOTDEFINED Not defined A general - internally used - error that is explained in the corresponding exception (NotDefined) error diagnostics; e.g. the analytics package throws this exception if a fit has not been defined; or if a math function fails to convert a string to an integer.
1312 NOTSAMELENGTH Not same length Internally used error corresponding to exception NotSameLength, which is thrown if input data are not of same length in a fit.
1313 NOSTORAGEPROTOCOL No protocol defined for storage endpoint ..
1314 UNKNOWNCHECKSUMTYPE Unknown checksum type ..
1315 UNKNOWNTRFFAILURE Unknown TRF failure ..
1316 RUCIOSERVICEUNAVAILABLE Rucio: Service unavailable Set if corresponding Rucio error details (reg.exp. or "service_unavailable") are found in copytool command output
1317 EXCEEDEDMAXWAITTIME Exceeded maximum waiting time Internally used exception.error code. Exception thrown by pilot monitoring when abort_job wait time has been exceeded (and when other threads have not finished cleaning up on time). abort_job is set when pilot has received a kill signal.
1318 COMMUNICATIONFAILURE Failed to communicate with server ..
1319 INTERNALPILOTPROBLEM An internal Pilot problem has occurred (consult Pilot log) Error code used for internal debugging. A more precise error message should be written to the log. ..
1320 LOGFILECREATIONFAILURE Failed during creation of log file In case tarfile.open() or the archive.add() fails, the pilot will set this error code.
1321 RUCIOLOCATIONFAILED Failed to get client location for Rucio ..
1322 RUCIOLISTREPLICASFAILED Failed to get replicas from Rucio ..
1323 UNKNOWNCOPYTOOL Unknown copy tool Set if the requested copy tool has no implementation (e.g. copytool=storm).
1324 SERVICENOTAVAILABLE Service not available at the moment Rucio server not available.
1325 SINGULARITYNOTINSTALLED Singularity: not installed Identified by trf exit code 64 and the string "Singularity is not installed" present in the stderr.
1326 NOREPLICAS No matching replicas were found in list_replicas() output list_replicas() returned replicas but no local matching replica was found.
1327 UNREACHABLENETWORK Unable to stage-in file since network is unreachable Problem seen with xrdcp command during stage-in.
1328 PAYLOADSIGSEGV SIGSEGV: Invalid memory reference or a segmentation fault Special payload error extracted from job report. A SIGSEGV is an error (signal) caused by an invalid memory reference or a segmentation fault. The payload is probably trying to access an array element out of bounds or trying to use too much memory.
1329 NONDETERMINISTICDDM Failed to construct SURL for non-deterministic ddm (update AGIS) While Pilot 1 ignored the is_deterministic endpoint field if the storage path ended in /rucio, Pilot 2 will instead fail the job if the endpoint is not deterministic. The endpoint should be fixed in AGIS.
1330 JSONRETRIEVALTIMEOUT JSON retrieval timed out Error is assigned if the pilot fails to download JSON. [To be revised now that fail-over has been implemented, February 2019].
1331 MISSINGINPUTFILE Input file is missing in storage element ..
1332 BLACKHOLE Black hole detected in file system (consult Pilot log) This error is assigned if a pilot module goes missing. Typically this would mean that it cannot be imported.
1333 NOREMOTESPACE No space left on device ..
1334 ATLASSETUPFATAL AtlasSetup failed with a fatal exception (consult Payload log) ..
1335 MISSINGUSERCODE User code not available on PanDA server (resubmit task with --useNewCode) Error occurs when user tarball has been deleted from the server and the pilot tries to download it. User must resubmit task with prun/pathena option --useNewCode.
1336 JOBALREADYRUNNING Job is already running elsewhere ..
1337 BADMEMORYMONITORJSON Memory monitor produced bad output Failure to parse JSON file from Memory monitor.
1338 STAGEINAUTHENTICATIONFAILURE Authentication failure during stage-in ..
1339 DBRELEASEFAILURE Local DBRelease handling failed (consult Pilot log) ..
1340 SINGULARITYNEWUSERNAMESPACE Singularity: Failed invoking the NEWUSER namespace runtime ..
1341 BADQUEUECONFIGURATION Bad queue configuration detected ..
1342 MIDDLEWAREIMPORTFAILURE Failed to import middleware (consult Pilot log) ..
1343 NOOUTPUTINJOBREPORT Found no output in job report Set when output=[] in job report.
1344 RESOURCEUNAVAILABLE Resource temporarily unavailable Set when get_current_cpu_consumption_time() fails due to OSError exception raised in subprocess module (failed os.fork()). To be extended in v 2.1.22+.
1345 SINGULARITYFAILEDUSERNAMESPACE Singularity: Failed to create user namespace Detected in stderr when the transform has a non-zero exit code.
1346 TRANSFORMNOTFOUND Transform not found Detected in stderr when the transform has a non-zero exit code.
1347 UNSUPPORTEDSL5OS Unsupported SL5 OS Detected in stderr when the transform has a non-zero exit code.
1348 SINGULARITYRESOURCEUNAVAILABLE Singularity: Resource temporarily unavailable Detected in stderr when the transform has a non-zero exit code.
1349 UNRECOGNIZEDTRFARGUMENTS Unrecognized transform arguments Detected in stderr when the transform has a non-zero exit code.
1350 EMPTYOUTPUTFILE Empty output file detected Detected in stderr when the transform has a non-zero exit code.
1351 UNRECOGNIZEDTRFSTDERR Unrecognized fatal error in transform stderr (dev pilot v 2.1.25). Detected in stderr when the transform has a non-zero exit code.
1352 STATFILEPROBLEM Failed to stat proc file for CPU consumption calculation The pilot sets this error during the CPU consumption calculation if reading /proc/pid/stat fails with "No such file or directory".
1353 NOSUCHPROCESS CPU consumption calculation failed: No such process The pilot sets this error during the CPU consumption calculation if reading /proc/pid/stat fails with "No such process".
1354 GENERALCPUCALCPROBLEM General CPU consumption calculation problem (consult Pilot log) If there is a problem accessing the /proc/pid/stat file that is not recognised, this error will be set.
1355 COREDUMP Core dump detected Set if a core dump is found for a failed job in the payload work dir (during the initial payload error analysis). The core dump is removed. Note: currently the file name must be "core" (i.e. not "core.*").
1356 PREPROCESSFAILURE Pre-process command failed ..
1357 POSTPROCESSFAILURE Post-process command failed ..
1358 MISSINGRELEASEUNPACKED Missing release setup in unpacked container Pilot requires that /release_setup.sh is present in unpacked containers. It is not present in older containers.
1359 PANDAQUEUENOTACTIVE PanDA queue is not active The error is set as soon as the pilot has downloaded queue data if the queue is not active.
1360 IMAGENOTFOUND Image not found The error is set if the pilot cannot find an image whose path is known.
1361 REMOTEFILECOULDNOTBEOPENED Remote file could not be opened For direct access jobs, the pilot attempts to open (and close) all input root files to avoid wasting CPU with the payload.
1362 XRDCPERROR Xrdcp was unable to open file ..

- PaulNilsson - 2018-01-12

Error codes in MONIT infrastructure

This table based on information from this file: https://gitlab.cern.ch/monitoring/spark-atlasjm-aggregation/-/raw/master/src/main/scala/ch/cern/monitoring/transformation/ErrorDiagnostics.scala

Error code Error type Meaning MONIT Error category
1 TRANSEXIT Athena release is not installed in the CE; or trf failed due to 'Unknown Problem' (see checklog.txt) Transformation Error: not installed in CE
2 TRANSEXIT Athena core dump Athena/Exec Error: Athena
6 TRANSEXIT TRF_SEGVIO - Segmentation violation Transformation Error
10 TRANSEXIT ATH_FAILURE - Athena non-zero exit Athena/Exec Error: Athena
26 TRANSEXIT TRF_ATHENACRASH - Athena crash Athena/Exec Error: Athena
30 TRANSEXIT TRF_PYT - transformation python error Transformation Error
31 TRANSEXIT TRF_ARG - transformation argument error Transformation Error
32 TRANSEXIT TRF_DEF - transformation definition error Transformation Error
33 TRANSEXIT TRF_ENV - transformation environment error Transformation Error
34 TRANSEXIT TRF_EXC - transformation exception Transformation Error
40 TRANSEXIT Athena crash - consult log file Athena/Exec Error: Athena
41 TRANSEXIT TRF_OUTFILE - output file error Transformation Error
42 TRANSEXIT TRF_CONFIG - transform config file error Transformation Error
50 TRANSEXIT Athena crash-consult log file (can be 'VKalVrtPrim ERROR Primary vertex not found') Athena/Exec Error: Athena
51 TRANSEXIT TRF_DBREL_TARFILE - Problems with the DBRelease tarfile Transformation Error
60 TRANSEXIT TRF_GBB_TIME - GriBB - output limit exceeded (time; memory; CPU) Transformation Error
79 TRANSEXIT Copying input file failed (Can't open source file : Invalid file name) Transformation Error
80 TRANSEXIT file in trf definition not found; using the expandable syntax Transformation Error
81 TRANSEXIT file in trf definition not found; using the expandable syntax -- pileup case Transformation Error
85 TRANSEXIT analysis output merge crash - consult log file Transformation Error
98 TRANSEXIT Oracle error - session limit reached Transformation Error
99 TRANSEXIT Unknown transform error (69999; TRF_UNKNOWN) -- consult log file Transformation Error
102 TRANSEXIT One of the output files did not get produced by the job Transformation Error
104 TRANSEXIT Copying the output file from the worker node to the local SE failed (md5sum mismatch; or size mismatch; or LFNnonunique) Transformation Error
126 TRANSEXIT trf is not executable - consult log file Transformation Error
127 TRANSEXIT trf is not installed in the CE Transformation Error: not installed in CE
134 TRANSEXIT Athena core dump; or Athena time out; or ConditionsDB exception caught: MySQL error (database load problem); or Error ORA-03114: not connected to ORACLE Athena/Exec Error: Athena
141 TRANSEXIT No input file is available - input dataset is broken or doesn't exist at WN's site Transformation Error
200 TRANSEXIT no Athena log file produced Transformation Error
220 TRANSEXIT Proot: An exception occurred in the user analysis code Athena/Exec Error: Proot
221 TRANSEXIT Proot: Framework decided to abort the job due to an internal problem Athena/Exec Error: Proot
222 TRANSEXIT Proot: Job completed without reading all input files Athena/Exec Error: Proot
223 TRANSEXIT Proot: Input files cannot be opened Athena/Exec Error: Proot
2100 TRANSEXIT MyProxyError: server name not specified (not really trf error) Transformation Error
2101 TRANSEXIT MyProxyError: voms attributes not specified (not really trf error) Transformation Error
2102 TRANSEXIT MyProxyError: user DN not specified (not really trf error) Transformation Error
2103 TRANSEXIT MyProxyError: pilot owner DN not specified (not really trf error) Transformation Error
2104 TRANSEXIT MyProxyError: invalid path for the delegated proxy (not really trf error) Transformation Error
2105 TRANSEXIT MyProxyError: invalid pilot proxy path (not really trf error) Transformation Error
2106 TRANSEXIT MyProxyError: no path to delegated proxy specified (not really trf error) Transformation Error
2200 TRANSEXIT MyProxyError: myproxy-init not available in PATH (not really trf error) Transformation Error
2201 TRANSEXIT MyProxyError: myproxy-logon not available in PATH (not really trf error) Transformation Error
2202 TRANSEXIT MyProxyError: myproxy-init version not valid (not really trf error) Transformation Error
2203 TRANSEXIT MyProxyError: myproxy-logon version not valid (not really trf error) Transformation Error
2300 TRANSEXIT MyProxyError: proxy delegation failed (not really trf error) Transformation Error
2301 TRANSEXIT MyProxyError: proxy retrieval failed (not really trf error) Transformation Error
2999 TRANSEXIT Unknown transExitCode error code (most likely a pilot script error; consult batch log) Transformation Error
.. TRANSEXIT Undocumented error Transformation Error
100 JOBDISPATCHER lost heartbeat Job Dispatcher Error: lost heartbeat
101 JOBDISPATCHER job recovery failed for three days Job Dispatcher Error
.. JOBDISPATCHER Undocumented error Job Dispatcher Error
100 DDM DQ2 server error DDM Error
200 DDM Adder could not add files to the output datasets DDM Error
.. DDM Undocumented error DDM Error
1008 PILOT General pilot error; consult batch log Pilot/DDM Error
1097 PILOT Get function can't be called for staging input file Pilot Error
1098 PILOT No space left on local disk Pilot/DDM Error: get error
1099 PILOT Get error: Staging input file failed Pilot/DDM Error: get error
1100 PILOT Get error: Replica not found Pilot/DDM Error: get error
1101 PILOT LRC registration error: Connection refused Pilot Error
1103 PILOT Get error: No such file or directory Pilot/DDM Error: get error
1104 PILOT User work directory too large Pilot Error
1105 PILOT Put error: Failed to add file size and checksum to LFC Pilot Error
1106 PILOT Payload stdout file too big Pilot/DDM Error
1107 PILOT Get error: Missing DBRelease file Pilot Error
1108 PILOT Put error: LCG registration failed Pilot Error
1109 PILOT Required CMTCONFIG incompatible with WN Pilot Error
1110 PILOT Failed during setup Pilot/DDM Error
1111 PILOT Exception caught by runJob Pilot/DDM Error
1112 PILOT Exception caught by pilot Pilot/DDM Error
1113 PILOT Get error: Failed to import LFC python modulet Pilot Error
1114 PILOT Put error: Failed to import LFC python module Pilot Error
1115 PILOT NFS SQLite locking problems Pilot Error
1116 PILOT Pilot could not download queuedata Pilot Error
1117 PILOT Pilot found non-valid queuedata Pilot Error
1118 PILOT Pilot could not curl space report Pilot Error
1119 PILOT Pilot aborted due to DDM space shortage Pilot Error
1122 PILOT Bad replica entry returned by lfc_getreplicas(): SFN not set in LFC for this guid Pilot Error
1123 PILOT Missing guid in output file list Pilot/DDM Error: put error
1124 PILOT Output file too large Pilot Error
1130 PILOT Get error: Failed to get POOL file catalog Pilot Error
1131 PILOT Put function can not be called for staging out Pilot/DDM Error: put error
1132 PILOT LRC registration error (consult log file) Pilot Error
1133 PILOT Put error: Fetching default storage URL failed Pilot/DDM Error: put error
1134 PILOT Put error: Error in mkdir on localSE; not allowed or no available space Pilot Error
1135 PILOT Could not get file size in job workdir Pilot Error
1136 PILOT Error running md5sum on the file in job workdir Pilot Error
1137 PILOT Put error: Error in copying the file from job workdir to localSE Pilot/DDM Error: put error
1138 PILOT Put error: could not get the file size on localSE Pilot Error
1139 PILOT Put error: Problem with copying from job workdir to local SE: size mismatch Pilot Error
1140 PILOT Put error: Error running md5sum on the file on localSE Pilot Error
1141 PILOT Put error: Problem with copying from job workdir to local SE: md5sum mismatch Pilot Error
1143 PILOT Failed to chmod trf Pilot Error
1144 PILOT This job was killed by panda server Pilot/PanDA Error: killed by panda server
1145 PILOT Get error: md5sum mismatch on input file Pilot Error
1146 PILOT Trf installation dir does not exist and could not be installed Pilot Error
1148 PILOT Put error: Failed to remove readOnly file in dCache Pilot Error
1149 PILOT wget command failed to download trf Pilot/PanDA Error
1150 PILOT Looping job killed by pilot Pilot/PanDA Error
1151 PILOT Get error: Input file staging timed out Pilot/PanDA Error: Get error
1152 PILOT Put error: File copy timed out Pilot/PanDA Error: Put error
1153 PILOT Lost job was not finished Pilot/PanDA Error
1154 PILOT Failed to register log file Pilot Error
1155 PILOT Failed to move output files for lost job Pilot Error
1156 PILOT Pilot could not recover job Pilot/PanDA Error
1158 PILOT Reached maximum number of recovery trials Pilot/PanDA Error
1159 PILOT Job recovery could not read PoolFileCatalog.xml file (guids lost) Pilot Error
1160 PILOT LRC registration error: file name string size exceeded limit of 250 Pilot Error
1161 PILOT Job recovery could not generate xml for remaining output files Pilot Error
1162 PILOT LRC registration error: Non-unique LFN Pilot Error
1163 PILOT Grid proxy not valid Pilot/PanDA Error
1164 PILOT Get error: Local input file missing Pilot Error
1165 PILOT Put error: Local output file missing Pilot/DDM Error: put error
1166 PILOT Put error: File copy broken by SIGPIPE Pilot Error
1167 PILOT Get error: Input file missing in PoolFileCatalog.xml Pilot Error
1168 PILOT Get error: Total file size too large Pilot/DDM Error: get error
1169 PILOT Put error: LFC registration failed Pilot Error
1170 PILOT Error running adler32 on the file in job workdir Pilot Error
1171 PILOT Get error: adler32 mismatch on input file Pilot/DDM Error: get error
1172 PILOT Put error: adler32 mismatch on output file Pilot/DDM Error: put error
1173 PILOT PandaMover staging error: File is not cached Pilot Error
1174 PILOT PandaMover transfer failure Pilot Error
1175 PILOT Get error: Problem with copying from local SE to job workdir: size mismatch Pilot Error
1176 PILOT Pilot has no child processes (job wrapper has either crashed or did not send final status Pilot/PanDA Error
1177 PILOT Voms proxy not valid Pilot Error
1178 PILOT Get error: No input files are staged Pilot Error
1179 PILOT Get error: Failed to get LFC replicas Pilot/PanDA Error: Get error
1180 PILOT Get error: Globus system error Pilot/PanDA Error: Get error
1181 PILOT Put error: Globus system error Pilot Error
1182 PILOT Get error: Failed to get LFC replica Pilot Error
1183 PILOT LRC registration error: Guid-metadata entry already exists Pilot Error
1184 PILOT Put error: PoolFileCatalog could not be found in workdir Pilot/PanDA Error: Put error
1186 PILOT Software directory does not exist Pilot Error
1187 PILOT Athena metadata is not available Pilot/PanDA Error
1188 PILOT lcg-getturls failed Pilot Error
1189 PILOT lcg-getturls was timed-out Pilot Error
1190 PILOT LFN too long (exceeding limit of 150 characters) Pilot Error
1191 PILOT File size cannot be zero Pilot/PanDA Error
1194 PILOT .. Pilot/PanDA Error
1199 PILOT Could not create directory Pilot Error
1200 PILOT Job terminated by unknown kill signal Pilot/Runtime Error
1201 PILOT Job killed by signal: SIGTERM Pilot/Runtime Error
1202 PILOT Job killed by signal: SIGQUIT Pilot/Runtime Error
1203 PILOT Job killed by signal: SIGSEGV Pilot Error
1204 PILOT Job killed by signal: SIGXCPU Pilot/Runtime Error
1206 PILOT Job killed by signal: SIGBUS Pilot/Runtime Error
1207 PILOT Job killed by signal: SIGUSR1 Pilot/Runtime Error
1210 PILOT No athena output Pilot/Runtime Error
1211 PILOT Missing installation Pilot Error
1212 PILOT Payload ran out of memory Pilot/Runtime Error
1213 PILOT Reached batch system time limit Pilot/Runtime Error
1214 PILOT Site does not allow requested direct access or file stager Pilot Error
1215 PILOT Failed to open TCP connection to localhost (worker node network problem) Pilot Error
1216 PILOT Pilot TCP server has died Pilot Error
1217 PILOT Mismatch between core count in job and queue definition Pilot Error
1218 PILOT Exception caught by RunJobEvent Pilot Error
1219 PILOT uuidgen failed to produce a guid Pilot Error
1220 PILOT Job failed due to unknown reason (consult log file) Pilot/Runtime Error
1221 PILOT File already exist Pilot/Runtime Error
1222 PILOT Failed to get security key pair Pilot Error
1223 PILOT TRF failed due to bad_alloc Pilot Error
1224 PILOT Recoverable Event Service Merge error Pilot/Runtime Error
1225 PILOT Recoverable Event Service error Pilot Error
1226 PILOT gLExec related error Pilot Error
1227 PILOT AthenaMP ended Event Service job prematurely Pilot Error
1228 PILOT Fatal Event Service error Pilot Error
1229 PILOT Fatal Token Extractor error Pilot Error
1230 PILOT Token Extractor error: Host name could not be resolved Pilot Error
1231 PILOT Token Extractor error: Bad URL Pilot Error
1232 PILOT Token Extractor error: Invalid GUID length Pilot Error
1233 PILOT Token Extractor error: No tokens for this GUID Pilot Error
1234 PILOT Already executed clone job Pilot Error
1235 PILOT Payload exceeded maximum allowed memory Pilot/Runtime Error
1236 PILOT Failed by server Pilot/Runtime Error
1237 PILOT Event Service job killed by serve Pilot Error
.. PILOT Undocumented error Pilot Error
100 TASKBUFFER Job expired and killed three days after submission (or killed by user) TaskBuffer Error: Timeout
101 TASKBUFFER Transfer timeout (2weeks) TaskBuffer Error: Timeout
102 TASKBUFFER Expired three days after submission TaskBuffer Error: Timeout
103 TASKBUFFER Aborted by ExtIF TaskBuffer Error
300 TASKBUFFER .. TaskBuffer Error 300
.. TASKBUFFER Undocumented error TaskBuffer Error
65 EXEERROR .. Execution Error 65
68 EXEERROR .. Execution Error 68
.. EXEERROR .. Execution Error
Edit | Attach | Watch | Print version | History: r40 < r39 < r38 < r37 < r36 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r40 - 2021-03-05 - PaulNilsson
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PanDA All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback