Info that can be extracted from LHCb Dirac jobs

Basically there are two sets of infos for Dirac jobs "parameters" and "attributes", I extracted the ones I found to be possibly useful, there are several more......

In addition there is a "Logging Info" which marks state changes of the job was going through the DIRAC system from submission until the final state.

LoggingInfo Description Example
Source the overall status of the job JobManager, InputData,
Status the status of the job as reported in the job monitor i.e. following a "state machine" Received, Checking, Running, Done, Failed
Minor Status a minor status inside the Status above Job accepted, JobScheduling, ...
Application Status the status of the payload application if it has started Successful, Executing DaVinci Step X
DateTime The date and time of the status report in UTC 2014-11-11 15:15

Parameter Name Description Example(s)
JobID Dirac Job ID 82141213
Status Final Status of the Job Done / Failed
StartExecTime time stamp when the payload started on the worker node 2014-07-07 13:30:51
RescheduleCounter number of times the same job was retried e.g. b/c of input data not retrievable 0
Minor Status internal fine grained Dirac state Requests done
ApplicationStatus payload status Job Finished Successfully
JobType major job types, ~ dozen DataProcessing, MonteCarlo, User, ....
SubmissionTime time stamp when the job was submitted to Dirac 2014-07-07 12:30:51
Site LHCb site name where the job was executed
EndExectime same as StartExecTime for payload ending on the WN 2014-07-07 14:30:51
UserPriority internal Dirac priority of the job 2
CPUTime might be useful but the ones I checked are all 0.0

Attribute Name Description Example
Pilot Reference WLCG name e.g.
CPUScalingFactor The scaling factor as posted in the BDII 4.0
LoadAverage last (?) load of the WN reported 24.14
DownloadInputData the input data for this job if any Successfully downloaded LFN(s): /lhcb/LHCb/Collision12/CHARMTOBESWUM.DST/00020349/0002/00020349_00024058_1.CharmToBeSwum.dst Downloaded 1 / 1 files from local Storage Elements on first attempt.
WallClockTime(s) seconds it took to execute the payload 18965.2357981
CacheSize(kB) cache used on the WN 12288KB
LastUpdateCPU(s) last CPU seconds reported by the watchdog 15695.0
DiskSpace(MB) scratch space available 6076.0
HostName host where the payload was executed
TotalCPUTime(s) Total CPU seconds used to execute the payload 16251.38
CPUNormalizationFactor the CPU scaling as measured by the LHCb pilot 9.3
NormCPUTime(s) CPUNormalizationFactor * TotalCPUTime(s) 151137.834
ScaledCPUTime .... 117392.0
LocalJobID local batch system job ID 543051900
ModelName cpu type as reported by the WN Intel(R)Xeon(R)CPUL5640@2.27GHz
PayloadPID process id of the payload on the WN 2200
AgentLocalSE what the job thinks are its local SEs to down/upload data IN2P3-RAW,IN2P3-DST,IN2P3_M-DST,IN2P3-USER,IN2P3-FAILOVER,IN2P3-RDST,IN2P3_MC_M-DST,IN2P3_MC-DST,IN2P3-ARCHIVE,IN2P3-BUFFER
UploadedOutputData the output data produced successfully uploaded to (some) storage 00039801_00022431_4.swimstrippingd02kskk.mdst
OK but what? True
LocalAccount wn unix id executing the payload lhbplt01
Memory(kB) max memory consumed by the payload 2450756kB
MemoryUsed(kB) total memory in use on the WN 6140024.0
CPU(MHz) cpu clock speed 2268.000

-- StefanRoiser - 15 Jul 2014

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2014-11-18 - StefanRoiser
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback