ATLAS
In order to optimize resources for I/O intensive jobs, I am collecting the Job information for all Jobtypes in all the experiments. The metrics that are interesting to me are listed below:
for the different jobtypes: MC SIM (different scenarios, i.e. ALICE
PbPb or pp), MC RECO, DATA RECO, ANALYSIS, data repack, skim, merge, stripping:
1) Data in and out [bytes/event] 2) Bandwidth requirement [bytes/sec] 3) CPU times per event [sec/event optimally on a 10 HEPSPEC06 CPU] 4) memory consumption [bytes] 5) total minbias input data size [bytes] 6) efficiencies [%] 7) sensitivity to latency on efficiency (for remote data access) 8) are there jobs that have more than 1 concurrent data access?
1) Data in and out [bytes/event]
The dashboard provides the size of the
input and output (processed data and produced data):

and number of processed events.
Simulation: 856.89 [kB/evt] processed data, 690.10 [kB/evt] produced data
Reconstruction: 10030.66 [kB/evt] processed data, 937.03 [kB/evt] produced data
Analysis: 1333.43 [kB/evt] processed data, 21.83 [kB/evt] produced data
2) Bandwidth requirement [bytes/sec]
3) CPU times per event [sec/event optimally on a 10 HEPSPEC06 CPU]
According to the ATLAS dashboard, the CPU usage from last year (01.08.2014 - 01.08.2015) can be seen below.

These pie-charts are taken from the dashboard, showing the CPU utilisation by
JobType in absolute and relative numbers. For the purpose of making the CPU time comparable, the numbers have been normalized with the HEPSPEC06 benchmark. It should deviate only by a factor of ~10 but does not match that order of magnitude of the CPU consumption shown in the Plot next to it (why?? ~ factor 400).
Using the previous Event Numbers (see Plot Answer 1) ) and breaking the previous CPU numbers down to the event level, the results normalized for a 10 HEPSPEC CPU are (mistake in the dashboard, in my logic, in the units?):
Simulation: 0,062 [s/evt]
Reconstruction: 0,013 [s/evt]
Analysis: 0,00024 [s/evt]
When comparing this with the numbers from Average CPU time spent on one good event, the expectation would be that it is at least in the same order of magnitude (an average CPU is slightly below 10 HEPSPEC), but the same disparity as in the previous two plots can be observed. It is probably better to stick to the following numbers, even if they are not normalized.
Simulation: 270,30 [s/evt]
Reconstruction: 40.02 [s/evt]
Analysis: 10,05 [s/evt]
It is not very significant to look only at the average, these categories are split further, starting with the MC simulation (due to the above seen inconsistency in the dashboard, the following numbers do not take into account the power of the CPU
*to be fixed*):
MC15_13TeV (Full Sim):249 [s/evt]
MC15_13TeV (Fast Sim):62 [s/evt]
MC15_13TeV (MC Simulation):48 [s/evt]
Analysis can be split further by input datatype in AOD and DAOD, resulting in:
AOD: avergae cpu time spent on one good event 7.8139 s/evt (0.000915651 hepspec/evt)
516kB/evt processed data... 4,07kB/evt produced data
DAOD: avergae cpu time spent on one good event 0.0619 s/evt (0.000397506 hepspec/evt)
250kB/evt processed data... 2.11kB/evt produced data
4) memory consumption [bytes]
5) total minbias input data size [bytes]
A 2000 event reconstruction job requires about 30 Gb of input, amounting to about 10 TB of reconstruction input data (minbias).
6) efficiencies [%]
7) sensitivity to latency on efficiency (for remote data access)
8) are there jobs that have more than 1 concurrent data access?
Summary
Why is Data IN MC sim so big?
How can "data OUT" MC SIM be so much bigger than "data IN" MC RECO? shouldnt it be the same?
--
GerhardFerdinandRzehorz - 2015-08-20
--
GerhardFerdinandRzehorz - 2015-09-17