ATLAS

In order to optimize resources for I/O intensive jobs, I am collecting the Job information for all Jobtypes in all the experiments. The metrics that are interesting to me are listed below:

for the different jobtypes: MC SIM (different scenarios, i.e. ALICE PbPb or pp), MC RECO, DATA RECO, ANALYSIS, data repack, skim, merge, stripping:

1) Data in and out [bytes/event]
2) Bandwidth requirement [bytes/sec]
3) CPU times per event [sec/event optimally on a 10 HEPSPEC06 CPU]
4) memory consumption [bytes]
5) total minbias input data size [bytes]
6) efficiencies [%]
7) sensitivity to latency on efficiency (for remote data access)
8) are there jobs that have more than 1 concurrent data access?

1) Data in and out [bytes/event]

The dashboard provides the size of the input and output (processed data and produced data):

processed_1year_jobtype.pngproduced_1year_jobtype.png

and number of processed events.

1year_events_jobtype.png

Simulation: 856.89 [kB/evt] processed data, 690.10 [kB/evt] produced data

Reconstruction: 10030.66 [kB/evt] processed data, 937.03 [kB/evt] produced data

Analysis: 1333.43 [kB/evt] processed data, 21.83 [kB/evt] produced data

2) Bandwidth requirement [bytes/sec]

3) CPU times per event [sec/event optimally on a 10 HEPSPEC06 CPU]

According to the ATLAS dashboard, the CPU usage from last year (01.08.2014 - 01.08.2015) can be seen below.

hepcpec_CPU_jobtype_1year.png1year_CPU_jobtype_alljob.png

These pie-charts are taken from the dashboard, showing the CPU utilisation by JobType in absolute and relative numbers. For the purpose of making the CPU time comparable, the numbers have been normalized with the HEPSPEC06 benchmark. It should deviate only by a factor of ~10 but does not match that order of magnitude of the CPU consumption shown in the Plot next to it (why?? ~ factor 400).

Using the previous Event Numbers (see Plot Answer 1) ) and breaking the previous CPU numbers down to the event level, the results normalized for a 10 HEPSPEC CPU are (mistake in the dashboard, in my logic, in the units?):

Simulation: 0,062 [s/evt]

Reconstruction: 0,013 [s/evt]

Analysis: 0,00024 [s/evt]

When comparing this with the numbers from Average CPU time spent on one good event, the expectation would be that it is at least in the same order of magnitude (an average CPU is slightly below 10 HEPSPEC), but the same disparity as in the previous two plots can be observed. It is probably better to stick to the following numbers, even if they are not normalized.

1year_jobtypes_avgCPU.png

Simulation: 270,30 [s/evt]

Reconstruction: 40.02 [s/evt]

Analysis: 10,05 [s/evt]

It is not very significant to look only at the average, these categories are split further, starting with the MC simulation (due to the above seen inconsistency in the dashboard, the following numbers do not take into account the power of the CPU *to be fixed*):

MC15_13TeV (Full Sim):249 [s/evt]

MC15_13TeV (Fast Sim):62 [s/evt]

MC15_13TeV (MC Simulation):48 [s/evt]

Analysis can be split further by input datatype in AOD and DAOD, resulting in:

AOD: avergae cpu time spent on one good event 7.8139 s/evt (0.000915651 hepspec/evt)

516kB/evt processed data... 4,07kB/evt produced data

DAOD: avergae cpu time spent on one good event 0.0619 s/evt (0.000397506 hepspec/evt)

250kB/evt processed data... 2.11kB/evt produced data

4) memory consumption [bytes]


5) total minbias input data size [bytes]

A 2000 event reconstruction job requires about 30 Gb of input, amounting to about 10 TB of reconstruction input data (minbias).


6) efficiencies [%]


7) sensitivity to latency on efficiency (for remote data access)


8) are there jobs that have more than 1 concurrent data access?

Summary

table1.png

Why is Data IN MC sim so big?

How can "data OUT" MC SIM be so much bigger than "data IN" MC RECO? shouldnt it be the same?

-- GerhardFerdinandRzehorz - 2015-08-20

-- GerhardFerdinandRzehorz - 2015-09-17

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2015-09-17 - GerhardFerdinandRzehorz
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback