Timing performance of CMSSW modules
In CMSSW there is a service which is measuring execution time
of all modules in use for every event. It can be used to detect
which module is most time consuming.
Here is step by step instruction how to perform such measurement:
Preparation of workspace
Create CMSSW workspace, put you code into
src
directory and compile it.
Choose configuration which you would like to use for the measurement.
Configuration file
Add to configuration file (let us say
test1_cfg.py
) following statement:
process.Timing = cms.Service("Timing")
Now you should run your configuration file through
cmsRun
, but check
first where goes the verbosity of modules. Normally all messages should be
printed on stdout and stderr. In such case redirect them to some files:
cmsRun test1_cfg.py 1>1.log 2>2.log
In case of CMSSW Logger, output could be written to some *.txt files.
Timing service output
Module Timing will print to the file information which looks like that:
TimeModule> 9 1 RawToDigi T2XMLDataDigiProducer 0.076854
TimeModule> 9 1 T2MCl T2MakeCluster2 0.0226469
TimeModule> 9 1 T2Hits T2RecHit2 0.000355959
TimeModule> 9 1 T2RoadColl2 T2RoadProducer2 0.000463009
TimeModule> 9 1 T2TrackColl2 T2TrackProducer2 0.000464916
TimeModule> 9 1 TriggerResults TriggerResultInserter 6.00815e-05
TimeModule> 9 1 o1 PoolOutputModule 0.05107
TimeEvent> 9 1 0.157636 0.156976 6.36203
First there is a
TimeModule>
prefix, then event number (here 9), then run number (here 1),
then called module labels and finally time needed to execute module for given event in seconds.
Find out the file in which such information is saved (2.log or debugmessages.txt or different one)
and filter it out to the
TimingInfo.txt
file:
grep TimeModule\> 2.log > TimingInfo.txt
Averaging over number of events
Having now all necessary information in
TimingInfo.txt file
we could use it to average execution time over all events.
In CMSSW there is piece of C++ code which does that. Let us take it:
cp /afs/cern.ch/cms/sw/slc4_ia32_gcc345/cms/cmssw/CMSSW_3_1_1/src/FastSimulation/Validation/test/timing.cpp .
g++ timing.cpp
By default this program will analyze "TimingInfo.txt" file and will print its results to stdout.
Let us call it:
> ./a.out
Analyzing time report from TimingInfo.txt with CPU fraction correction 0.995
Timing per module
=================
1 T2MakeCluster2 : 269 ms/event
2 PoolOutputModule : 203 ms/event
3 T2RoadProducer2 : 119 ms/event
4 T2XMLDataDigiProducer : 86.4 ms/event
5 T2TrackProducer2 : 25.5 ms/event
6 T2RecHit2 : 11.3 ms/event
7 TriggerResultInserter : 0.0643 ms/event
Total time = 714 ms/event
=================
Timing per label
=================
1 T2MakeCluster2:T2MCl : 269 ms/event
2 PoolOutputModule:o1 : 203 ms/event
3 T2RoadProducer2:T2RoadColl2 : 119 ms/event
4 T2XMLDataDigiProducer:RawToDigi : 86.4 ms/event
5 T2TrackProducer2:T2TrackColl2 : 25.5 ms/event
6 T2RecHit2:T2Hits : 11.3 ms/event
7 TriggerResultInserter:TriggerResults : 0.0643 ms/event
=================
Total time = 714 ms/event
Total time = 714 +/- 480 ms/event
Min. time = 134 ms/event
Max. time = 3.51e+03 ms/event
This summary could help us to identify most time consuming module.
Remarks
Repeat described procedure several times - results might heavily depend on machine load
on which you are executing
cmsRun
command.
Usually input/output is most time consuming part, so be careful when working with AFS or CASTOR.
timing.cpp tool wasn't yet tested with files containg data from many runs.
--
LeszekGrzanka - 17-Aug-2010