Procedures for physics validation of the tracking in a production release
The procedure uses the Z->mumu ESD dataset from the
physics validation Sample A.
InDetRecStatistics is used to create an ntuple, from which a number of plots are made. These can be compared by eye, or with
dCube, with plots created from a reference sample. I currently run everything at CERN on lxplus and lxbatch.
This example is based on the 14.1.0.2 evgen validation, which was the first validation requested in
this post
and
presented
at the following
Physics Validation Meeting
. The background for the tracking validation procedures is discussed
here
.
Setting up your Account
The normal
Workbook account setup instructions should be OK, ie.
% mkdir ~/cmthome
% cd ~/cmthome
% source /afs/cern.ch/sw/contrib/CMT/v1r20p20080222/mgr/setup.sh # or .csh
Then create your
requirements
file. I used:
set CMTSITE CERN
set SITEROOT /afs/cern.ch
macro ATLAS_DIST_AREA /afs/cern.ch/atlas/software/dist
macro ATLAS_TEST_AREA ${HOME}/testarea
apply_tag setup
apply_tag simpleTest
apply_tag 32
use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA)
which doesn't specify an explicit release (I don't know why they put the release in the Workbook's example
requirements
). So far we have only validated 32-bit builds, so I put that in as a default having been burned by accidentally getting a 64-bit build in the past.
Then create the
setup.sh
and
.csh
.
% cmt config
Setting up the ntuple-creation release
This only has to be updated when there is a change or fix in the ESD reading or
InDetRecStatistics code. Here, as a test, I set up 14.2.0, although the above validation was done with 14.1.0 and
InDetRecStatistics-00-01-82
.
% mkdir ~/testarea/14.2.0
% cd ~/testarea/14.2.0
% source ~/cmthome/setup.sh -tag=14.2.0
See the
AtlasLogin package for details of the different tags that can be specified.
If necessary, check out any necessary package updates. Currently we need fixes in
InDetRecStatistics-00-01-85
:
% cmt co -r InDetRecStatistics-00-01-85 InnerDetector/InDetValidation/InDetRecStatistics
The
InnerDetector/InDetExample/InDetRecExample
package also sometimes needs updates.
Use
cvs update
to update to another tag. Note, that if you check out or update a package in an existing release, then you should rerun the following setupWorkArea.py step, and the steps in the
WorkArea/cmt
directory. In addition, if an InDetRecStatistics update includes changes to the ntuple variables (see the
ChangeLog
), then you should first clean out any old libraries (I don't know why or how much of this is necessary, but if it isn't done, then the job seems to crash at the end). To do this, use
cmt bro gmake clean
and perhaps also
find ~/testarea/14.2.0 -name "$CMTCONFIG" -type d | xargs -n100 rm -rf
(or, if you prefer, start a new release under
~/testarea
).
% setupWorkArea.py
WorkAreaMgr : INFO ################################################################################
WorkAreaMgr : INFO Creating a WorkArea CMT package under: [/afs/cern.ch/user/a/adye/testarea/14.2.0]
WorkAreaMgr : INFO Scanning [/afs/cern.ch/user/a/adye/testarea/14.2.0]
WorkAreaMgr : INFO Found 0 packages in WorkArea
WorkAreaMgr : INFO => 0 package(s) in suppression list
WorkAreaMgr : INFO Generation of WorkArea/cmt/requirements done [OK]
WorkAreaMgr : INFO ################################################################################
% cd WorkArea/cmt
% cmt config
Creating setup scripts.
Creating cleanup scripts.
Installing the run directory
Installing the python directory
% source setup.sh # or .csh
% cmt bro gmake
#--------------------------------------------------------------
# Now trying [gmake] in /afs/cern.ch/user/a/adye/testarea/14.2.0/WorkArea/cmt (166/166)
#--------------------------------------------------------------
...
------> (constituents.make) install_python_modules done
all ok.
% cd ../run
% cp -p ~adye/testarea/14.2.0/WorkArea/run/esd.sh .
% get_files -symlink -jo ReadInDet_jobOptions.py
% mv ReadInDet_jobOptions.py esd.py.orig
% cp -p esd.py.orig esd.py
Edit
esd.sh
to specify the correct tag in
setup.sh
, ie.
-tag=14.2.0
here.
Edit
esd.py
:
- remove setting of
DetDescrVersion
- set
InDet__InDetRecStatisticsAlg.NtupleSaveHits=True
after InDetRecExample/InDetRec_all.py
- set
theApp.EvtMax
from os.environ["EVTMAX"]
(instead of hardcoded 10
)
- remove all settings of
ServiceMgr.EventSelector.InputCollections
- Change
InDetFlags.doStatNtuple
to true
(for release 14.2.20 and later)
You can see how I did this by comparing my
esd.py.orig
and
esd.py
in
~adye/testarea/14.2.20/WorkArea/run
.
Locating the input ESD dataset
This step should be done for the relevant validation sample specified in the request from Iacopo. It may also have to be done for the reference sample, unless the ntuple is still available from a previous validation.
Find the dataset name using the
BNL Panda Monitor Task and Tag Query Form
.
Fill the tag name
e339_s435_r432
in the "Output Task Name" field, and select the "Query Tags" radio button. Click "Continue" and then "QuerySubmit". This should show the statuses of all the samples' tasks. Find the
PythiaZmumu.recon
task (
valid1.005145.PythiaZmumu.recon.e339_s435_r432
in this case). Hopefully it should be "done" or "finished", but there may be some output even if it is still "running" (sometimes a few jobs get stuck). Select the "Task ID" number to go to the task status page (the "Task name" link takes you to the request page, which isn't so useful).
Check the details of the job are as expected.
- Trf Version: 14.1.0.2 is the reco release version (you can select "Task Input" for the simu version, and again for the evgen version).
- Vparams:
ATLAS-CSC-05-00-00,NoTrackSlimming.py,...
(NoTrackSlimming.py
is a special option for the Track Validation, so we can check the hit residuals - if this is missing the residuals plots will be nearly empty).
- Events/file: 250. This is useful to check how many events are available if not all jobs are complete.
Click on the ESD dataset (
valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432
) under "Datasets for task" at the top and scroll down to the content section. Check enough files are available (>=5000 events should be fine, though the more the better).
Check whether there is a replica available in CASTOR at CERN (eg.
CERN-PROD_MCDISK
). If so, select the replica link for one of the files and find the PFN (at the bottom of the page). Copy the directory part of the URL, eg.
/castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid1/ESD/valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426_tid022268
(leaving off
srm://srm-atlas.cern.ch
at the beginning and
/ESD.022268._00001.pool.root.2
at the end) and list the directory using the lxplus command:
% nsls -l /castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid1/ESD/valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426_tid022268
If those files look OK, then they can be read directly from that location (skip the next section). If not, then we have to make a local copy.
Copying ESD dataset to CERN CASTOR
Setup DQ2 and get a Grid proxy. I have a little script to do both, though it is for Bourne-shell only - you can make your own csh version if you like.
% source ~adye/bin/dq2setup
Enter GRID pass phrase: ****
Your identity: /C=UK/O=eScience/OU=CLRC/L=RAL/CN=tim adye
Creating temporary proxy ................................................................ Done
Contacting lcg-voms.cern.ch:15001 [/DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch] "atlas" Done
Creating proxy ...................................... Done
Your proxy is valid until Sat Jun 14 19:29:18 2008
Find the dataset with DQ2.
% dq2-ls -f valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432
valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432
[ ] ESD.022659._00036.pool.root.2 524F4742-3D36-DD11-B569-00A0D1E505A1 md5:69fc965e72433da4092f1e8f20f01469 221954941
[ ] ESD.022659._00027.pool.root.1 820FCC93-1D35-DD11-B554-00A0D1E70CB8 md5:1815556b996dd0987cce95e7631bc1c1 224329304
...
[ ] ESD.022659._00003.pool.root.1 BCAE0093-1C33-DD11-8D94-00A0D1E70C54 md5:c43c2100e146ce6f02139248a6c36ffd 222784433
total files: 38
local files: 0
total size: 8563698219
date: 2008-06-10 18:31:19
If the files aren't there add a
*
to the end of the dataset. Sometimes when the dataset is incomplete, the files are in the task dataset with a
_tid022659
suffix. If that is the case, then use this dataset name in what follows.
Copy the dataset to your
$CASTOR_HOME
area. I have a script to do this. It should be run from a temporary area, eg.
$TMPDIR
(
/tmp/adye
in my case). It may take some time, so you probably want to run in the background (or in a batch job, though I haven't tried that), perhaps with my
job
script:
% cd $TMPDIR
% ~adye/bin/job -N ~adye/bin/dq2castor valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432
The job script will put the logfile in
~/job.log*
and will mail you when it is done (assuming you have a
~/.forward
set up).
If that is successful (check the logfile: should say something like
Done
,
Total:33 - Failed:0
near the top), you should have the dataset in
$CASTOR_HOME/valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432
:
% nsls -l valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432
mrw-r--r-- 1 adye zp 224914351 Jun 09 19:00 ESD.022659._00001.pool.root.1
mrw-r--r-- 1 adye zp 233560703 Jun 09 19:00 ESD.022659._00002.pool.root.1
...
Creating the InDetRecStatistics ntuple using ESD files in CASTOR
Create a jobOptions file named the same as the dataset (it doesn't have to be, but this helps keep track of things) and
.py
suffix to:
- specify the same
DetDescrVersion
as was used in the reco job (ATLAS-CSC-05-00-00
), shown on the task status page.
- include
esd.py
.
- set
ServiceMgr.EventSelector.InputCollection
with the list of TUrls of the files in CASTOR (ie. castor:/castor/cern.ch/...
). These are the files you found or copied in the previous sections.
See
~adye/testarea/14.2.0/WorkArea/run/valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432.py
for an example.
This can be tested interactively:
% cd $TMPDIR
% ~/testarea/14.2.0/WorkArea/run/esd.sh valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432 100
where
esd.sh
takes the dataset name (without
.py
suffix) and maximum number of events (100 takes a few minutes, so is OK to run interactively).
Assuming that works OK, you will have an ntuple file,
InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432-100.root
in this case, in your directory and also in CASTOR. You can use root's TBrowser to check a few quantities (in the
Tracks
tree) for gross screw-ups.
% root -l InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432-100.root
root [0]
Attaching file InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432-100.root as _file0...
root [1] TBrowser b
You can now submit an batch job to run over the full dataset:
% cd ~/testarea/14.2.0/WorkArea/run
% bsub -q 8nh esd.sh valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432
(NB. no event limit specified here). 10000 events usually takes about 2 hours of normalised CPU time (KSI2K) or 1:45 of real time (assuming you get a job slot reasonably quickly).
When the job is done, check the bottom of the logfile for crashes and that the ntuple file is present. The logfile also includes the
InDetRecStatistics summary table (see that web page for details), which can be checked.
The ntuple should be in CASTOR:
% nsls -l InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432.root
mrw-r--r-- 1 adye zp 379462300 Jun 09 20:58 /castor/cern.ch/user/a/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432.root
This should be accessible directly from ROOT with a
castor:
or
rfio:
TUrl, however it is faster and more reliable to copy the file to
$TMPDIR
(some complex ntuple operations seem to have problems with files in CASTOR).
% rfcp $CASTOR_HOME/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432.root $TMPDIR
(this is done automatically by
docmp.cc
in the next step).
Creating and comparing tracking plots
I have some ROOT scripts to analyse the
InDetRecStatistics ntuple in my CVS area (some copied to
InnerDetector/InDetValidation/InDetRecStatistics/scripts
and/or
InnerDetector/InDetValidation/InDetRTT/scripts
, but the full set is currently only in my CVS repository).
% cd ~
% cvs -d /afs/rl.ac.uk/user/a/adye/cvsroot co sven_scripts
Unfortunately CVS doesn't allow you to check out from a repository where you don't have write access (needed for locking). If you don't have write access to
/afs/rl.ac.uk/user/a/adye/cvsroot/atlas/sven_scripts
(ask me if you have a RAL AFS account), then you should copy from
~adye/public/sven_scripts
instead. I'll try to remember to keep that checked out at the head.
% cd sven_scripts
% root -l -b
root [0] .L docmp.cc+
Info in <TUnixSystem::ACLiC>: creating shared library /afs/cern.ch/user/a/adye/sven_scripts/./docmp_cc.so
Then run
docmp
, specifying the reference and validation datasets as arguments:
root [1] docmp("valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426","valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432")
+ ./docmp_pre valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426 InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root
+ '[' '!' -r /tmp/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root ']'
+ rfcp /castor/cern.ch/user/a/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root /tmp/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root
453680541 bytes in 10 seconds through eth0 (in) and local (out) (44304 KB/sec)
453680541 bytes in remote file
+ '[' '!' -r /tmp/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root ']'
+ test -h valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426
+ test -d valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426
+ test -d /tmp/adye/valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426
+ mkdir /tmp/adye/valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426
+ ln -nfs /tmp/adye/valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426 valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426
+ '[' '!' -d valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426 ']'
Error in <TChain::LoadTree>: Cannot find tree with name ConvertedIPatTracks in file $TMPDIR/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root
Error in <TChain::LoadTree>: Cannot find tree with name ConvertedXKalmanTracks in file $TMPDIR/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root
InDetRecStatistics trees:-
$TMPDIR/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root tree:Tracks entries=9701
======================= eff =======================
mc_cut: (((sqrt(pow(mctrack_beginvertexx,2)+pow(mctrack_beginvertexy,2)) < 25.0))&&((abs(mctrack_beginvertexz) < 200.0)))&&(((abs(mctrack_endvertexz) > 2300.0))||((sqrt(pow(mctrack_endvertexx,2)+pow(mctrack_endvertexy,2)) > 400.0)))
data_cut: ((((sqrt(pow(mctrack_beginvertexx,2)+pow(mctrack_beginvertexy,2)) < 25.0))&&((abs(mctrack_beginvertexz) < 200.0)))&&(((abs(mctrack_endvertexz) > 2300.0))||((sqrt(pow(mctrack_endvertexx,2)+pow(mctrack_endvertexy,2)) > 400.0))))&&((mctrack_truth_prob>0.5))
Info in <TCanvas::Print>: eps file valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426/efficiency_eta.eps has been created
Info in <TCanvas::Print>: eps file valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426/efficiency_phi.eps has been created
Info in <TCanvas::Print>: eps file valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426/efficiency_pt.eps has been created
======================= fake =======================
...
You can even run it all in the background:
% ~adye/bin/job -N root -l -b \
'docmp.cc+("valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426","valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432")'
Don't worry about the missing
ConvertedIPatTracks
and
ConvertedXKalmanTracks
trees. Those are not in the ESD - we just look at the
Tracks
tree.
This assumes it can find
InDetRecStatistics-REFDATASET.root
and
InDetRecStatistics-MONDATASET.root
in
$TMPDIR
or
$CASTOR_HOME
(if the latter, it copies it to the former). It will create directories for each dataset in your
~/sven_scripts
and run dCube to compare them (the result is in
MONDATASET/dcube
, eg.
valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432/dcube
).
The comparison should probably use the same number of events from both ntuples. By default, docmp limits the number of events to the smaller of the validation and reference samples. This can be overridden by specifying a different number of events as the third parameter (nentries), or -1 to use all events in both files.
Alternatively, the number of events can be determined from the Athena logfile, or using:
% root -l
root [0] TFile::Open("/tmp/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root")
(class TFile*)0x8b42ee8
root [1] Tracks->GetEntries()
(const Long64_t)9701
It's then most convenient to access these from the web, so move those two directories to a web-accessible directory, eg.
% mkdir -p ~/www/val/080610a
% mv valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426 \
valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432 \
~/www/val/080610a
If you haven't set up a CERN AFS web area (I put it in
~adye/www
), you can do it using the
CERN web admin pages
.
You can see my results here:
reference
and
validation
, with dCube comparison linked from the latter.
--
TimAdye - 16 Jun 2008