Procedures for physics validation of the tracking in a production release

The procedure uses the Z->mumu ESD dataset from the physics validation Sample A. InDetRecStatistics is used to create an ntuple, from which a number of plots are made. These can be compared by eye, or with dCube, with plots created from a reference sample. I currently run everything at CERN on lxplus and lxbatch.

This example is based on the 14.1.0.2 evgen validation, which was the first validation requested in this post and presented at the following Physics Validation Meeting. The background for the tracking validation procedures is discussed here.

Setting up your Account

The normal Workbook account setup instructions should be OK, ie.

% mkdir ~/cmthome
% cd ~/cmthome
% source /afs/cern.ch/sw/contrib/CMT/v1r20p20080222/mgr/setup.sh    # or .csh

Then create your requirements file. I used:

set CMTSITE CERN
set SITEROOT /afs/cern.ch
macro ATLAS_DIST_AREA /afs/cern.ch/atlas/software/dist
macro ATLAS_TEST_AREA ${HOME}/testarea
apply_tag setup
apply_tag simpleTest
apply_tag 32
use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA)

which doesn't specify an explicit release (I don't know why they put the release in the Workbook's example requirements). So far we have only validated 32-bit builds, so I put that in as a default having been burned by accidentally getting a 64-bit build in the past.

Then create the setup.sh and .csh.

% cmt config

Setting up the ntuple-creation release

This only has to be updated when there is a change or fix in the ESD reading or InDetRecStatistics code. Here, as a test, I set up 14.2.0, although the above validation was done with 14.1.0 and InDetRecStatistics-00-01-82.

% mkdir ~/testarea/14.2.0
% cd ~/testarea/14.2.0
% source ~/cmthome/setup.sh -tag=14.2.0

See the AtlasLogin package for details of the different tags that can be specified.

If necessary, check out any necessary package updates. Currently we need fixes in InDetRecStatistics-00-01-85:

% cmt co -r InDetRecStatistics-00-01-85 InnerDetector/InDetValidation/InDetRecStatistics

The InnerDetector/InDetExample/InDetRecExample package also sometimes needs updates.

Use cvs update to update to another tag. Note, that if you check out or update a package in an existing release, then you should rerun the following setupWorkArea.py step, and the steps in the WorkArea/cmt directory. In addition, if an InDetRecStatistics update includes changes to the ntuple variables (see the ChangeLog), then you should first clean out any old libraries (I don't know why or how much of this is necessary, but if it isn't done, then the job seems to crash at the end). To do this, use cmt bro gmake clean and perhaps also find ~/testarea/14.2.0 -name "$CMTCONFIG" -type d | xargs -n100 rm -rf (or, if you prefer, start a new release under ~/testarea).

% setupWorkArea.py
WorkAreaMgr : INFO     ################################################################################
WorkAreaMgr : INFO     Creating a WorkArea CMT package under: [/afs/cern.ch/user/a/adye/testarea/14.2.0] 
WorkAreaMgr : INFO     Scanning [/afs/cern.ch/user/a/adye/testarea/14.2.0]
WorkAreaMgr : INFO     Found 0 packages in WorkArea
WorkAreaMgr : INFO     => 0 package(s) in suppression list
WorkAreaMgr : INFO     Generation of WorkArea/cmt/requirements done [OK]
WorkAreaMgr : INFO     ################################################################################
% cd WorkArea/cmt
% cmt config
Creating setup scripts.
Creating cleanup scripts.
Installing the run directory
Installing the python directory
% source setup.sh    # or .csh
% cmt bro gmake
#--------------------------------------------------------------
# Now trying [gmake] in /afs/cern.ch/user/a/adye/testarea/14.2.0/WorkArea/cmt (166/166)
#--------------------------------------------------------------
...
------> (constituents.make) install_python_modules done
 all ok.
% cd ../run
% cp -p ~adye/testarea/14.2.0/WorkArea/run/esd.sh .
% get_files -symlink -jo ReadInDet_jobOptions.py
% mv ReadInDet_jobOptions.py esd.py.orig
% cp -p esd.py.orig esd.py

Edit esd.sh to specify the correct tag in setup.sh, ie. -tag=14.2.0 here.

Edit esd.py:

  1. remove setting of DetDescrVersion
  2. set InDet__InDetRecStatisticsAlg.NtupleSaveHits=True after InDetRecExample/InDetRec_all.py
  3. set theApp.EvtMax from os.environ["EVTMAX"] (instead of hardcoded 10)
  4. remove all settings of ServiceMgr.EventSelector.InputCollections
  5. Change InDetFlags.doStatNtuple to true (for release 14.2.20 and later)

You can see how I did this by comparing my esd.py.orig and esd.py in ~adye/testarea/14.2.20/WorkArea/run.

Locating the input ESD dataset

This step should be done for the relevant validation sample specified in the request from Iacopo. It may also have to be done for the reference sample, unless the ntuple is still available from a previous validation.

Find the dataset name using the BNL Panda Monitor Task and Tag Query Form.

Fill the tag name e339_s435_r432 in the "Output Task Name" field, and select the "Query Tags" radio button. Click "Continue" and then "QuerySubmit". This should show the statuses of all the samples' tasks. Find the PythiaZmumu.recon task (valid1.005145.PythiaZmumu.recon.e339_s435_r432 in this case). Hopefully it should be "done" or "finished", but there may be some output even if it is still "running" (sometimes a few jobs get stuck). Select the "Task ID" number to go to the task status page (the "Task name" link takes you to the request page, which isn't so useful).

Check the details of the job are as expected.

  1. Trf Version: 14.1.0.2 is the reco release version (you can select "Task Input" for the simu version, and again for the evgen version).
  2. Vparams: ATLAS-CSC-05-00-00,NoTrackSlimming.py,... (NoTrackSlimming.py is a special option for the Track Validation, so we can check the hit residuals - if this is missing the residuals plots will be nearly empty).
  3. Events/file: 250. This is useful to check how many events are available if not all jobs are complete.

Click on the ESD dataset (valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432) under "Datasets for task" at the top and scroll down to the content section. Check enough files are available (>=5000 events should be fine, though the more the better).

Check whether there is a replica available in CASTOR at CERN (eg. CERN-PROD_MCDISK). If so, select the replica link for one of the files and find the PFN (at the bottom of the page). Copy the directory part of the URL, eg. /castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid1/ESD/valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426_tid022268

(leaving off srm://srm-atlas.cern.ch at the beginning and /ESD.022268._00001.pool.root.2 at the end) and list the directory using the lxplus command:

% nsls -l /castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid1/ESD/valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426_tid022268

If those files look OK, then they can be read directly from that location (skip the next section). If not, then we have to make a local copy.

Copying ESD dataset to CERN CASTOR

Setup DQ2 and get a Grid proxy. I have a little script to do both, though it is for Bourne-shell only - you can make your own csh version if you like.

% source ~adye/bin/dq2setup
Enter GRID pass phrase: ****
Your identity: /C=UK/O=eScience/OU=CLRC/L=RAL/CN=tim adye
Creating temporary proxy ................................................................ Done
Contacting  lcg-voms.cern.ch:15001 [/DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch] "atlas" Done
Creating proxy ...................................... Done
Your proxy is valid until Sat Jun 14 19:29:18 2008

Find the dataset with DQ2.

% dq2-ls -f valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432

valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432
[ ]     ESD.022659._00036.pool.root.2   524F4742-3D36-DD11-B569-00A0D1E505A1    md5:69fc965e72433da4092f1e8f20f01469    221954941
[ ]     ESD.022659._00027.pool.root.1   820FCC93-1D35-DD11-B554-00A0D1E70CB8    md5:1815556b996dd0987cce95e7631bc1c1    224329304
...
[ ]     ESD.022659._00003.pool.root.1   BCAE0093-1C33-DD11-8D94-00A0D1E70C54    md5:c43c2100e146ce6f02139248a6c36ffd    222784433
total files: 38
local files: 0
total size: 8563698219
date: 2008-06-10 18:31:19

If the files aren't there add a * to the end of the dataset. Sometimes when the dataset is incomplete, the files are in the task dataset with a _tid022659 suffix. If that is the case, then use this dataset name in what follows.

Copy the dataset to your $CASTOR_HOME area. I have a script to do this. It should be run from a temporary area, eg. $TMPDIR (/tmp/adye in my case). It may take some time, so you probably want to run in the background (or in a batch job, though I haven't tried that), perhaps with my job script:

% cd $TMPDIR
% ~adye/bin/job -N ~adye/bin/dq2castor valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432

The job script will put the logfile in ~/job.log* and will mail you when it is done (assuming you have a ~/.forward set up).

If that is successful (check the logfile: should say something like Done, Total:33 - Failed:0 near the top), you should have the dataset in $CASTOR_HOME/valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432:

% nsls -l valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432
mrw-r--r--   1 adye     zp                224914351 Jun 09 19:00 ESD.022659._00001.pool.root.1
mrw-r--r--   1 adye     zp                233560703 Jun 09 19:00 ESD.022659._00002.pool.root.1
...

Creating the InDetRecStatistics ntuple using ESD files in CASTOR

Create a jobOptions file named the same as the dataset (it doesn't have to be, but this helps keep track of things) and .py suffix to:

  1. specify the same DetDescrVersion as was used in the reco job (ATLAS-CSC-05-00-00), shown on the task status page.
  2. include esd.py.
  3. set ServiceMgr.EventSelector.InputCollection with the list of TUrls of the files in CASTOR (ie. castor:/castor/cern.ch/...). These are the files you found or copied in the previous sections.

See ~adye/testarea/14.2.0/WorkArea/run/valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432.py for an example.

This can be tested interactively:

% cd $TMPDIR
% ~/testarea/14.2.0/WorkArea/run/esd.sh valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432 100

where esd.sh takes the dataset name (without .py suffix) and maximum number of events (100 takes a few minutes, so is OK to run interactively).

Assuming that works OK, you will have an ntuple file, InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432-100.root in this case, in your directory and also in CASTOR. You can use root's TBrowser to check a few quantities (in the Tracks tree) for gross screw-ups.

% root -l InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432-100.root
root [0] 
Attaching file InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432-100.root as _file0...
root [1] TBrowser b

You can now submit an batch job to run over the full dataset:

% cd ~/testarea/14.2.0/WorkArea/run
% bsub -q 8nh esd.sh valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432

(NB. no event limit specified here). 10000 events usually takes about 2 hours of normalised CPU time (KSI2K) or 1:45 of real time (assuming you get a job slot reasonably quickly).

When the job is done, check the bottom of the logfile for crashes and that the ntuple file is present. The logfile also includes the InDetRecStatistics summary table (see that web page for details), which can be checked.

The ntuple should be in CASTOR:

% nsls -l InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432.root
mrw-r--r--   1 adye     zp                379462300 Jun 09 20:58 /castor/cern.ch/user/a/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432.root

This should be accessible directly from ROOT with a castor: or rfio: TUrl, however it is faster and more reliable to copy the file to $TMPDIR (some complex ntuple operations seem to have problems with files in CASTOR).

% rfcp $CASTOR_HOME/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432.root $TMPDIR

(this is done automatically by docmp.cc in the next step).

Creating and comparing tracking plots

I have some ROOT scripts to analyse the InDetRecStatistics ntuple in my CVS area (some copied to InnerDetector/InDetValidation/InDetRecStatistics/scripts and/or InnerDetector/InDetValidation/InDetRTT/scripts, but the full set is currently only in my CVS repository).

% cd ~
% cvs -d /afs/rl.ac.uk/user/a/adye/cvsroot co sven_scripts

Unfortunately CVS doesn't allow you to check out from a repository where you don't have write access (needed for locking). If you don't have write access to /afs/rl.ac.uk/user/a/adye/cvsroot/atlas/sven_scripts (ask me if you have a RAL AFS account), then you should copy from ~adye/public/sven_scripts instead. I'll try to remember to keep that checked out at the head.

% cd sven_scripts
% root -l -b
root [0] .L docmp.cc+
Info in <TUnixSystem::ACLiC>: creating shared library /afs/cern.ch/user/a/adye/sven_scripts/./docmp_cc.so

Then run docmp, specifying the reference and validation datasets as arguments:

root [1] docmp("valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426","valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432")
+ ./docmp_pre valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426 InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root
+ '[' '!' -r /tmp/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root ']'
+ rfcp /castor/cern.ch/user/a/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root /tmp/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root
453680541 bytes in 10 seconds through eth0 (in) and local (out) (44304 KB/sec)
453680541 bytes in remote file
+ '[' '!' -r /tmp/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root ']'
+ test -h valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426
+ test -d valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426
+ test -d /tmp/adye/valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426
+ mkdir /tmp/adye/valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426
+ ln -nfs /tmp/adye/valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426 valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426
+ '[' '!' -d valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426 ']'
Error in <TChain::LoadTree>: Cannot find tree with name ConvertedIPatTracks in file $TMPDIR/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root
Error in <TChain::LoadTree>: Cannot find tree with name ConvertedXKalmanTracks in file $TMPDIR/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root
InDetRecStatistics trees:-
$TMPDIR/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root tree:Tracks entries=9701
======================= eff =======================
mc_cut:   (((sqrt(pow(mctrack_beginvertexx,2)+pow(mctrack_beginvertexy,2)) < 25.0))&&((abs(mctrack_beginvertexz) < 200.0)))&&(((abs(mctrack_endvertexz) > 2300.0))||((sqrt(pow(mctrack_endvertexx,2)+pow(mctrack_endvertexy,2)) > 400.0)))
data_cut: ((((sqrt(pow(mctrack_beginvertexx,2)+pow(mctrack_beginvertexy,2)) < 25.0))&&((abs(mctrack_beginvertexz) < 200.0)))&&(((abs(mctrack_endvertexz) > 2300.0))||((sqrt(pow(mctrack_endvertexx,2)+pow(mctrack_endvertexy,2)) > 400.0))))&&((mctrack_truth_prob>0.5))
Info in <TCanvas::Print>: eps file valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426/efficiency_eta.eps has been created
Info in <TCanvas::Print>: eps file valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426/efficiency_phi.eps has been created
Info in <TCanvas::Print>: eps file valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426/efficiency_pt.eps has been created
======================= fake =======================
...

You can even run it all in the background:

% ~adye/bin/job -N root -l -b \
'docmp.cc+("valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426","valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432")'

Don't worry about the missing ConvertedIPatTracks and ConvertedXKalmanTracks trees. Those are not in the ESD - we just look at the Tracks tree.

This assumes it can find InDetRecStatistics-REFDATASET.root and InDetRecStatistics-MONDATASET.root in $TMPDIR or $CASTOR_HOME (if the latter, it copies it to the former). It will create directories for each dataset in your ~/sven_scripts and run dCube to compare them (the result is in MONDATASET/dcube, eg. valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432/dcube).

The comparison should probably use the same number of events from both ntuples. By default, docmp limits the number of events to the smaller of the validation and reference samples. This can be overridden by specifying a different number of events as the third parameter (nentries), or -1 to use all events in both files.

Alternatively, the number of events can be determined from the Athena logfile, or using:

% root -l
root [0] TFile::Open("/tmp/adye/InDetRecStatistics-valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426.root")
(class TFile*)0x8b42ee8
root [1] Tracks->GetEntries()
(const Long64_t)9701

It's then most convenient to access these from the web, so move those two directories to a web-accessible directory, eg.

% mkdir -p ~/www/val/080610a
% mv valid1.005145.PythiaZmumu.recon.ESD.e322_s429_r426 \
     valid1.005145.PythiaZmumu.recon.ESD.e339_s435_r432 \
     ~/www/val/080610a

If you haven't set up a CERN AFS web area (I put it in ~adye/www), you can do it using the CERN web admin pages.

You can see my results here: reference and validation, with dCube comparison linked from the latter.

-- TimAdye - 16 Jun 2008

Edit | Attach | Watch | Print version | History: r18 | r10 < r9 < r8 < r7 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r8 - 2008-09-15 - TimAdye
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Atlas All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback