Procedures for physics validation of the primary tracking in a production release

The procedure uses the ESD datasets from the physics validation 10 TeV Sample A. In the example given here, we use the Z→μμ sample. A full validation usually also includes 1 GeV and 100 GeV pT single muon samples. InDetPerformanceMonitoring is used to create plots which can be compared with plots created from a reference sample using DCube. I currently run everything at CERN on lxplus and lxbatch.

In what follows, I have marked sections that can usually be skipped in green. These are there to give alternatives in case there are problems or if people want to do things differently.

This example is based on the 15.6.7.5 MC10 reconstruction validation, which was the second validation requested in this post and presented at the following Physics Validation Meeting. The background for the tracking validation procedures is discussed here. A summary of past validations can be found here.

The 15.6.7.5 validation was performed using a 15.6.7 analysis release to read the ESDs. This release needs to be run on SLC5, so login to lxplus5.cern.ch

[See revision 8 of this page for the previous recipe using the InDetRecStatistics n-tuple]

Creating the analysis release (my method)

A new analysis release is required whenever we validate ESD produced with a newer base release. It is OK to read ESD from a production cache using that cache's base release (eg. reading 15.6.6.5 or 15.6.7.5 with 15.6.7) but not with a previous base release (eg. reading 15.6.7.5 with 15.6.6 may not work).

I have a general-purpose script to automatically set up the release in your testarea. It also uses a slightly unconventional directory organisation, with a separate cmthome for each release, as a subdirectory. You may also find this method more convenient elsewhere, eg., analysis.

To set up a release on lxplus5.cern.ch, use

% mkdir ~/testarea/15.6.7
% cd ~/testarea/15.6.7
% ~adye/bin/atlnewrel 15.6.7
% source cmthome/setup.sh

The generated requirements file includes the release tag by default, so you don't need to specify that on the setup.sh.

If, for some reason, you need a Production Cache (pcache release, eg. 15.6.7.5), use

% ~adye/bin/atlnewrel 15.6.7 15.6.7.5,AtlasProduction

Once again, -tag=15.6.7.5,AtlasProduction is saved in the generated requirements file, so you don't need to remember that on the setup.sh.

Creating the analysis release (conventional method)

If you prefer to follow the official method, the normal Workbook account setup instructions should be OK, ie.

% mkdir ~/cmthome
% cd ~/cmthome
% source /afs/cern.ch/sw/contrib/CMT/v1r20p20090520/mgr/setup.sh    # or .csh

Then create your requirements file, eg.

set CMTSITE STANDALONE
set SITEROOT /afs/cern.ch/atlas/software/releases/15.6.7
macro ATLAS_TEST_AREA ${HOME}/testarea/15.6.7
macro ATLAS_DIST_AREA ${SITEROOT}
apply_tag projectArea
macro SITE_PROJECT_AREA ${SITEROOT}
macro EXTERNAL_PROJECT_AREA ${SITEROOT}
apply_tag opt
apply_tag setup
apply_tag oneTest
apply_tag 32
use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA)

This uses the oneTest setup, so the ATLAS_TEST_AREA includes the release number. That is a clearer configuration for when we use multiple releases. So far we have only validated 32-bit builds, so I put that in as a default having been burned by accidentally getting a 64-bit build in the past.

Then create the setup.sh and .csh.

% cmt config

and setup the release directory.

% mkdir ~/testarea/15.6.7
% cd ~/testarea/15.6.7
% source ~/cmthome/setup.sh -tag=15.6.7

See the AtlasLogin package for details of the different tags that can be specified. In particular, note that, if for some reason, you need a Production Cache (pcache release, eg. 15.6.7.5), then you need to specify AtlasProduction, eg.

% source ~/cmthome/setup.sh -tag=15.6.7.5,AtlasProduction

though SITEROOT in the requirements file should still refer to the base release, 15.6.7. It's probably clearer to use 15.6.7.5 in the directory name, so ATLAS_TEST_AREA should reflect that.

Setting up the analysis release

Release 15.6.7 has all the latest InnerDetector tags. In the past, fixes have been needed to InDetRecExample, InDetRecStatistics, and InDetPerformanceRTT. InDetPerformanceRTT contains the bulk of the histogramming code.

We have one enhancement to DCubeClient (allowing fully normalised histograms to be compared) which must be applied if you want to use this release to run DCube later.

% cmt co Tools/DCubeClient
% patch -p0 < ~adye/save/DCubeClient-enorm.patch

To compile this (and any other) packages, you can use my script

~adye/bin/atlmake

This runs the setupWorkArea.py, cmt config, and cmt bro gmake, all without having to change directory.

Now copy the run script, top jobOptions, and other needed stuff from my area to your WorkArea/run directory.

% cd ~adye/testarea/15.6.7/WorkArea/run
% cp -p esd.py esd.sh template.py MyCounter.py goodruns*.xml ~/testarea/15.6.7/WorkArea/run/
% cd ~/testarea/15.6.7/WorkArea/run

(MyCounter.py and goodruns*.xml are only needed for validating real data.)

If you used the conventional method to create your release, edit esd.sh to specify the correct location and tag in setup.sh. If you used my atlnewrel script, then esd.sh should work without changes release-to-release, unless an updated DBRelease is required (see below).

The above assumes that my top jobOptions, esd.py, still work with your release (as of course they do with 15.6.7). If they don't, then you can use ReadInDet_jobOptions.py from InDetRecExample as a template:

% get_files -symlink -jo ReadInDet_jobOptions.py
% mv ReadInDet_jobOptions.py esd.py.orig
% cp -p esd.py.orig esd.py

Edit esd.py:

  1. set InDetFlags.doStandardPlots=True
  2. set InDetKeys.StandardPlotHistName from os.environ["STDPLOTS_FILE"]
  3. remove setting of DetDescrVersion
  4. set theApp.EvtMax from os.environ["EVTMAX"] (instead of hardcoded 10)
  5. remove all settings of ServiceMgr.EventSelector.InputCollections

There are some additional changes needed for real data, so it's probably simplest to see how I did this by comparing my esd.py.orig and esd.py in ~adye/testarea/15.6.7/WorkArea/run.

Checking the production status and locating the output

This step should be done for the relevant validation sample specified in the request from Iacopo. It may also have to be done for the reference sample, unless the histogram root file is still available from a previous validation (and the previous run used a compatible release with a similar number of events).

Find the dataset name using the Panda Monitor Task and Tag Query Form.

Fill the tag name e380_s764_r1185 in the "Output Task Name" field, and select the "Query Tags" radio button. Click "Continue" and then "QuerySubmit". This should show the statuses of all the samples' tasks. Find the PythiaZmumu.recon task (valid1.105145.PythiaZmumu.recon.e380_s764_r1185 in this case). Hopefully it should be "done" or "finished", but there may be some output even if it is still "running" (sometimes a few jobs get stuck). Then copy the "Task ID" number (122425) into the "Task status" quick search box on the left (the links take you to less useful places).

Check the details of the job are as expected.

  1. Trf Version: 15.6.7.5 is the reco release version (you can select "Task Input" for the simulation version, and again for the event generation version).
  2. Vparams DBRelease value: 9.4.1.
  3. Vparams Geometry value: ATLAS-GEO-10-00-00
  4. Vparams preInclude value: NoTrackSlimming.py (NoTrackSlimming.py is a special option for the Track Validation, so we can check the hit residuals - if this is missing the residuals plots will be nearly empty and the hole search won't be correct).
  5. Events/file: 250. This is useful to check how many events are available if not all jobs are complete.

Check that the DBRelease value is not greater than the default version provided by your release (use echo $DBRELEASE to check). If it is, then you will have to find a newer version. Other DBReleases can be found under the release's DBRelease subdirectory (eg. /afs/cern.ch/atlas/software/releases/15.6.7/DBRelease). The default can be overridden by setting the DBRELEASE_OVERRIDE, and perhaps ATLAS_DB_AREA, environment variables, eg. adding

export ATLAS_DB_AREA=/afs/cern.ch/atlas/software/releases/15.6.5
export DBRELEASE_OVERRIDE=9.6.1

to esd.sh (maybe renamed to esd-db961.sh). ATLAS_DB_AREA is only required if the DBRelease is only in another release.

If the DBRelease isn't available anywhere, it can be installed in a private area (requires ~700 MB). cd to that directory and:

% source /afs/cern.ch/atlas/software/pacman/pacman-latest/setup.sh
% pacman -allow trust-all-caches -get http://atlas.web.cern.ch/Atlas/GROUPS/DATABASE/pacman4/DBRelease:DBRelease-9.4.1

but it's probably simpler to request (on the hn-atlas-physics-software-validation list) the installation of the required DBRelease.

Locating the input ESD dataset using PANDA

Click on the ESD dataset (valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185/) under "Datasets for task" at the top, then select a sub-task dataset and scroll down to the file content section. Check enough files are available (>=5000 events should be fine for the multi-particle sample, though the more the better).

Check whether there is a replica available in CASTOR at CERN (eg. CERN-PROD_MCDISK). If so, select the replica link for one of the files and find the PFN (at the bottom of the page). Copy the directory part of the URL, eg. /castor/cern.ch/grid/atlas/atlasmcdisk/valid1/ESD/e380_s764_r1185/valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185_tid122425_00

(leaving off srm://srm-atlas.cern.ch at the beginning and /ESD.122425._000002.pool.root.2 at the end) and list the directory using the lxplus command:

% nsls -l /castor/cern.ch/grid/atlas/atlasmcdisk/valid1/ESD/e380_s764_r1185/valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185_tid122425_00
-rw-r--r--   1 atlas003 zp                390109819 Mar 28 19:09 ESD.122425._000002.pool.root.2
-rw-r--r--   1 atlas003 zp                413042976 Mar 25 22:53 ESD.122425._000003.pool.root.1
...
-rw-r--r--   1 atlas003 zp                444042654 Mar 26 06:53 ESD.122425._000040.pool.root.1

Locating the input ESD dataset using DQ2

Sometimes PANDA takes some time to hear about completed jobs and/or replication. Instead, you can use DQ2 to query the Grid directly.

Setup DQ2 and get a Grid proxy.

% source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
... [lots of tedious announcements]

% voms-proxy-init -voms atlas
Cannot find file or dir: /afs/cern.ch/user/a/adye/.glite/vomses
Enter GRID pass phrase: ****
Your identity: /C=UK/O=eScience/OU=CLRC/L=RAL/CN=tim adye
Creating temporary proxy ........................................... Done
Contacting  lcg-voms.cern.ch:15001 [/DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch] "atlas" Done
Creating proxy .................................................... Done

Then list the "container" dataset

% dq2-ls -L CERN-PROD_MCDISK -f -p valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185/

If the files are present, then their physical addresses, eg. srm://srm-atlas.cern.ch/castor/cern.ch/grid/atlas/atlasmcdisk/valid1/ESD/e380_s764_r1185/valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185_tid122425_00/ESD.122425._000002.pool.root.2 will be printed. If they are not, then they will be listed as absent "[ ]".

My script ~adye/bin/dq2-ls-castor simplifies the output, listing just the CASTOR directory. Specify it with just the dataset name.

If the files aren't there, you might want to look for the task dataset.

% dq2-ls valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185\*

The task dataset has a suffix of _tid122425_00 (where the number is the task id - ignore datasets with an addition suffix such as _sub06234844). Try the dq2-ls -L CERN-PROD_MCDISK -f -p command on this dataset (without the trailing "=/=").

Locating the input ESD dataset by knowing where to look

If that all seems rather cumbersome, you could just check the usual place in CASTOR for the output. This might not work if the DDM puts things somewhere else, so check with DQ2 if it doesn't find what you expect. My script ~adye/bin/dq2ls simplifies this a little by searching for all datasets that contain the given components in their name and printing the CASTOR directories it finds (run without arguments for a quick example - use the -l option to list the files), eg.

% ~adye/bin/dq2ls valid1/ESD PythiaZmumu,singlepart_mu1,singlepart_mu100 e380_s764_r1185
/castor/cern.ch/grid/atlas/atlasmcdisk/valid1/ESD/e380_s764_r1185/valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185_tid122425_00
/castor/cern.ch/grid/atlas/atlasmcdisk/valid1/ESD/e380_s764_r1185/valid1.107199.singlepart_mu1.recon.ESD.e380_s764_r1185_tid122406_00
/castor/cern.ch/grid/atlas/atlasmcdisk/valid1/ESD/e380_s764_r1185/valid1.107233.singlepart_mu100.recon.ESD.e380_s764_r1185_tid122404_00

where the directory valid1 is the first word of the dataset name and the _tid122425 suffix comes from the task id (who knows about the _00 suffix).

You can list all the files with nsls:

nsls -l /castor/cern.ch/grid/atlas/atlasmcdisk/valid1/ESD/e380_s764_r1185/valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185_tid122425_00

Copying ESD dataset to CERN CASTOR

If the dataset is not available at CERN, it may be best to wait until it is copied by the automatic replication system. However this only works for certain datasets (notably ones named valid*) and sometimes it gets stuck. In that case, you can copy the dataset to your local CASTOR directory by hand.

Find the dataset with DQ2.

% dq2-ls -f valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185/

valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185/
[ ]     ESD.122425._000009.pool.root.1  3C64C8DA-5238-DF11-BEB8-001D0930AD5E    ad:dc0f1573     460757546
[ ]     ESD.122425._000031.pool.root.1  7C194EC3-6238-DF11-9AE3-0019B9EACAEB    ad:72d985de     435311707
...
[ ]     ESD.122425._000011.pool.root.1  BE0E4C0D-7A38-DF11-BD5F-0019B9E84B24    ad:286a37db     440205819
total files: 39
local files: 0
total size: 16982567571
date: 2010-03-28 14:48:53

If that doesn't work, try the task dataset as described above.

Copy the dataset to your $CASTOR_HOME area. I have a script to do this. It should be run from a temporary area, eg. $TMPDIR (/tmp/adye in my case). It may take some time, so you probably want to run in the background (or in a batch job, though I haven't tried that), perhaps with my job script:

% cd $TMPDIR
% ~adye/bin/job -N ~adye/bin/dq2castor valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185/

The job script will put the logfile in ~/job.log* and will mail you when it is done (assuming you have a ~/.forward set up).

If that is successful (check the logfile: should say something like Done, Total:33 - Failed:0 near the top), you should have the dataset in $CASTOR_HOME/valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185/:

% nsls -l valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185
mrw-r--r--   1 adye     zp                390109819 Mar 28 19:29 ESD.122425._000002.pool.root.2
mrw-r--r--   1 adye     zp                413042976 Mar 28 19:29 ESD.122425._000003.pool.root.1
...

Running the InDetPerformanceMonitoring analysis job over ESD files in CASTOR

Create a jobOptions file named the same as the dataset (it doesn't have to be, but this helps keep track of things) and .py suffix to:

  1. specify the same DetDescrVersion as was used in the reco job (ATLAS-GEO-10-00-00), shown on the task status page.
  2. include esd.py.
  3. set ServiceMgr.EventSelector.InputCollection with the list of TUrls of the files in CASTOR (ie. castor:/castor/cern.ch/...). These are the files you found or copied in the previous sections.

This can be done automatically using ~adye/bin/mkjo with the dataset name, CASTOR directory, and geometry version on the command line, eg.

% mkjo valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185 /castor/cern.ch/grid/atlas/atlasmcdisk/valid1/ESD/e380_s764_r1185/valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185_tid122425_00 ATLAS-GEO-10-00-00

mkjo uses the template.py file you copied to your WorkArea/run directory and creates a jobOptions file with the name of the dataset (or whatever you specified as the first parameter) and a .py suffix.

For real data, you need to edit the valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185.py file and change DataSource according to the comments. You may also need to change (or update) the good runs list, eg. for field-off data add:

goodRunsList = "goodruns_data09_900GeV_nosol"

Now everything should be set up to run over the ESD files. This can be tested interactively:

% cd $TMPDIR
% ~/testarea/15.6.7/WorkArea/run/esd.sh valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185 100

where esd.sh takes the dataset name (without .py suffix) and maximum number of events (100 takes a few minutes, so is OK to run interactively). Remember to use esd-db961.sh if you made a special version to specify a non-default DBRelease.

If it complains about being unable to read the files from CASTOR because of lack of privilege, you should check that you have a Kerberos ticket with the klist command. Normally that should be set up when you give a password to login to lxplus, but if that doesn't work, use kinit user@CERN.CH to get one explicitly.

Assuming that works OK, you will have a histogram file, with a name like InDetStandardPlots-valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185-100-lxplus312-32270.root in your tmp directory (it has a suffix with the machine name and PID to ensure unqueness). You can use root's TBrowser to check a few quantities (in the /IDPerformanceMon/Tracks/SelectedGoodTracks tree) for gross screw-ups.

% root -l InDetStandardPlots-valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185-100-lxplus312-32270.root
root [0]
Attaching file InDetStandardPlots-valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185-100-lxplus312-32270.root as _file0...
root [1] TBrowser b

You can now submit an batch job to run over the full dataset:

% cd ~/testarea/15.6.7/WorkArea/run
% bsub -q 8nh esd.sh valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185

Note that no event limit was specified here. You might want to specify an event limit to match sample sizes if the test or reference sample has fewer events than the other (eg. because the production or replication is not yet finished). Most InDetPerformanceMonitoring histograms are independent of the number of events (except for the errors), but a few are not normalised so can be harder to compare if there is a large disparity in the sample sizes.

10000 Z→μμ events currently takes about 2:45 hours of normalised CPU time (KSI2K) or 2:36 of real time (assuming you get a job slot reasonably quickly).

When the job is done, check the bottom of the logfile for crashes and that the histogram file is present. When run in batch, esd.sh writes the output to your run directory and gives the file a suffix of the LSF job id. The logfile also includes the InDetRecStatistics summary table (see that web page for details), which can be checked.

You might find my ~adye/bin/mvlog script useful: it matches LSFJOB_*/STDOUT logfiles with root files and gives them the same name. Use mvlog -h for help.

Comparing InDetPerformanceMonitoring plots using DCube

It is usually convenient to check the results on the web. It is also useful to keep these in a standard location for archival purposes. The AFS directory /afs/cern.ch/atlas/groups/validation/Tracking, which is accessible via the URL http://atlas-computing.web.cern.ch/atlas-computing/links/PhysValDir/Tracking/, can be used for this purpose.

Create a subdirectory for this validation (I use the date and a sequence letter YYMMDDX, eg. 090818a) and copy the ROOT and log files for test and reference jobs there.

The DCube comparison can be made using another script:

~adye/bin/dcube_val.sh valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1193 valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185

where the reference dataset (valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1193) and test dataset (valid1.105145.PythiaZmumu.recon.ESD.e380_s764_r1185) are specified (in that order) on the command line. dcube_val.sh will search for ROOT files containing those names, run DCube, and set up a subdirectory named for the test dataset (this can be overridden by the -o DIR option, eg. if comparing multiple references to the same test dataset in the same dated directory).

If looking at real data, use ~adye/bin/dcube_val.sh -c ~adye/bin/dcube_val_data.xml instead. This excludes plots that require MC truth from the comparison.

The DCube output can be viewed in the subdirectory using a web browser. The default configuration displays all the plots (apart from some intermediate plots) for SelectedGoodTracks. That's 436 plots, though usually only the first 60 (corresponding to the RTT plots) need be checked in detail (the others are useful for tracking down problems when unexpected differences appear).

An example of some of these comparisons can be seen here.

-- TimAdye - 6 Apr 2010

Edit | Attach | Watch | Print version | History: r18 < r17 < r16 < r15 < r14 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r18 - 2010-04-06 - TimAdye
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Atlas All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback