ARATopAnalysis

Introduction

ARATopAnalysis is a physics analysis package which uses AthenaROOTAccess (ARA) to read ESD/AOD input in the form of a transient ROOT tree and performs a basic $t \bar{t}$ candidate selection. The offline objects are selected using different selectors. The selector classes access an object (EventData) which holds the required event information to reconstruct top pairs from the transient tree. The transient tree itself is packaged in the ARATree object. The selector classes are unaware of how the branches of the transient tree are accessed. They are independent of this step. The interface between the ARATree (data) and their accessors (selectors) is the EventData class.

Intended Users

This package is intended for users starting out on top physics analysis using ARA, although it should be relatively easy to tailor this package to suit any other analysis. The users are required to write their own steering class which is able to access all the information from the transient tree. Once this is done the user has a choice of using/modifying the selector classes already provided or writing their own object selection routines.

List of Tags

Newest tag always contains latest updates. Please use those. The SVN location for ARATopAnalysis. A list of tags for specific Athena releases is given in the table below:

Athena AthenaROOTAccess ARATopAnalysis AZAnalysis Comment
15.5.2 AthenaROOTAccess-00-05-57 ARATopAnalysis-00-03-02 AZAnalysis-00-00-05 Added TopoVariables, TopoTreeMaker


Athena AthenaROOTAccess ARATopAnalysis Comment
15.4.0 AthenaROOTAccess-00-05-55 ARATopAnalysis-00-02-06 fixes to use P4Helpers
14.4.0 AthenaROOTAccess-00-05-31 ARATopAnalysis-00-00-00 First import
14.5.0 AthenaROOTAccess-00-05-33 ARATopAnalysis-00-00-01 No updates
ARATopAnalysis-00-00-02 added dummy version.cmt
ARATopAnalysis-00-00-03 added TrigStudy class
ARATopAnalysis-00-00-04 added TopFitter class
15.0.0 AthenaROOTAccess-00-05-43 ARATopAnalysis-00-01-03 added TruthSelector class
modifications to compile with gcc4.3
15.1.0 AthenaROOTAccess-00-05-46-01 ARATopAnalysis-00-01-04 added TagAndProbe class
15.2.0 AthenaROOTAccess-00-05-49 ARATopAnalysis-00-01-05 verified cutflow for PUB note, merged e,mu
ARATopAnalysis-00-01-06 added an example showing the use of EventData
15.2.0 AthenaROOTAccess-00-05-49 ARATopAnalysis-00-02-00 Using TriggerDecisionTool, added Top/W containers, first SVN version
ARATopAnalysis-00-02-02 fixed crash, added myMain.cxx
ARATopAnalysis-00-02-04 added runARA.py module, NTupleHandler
ARATopAnalysis-00-02-05 added JESSystematic

Analysis Steering

Steering the analysis involves the following steps:
  • Access all the required branches from ARATree
  • Store this information in EventData
  • Loop over all events and perform the selection
  • Save the resulting subset of events, histograms etc.

An analysis base-class (myAnalysis) is created which has methods to perform these steps. It has pure virtual methods and all the analysis steering classes (one of them is TopAnalysis) inherit from this base-class and implement these steps. The analysis base class is made a friend of ARATree, to enable the steering classes to implement the above steps.

Configuration

User analysis usually involves selection of physics objects based on cuts. To facilitate easy modification of the selection criteria without the additional burden of re-compile / building of the package upon making these modifications, a configuration file (an ASCII text file) is used by the selectors. A Config class reads the user configuration during run-time and passes them to the selectors. The user configuration could be acceptance regions in $ \eta-\phi $, $ p_T $ cuts, names of the branches of the transient tree to be read etc. to name a few.

Extras

Lastly, it is possible to store/retrieve a subset of selected objects or new objects. This is done by the ObjCollection class. Message service is provided by SLogger (which is originally from the SFrame analysis package). Message service enables user to output messages to standard output / error streams with different levels of detail.

Class Overview

Main Classes

  • ARATree: Has the ARA transient tree and methods to access the branches
  • EventData: Contains only relevant transient tree branches to reconstruct $ t \bar{t} $ pairs from their semi-leptonic decay
  • myAnalysis: Analysis base class. All steering classes (analysers) must inherit from this class
  • TopAnalysis: Analyzer for top selection based on top cross section CSC note (T6)
  • ObjCollection: Stores collection of user selected objects. The objects can be stored / retrieved by their unique names
  • myAnalysisTools: Its a helper class used to sort collection of objects based on some of their properties (descending order by $ p_T $ for example)
  • Config: Reads the user configuration (cuts) from an ASCII file
  • SLogger: Holds the message stream objects
  • SLogWriter: Writes the formatted messages to standard output with various levels of detail

Helper Classes

  • Histogram: Plot kinematic properties of objects
  • TopoKineVars: Calculate some event shape / topological quantities specific to top decay and store them in a rootuple
  • JESSystematic: Scales the 4-momenta of all the jets in an event and re-computes the missing ET

Container Classes

  • TopContainer: To hold leptonic $ t \rightarrow (W \rightarrow \ell \nu) b$ and hadronic top $ t \rightarrow (W \rightarrow qq) b$

Selector Classes

  • ElectronSelector: Select reconstructed electrons
  • JetSelector: Select reconstructed jets
  • METSelector: Select missing $ E_T $
  • MuonSelector: Selected reconstructed muons
  • TopSelector: Reconstuct a top pair from their semi-leptonic decay
  • TriggerSelector: Select events based on certain triggers
  • TruthSelector: Select MC truth information for truth-matching

Example 1. Running TopAnalysis

For running the top analysis out of the box, one has to check out the package (either from head or a specific tag) and compile it first. The default analysis can be run out of the box with tag ARATopAnalysis-00-02-00 or later. A python module (runARA.py aliased to runARA) is defined to facilitate running the analysis.

    • compile and create a link to config file
      cd $TestArea/ARATopAnalysis/cmt/ && gmake && cd $TestArea
      source $TestArea/ARATopAnalysis/cmt/setup.sh
      ln -s ARATopAnalysis/test/config/analysis.ttbar.config analysis.config

    • show runARA options:
      runARA -h or runARA --help or runARA.py -h
            Options:
              -h, --help            show this help message and exit
              -f FILENAME, --file=FILENAME
                                    read a set of AOD/ESD/DPD's from inputfile
              -n EVENTS,   --nevts=EVENTS
                                    number of events to analyze
            
    • run over 100 events, with input files listed in test.ttbar.list:
      runARA -n 100 -f /afs/cern.ch/user/v/venkat/public/test.ttbar.list
    • run over all events
      runARA -f /afs/cern.ch/user/v/venkat/public/test.ttbar.list
    • run over all events from two input files: Notice the comma after each input file in the first line and no commas in the second line
      echo /raid01/venkat/dataset/check/AOD.026357._00511.pool.root.1, /raid01/venkat/dataset/check/AOD.026357._00540.pool.root.1 | runARA runARA /raid01/venkat/dataset/check/AOD.026357._00511.pool.root.1 /raid01/venkat/dataset/check/AOD.026357._00540.pool.root.1
    • Using runARA.exe directly. Use with caution or be prepared for consequences!
      runARA.exe 100 /afs/cern.ch/user/v/venkat/public/test.ttbar.list

Example 2. Standalone test

Setup Athena Release

Refer WorkBookSetAccount on how to setup your Athena account for a particular release. This will setup your release and workarea pointed to by $TestArea environment variable. You should also have $CMTROOT setup properly and $CMTPATH set to include your working area. You have to check to make sure these are set properly.

Checkout the package from CVS

Assuming you have setup your release and are using bash shell, execute the following commands. Refer to a list tags for a given Athena release. You will also need the usual AthenaROOTAccess tags for the same release. In the commands below, my top-level working directory is called work

cd $TestArea
work> export CVSROOT=isscvs.cern.ch:/local/reps/atlas
work> export CVS_RSH=ssh
work> cvs co -d ARATopAnalysis -r ARATopAnalysis-xx-yy-zz atlas/groups/Arizona/ARATopAnalysis
Alternatively, if you are using cmt you can do the following
cd $TestArea
work> export CMTCVSOFFSET=groups/Arizona
work> cmt co ARATopAnalysis      #from head OR
work> cmt co -r ARATopAnalysis-xx-yy-zz ARATopAnalysis     #specific tag
work> export CMTCVSOFFSET=offline

Checkout the package from SVN

cd $TestArea
export SVNGRP=svn+ssh://svn.cern.ch/reps/atlasgrp/Institutes/Arizona
work> svn co -d ARATopAnalysis -r ARATopAnalysis-xx-yy-zz $SVNGRP/ARATopAnalysis (specific tag)
work> svn co $SVNGRP/ARATopAnalysis/trunk ARATopAnalysis  (from trunk [HEAD])

Alternatively, if you are using cmt you need to use the following version when setting up cmt

cmthome> source /afs/cern.ch/sw/contrib/CMT/v1r20p20090520/mgr/setup.sh
cmthome> cmt config

Now you can do the usual setup of Athena and then:

cd $TestArea
work> export SVNROOT=svn+ssh://svn.cern.ch/reps/atlasgrp/Institutes/Arizona
work> cmt co ARATopAnalysis      #from head OR
work> cmt co -r ARATopAnalysis-xx-yy-zz ARATopAnalysis     #specific tag
work> export SVNROOT=svn+ssh://svn.cern.ch/reps/atlasoff

Some useful SVN commands

Updating, adding/removing files to a package

work> cd $TestArea/ARATopAnalysis
ARATopAnalysis> export SVNGRP=svn+ssh://svn.cern.ch/reps/atlasgrp
ARATopAnalysis> svn add ARATopAnalysis/TopContainer.h  (add)
ARATopAnalysis> svn rm ARATopAnalysis/TruthContainer.h  (remove) 
ARATopAnalysis> svn update  (update) 
ARATopAnalysis> svn ci -m "See ChangeLog"  (commit) 
ARATopAnalysis> svn cp . $SVNGRP/Institutes/Arizona/ARATopAnalysis/tags/ARATopAnalysis-00-02-00  (tag)

Creating and importing a package (first time)

MyDir> ls 
HLTAnalysis

MyDir> svn ls $SVNGRP/Institutes/Arizona/AZAnalysis
MyDir> svn mkdir $SVNGRP/Institutes/Arizona/HLTAnalysis  -m "creating new package"
MyDir> svn mkdir $SVNGRP/Institutes/Arizona/HLTAnalysis/trunk -m "adding trunk"
MyDir> svn mkdir $SVNGRP/Institutes/Arizona/HLTAnalysis/tags -m "adding tags"
MyDir> svn mkdir $SVNGRP/Institutes/Arizona/HLTAnalysis/branches -m "adding branches"
MyDir> cd HLTAnalysis/
HLTAnalysis> ls 
ChangeLog  HLTAnalysis  cmt  config  exec  src

HLTAnalysis> svn import -m "HLTAnalysis first import" . $SVNGRP/Institutes/Arizona/HLTAnalysis/trunk  (importing first time)

Prepare a configuration file

The input data can be in the form of a text file. An example configuration file: ARATopAnalysis/test/config/testARA.config is provided. It reads a set of input parameters from the user. In the example provided, the reconstructed muon branch "StacoMuonCollection" branch is accessed and for all events with one or more muons, the $p_T$ spectrum and multiplicity are plotted. Here, you can either create a soft-link to that config file or set an environment variable that points to the file. The only requirement is that the configuration file that the program picks up is named analysis.config. If there are more than one analysis.config files in the current search path (starting from your work area) the last modified one is picked up by the program.

work> cat ARATopAnalysis/test/config/testARA.config
##############  myTestARA Configuration ################

# This is a comment line -- will be ignored

#set message level
myTestARA.DebugLevel: INFO

#select muons from this branch
myTestARA.MuonsFromBranch: StacoMuonCollection
myTestARA.ElectronsFromBranch: ElectronAODCollection

## End of configuration
There are two options: either you can create a symbolic link to the configuration file or set an environment variable ANSLYSIS_CONFIG pointing to it.
work> ln -s ARATopAnalysis/test/config/testARA.config analysis.config
                           OR equivalently
work> export ANALYSIS_CONFIG=ARATopAnalysis/test/config/testARA.config

Compile and run a test

Almost all of the requirements for this package are similar to _UserAnalysis_ package. In order to compile, do the following.

work> cd ARATopAnalysis/cmt
cmt> cmt config
cmt> source setup.sh
cmt> cmt bro gmake
cmt> cd $TestArea
Alternatively you can use the script provided which will compile all packages in $TestArea especially if you have a large number of packages checked out and need to compile all of them. This script uses cmt broadcast gmake and checks to make sure all packages compile correctly.
cd $TestArea
work> ln -s ARATopAnalysis/share/compileARA.sh
work> ./compileARA.sh -f         # without prompting the user OR
work> ./compileARA.sh      # user is prompted before compiling each package

This will create a directory $TestArea/InstallArea with links to the executables, libraries etc. required to run the example. If everything goes well during compilation, you should be able to run the test program called runTest.exe In order to test, you will need a set of input files (AOD's or ESD's) in some location. The test program reads a list of input files (listed in a text file, one per line) and creates a transient tree (or chain of trees). It then loops through (N > 0) events specified at the command line (or alternatively over all events if you specify -1 from command line). You can get a sample AOD/ESD file and specify its location in a text file.

work> ls ./ARATopAnalysis/$CMTCONFIG/runTest.exe
work> runTest.exe
--> Usage: runTest.exe <nEvents> <sampleName>
work> ls /path/to/some.AOD.pool.root.1 > myList.ext
work> cat myList.txt
/path/to/some/AOD.pool.root.1
work> runTest.exe 10 myList.txt

Program Output

The user should see the following output:
INFO runTest.exe : Initializing ARA transient tree. Could take a while, please wait...
--> reading 1 files from /path/to
--> some.AOD.pool.root.1
type/key : HLT / TrigRoiDescriptorCollection to be aliased to: HLT_Roi
type/key : HLT_TrigT2CaloEgamma / TrigRoiDescriptorCollection to be aliased to: HLT_TrigT2CaloEgamma_Roi
type/key : HLT_TrigT2CaloJet / TrigRoiDescriptorCollection to be aliased to: HLT_TrigT2CaloJet_Roi
type/key : HLT_TrigT2CaloTau / TrigRoiDescriptorCollection to be aliased to: HLT_TrigT2CaloTau_Roi
type/key : HLT / MuonFeatureContainer to be aliased to: HLT_Mu
type/key : HLT / IsoMuonFeatureContainer to be aliased to: HLT_IsoMu
type/key : HLT / CombinedMuonFeatureContainer to be aliased to: HLT_CombMu
type/key : HLT_TrigT2CaloJet / TrigT2JetContainer to be aliased to: HLT_TrigT2CaloJet_T2Jet
type/key : HLT / TrigTauContainer to be aliased to: HLT_tau
type/key : HLT / TileMuFeatureContainer to be aliased to: HLT_TileMu
type/key : HLT / TileTrackMuFeatureContainer to be aliased to: HLT_TileTrackMu
type/key : HLT / TrigEMClusterContainer to be aliased to: HLT_EMCl
type/key : HLT_TrigT2CaloEgamma / TrigEMClusterContainer to be aliased to: HLT_TrigT2CaloEgamma_EMCl
type/key : HLT_TrigT2CaloTau / TrigTauClusterContainer to be aliased to: HLT_TrigT2CaloTau_TauCl
type/key : HLT / TrigTauTracksInfoCollection to be aliased to: HLT_TauTrInfo
type/key : HLT / TrigL2BphysContainer to be aliased to: HLT_L2Bphys
type/key : HLT / TrigInDetTrackCollection to be aliased to: HLT_InDetTrk
type/key : HLT / CaloClusterContainer to be aliased to: HLT_CaloCl
type/key : HLT / CaloShowerContainer to be aliased to: HLT_CaloSh
type/key : HLT_egamma / egammaContainer to be aliased to: HLT_egamma_eg
type/key : HLT / TrigTrackCountsCollection to be aliased to: HLT_TrkCount
type/key : HLT / JetCollection to be aliased to: HLT_JetColl
type/key : HLT_egamma / egDetailContainer to be aliased to: HLT_egamma_egDet
JobIDSvc.Servic...VERBOSE ServiceLocatorHelper::createService: found service JobIDSvc
AthenaHitsVector DEBUG initialized.

INFO runTest.exe : ARA Initialization: RealTime: 21.3409 s CpuTime: 21.04 s
INFO runTest.exe : Looping over events
INFO Config : Reading configuration file: ./analysis.config
INFO myTestARA : ----------------- Configuration ------------------------------
INFO myTestARA : Electrons From Branch : ElectronAODCollection
INFO myTestARA : Muons From Branch : StacoMuonCollection
INFO myTestARA : DebugLevel : INFO
INFO myTestARA : --------------------------------------------------------------
INFO myTestARA : Event #: 0 6 kB
INFO myTestARA : Event #: 1 2 kB
INFO myTestARA : Event #: 2 4 kB
INFO myTestARA : Event #: 3 10 kB
INFO myTestARA : Event #: 4 7 kB
INFO myTestARA : Event #: 5 4 kB
INFO myTestARA : Event #: 6 2 kB
INFO myTestARA : Event #: 7 10 kB
INFO myTestARA : Event #: 8 2 kB
INFO myTestARA : Event #: 9 3 kB
INFO runTest.exe : Analysed events: 10
INFO runTest.exe : RealTime: 0.405415 seconds. CpuTime: 0.09 seconds
INFO runTest.exe : Execution time: 24.6661 events / RealTime second. 111.111 events / CpuTime second.


It will also create a ROOT file called myTestARA.hists.root in the working area, which contains two histograms - the muon multiplicity and muon $p_T$ histograms. The histograms obtained by running this example are shown below.

m_muon_all_n m_muon_pt
m_muon_all_n.gif m_muon_pt.gif



 \begin{table} \begin{tabular}{|c|c|} a &amp; b \\ 1 &amp; 2 \\ \hline \end{tabular} \end{table}

Example 3. Running on the grid

Submitting a grid job using prun

PandaRun twiki describes how to submit general jobs on the grid. The user has options of submitting ROOT (CINT, C++, pyROOT), ARA, Python, user executables or even shell scripts. Using the flexibility it offers, we can choose to submit a general job running a user executable. In our case, the executable is called gridrunARA.exe. Its a "grid-aware" version of the test program which was explained in example 1. The submission command is shown below. Before you issue prun you must make sure you have a valid membership of a VO (Virtual Organization ATLAS) and obtain and install the PandaRun package. When you specify output dataset (which will hold all your output) its required of you to follow some convention : userNN.FirstNameLastName.XYZ where NN is the last two digits of the current year, First-, Last- names of the user followed by a description of your output data. For more options refer the examples.
work> prun --exec "source gridcompileARA.sh 14.4.0 AtlasOffline; echo %IN | gridrunARA.exe" \
--athenaTag 14.4.0 --inDS mc08.105200.T1_McAtNlo_Jimmy.recon.AOD.e357_s462_r579 \
--outDS user09.VenkateshKaushik.try25 --nFiles 5 --cloud FR --outputs "my*.root"

Checking Job Status

Using the command line interface

Once you issue the above command, you will see the following output:
query files in dataset:mc08.105200.T1_McAtNlo_Jimmy.recon.AOD.e357_s462_r579
gathering files under /raid01/venkat/testarea/TopAnalysis/14.5.0/TrigARA
upload sources
submit
===================
JobID : 31
Status : 0
> build
PandaID=23480737
> run
PandaID=23480738

In order to check the status of your job, you can either use web interface or the command line interface. Using the command line interface involves issuing the pbook which is a book-keeping tool along with show command with the jobID.
work> pbook
Any general job is split into two parts. The "build" stage and the "run" stage that follows it. If the build-job fails, the run-job is killed. Once submitted, you will see the status of the build-job shows "running", whereas the run-job shows "defined". You might see a job status set to holding between different stages. Once the job is finished the status is changed to finished.
>>> show(31)

...
...
jobStatus : running
defined : 1
holding : 1
>>> show(31)

...
...
jobStatus : running
finished : 1
running : 1
>>> show(31)

INFO : Getting status for JobID=31 ...
INFO : Updated JobID=31
======================================
JobID : 31
type : prun
PandaID : 23480737-23480738
nJobs : 1 + 1(build)
site : ANALY_TOKYO
cloud : FR
inDS : mc08.105200.T1_McAtNlo_Jimmy.recon.AOD.e357_s462_r579
outDS : user09.VenkateshKaushik.try25
libDS : user09.VenkateshKaushik.lib._1232292662.66.lib.tgz
retryID : 0
provenanceID : 0
creationTime : 2009-01-18 15:31:05
lastUpdate : 2009-01-18 16:28:00
params : --exec "source gridcompileARA.sh 14.4.0 AtlasOffline; echo %IN gridrunARA.exe" --athenaTag 14.4.0 --inDS mc08.105200.T1_McAtNlo_Jimmy.recon.AOD.e357_s462_r579 --outDS user09.VenkateshKaushik.try25 --nFiles 5 --cloud FR --outputs "my*.root"
jobStatus : frozen
finished : 2

Using the web interface

The panda monitor web interface can be used to find out the status of jobs. The book-keeping and the debug-jobs sections are especially useful. Given below are the screen-shots of the different stages of job execution for the user analysis job above. Once the job finishes (or fails) an e-mail message will be sent to the user notifying the completion of the job.


Build Stage:
runningbuild.jpg

Run Stage:
runningjob.jpg


Finished stage:
runfinished.jpg


Bugs/Known Issues

  • TriggerDecisionTool cannot find valid IOV for a given trigger configuration - (mostly harmless)
Did not find valid IOV for _TRIGGER_LVL1_Menu
Did not find valid IOV for _TRIGGER_HLT_Menu
Did not find valid IOV for _TRIGGER_LVL1_Lvl1ConfigKey
Did not find valid IOV for _TRIGGER_HLT_HltConfigKeys
Did not find valid IOV for _TRIGGER_LVL1_Prescales
TrigDecisionToo...  FATAL ERROR: Loading of new trigger configuration for run/lb = 0/0 failed

CMSsandbox.ToDo List

  • Use HitFit for constrained fit and compute DONE
  • Single muon trigger efficiency from Z sample DONE
  • Use trigger decision tool instead of trigger bit decoder DONE
  • Add provision for storing JES correction (nominal, plus minus) jets DONE
  • Add Ntuple making capability -- Thanks Caleb!
  • Cleanup access to ObjCollection -- Thanks Caleb!
    • Removed ObjCollection and EventData classes.
    • Replaced with DataStore to store user/ARATree objects

  • Add muon isolation study
  • Muon+Jet trigger efficiency
  • Estimate QCD from template / matrix method
  • Add TauContainer to user objects


Responsible: Venkat Kaushik

Major updates: --Main.VenkateshKaushik 16 Jan 2009


Review: Never reviewed

Edit | Attach | Watch | Print version | History: r31 < r30 < r29 < r28 < r27 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r31 - 2010-12-08 - VenkateshKaushik
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback