Chapter 4: CMSSW Framework in Data Analysis
4.1 Data Analysis in the Full Framework
Complete:
Detailed Review Status
Contents
Goals of this page
This page provides an up-to-date overview of the role of CMSSW Framework (aka Full Framework) in user's analysis.
Introduction
The Full Framework is CMS's main tool for data processing and is thus intimately connected with user's analysis in a number of ways:
- it is used to produce data`upstream' of the user's analysis: in HLT, Reconstruction, and Skimming
- it is used in PAT-tuple production in group and user skims
- it could also be used for making histograms and plots
- it could be used by the user to further adjust the content of the PAT files used in the analysis
The objective of this whole chapter is to demonstrate the key applications of the Full Framework in points #3 and #4 above.
Making plots in Full Framework, and the Interactive Analysis
The Full Framework is capable of creating and filling histograms, and is able to manage many histograms produced by a number of ED Analyzers. There are two ways to utilize this capability:
- monitoring and validation of Full Framework jobs
- interactive use
The CMS Framework is perfectly suited for creating and filling lots of histograms that can be used to keep track of what is happening in the event processing. The histograms produced this way can be used to quickly identify whether the job is working correctly and whether the output file makes sense (without analyzing the output file more thoroughly in FW Lite). This can be extended to a more detailed validation as well. In fact, the Full Framework is a great tool for making many
known plots.
However, this approach suffers from reduced
interactivity. In FW Lite, one can have rapid iterations in the think-click-plot-think cycle. In cmsRun, the same cycle takes a bit longer (sometimes many times longer, depending on the application), which ultimately slows down the progress of the analysis as well.
That being said, if the analysis requires the use of detailed Geometry, Alignment, and other kinds of Calibration Constants (possibly requiring database access), the only way to achieve it is in the Full Framework.
Adjusting the content of the user's PAT-tuple
The users will naturally want to control what is in their analysis data sample (
e.g., PAT-tuples), as that defines what is possible in the interactive stage of the analysis. In addition, having data sample which is too large also slows down the interactive analysis, simply because, due to a larger I/O, it takes longer to process the same number of events. So the key is to learn how to
- add the necessary information (in terms of other ED products, or CMS.UserData attached to PAT objects)
- remove the unnecessary information (in terms of making PAT objects which are `just right' in size)
These topics will be covered in detail in the PAT workbook sections below.
-- Main.Altan Cakir - 09 Oct 2017
4.1.1 More on CMSSW Framework
Complete:
Detailed Review Status
Contents
Goals of this page
When you finish this page, you should understand:
- the modular architecture of the CMSSW framework and the Event Data Model (EDM)
- how data are uniquely identified in an Event
- how Event data are processed - AOD and miniAOD structures
- the Framework Services, including the EventSetup
Introduction
The overall collection of software, referred to as CMSSW, is built around a Framework, an Event Data Model (EDM), and Services needed by the simulation, calibration and alignment, and reconstruction modules that process event data so that physicists can perform analysis. The primary goal of the Framework and
EDM is to facilitate the development and deployment of reconstruction and analysis software.
Modular Event Content
It is important to emphasize that the event data architecture is modular, just as the framework. Different data layers (using different
data formats) can be configured, and a given application can use any layer or layers. The branches (which map one to one with event data objects) can be loaded or dropped on demand by the application. The following diagram illustrates this concept:
You can reprocess event data at virtually any stage. For instance, if the available AOD doesn't contain exactly what you want, you might want to reprocess the RECO (e.g., to apply a new calibration) to produce the desired AOD.
Custom quantities (data produced by a user or analysis group) can be added to an event and associated with existing objects at any processing stage (RECO/AOD -> candidates -> user data). Thus the distinction between "CMS data" and "user data" may change during the lifetime of the experiment.
Identifying Data in the Event
Data within the Event are uniquely identified by four quantities:
- C++ class type of the data
- E.g., edm::PSimHitContainer or reco::TrackCollection.
- module label
- the label that was assigned to the module that created the data. E.g., "SimG4Objects" or "TrackProducer".
- product instance label
- the label assigned to object from within the module (defaults to an empty string). This is convenient if many of the same type of C++ objects are being put into the edm::Event from within a single module.
- process name
- the process name as set in the job that created the data
For example if you do (you can find the file MYCOPY.ROOT
here
) :
edmDumpEventContent MYCOPY.root
you get this output:
vector<reco::TrackExtra> "electronGsfTracks" "" "RECO."
vector<reco::TrackExtra> "generalTracks" "" "RECO."
vector<reco::TrackExtra> "globalMuons" "" "RECO."
vector<reco::TrackExtra> "globalSETMuons" "" "RECO."
vector<reco::TrackExtra> "pixelTracks" "" "RECO."
vector<reco::TrackExtra> "standAloneMuons" "" "RECO."
vector<reco::TrackExtra> "standAloneSETMuons" "" "RECO."
vector<reco::TrackExtra> "tevMuons" "default" "RECO."
vector<reco::TrackExtra> "tevMuons" "firstHit" "RECO."
vector<reco::TrackExtra> "tevMuons" "picky" "RECO."
In the above output:
vector<reco::TrackExtra>
is the C++ class type of the data
globalMuons"
is the module label
firstHit
is the product instance label
RECO
is the process name
Getting data from the Event
All Event data access methods use the
edm::Handle<type>
where
type
is the C++ type of the datum, to hold the result of an access.
To request data from an Event, in your module, use a form of one of the following:
- get which either returns one object or throws a C++ exception.
- getMany which returns a list of zero or more matches to the data request.
After
get or
getMany, indicate how to identify the data , e.g.
getByLabel or
getManyByType, and then use the name associated with the handle type, as shown in the example below.
Sample EDAnalyzer Code
Here is snippet from EDAnalyzer code called
DemoAnalyzer.cc ( used in the next section) showing how data is identified and accessed by a module. Notes follow:
void DemoAnalyzer::analyze(const edm::Event& iEvent, const edm::EventSetup& iSetup)
{
// These declarations create handles called "tracks" to the types of records "reco::TrackCollection" that you want
// to retrieve from event "iEvent".
using namespace edm;
edm::Handle<reco::TrackCollection> tracks;
// Pass the handle "tracks" to the method "getByLabel", which is used to
// retrieve one and only one instance of the type in question with
// the label specified out of event "iEvent". If more than one instance
// exists in the event, then an exception is thrown immediately when
// "getByLabel" is called. If zero instances exist which pass
// the search criteria, then an exception is thrown when the handle
// is used to access the data. (You can use the "failedToGet" function
// of the handle to determine whether the "get" found its data before
// using the handle)
iEvent.getByLabel("generalTracks", tracks);
.....................
.....................
}
Notes:
- Line 1: The method
analyze
receives a pointer iEvent
to the object edm::Event
which contains all event data.
- Middle section: Containers are provided for each type of event data and can be obtained by using the object
edm::Handle
.
- Last 3 section:
iEvent.getByLabel
(handle to types of event data) will retrieve the data from the event and store them in a container in memory.
No matter which way you request the data, the results of the request will be returned in a smart pointer (C++ handle) of type
edm::Handle<>
.
You may refer to the code
4.1.2 called
DemoAnalyzer.cc to see a used case.
The Processing Model
Events are processed by passing the Event through a sequence of modules. The exact sequence of modules is specified by the user via a
path
statement in a configuration file. A
path
is an ordered list of Producer/Filter/Analyzer modules which sets the exact execution order of all the modules. When an Event is passed to a module, that module can get data from the Event and put data back into the Event. When data is put into the Event, the provenance information about the module that created the data will be stored with the data in the Event. The components involved in the framework and
EDM are shown here:
The Standard Input Source shown above uses a ROOT I/O. The Event is then passed to the execution paths. The paths can then be ordered into a list that makes up the schedule for the process. Note that the same module may appear in multiple paths, but the framework will guarantee that a module is only executed once per Event. Since it will ask for exactly the same products from the event and produce the same result independent of which path it is in, it makes no sense to execute it twice. On the other hand a user designing a trigger path should not have to worry about the full schedule (that could involve 100's of modules). Each path should be executable by itself, in that modules within the path, only ask for things they know have been produced in a previous module in the same path or from the input source. In a perfect world, order of execution of the paths should not matter. However due to the existence of bugs it is always possible that there is an order dependence. Such dependencies should be removed during validation of the job.
Framework Services
ServiceRegistry System
The ServiceRegistry is used to deliver services such as the error logger or a debugging service which provides feedback
about the state of the Framework (e.g., what module is presently running). Services are informed about the present state of the Framework, e.g., the start of a new Event or the completion of a certain module. Such information is useful for producing meaningful error messages from the error logger or for debugging. The services to be used in a job and the exact configuration of those services are set in the user's configuration file via a ParameterSet. For further information look
here.
Event Setup
To be able to fully process an event, one has to take into account potentially changing and periodically updated information about the detector environment and status. This information (non-event data) is not tied to a given event, but rather to the time period for which it is valid. This time period is called its
interval of validity or IOV, and an IOV typically spans many events. Examples of this type of non-event data include calibrations, alignments, geometry descriptions, magnetic field and run conditions recorded during data acquisition. The IOV of one piece of non-event data is not necessarily related to that of another. The EventSetup system handles this type of non-event data for which the IOV is longer than one Event. (Note that non-Event data initiated by the DAQ, such as the Event or a Run transition, are handled by the Event system.)
The figure illustrates the varying IOVs of different non-event data (calibrations and alignments), and how their values at the time of a given event are read by the EventSetup system.
The EventSetup system design uses two categories
of modules to do its work:
ESSource and
ESProducer. These components are
configured using the same configuration mechanism as their Event counterparts, i.e., via a
ParameterSet.
- ESSource
- is responsible for determining the IOV of a Record (or a set of Records). (A Record is an EventSetup construct that holds data and services which have identical IOVs.) The ESSource may also deliver data/services. For example, a user can request the ECAL pedestals via an ESSource that reads the appropriate values from a database.
- ESProducer
- an ESProducer is, conceptually, an algorithm whose inputs are dependent on data with IOVs. The ESProducer's algorithm is run whenever there is an IOV change for the Record to which the ESProducer is bound. For example, an ESProducer is used to read the ideal geometry of the tracker as well as the alignment corrections and then create the aligned tracker geometry from those 2 pieces of information. This ESProducer is told by the EventSetup system to create a new aligned tracker geometry whenever the alignment changes.
For further information look
here.
Provenance Tracking
The CMS Offline framework stores provenance information within CMS's standard ROOT event data files. The provenance information is used to track how every data product was constructed including what other data products were read in order to do the construction. We record information to understand the history of how data were produced and chosen. Provenance information does not have to be sufficient to allow an exact replay of a process. Storing provenance in output files is very crucial to insure trust in the data, given the large scale, highly distributed nature of production, especially for physicists' personal skims which are not centrally managed. Using Provenance information one can track the source of a problem seen in one file but not another one, guarantee compatibility when reading multiple files in a job, confirm that an analysis was done using the proper data, track why two analyses get different results etc. A good source of info is a
talk
by Chris Jones at given at CEHP09. Also refer to
WorkBook 2.3. Also see
http://iopscience.iop.org/1742-6596/219/3/032011.
Review status
Responsible:
SudhirMalik
Last reviewed by:
SudhirMalik - 26 Nov 2009
%EDITING%
AltanCakir - 09 Oct 2017
--
AltanCakir - 09 Oct 2017
4.1.2 Writing your own EDAnalyzer
Complete:
Detailed Review status
Goals of this page:
You will learn the first steps of interacting with the CMS framework and how to write a module where you can put your analysis code.
Contents
Introduction
First, a few general words about analysis in the CMSSW framework. Physics analysis proceeds via a series of subsequent steps. Building blocks are identified and more complex objects are built on top of them. For instance, the Higgs search
H ->ZZ -> µµµµ requires:
- identifying muon candidates;
- reconstructing Z candidates starting from muon candidates;
- reconstructing Higgs candidates starting from Z candidates.
This process clearly identifies three products: muon candidates, Z candidates, and a Higgs candidate, as well as three processes to reconstruct them. These are well mapped into three Framework modules (EDProducers) that add into the Event three different products (the candidates collections).
Taking advantage of the modularity provided by the Framework for analysis enhances flexibility and allows separation of the analysis processes into single units that can be reused for different applications.
In this tutorial you will create (and later modify) a new EDAnalyzer module
and a configuration file which will run it. You will create a configuration
fragment include (cfi) file for the module, to contain the default values
for its parameters. You will use the tracer to watch the processing.
Customize this document
Set up your Environment
If you are working at Fermilab's cmslpc cluster you need to execute the following command before you start
#%LOCALSHELL% users
source /cvmfs/cms.cern.ch/cmsset_default.%LOCALCSH%sh
At CERN, you can login directly to
lxplus.cern.ch
# make a your working directory
mkdir MYDEMOANALYZER
cd MYDEMOANALYZER
# if output of echo $0 is csh or tcsh
setenv SCRAM_ARCH slc7_amd64_gcc820
# if output of echo $0 is bash/sh
export SCRAM_ARCH=slc7_amd64_gcc820
# check your arch - it should give an output as slc7_amd64_gcc820
echo $SCRAM_ARCH
# create a new project area
cmsrel CMSSW_10_2_18
echo $0
cd CMSSW_10_2_18/src/
cmsenv
Write a Framework Module
First, create a subsystem area. The actual name used for the directory is not important, we'll use
Demo
. From the
src
directory, make and change to the
Demo
area:
mkdir Demo
cd Demo
Note that if you do not create the subsystem area and create you module directly under the
src
directory, your code will not compile. Create the "skeleton" of an EDAnalyzer module (see
SWGuideSkeletonCodeGenerator for more information):
mkedanlzr DemoAnalyzer
Compile the code:
cd DemoAnalyzer
scram b
We'll use a ROOT file as the data source when we run with this module.
The data source is defined in the configuration file. For your convenience, there is already a data file processed in CMSSW_5_3_4, which is fully compatible with CMSSW_%LOCALWBRELEASE% release, containing 100 events. It is located at
/afs/cern.ch/cms/Tutorials/TWIKI_DATA/TTJets_8TeV_53X.root
.
The
mkedanlzr
script has generated an example python configuration file
%LOCALCONFFILE%
in the
%LOCALCONFFILEDIR%
directory. Open the file using your favorite text editor and change the data source file to as shown below.
And now the configuration file should read like this:
import FWCore.ParameterSet.Config as cms
process = cms.Process("Demo")
process.load("FWCore.MessageService.MessageLogger_cfi")
process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(-1) )
process.source = cms.Source("PoolSource",
# replace 'myfile.root' with the source file you want to use
fileNames = cms.untracked.vstring(
'file:/afs/cern.ch/cms/Tutorials/TWIKI_DATA/TTJets_8TeV_53X.root'
)
)
process.demo = cms.EDAnalyzer('DemoAnalyzer'
)
process.p = cms.Path(process.demo)
Full documentation about the configuration language is in
SWGuideAboutPythonConfigFile.
Run the job
cmsRun Demo/%LOCALCONFFILEDIR%/%LOCALCONFFILE%
You should see something like this (click on Show result below):
[lxplus402 @ ~/workbook/MYDEMOANALYZER/CMSSW_5_3_5/src]$ cmsRun Demo/DemoAnalyzer/demoanalyzer_cfg.py
11-Mar-2013 02:11:38 CET Initiating request to open file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
11-Mar-2013 02:11:45 CET Successfully opened file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
Begin processing the 1st record. Run 1, Event 261746003, LumiSection 872662 at 11-Mar-2013 02:11:46.878 CET
Begin processing the 2nd record. Run 1, Event 261746009, LumiSection 872662 at 11-Mar-2013 02:11:46.879 CET
Begin processing the 3rd record. Run 1, Event 261746010, LumiSection 872662 at 11-Mar-2013 02:11:46.880 CET
...
Begin processing the 48th record. Run 1, Event 261746140, LumiSection 872662 at 11-Mar-2013 02:11:47.320 CET
Begin processing the 49th record. Run 1, Event 261746141, LumiSection 872662 at 11-Mar-2013 02:11:47.320 CET
Begin processing the 50th record. Run 1, Event 261746142, LumiSection 872662 at 11-Mar-2013 02:11:47.321 CET
11-Mar-2013 02:11:47 CET Closed file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
=============================================
MessageLogger Summary
type category sev module subroutine count total
---- -------------------- -- ---------------- ---------------- ----- -----
1 fileAction -s file_close 1 1
2 fileAction -s file_open 2 2
type category Examples: run/evt run/evt run/evt
---- -------------------- ---------------- ---------------- ----------------
1 fileAction PostEndRun
2 fileAction pre-events pre-events
Severity # Occurrences Total Occurrences
-------- ------------- -----------------
System 3 3
Take notice that at this point, no action has been yet required in the new framework module - no output nor root
files will be produced.
NOTE: You may want to run your analyzer on a data file of your choice instead of the one above.
If you are using CMSSW version X and you are using the data file that is processed with CMSSW version Y, make sure that Y < X. For more info on how to look for data in DAS look at
WorkBookLocatingDataSamples. Also BEFORE you run, make sure the data file you want to read actually exists. Every file in mass storage has a unique Logical File Name (LFN). The list of LFNs for all files in a specified dataset can be obtained from DAS. An example would be
/store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
. There is a mapping between the LFN of a file and its Physical File Name (PFN). The PFN of a file differs from site to site, unlike the LFN. To obtain the PFN of a file from its LFN, use the
edmFileUtil
command.
On lxplus:
[lxplus402 @ ~/workbook/MYDEMOANALYZER/7_4_15/src/Demo/DemoAnalyzer]$ edmFileUtil -d /store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
root://eoscms.cern.ch//eos/cms/store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
On cmslpc:
[jstupak@cmslpc35 DemoAnalyzer]$ edmFileUtil -d /store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
root://cmsxrootd-site.fnal.gov//store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
On lxplus, use the
eos ls
command and the LFN to verify that a file exists in mass storage:
[lxplus402 @ ~/workbook/MYDEMOANALYZER/%LOCALWBRELEASE%/src/Demo/DemoAnalyzer]$ eos ls -l /store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
-rw-r--r-- 2 phedex zh 418309738 Oct 16 15:52 6CA1C627-246C-E511-8A6A-02163E014147.root
On cmslpc, use the
ls -l /eos/uscms
command and the LFN to verify that a file exists in mass storage:
[jstupak@cmslpc35 DemoAnalyzer]$ ls -l /eos/uscms/store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
-rw-r--r-- 1 cmsprod us_cms 418309738 Dec 2 16:41 /eos/uscms/store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
From any site with xrootd installed (cvmfs) use
xrdfs
with option
stat
in between both put the server site (output of edmFileUtil ) and last argument LFN to verify that a file exist in a given storage:
-bash-4.1$ xrdfs root://eoscms.cern.ch stat /store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
Path: /store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
Id: 5764607523034286847
Size: 418309738
Flags: 16 (IsReadable)
-bash-4.1$ xrdfs root://cmsxrootd-site.fnal.gov stat /store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
Path: /store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root
Id: -3458764513820540908
Size: 418309738
Flags: 16 (IsReadable)
Sometimes one may want to copy the entire file in the local
Demo/DemoAnalyzer/
directory. Remember these files are big so this is not recommended. But, just as a reminder, before you copy the file make sure that data file exists. If the data file is absent try another one by searching in
DAS
. To copy a datafile on lxplus and cmslpc to your local working directory, in both cases the PFN is used. On lxplus, use the
xrdcp
command:
xrdcp root://eoscms.cern.ch//eos/cms/store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root .
On cmslpc, use the
cp /eos/uscms
command and LFN:
cp /eos/uscms/store/data/Run2015D/SingleMuon/MINIAOD/PromptReco-v4/000/258/159/00000/6CA1C627-246C-E511-8A6A-02163E014147.root .
In case you copy the file you will have to change in the configuration file
fileNames
parameter to point to the local copy.
Since the data files are big, you may want to copy only a few events from the file, as explained in
WorkBookDataSamples. If you DO NOT copy the data file to your working area, you can read it directly from mass storage.
Get tracks from the Event
%LOCALDEMODIR%/DemoAnalyzer.cc
is the place to put your analysis code. In this example, we only add
very simple statement printing the number of tracks.
Edit
%LOCALBUILDDIR%/BuildFile.xml
: so that it looks like this
<use name="FWCore/Framework"/>
<use name="FWCore/PluginManager"/>
<use name="FWCore/ParameterSet"/>
<use name="DataFormats/TrackReco"/>
<use name="CommonTools/UtilAlgos"/>
<flags EDM_PLUGIN="1"/>
More on information on the structure of a typical
BuildFile.xml
can be found on
WorkBookBuildFilesIntro.
Edit
%LOCALDEMODIR%/DemoAnalyzer.cc
:
- Add the following include statements (together with the other include statements):
#include "DataFormats/TrackReco/interface/Track.h"
#include "DataFormats/TrackReco/interface/TrackFwd.h"
#include "FWCore/MessageLogger/interface/MessageLogger.h"
- Edit the method
analyze
which starts with
DemoAnalyzer::analyze(const edm::Event& iEvent, const edm::EventSetup& iSetup)
and put the following lines below
using namespace edm;
Handle<reco::TrackCollection> tracks;
iEvent.getByLabel("generalTracks", tracks);
LogInfo ("Demo") << "number of tracks " << tracks->size();
To see how to access data from a triggered or simulated physics event look at
SWGuideEDMGetDataFromEvent.
To know what other collections (besides
TrackCollection
of reconstructed objects are available, you can have a look to
SWGuideRecoDataTable.
Hold on, I have a question: if I want to add my own stuff in the analyzer, how am I supposed to know
what header file to include and what to add to the BuildFile.xml?
In this case, we want to add tracks. The tracks are part of event content. The event content can be found in
SWGuideRecoDataTable. If you looking for tracks, you will find
generalTracks
in the table, and
its collection name which you will need in the code as shown above. If you follow the link to the collection
name, you will find the class documentation of the object. You will see that its header file is
"DataFormats/TrackReco/interface/Track.h" and you will need to include it in your analyzer.
As it resides in
DataFormats/TrackReco
package you will need to add it to your BuildFile.xml.

: Many links in
SWGuideRecoDataTable point to a non-existing version in the class documentation. We are working to fix it.
To print out the information that we added in the analyzer, we need to replace the line
process.load("FWCore.MessageService.MessageLogger_cfi")
) in file
%LOCALCONFFILE%
with the segment below
# initialize MessageLogger and output report
process.load("FWCore.MessageLogger.MessageLogger_cfi")
process.MessageLogger.cerr.threshold = 'INFO'
process.MessageLogger.categories.append('Demo')
process.MessageLogger.cerr.INFO = cms.untracked.PSet(
limit = cms.untracked.int32(-1)
)
process.options = cms.untracked.PSet( wantSummary = cms.untracked.bool(True) )
More information on the
MessageLogger
can be found on
SWGuideMessageLogger.
Now, compile the code and run the job again:
scram b
cmsRun Demo/%LOCALCONFFILEDIR%/%LOCALCONFFILE%
The output should look something like this:
[lxplus404 @ ~/workbook/MYDEMOANALYZER/CMSSW_5_3_5/src]$ cmsRun Demo/DemoAnalyzer/demoanalyzer_cfg.py
12-Mar-2013 18:59:31 CET Initiating request to open file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
12-Mar-2013 18:59:36 CET Successfully opened file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 18:59:36 CET pre-events
no dictionary for class pair<edm::IndexIntoFile::IndexRunKey,Long64_t> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 18:59:36 CET pre-events
no dictionary for class pair<edm::IndexIntoFile::IndexRunLumiKey,Long64_t> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 18:59:36 CET pre-events
no dictionary for class pair<edm::BranchKey,edm::ConstBranchDescription> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 18:59:36 CET pre-events
no dictionary for class pair<edm::BranchID,unsigned int> is available
%MSG
Begin processing the 1st record. Run 1, Event 261746003, LumiSection 872662 at 12-Mar-2013 18:59:37.193 CET
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 18:59:37 CET Run: 1 Event: 261746003
number of tracks 1211
%MSG
Begin processing the 2nd record. Run 1, Event 261746009, LumiSection 872662 at 12-Mar-2013 18:59:37.203 CET
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 18:59:37 CET Run: 1 Event: 261746009
number of tracks 781
%MSG
Begin processing the 3rd record. Run 1, Event 261746010, LumiSection 872662 at 12-Mar-2013 18:59:37.206 CET
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 18:59:37 CET Run: 1 Event: 261746010
number of tracks 1535
...
%MSG
Begin processing the 48th record. Run 1, Event 261746140, LumiSection 872662 at 12-Mar-2013 18:59:38.129 CET
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 18:59:38 CET Run: 1 Event: 261746140
number of tracks 544
%MSG
Begin processing the 49th record. Run 1, Event 261746141, LumiSection 872662 at 12-Mar-2013 18:59:38.134 CET
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 18:59:38 CET Run: 1 Event: 261746141
number of tracks 662
%MSG
Begin processing the 50th record. Run 1, Event 261746142, LumiSection 872662 at 12-Mar-2013 18:59:38.135 CET
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 18:59:38 CET Run: 1 Event: 261746142
number of tracks 947
%MSG
12-Mar-2013 18:59:38 CET Closed file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
TrigReport ---------- Event Summary ------------
TrigReport Events total = 50 passed = 50 failed = 0
TrigReport ---------- Path Summary ------------
TrigReport Trig Bit# Run Passed Failed Error Name
TrigReport 1 0 50 50 0 0 p
TrigReport -------End-Path Summary ------------
TrigReport Trig Bit# Run Passed Failed Error Name
TrigReport ---------- Modules in Path: p ------------
TrigReport Trig Bit# Visited Passed Failed Error Name
TrigReport 1 0 50 50 0 0 demo
TrigReport ---------- Module Summary ------------
TrigReport Visited Run Passed Failed Error Name
TrigReport 50 50 50 0 0 demo
TrigReport 50 50 50 0 0 TriggerResults
TimeReport ---------- Event Summary ---[sec]----
TimeReport CPU/event = 0.002820 Real/event = 0.002912
TimeReport ---------- Path Summary ---[sec]----
TimeReport per event per path-run
TimeReport CPU Real CPU Real Name
TimeReport 0.002800 0.002883 0.002800 0.002883 p
TimeReport CPU Real CPU Real Name
TimeReport per event per path-run
TimeReport -------End-Path Summary ---[sec]----
TimeReport per event per endpath-run
TimeReport CPU Real CPU Real Name
TimeReport CPU Real CPU Real Name
TimeReport per event per endpath-run
TimeReport ---------- Modules in Path: p ---[sec]----
TimeReport per event per module-visit
TimeReport CPU Real CPU Real Name
TimeReport 0.002800 0.002881 0.002800 0.002881 demo
TimeReport CPU Real CPU Real Name
TimeReport per event per module-visit
TimeReport CPU Real CPU Real Name
TimeReport per event per module-visit
TimeReport ---------- Module Summary ---[sec]----
TimeReport per event per module-run per module-visit
TimeReport CPU Real CPU Real CPU Real Name
TimeReport 0.002800 0.002881 0.002800 0.002881 0.002800 0.002881 demo
TimeReport 0.000020 0.000025 0.000020 0.000025 0.000020 0.000025 TriggerResults
TimeReport CPU Real CPU Real CPU Real Name
TimeReport per event per module-run per module-visit
T---Report end!
=============================================
MessageLogger Summary
type category sev module subroutine count total
---- -------------------- -- ---------------- ---------------- ----- -----
1 fileAction -s file_close 1 1
2 fileAction -s file_open 2 2
type category Examples: run/evt run/evt run/evt
---- -------------------- ---------------- ---------------- ----------------
1 fileAction PostEndRun
2 fileAction pre-events pre-events
Severity # Occurrences Total Occurrences
-------- ------------- -----------------
System 3 3
Add parameters to our module
In this section, we determine the minimum number of tracks for an event to be displayed, and make it so we can change this number in the config and not need to recompile.
- Edit the
%LOCALDEMODIR%/DemoAnalyzer.cc
file. Add a new member data line to the DemoAnalyzer class:
private:
// ----------member data ---------------------------
unsigned int minTracks_;
DemoAnalyzer::DemoAnalyzer(const edm::ParameterSet& iConfig)
{
//now do what ever initialization is needed
}
to set the value of minTracks_ from a parameter. It should look like this (Note ":" colon at the end of
.....iConfig)
)
DemoAnalyzer::DemoAnalyzer(const edm::ParameterSet& iConfig) :
minTracks_(iConfig.getUntrackedParameter<unsigned int>("minTracks",0))
{
//now do what ever initialization is needed
}
DemoAnalyzer::analyze(const edm::Event& iEvent, const edm::EventSetup& iSetup)
method to use minTracks_ to decide when to print the number of tracks:
if( minTracks_ <= tracks->size() ) {
LogInfo ("Demo") << "number of tracks " << tracks->size();
}
So now this segment will look like this( Note: the first
LogInfo("Demo")....
has been commented, since now we want
"number of tracks"
to be printed
only if
minTracks_
is greater than a certain number as you will see below)
DemoAnalyzer::analyze(const edm::Event& iEvent, const edm::EventSetup& iSetup)
{
using namespace edm;
Handle tracks;
iEvent.getByLabel("generalTracks", tracks);
//LogInfo("Demo") << "number of tracks " << tracks->size();
if( minTracks_ <= tracks->size() ) {
LogInfo ("Demo") << "number of tracks " << tracks->size();
}
#ifdef THIS_IS_AN_EVENT_EXAMPLE
Handle<ExampleData> pIn;
iEvent.getByLabel("example",pIn);
#endif
#ifdef THIS_IS_AN_EVENTSETUP_EXAMPLE
ESHandle<SetupData> pSetup;
iSetup.get<SetupRecord>().get(pSetup);
#endif
}
Also to see if actually
minTracks_
gets used, replace the line =process.demo = cms.EDAnalyzer('DemoAnalyzer'
)= in
%LOCALCONFFILE%
by the segment below to use a value of say
minTracks_ = 1000
.
process.demo = cms.EDAnalyzer('DemoAnalyzer',
minTracks = cms.untracked.uint32(1000)
)
- Compile the code.
scram b
NOTE: whenever you make changes in the code of your analyzer you have to do
scram b
The output should be something like this (Only events with tracks greater than 1000 get printed). Click here:
[lxplus404 @ ~/workbook/MYDEMOANALYZER/CMSSW_5_3_5/src]$ cmsRun Demo/DemoAnalyzer/demoanalyzer_cfg.py
12-Mar-2013 19:12:58 CET Initiating request to open file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
12-Mar-2013 19:13:03 CET Successfully opened file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:13:03 CET pre-events
no dictionary for class pair<edm::IndexIntoFile::IndexRunKey,Long64_t> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:13:03 CET pre-events
no dictionary for class pair<edm::IndexIntoFile::IndexRunLumiKey,Long64_t> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:13:03 CET pre-events
no dictionary for class pair<edm::BranchKey,edm::ConstBranchDescription> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:13:03 CET pre-events
no dictionary for class pair<edm::BranchID,unsigned int> is available
%MSG
Begin processing the 1st record. Run 1, Event 261746003, LumiSection 872662 at 12-Mar-2013 19:13:05.970 CET
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 19:13:05 CET Run: 1 Event: 261746003
number of tracks 1211
%MSG
Begin processing the 2nd record. Run 1, Event 261746009, LumiSection 872662 at 12-Mar-2013 19:13:05.982 CET
Begin processing the 3rd record. Run 1, Event 261746010, LumiSection 872662 at 12-Mar-2013 19:13:05.985 CET
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 19:13:05 CET Run: 1 Event: 261746010
number of tracks 1535
%MSG
Begin processing the 4th record. Run 1, Event 261746019, LumiSection 872662 at 12-Mar-2013 19:13:05.990 CET
Begin processing the 5th record. Run 1, Event 261746021, LumiSection 872662 at 12-Mar-2013 19:13:05.993 CET
Begin processing the 6th record. Run 1, Event 261746029, LumiSection 872662 at 12-Mar-2013 19:13:05.996 CET
Begin processing the 7th record. Run 1, Event 261746030, LumiSection 872662 at 12-Mar-2013 19:13:05.998 CET
Begin processing the 8th record. Run 1, Event 261746031, LumiSection 872662 at 12-Mar-2013 19:13:06.001 CET
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 19:13:06 CET Run: 1 Event: 261746031
number of tracks 1350
...
A look at demoanalyzer_cfi.py file
There is file named
%LOCALCFIFILE%.py
in the
Demo/DemoAnalyzer/python
directory. A
_cfi.py
file contains default values of all the required parameters of a package. Rather than setting all the analyzer (we also call it module) parameters in your
%LOCALCONFFILE%
file, one can define all of them in the
%LOCALCFIFILE%.py
file. These parameters can then be reset (or overridden) by giving them a value you want in
%LOCALCONFFILE%
. However, you will have to include
%LOCALCFIFILE%.py
in
%LOCALCONFFILE%
.
To explain it more clearly, I can put the default value of minTracks=0 in
%LOCALCFIFILE%.py
and override it to say
minTracks=50
in
%LOCALCONFFILE%
. Thus if there are, say 10 parameters defined with their default values in
%LOCALCFIFILE%.py
and I want to change two of those to some other value, I can do that in
%LOCALCONFFILE%
WITHOUT changing the file
%LOCALCFIFILE%.py
that has default values.
The
mkedanlzr
script has created a
%LOCALCFIFILE%.py
file in the
python
directory as mentioned above.
- Add the newly created parameter
minTracks
and its default value to this %LOCALCFIFILE%.py
file so that it its content looks like this:
import FWCore.ParameterSet.Config as cms
demo = cms.EDAnalyzer('DemoAnalyzer',
minTracks=cms.untracked.uint32(0)
)
- Edit the
%LOCALCONFFILE%
file and REPLACE
process.demo = cms.EDAnalyzer('DemoAnalyzer',
minTracks=cms.untracked.uint32(1000)
)
WITH (these are 2 different lines, so should be on seperate lines each
process.load("Demo.DemoAnalyzer.%LOCALCFIFILE%")
process.demo.minTracks=1000
Here we use the default parameter values, and reset only the minTracks parameter to a different value.
cmsRun Demo/%LOCALCONFFILEDIR%/%LOCALCONFFILE%
You will see the same output as before EXCEPT that now you are overriding the default value of minTracks=0 defined in
%LOCALCFIFILE%.py
with a value you choose (in this case
minTracks=1000
) in
%LOCALCONFFILE%
. In this case the file with default parameters
%LOCALCFIFILE%.py
remains untouched.
See what is available in the Event
1. Edit the configuration file,
%LOCALCONFFILE%
and add the EventContentAnalyzer module. The module EventContentAnalyzer dumps all products stored in an event to the screen. One can add it to the path as follows:
process.dump=cms.EDAnalyzer('EventContentAnalyzer')
Also
REPLACE the line (which should still be the last line)
process.p = cms.Path(process.demo)
WITH
process.p = cms.Path(process.demo*process.dump)
In this case you probably want to run over only one event, change the number of events to 1 ( default is -1 which means all events)
process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(1) )
2. Run the job.
cmsRun Demo/%LOCALCONFFILEDIR%/%LOCALCONFFILE%
The output should be something like this:
[lxplus404 @ ~/workbook/MYDEMOANALYZER/CMSSW_5_3_5/src]$ cmsRun Demo/DemoAnalyzer/demoanalyzer_cfg.py
12-Mar-2013 19:40:13 CET Initiating request to open file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
12-Mar-2013 19:40:18 CET Successfully opened file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:40:18 CET pre-events
no dictionary for class pair<edm::IndexIntoFile::IndexRunKey,Long64_t> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:40:18 CET pre-events
no dictionary for class pair<edm::IndexIntoFile::IndexRunLumiKey,Long64_t> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:40:18 CET pre-events
no dictionary for class pair<edm::BranchKey,edm::ConstBranchDescription> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:40:18 CET pre-events
no dictionary for class pair<edm::BranchID,unsigned int> is available
%MSG
Begin processing the 1st record. Run 1, Event 261746003, LumiSection 872662 at 12-Mar-2013 19:40:19.560 CET
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 19:40:19 CET Run: 1 Event: 261746003
number of tracks 1211
%MSG
++Event 1 contains 695 products with friendlyClassName, moduleLabel, productInstanceName and processName:
++BeamSpotOnlines "scalersRawToDigi" "" "RECO" (productId = 4:1)
++CSCDetIdCSCALCTDigiMuonDigiCollection "hltMuonCSCDigis" "MuonCSCALCTDigi" "HLT" (productId = 3:2)
++CSCDetIdCSCCLCTDigiMuonDigiCollection "hltMuonCSCDigis" "MuonCSCCLCTDigi" "HLT" (productId = 3:4)
...
++triggerTriggerFilterObjectWithRefs "hltPFTau35Track" "" "HLT" (productId = 3:2722)
++triggerTriggerFilterObjectWithRefs "hltPFTau35TrackPt20" "" "HLT" (productId = 3:2723)
++triggerTriggerFilterObjectWithRefs "hltPFTau35TrackPt20LooseIso" "" "HLT" (productId = 3:2724)
++uintedmValueMap "muons" "cosmicsVeto" "RECO" (productId = 4:1021)
12-Mar-2013 19:40:19 CET Closed file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
Summary for key being the concatenation of friendlyClassName, moduleLabel, productInstanceName and processName
1 occurrences of key BeamSpotOnlines + "scalersRawToDigi" + "" "RECO"
1 occurrences of key CSCDetIdCSCALCTDigiMuonDigiCollection + "hltMuonCSCDigis" + "MuonCSCALCTDigi" "HLT"
1 occurrences of key CSCDetIdCSCCLCTDigiMuonDigiCollection + "hltMuonCSCDigis" + "MuonCSCCLCTDigi" "HLT"
...
1 occurrences of key triggerTriggerFilterObjectWithRefs + "hltPFTau35TrackPt20" + "" "HLT"
1 occurrences of key triggerTriggerFilterObjectWithRefs + "hltPFTau35TrackPt20LooseIso" + "" "HLT"
1 occurrences of key uintedmValueMap + "muons" + "cosmicsVeto" "RECO"
TrigReport ---------- Event Summary ------------
TrigReport Events total = 1 passed = 1 failed = 0
TrigReport ---------- Path Summary ------------
TrigReport Trig Bit# Run Passed Failed Error Name
TrigReport 1 0 1 1 0 0 p
TrigReport -------End-Path Summary ------------
TrigReport Trig Bit# Run Passed Failed Error Name
TrigReport ---------- Modules in Path: p ------------
TrigReport Trig Bit# Visited Passed Failed Error Name
TrigReport 1 0 1 1 0 0 demo
TrigReport 1 0 1 1 0 0 dump
TrigReport ---------- Module Summary ------------
TrigReport Visited Run Passed Failed Error Name
TrigReport 1 1 1 0 0 demo
TrigReport 1 1 1 0 0 dump
TrigReport 1 1 1 0 0 TriggerResults
TimeReport ---------- Event Summary ---[sec]----
TimeReport CPU/event = 0.044994 Real/event = 0.048325
TimeReport ---------- Path Summary ---[sec]----
TimeReport per event per path-run
TimeReport CPU Real CPU Real Name
TimeReport 0.044994 0.048223 0.044994 0.048223 p
TimeReport CPU Real CPU Real Name
TimeReport per event per path-run
TimeReport -------End-Path Summary ---[sec]----
TimeReport per event per endpath-run
TimeReport CPU Real CPU Real Name
TimeReport CPU Real CPU Real Name
TimeReport per event per endpath-run
TimeReport ---------- Modules in Path: p ---[sec]----
TimeReport per event per module-visit
TimeReport CPU Real CPU Real Name
TimeReport 0.008999 0.009337 0.008999 0.009337 demo
TimeReport 0.035995 0.038878 0.035995 0.038878 dump
TimeReport CPU Real CPU Real Name
TimeReport per event per module-visit
TimeReport CPU Real CPU Real Name
TimeReport per event per module-visit
TimeReport ---------- Module Summary ---[sec]----
TimeReport per event per module-run per module-visit
TimeReport CPU Real CPU Real CPU Real Name
TimeReport 0.008999 0.009337 0.008999 0.009337 0.008999 0.009337 demo
TimeReport 0.035995 0.038878 0.035995 0.038878 0.035995 0.038878 dump
TimeReport 0.000000 0.000093 0.000000 0.000093 0.000000 0.000093 TriggerResults
TimeReport CPU Real CPU Real CPU Real Name
TimeReport per event per module-run per module-visit
T---Report end!
=============================================
MessageLogger Summary
type category sev module subroutine count total
---- -------------------- -- ---------------- ---------------- ----- -----
1 EventContent -s EventContentAnal 696 696
2 EventContent -s EventContentAnal 696 696
3 fileAction -s file_close 1 1
4 fileAction -s file_open 2 2
type category Examples: run/evt run/evt run/evt
---- -------------------- ---------------- ---------------- ----------------
1 EventContent 1/261746003 1/261746003 1/261746003
2 EventContent PostEndRun PostEndRun PostEndRun
3 fileAction PostEndRun
4 fileAction pre-events pre-events
Severity # Occurrences Total Occurrences
-------- ------------- -----------------
System 1395 1395
3. A different way of looking at the content of
ROOT file is using the command line tool
edmDumpEventContent
. Compare that output with the result of the above example with the
EventContentAnalyzer
. So you would know what is available in your root file that you want to access.
To do this do:
edmDumpEventContent /afs/cern.ch/cms/Tutorials/TWIKI_DATA/TTJets_8TeV_53X.root
For example if your
edmDumpEventContent
looks like this:
[lxplus404 @ ~/workbook/MYDEMOANALYZER/CMSSW_5_3_5/src]$ edmDumpEventContent /afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
Type Module Label Process
----------------------------------------------------------------------------------------------
LHEEventProduct "source" "" "LHE"
GenEventInfoProduct "generator" "" "SIM"
edm::HepMCProduct "generator" "" "SIM"
edm::TriggerResults "TriggerResults" "" "SIM"
vector<SimTrack> "g4SimHits" "" "SIM"
vector<SimVertex> "g4SimHits" "" "SIM"
vector<int> "genParticles" "" "SIM"
vector<reco::GenJet> "ak5GenJets" "" "SIM"
vector<reco::GenJet> "ak7GenJets" "" "SIM"
vector<reco::GenJet> "iterativeCone5GenJets" "" "SIM"
vector<reco::GenJet> "kt4GenJets" "" "SIM"
vector<reco::GenJet> "kt6GenJets" "" "SIM"
vector<reco::GenMET> "genMetCalo" "" "SIM"
vector<reco::GenMET> "genMetCaloAndNonPrompt" "" "SIM"
vector<reco::GenMET> "genMetTrue" "" "SIM"
vector<reco::GenParticle> "genParticles" "" "SIM"
FEDRawDataCollection "rawDataCollector" "" "HLT"
L1GlobalTriggerObjectMapRecord "hltL1GtObjectMap" "" "HLT"
L1GlobalTriggerReadoutRecord "hltGtDigis" "" "HLT"
L1MuGMTReadoutCollection "hltGtDigis" "" "HLT"
MuonDigiCollection<CSCDetId,CSCALCTDigi> "hltMuonCSCDigis" "MuonCSCALCTDigi" "HLT"
MuonDigiCollection<CSCDetId,CSCCLCTDigi> "hltMuonCSCDigis" "MuonCSCCLCTDigi" "HLT"
MuonDigiCollection<CSCDetId,CSCComparatorDigi> "hltMuonCSCDigis" "MuonCSCComparatorDigi" "HLT"
MuonDigiCollection<CSCDetId,CSCCorrelatedLCTDigi> "hltMuonCSCDigis" "MuonCSCCorrelatedLCTDigi" "HLT"
MuonDigiCollection<CSCDetId,CSCDCCFormatStatusDigi> "hltMuonCSCDigis" "MuonCSCDCCFormatStatusDigi" "HLT"
MuonDigiCollection<CSCDetId,CSCRPCDigi> "hltMuonCSCDigis" "MuonCSCRPCDigi" "HLT"
MuonDigiCollection<CSCDetId,CSCStripDigi> "hltMuonCSCDigis" "MuonCSCStripDigi" "HLT"
MuonDigiCollection<CSCDetId,CSCWireDigi> "hltMuonCSCDigis" "MuonCSCWireDigi" "HLT"
MuonDigiCollection<DTChamberId,DTLocalTrigger> "hltMuonDTDigis" "" "HLT"
MuonDigiCollection<DTLayerId,DTDigi> "hltMuonDTDigis" "" "HLT"
MuonDigiCollection<DTLayerId,DTDigiSimLink> "simMuonDTDigis" "" "HLT"
MuonDigiCollection<RPCDetId,RPCDigi> "hltMuonRPCDigis" "" "HLT"
RPCRawDataCounts "hltMuonRPCDigis" "" "HLT"
double "hltAntiKT5CaloJets" "rho" "HLT"
double "hltAntiKT5PFJets" "rho" "HLT"
double "hltAntiKT5CaloJets" "sigma" "HLT"
double "hltAntiKT5PFJets" "sigma" "HLT"
edm::AssociationMap<edm::OneToMany<vector<L2MuonTrajectorySeed>,vector<L2MuonTrajectorySeed>,unsigned int> > "hltL2Muons" "" "HLT"
edm::AssociationMap<edm::OneToOne<vector<Trajectory>,vector<reco::Track>,unsigned short> > "hltL2Muons" "" "HLT"
edm::AssociationMap<edm::OneToOne<vector<Trajectory>,vector<reco::Track>,unsigned short> > "hltL3MuonsIOHit" "" "HLT"
edm::AssociationMap<edm::OneToOne<vector<Trajectory>,vector<reco::Track>,unsigned short> > "hltL3MuonsOIHit" "" "HLT"
edm::AssociationMap<edm::OneToOne<vector<Trajectory>,vector<reco::Track>,unsigned short> > "hltL3MuonsOIState" "" "HLT"
edm::AssociationMap<edm::OneToOne<vector<Trajectory>,vector<reco::Track>,unsigned short> > "hltL3TkTracksFromL2IOHit" "" "HLT"
edm::AssociationMap<edm::OneToOne<vector<Trajectory>,vector<reco::Track>,unsigned short> > "hltL3TkTracksFromL2OIHit" "" "HLT"
edm::AssociationMap<edm::OneToOne<vector<Trajectory>,vector<reco::Track>,unsigned short> > "hltL3TkTracksFromL2OIState" "" "HLT"
edm::AssociationMap<edm::OneToOne<vector<Trajectory>,vector<reco::Track>,unsigned short> > "hltL3MuonsIOHit" "L2Seeded" "HLT"
edm::AssociationMap<edm::OneToOne<vector<Trajectory>,vector<reco::Track>,unsigned short> > "hltL3MuonsOIHit" "L2Seeded" "HLT"
edm::AssociationMap<edm::OneToOne<vector<Trajectory>,vector<reco::Track>,unsigned short> > "hltL3MuonsOIState" "L2Seeded" "HLT"
edm::AssociationMap<edm::OneToOne<vector<reco::Track>,vector<reco::Track>,unsigned int> > "hltL2Muons" "" "HLT"
edm::AssociationVector<edm::RefToBaseProd<reco::Jet>,vector<edm::RefVector<vector<reco::Track>,reco::Track,edm::refhelper::FindUsingAdvance<vector<reco::Track>,reco::Track> > >,edm::RefToBase<reco::Jet>,unsigned int,edm::helper::AssociationIdenticalKeyReference> "hltPFTauJetTracksAssociator" "" "HLT"
edm::DetSetVector<RPCDigiSimLink> "simMuonRPCDigis" "RPCDigiSimLink" "HLT"
edm::DetSetVector<StripDigiSimLink> "simMuonCSCDigis" "MuonCSCStripDigiSimLinks" "HLT"
edm::DetSetVector<StripDigiSimLink> "simMuonCSCDigis" "MuonCSCWireDigiSimLinks" "HLT"
edm::LazyGetter<SiStripCluster> "hltSiStripRawToClustersFacility" "" "HLT"
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> > "hltL2Muons" "" "HLT"
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> > "hltL3MuonsIOHit" "" "HLT"
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> > "hltL3MuonsOIHit" "" "HLT"
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> > "hltL3MuonsOIState" "" "HLT"
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> > "hltL3TkTracksFromL2IOHit" "" "HLT"
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> > "hltL3TkTracksFromL2OIHit" "" "HLT"
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> > "hltL3TkTracksFromL2OIState" "" "HLT"
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> > "hltL3MuonsIOHit" "L2Seeded" "HLT"
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> > "hltL3MuonsOIHit" "L2Seeded" "HLT"
edm::OwnVector<TrackingRecHit,edm::ClonePolicy<TrackingRecHit> > "hltL3MuonsOIState" "L2Seeded" "HLT"
edm::RangeMap<CSCDetId,edm::OwnVector<CSCRecHit2D,edm::ClonePolicy<CSCRecHit2D> >,edm::ClonePolicy<CSCRecHit2D> > "hltCsc2DRecHits" "" "HLT"
edm::RangeMap<CSCDetId,edm::OwnVector<CSCSegment,edm::ClonePolicy<CSCSegment> >,edm::ClonePolicy<CSCSegment> > "hltCscSegments" "" "HLT"
edm::RangeMap<DTChamberId,edm::OwnVector<DTRecSegment4D,edm::ClonePolicy<DTRecSegment4D> >,edm::ClonePolicy<DTRecSegment4D> > "hltDt4DSegments" "" "HLT"
edm::RangeMap<RPCDetId,edm::OwnVector<RPCRecHit,edm::ClonePolicy<RPCRecHit> >,edm::ClonePolicy<RPCRecHit> > "hltRpcRecHits" "" "HLT"
edm::RefVector<vector<reco::Track>,reco::Track,edm::refhelper::FindUsingAdvance<vector<reco::Track>,reco::Track> > "hltBSoftMuonMu5L3" "" "HLT"
edm::SortedCollection<CaloTower,edm::StrictWeakOrdering<CaloTower> > "hltTowerMakerForAll" "" "HLT"
edm::SortedCollection<CaloTower,edm::StrictWeakOrdering<CaloTower> > "hltTowerMakerForMuons" "" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltEcalRecHitAll" "EcalRecHitsEB" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltEcalRecHitAll" "EcalRecHitsEE" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaEtaEBUncalibrator" "etaEcalRecHitsEB" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaEtaEEUncalibrator" "etaEcalRecHitsEB" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaEtaRecHitsFilterEBonly" "etaEcalRecHitsEB" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaEtaEBUncalibrator" "etaEcalRecHitsEE" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaEtaEEUncalibrator" "etaEcalRecHitsEE" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaEtaRecHitsFilterEEonly" "etaEcalRecHitsEE" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaEtaRecHitsFilterEBonly" "etaEcalRecHitsES" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaEtaRecHitsFilterEEonly" "etaEcalRecHitsES" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPhiSymStream" "phiSymEcalRecHitsEB" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPhiSymUncalibrator" "phiSymEcalRecHitsEB" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPhiSymStream" "phiSymEcalRecHitsEE" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPhiSymUncalibrator" "phiSymEcalRecHitsEE" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPi0EBUncalibrator" "pi0EcalRecHitsEB" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPi0EEUncalibrator" "pi0EcalRecHitsEB" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPi0RecHitsFilterEBonly" "pi0EcalRecHitsEB" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPi0EBUncalibrator" "pi0EcalRecHitsEE" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPi0EEUncalibrator" "pi0EcalRecHitsEE" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPi0RecHitsFilterEEonly" "pi0EcalRecHitsEE" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPi0RecHitsFilterEBonly" "pi0EcalRecHitsES" "HLT"
edm::SortedCollection<EcalRecHit,edm::StrictWeakOrdering<EcalRecHit> > "hltAlCaPi0RecHitsFilterEEonly" "pi0EcalRecHitsES" "HLT"
edm::SortedCollection<ZDCDataFrame,edm::StrictWeakOrdering<ZDCDataFrame> > "simHcalUnsuppressedDigis" "" "HLT"
edm::TriggerResults "TriggerResults" "" "HLT"
edmNew::DetSetVector<SiPixelCluster> "hltSiPixelClusters" "" "HLT"
reco::BeamSpot "hltOnlineBeamSpot" "" "HLT"
vector<DcsStatus> "hltScalersRawToDigi" "" "HLT"
vector<L1MuGMTCand> "hltGtDigis" "" "HLT"
vector<L2MuonTrajectorySeed> "hltL2MuonSeeds" "" "HLT"
vector<L3MuonTrajectorySeed> "hltL3TrajSeedIOHit" "" "HLT"
vector<L3MuonTrajectorySeed> "hltL3TrajSeedOIHit" "" "HLT"
vector<L3MuonTrajectorySeed> "hltL3TrajSeedOIState" "" "HLT"
vector<L3MuonTrajectorySeed> "hltL3TrajectorySeed" "" "HLT"
vector<LumiScalers> "hltScalersRawToDigi" "" "HLT"
vector<PileupSummaryInfo> "addPileupInfo" "" "HLT"
vector<TrackCandidate> "hltL3TrackCandidateFromL2IOHit" "" "HLT"
vector<TrackCandidate> "hltL3TrackCandidateFromL2OIHit" "" "HLT"
vector<TrackCandidate> "hltL3TrackCandidateFromL2OIState" "" "HLT"
vector<double> "hltAntiKT5CaloJets" "rhos" "HLT"
vector<double> "hltAntiKT5PFJets" "rhos" "HLT"
vector<double> "hltAntiKT5CaloJets" "sigmas" "HLT"
vector<double> "hltAntiKT5PFJets" "sigmas" "HLT"
vector<l1extra::L1EmParticle> "hltL1extraParticles" "Isolated" "HLT"
vector<l1extra::L1EmParticle> "hltL1extraParticles" "NonIsolated" "HLT"
vector<l1extra::L1EtMissParticle> "hltL1extraParticles" "MET" "HLT"
vector<l1extra::L1EtMissParticle> "hltL1extraParticles" "MHT" "HLT"
vector<l1extra::L1HFRings> "hltL1extraParticles" "" "HLT"
vector<l1extra::L1JetParticle> "hltL1extraParticles" "Central" "HLT"
vector<l1extra::L1JetParticle> "hltL1extraParticles" "Forward" "HLT"
vector<l1extra::L1JetParticle> "hltL1extraParticles" "Tau" "HLT"
vector<l1extra::L1MuonParticle> "hltL1extraParticles" "" "HLT"
vector<reco::CaloJet> "hltAntiKT5CaloJets" "" "HLT"
vector<reco::CaloJet> "hltCaloJetCorrected" "" "HLT"
vector<reco::CaloJet> "hltCaloJetCorrectedRegional" "" "HLT"
vector<reco::CaloJet> "hltL2TauJets" "" "HLT"
vector<reco::CaloMET> "hltMet" "" "HLT"
vector<reco::Electron> "hltPixelMatch3HitElectronsActivity" "" "HLT"
vector<reco::Electron> "hltPixelMatch3HitElectronsL1Seeded" "" "HLT"
vector<reco::Electron> "hltPixelMatchElectronsActivity" "" "HLT"
vector<reco::Electron> "hltPixelMatchElectronsL1Seeded" "" "HLT"
vector<reco::IsolatedPixelTrackCandidate> "hltHITIPTCorrectorHB" "" "HLT"
vector<reco::IsolatedPixelTrackCandidate> "hltHITIPTCorrectorHE" "" "HLT"
vector<reco::IsolatedPixelTrackCandidate> "hltIsolPixelTrackProdHB" "" "HLT"
vector<reco::IsolatedPixelTrackCandidate> "hltIsolPixelTrackProdHE" "" "HLT"
vector<reco::MuonTrackLinks> "hltL3MuonsIOHit" "" "HLT"
vector<reco::MuonTrackLinks> "hltL3MuonsLinksCombination" "" "HLT"
vector<reco::MuonTrackLinks> "hltL3MuonsOIHit" "" "HLT"
vector<reco::MuonTrackLinks> "hltL3MuonsOIState" "" "HLT"
vector<reco::PFCandidate> "hltParticleFlow" "" "HLT"
vector<reco::PFCandidate> "hltParticleFlow" "AddedMuonsAndHadrons" "HLT"
vector<reco::PFCandidate> "hltParticleFlow" "CleanedCosmicsMuons" "HLT"
vector<reco::PFCandidate> "hltParticleFlow" "CleanedFakeMuons" "HLT"
vector<reco::PFCandidate> "hltParticleFlow" "CleanedHF" "HLT"
vector<reco::PFCandidate> "hltParticleFlow" "CleanedPunchThroughMuons" "HLT"
vector<reco::PFCandidate> "hltParticleFlow" "CleanedPunchThroughNeutralHadrons" "HLT"
vector<reco::PFCandidate> "hltParticleFlow" "CleanedTrackerAndGlobalMuons" "HLT"
vector<reco::PFJet> "hltAntiKT5PFJets" "" "HLT"
vector<reco::PFTauTagInfo> "hltPFTauTagInfo" "" "HLT"
vector<reco::RecoChargedCandidate> "hltL2MuonCandidates" "" "HLT"
vector<reco::RecoChargedCandidate> "hltL2MuonCandidatesNoVtx" "" "HLT"
vector<reco::RecoChargedCandidate> "hltL3MuonCandidates" "" "HLT"
vector<reco::RecoChargedCandidate> "hltMuTrackJpsiCtfTrackCands" "" "HLT"
vector<reco::RecoChargedCandidate> "hltMuTrackJpsiPixelTrackCands" "" "HLT"
vector<reco::RecoEcalCandidate> "hltL1SeededRecoEcalCandidate" "" "HLT"
vector<reco::RecoEcalCandidate> "hltRecoEcalSuperClusterActivityCandidate" "" "HLT"
vector<reco::RecoEcalCandidate> "hltRecoEcalSuperClusterActivityCandidateSC4" "" "HLT"
vector<reco::Track> "hltL2Muons" "" "HLT"
vector<reco::Track> "hltL3Muons" "" "HLT"
vector<reco::Track> "hltL3MuonsIOHit" "" "HLT"
vector<reco::Track> "hltL3MuonsOIHit" "" "HLT"
vector<reco::Track> "hltL3MuonsOIState" "" "HLT"
vector<reco::Track> "hltL3TkFromL2OICombination" "" "HLT"
vector<reco::Track> "hltL3TkTracksFromL2" "" "HLT"
vector<reco::Track> "hltL3TkTracksFromL2IOHit" "" "HLT"
vector<reco::Track> "hltL3TkTracksFromL2OIHit" "" "HLT"
vector<reco::Track> "hltL3TkTracksFromL2OIState" "" "HLT"
vector<reco::Track> "hltL3MuonsIOHit" "L2Seeded" "HLT"
vector<reco::Track> "hltL3MuonsOIHit" "L2Seeded" "HLT"
vector<reco::Track> "hltL3MuonsOIState" "L2Seeded" "HLT"
vector<reco::Track> "hltL2Muons" "UpdatedAtVtx" "HLT"
vector<reco::TrackExtra> "hltL2Muons" "" "HLT"
vector<reco::TrackExtra> "hltL3MuonsIOHit" "" "HLT"
vector<reco::TrackExtra> "hltL3MuonsOIHit" "" "HLT"
vector<reco::TrackExtra> "hltL3MuonsOIState" "" "HLT"
vector<reco::TrackExtra> "hltL3TkTracksFromL2IOHit" "" "HLT"
vector<reco::TrackExtra> "hltL3TkTracksFromL2OIHit" "" "HLT"
vector<reco::TrackExtra> "hltL3TkTracksFromL2OIState" "" "HLT"
vector<reco::TrackExtra> "hltL3MuonsIOHit" "L2Seeded" "HLT"
vector<reco::TrackExtra> "hltL3MuonsOIHit" "L2Seeded" "HLT"
vector<reco::TrackExtra> "hltL3MuonsOIState" "L2Seeded" "HLT"
trigger::TriggerEvent "hltTriggerSummaryAOD" "" "HLT"
trigger::TriggerEventWithRefs "hltTriggerSummaryRAW" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltL1MatchedLooseIsoPFTau20" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltL1sDoubleTauJet44erorDoubleJetC64" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltL1sL1ETM36or40" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltL1sL1SingleEG12" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltL1sMu12Eta2p1ETM20" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltMu8Ele17CaloIdTCaloIsoVLPixelMatchFilter" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltOverlapFilterIsoEle20LooseIsoPFTau20L1Jet" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltPFTau20" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltPFTau20Track" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltPFTau20TrackLooseIso" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltPFTau35" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltPFTau35Track" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltPFTau35TrackPt20" "" "HLT"
trigger::TriggerFilterObjectWithRefs "hltPFTau35TrackPt20LooseIso" "" "HLT"
EBDigiCollection "selectDigi" "selectedEcalEBDigiCollection" "RECO"
EEDigiCollection "selectDigi" "selectedEcalEEDigiCollection" "RECO"
EcalTrigPrimCompactColl "ecalCompactTrigPrim" "" "RECO"
HcalNoiseSummary "hcalnoise" "" "RECO"
L1GlobalTriggerObjectMaps "l1L1GtObjectMap" "" "RECO"
L1GlobalTriggerReadoutRecord "gtDigis" "" "RECO"
L1MuGMTReadoutCollection "gtDigis" "" "RECO"
double "fixedGridRhoAll" "" "RECO"
double "fixedGridRhoFastjetAll" "" "RECO"
double "ak5CaloJets" "rho" "RECO"
double "ak5PFJets" "rho" "RECO"
double "ak5TrackJets" "rho" "RECO"
double "ak7BasicJets" "rho" "RECO"
double "ak7CaloJets" "rho" "RECO"
double "ak7PFJets" "rho" "RECO"
double "iterativeCone5CaloJets" "rho" "RECO"
double "iterativeCone5PFJets" "rho" "RECO"
double "kt4CaloJets" "rho" "RECO"
double "kt4PFJets" "rho" "RECO"
double "kt4TrackJets" "rho" "RECO"
double "kt6CaloJets" "rho" "RECO"
double "kt6CaloJetsCentral" "rho" "RECO"
double "kt6PFJets" "rho" "RECO"
double "kt6PFJetsCentralChargedPileUp" "rho" "RECO"
double "kt6PFJetsCentralNeutral" "rho" "RECO"
double "kt6PFJetsCentralNeutralTight" "rho" "RECO"
double "ak5CaloJets" "sigma" "RECO"
double "ak5PFJets" "sigma" "RECO"
double "ak5TrackJets" "sigma" "RECO"
double "ak7BasicJets" "sigma" "RECO"
double "ak7CaloJets" "sigma" "RECO"
double "ak7PFJets" "sigma" "RECO"
double "iterativeCone5CaloJets" "sigma" "RECO"
double "iterativeCone5PFJets" "sigma" "RECO"
double "kt4CaloJets" "sigma" "RECO"
double "kt4PFJets" "sigma" "RECO"
double "kt4TrackJets" "sigma" "RECO"
double "kt6CaloJets" "sigma" "RECO"
double "kt6CaloJetsCentral" "sigma" "RECO"
double "kt6PFJets" "sigma" "RECO"
double "kt6PFJetsCentralChargedPileUp" "sigma" "RECO"
double "kt6PFJetsCentralNeutral" "sigma" "RECO"
double "kt6PFJetsCentralNeutralTight" "sigma" "RECO"
edm::AssociationMap<edm::OneToOne<vector<reco::SuperCluster>,vector<reco::HFEMClusterShape>,unsigned int> > "hfEMClusters" "" "RECO"
...
vector<reco::Track> "ckfInOutTracksFromConversions" "" "RECO"
vector<reco::Track> "ckfOutInTracksFromConversions" "" "RECO"
vector<reco::Track> "conversionStepTracks" "" "RECO"
vector<reco::Track> "cosmicMuons" "" "RECO"
vector<reco::Track> "cosmicMuons1Leg" "" "RECO"
vector<reco::Track> "cosmicsVetoTracks" "" "RECO"
vector<reco::Track> "generalTracks" "" "RECO"
vector<reco::Track> "globalCosmicMuons" "" "RECO"
vector<reco::Track> "globalCosmicMuons1Leg" "" "RECO"
vector<reco::Track> "globalMuons" "" "RECO"
vector<reco::Track> "globalSETMuons" "" "RECO"
vector<reco::Track> "pixelTracks" "" "RECO"
vector<reco::Track> "refittedStandAloneMuons" "" "RECO"
vector<reco::Track> "regionalCosmicTracks" "" "RECO"
vector<reco::Track> "standAloneMuons" "" "RECO"
vector<reco::Track> "standAloneSETMuons" "" "RECO"
vector<reco::Track> "uncleanedOnlyCkfInOutTracksFromConversions" "" "RECO"
vector<reco::Track> "uncleanedOnlyCkfOutInTracksFromConversions" "" "RECO"
vector<reco::Track> "refittedStandAloneMuons" "UpdatedAtVtx" "RECO"
vector<reco::Track> "standAloneMuons" "UpdatedAtVtx" "RECO"
vector<reco::Track> "standAloneSETMuons" "UpdatedAtVtx" "RECO"
vector<reco::Track> "tevMuons" "default" "RECO"
vector<reco::Track> "tevMuons" "dyt" "RECO"
vector<reco::Track> "tevMuons" "firstHit" "RECO"
vector<reco::Track> "impactParameterTagInfos" "ghostTracks" "RECO"
vector<reco::Track> "tevMuons" "picky" "RECO"
vector<reco::TrackExtra> "ckfInOutTracksFromConversions" "" "RECO"
vector<reco::TrackExtra> "ckfOutInTracksFromConversions" "" "RECO"
vector<reco::TrackExtra> "conversionStepTracks" "" "RECO"
vector<reco::TrackExtra> "cosmicMuons" "" "RECO"
vector<reco::TrackExtra> "cosmicMuons1Leg" "" "RECO"
vector<reco::TrackExtra> "electronGsfTracks" "" "RECO"
vector<reco::TrackExtra> "generalTracks" "" "RECO"
vector<reco::TrackExtra> "globalCosmicMuons" "" "RECO"
vector<reco::TrackExtra> "globalCosmicMuons1Leg" "" "RECO"
vector<reco::TrackExtra> "globalMuons" "" "RECO"
vector<reco::TrackExtra> "globalSETMuons" "" "RECO"
vector<reco::TrackExtra> "pixelTracks" "" "RECO"
vector<reco::TrackExtra> "refittedStandAloneMuons" "" "RECO"
vector<reco::TrackExtra> "regionalCosmicTracks" "" "RECO"
vector<reco::TrackExtra> "standAloneMuons" "" "RECO"
vector<reco::TrackExtra> "standAloneSETMuons" "" "RECO"
vector<reco::TrackExtra> "uncleanedOnlyCkfInOutTracksFromConversions" "" "RECO"
vector<reco::TrackExtra> "uncleanedOnlyCkfOutInTracksFromConversions" "" "RECO"
vector<reco::TrackExtra> "tevMuons" "default" "RECO"
vector<reco::TrackExtra> "tevMuons" "dyt" "RECO"
vector<reco::TrackExtra> "tevMuons" "firstHit" "RECO"
vector<reco::TrackExtra> "tevMuons" "picky" "RECO"
vector<reco::TrackExtrapolation> "trackExtrapolator" "" "RECO"
vector<reco::TrackIPTagInfo> "impactParameterTagInfos" "" "RECO"
vector<reco::TrackJet> "ak5TrackJets" "" "RECO"
vector<reco::TrackJet> "kt4TrackJets" "" "RECO"
vector<reco::Vertex> "offlinePrimaryVertices" "" "RECO"
vector<reco::Vertex> "offlinePrimaryVerticesWithBS" "" "RECO"
vector<reco::Vertex> "pixelVertices" "" "RECO"
vector<reco::VertexCompositeCandidate> "generalV0Candidates" "Kshort" "RECO"
vector<reco::VertexCompositeCandidate> "generalV0Candidates" "Lambda" "RECO"
and if you want to use
"generalTracks"
in your code (which we already did in the
DemoAnalyzer.cc
code above), you would do the following:
Handle<reco::TrackCollection> tracks;
iEvent.getByLabel("generalTracks", tracks);
Watch the processing of a job
1. Edit the configuration file,
%LOCALCONFFILE%
, and add the Tracer service ( just above the line
process.p = cms.Path(process.demo*process.dump)
) in
%LOCALCONFFILE%
. This service identifies what module is called and when.
process.Tracer = cms.Service("Tracer")
You can remove the module
dump
if you added it as explained above, and set maxEvents to -1 (run over all events in the file).
2. Run the job
cmsRun Demo/%LOCALCONFFILEDIR%/%LOCALCONFFILE%
The output should be something like this:
[lxplus404 @ ~/workbook/MYDEMOANALYZER/CMSSW_5_3_5/src]$ cmsRun Demo/DemoAnalyzer/demoanalyzer_cfg.py
++ constructing source:PoolSource
++++open input file
12-Mar-2013 19:52:44 CET Initiating request to open file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
12-Mar-2013 19:52:49 CET Successfully opened file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
++++finished: open input file
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:52:49 CET pre-events
no dictionary for class pair<edm::IndexIntoFile::IndexRunKey,Long64_t> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:52:49 CET pre-events
no dictionary for class pair<edm::IndexIntoFile::IndexRunLumiKey,Long64_t> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:52:49 CET pre-events
no dictionary for class pair<edm::BranchKey,edm::ConstBranchDescription> is available
%MSG
%MSG-i Root_Information: AfterFile TClass::TClass() 12-Mar-2013 19:52:49 CET pre-events
no dictionary for class pair<edm::BranchID,unsigned int> is available
%MSG
++ construction finished:PoolSource
++ constructing module:demo
++ construction finished:demo
++ constructing module:TriggerResults
++ construction finished:TriggerResults
%MSG-i path: AfterModConstruction 12-Mar-2013 19:52:49 CET pre-events
The following module labels are not assigned to any path:
'dump'
%MSG
++ beginJob module:demo
++ beginJob finished:demo
++ beginJob module:TriggerResults
++ beginJob finished:TriggerResults
++ Job started
++++source run
++++finished: source run
++++ processing begin run:run: 1 time:1
++++++ processing path for begin run:p
++++++++ module for begin run:demo
++++++++ finished for begin run:demo
++++++ finished path for begin run:p
++++++++ module for begin run:TriggerResults
++++++++ finished for begin run:TriggerResults
++++ finished begin run:
++++source lumi
++++finished: source lumi
++++ processing begin lumi:run: 1 luminosityBlock: 872662 time:1230005000001
++++++ processing path for begin lumi:p
++++++++ module for begin lumi:demo
++++++++ finished for begin lumi:demo
++++++ finished path for begin lumi:p
++++++++ module for begin lumi:TriggerResults
++++++++ finished for begin lumi:TriggerResults
++++ finished begin lumi:
++++source event
++++finished: source event
Begin processing the 1st record. Run 1, Event 261746003, LumiSection 872662 at 12-Mar-2013 19:52:50.041 CET
++++ processing event:run: 1 lumi: 872662 event: 261746003 time:1230015000001
++++++ processing path for event:p
++++++++ module for event:demo
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 19:52:50 CET Run: 1 Event: 261746003
number of tracks 1211
%MSG
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++source event
++++finished: source event
Begin processing the 2nd record. Run 1, Event 261746009, LumiSection 872662 at 12-Mar-2013 19:52:50.052 CET
++++ processing event:run: 1 lumi: 872662 event: 261746009 time:1230045000001
++++++ processing path for event:p
++++++++ module for event:demo
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++source event
++++finished: source event
Begin processing the 3rd record. Run 1, Event 261746010, LumiSection 872662 at 12-Mar-2013 19:52:50.055 CET
++++ processing event:run: 1 lumi: 872662 event: 261746010 time:1230050000001
++++++ processing path for event:p
++++++++ module for event:demo
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 19:52:50 CET Run: 1 Event: 261746010
number of tracks 1535
%MSG
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++source event
++++finished: source event
Begin processing the 4th record. Run 1, Event 261746019, LumiSection 872662 at 12-Mar-2013 19:52:50.060 CET
++++ processing event:run: 1 lumi: 872662 event: 261746019 time:1230095000001
++++++ processing path for event:p
++++++++ module for event:demo
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++source event
++++finished: source event
Begin processing the 5th record. Run 1, Event 261746021, LumiSection 872662 at 12-Mar-2013 19:52:50.063 CET
++++ processing event:run: 1 lumi: 872662 event: 261746021 time:1230105000001
++++++ processing path for event:p
++++++++ module for event:demo
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++source event
++++finished: source event
Begin processing the 6th record. Run 1, Event 261746029, LumiSection 872662 at 12-Mar-2013 19:52:50.066 CET
++++ processing event:run: 1 lumi: 872662 event: 261746029 time:1230145000001
++++++ processing path for event:p
++++++++ module for event:demo
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++source event
++++finished: source event
Begin processing the 7th record. Run 1, Event 261746030, LumiSection 872662 at 12-Mar-2013 19:52:50.068 CET
++++ processing event:run: 1 lumi: 872662 event: 261746030 time:1230150000001
++++++ processing path for event:p
++++++++ module for event:demo
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++source event
++++finished: source event
Begin processing the 8th record. Run 1, Event 261746031, LumiSection 872662 at 12-Mar-2013 19:52:50.071 CET
++++ processing event:run: 1 lumi: 872662 event: 261746031 time:1230155000001
++++++ processing path for event:p
++++++++ module for event:demo
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 19:52:50 CET Run: 1 Event: 261746031
number of tracks 1350
%MSG
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++source event
++++finished: source event
Begin processing the 9th record. Run 1, Event 261746034, LumiSection 872662 at 12-Mar-2013 19:52:50.075 CET
++++ processing event:run: 1 lumi: 872662 event: 261746034 time:1230170000001
++++++ processing path for event:p
++++++++ module for event:demo
%MSG-i Demo: DemoAnalyzer:demo 12-Mar-2013 19:52:50 CET Run: 1 Event: 261746034
number of tracks 1084
%MSG
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++source event
++++finished: source event
Begin processing the 10th record. Run 1, Event 261746035, LumiSection 872662 at 12-Mar-2013 19:52:50.079 CET
++++ processing event:run: 1 lumi: 872662 event: 261746035 time:1230175000001
++++++ processing path for event:p
++++++++ module for event:demo
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++source event
++++finished: source event
Begin processing the 11th record. Run 1, Event 261746036, LumiSection 872662 at 12-Mar-2013 19:52:50.082 CET
++++ processing event:run: 1 lumi: 872662 event: 261746036 time:1230180000001
++++++ processing path for event:p
++++++++ module for event:demo
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
...
++++source event
++++finished: source event
Begin processing the 50th record. Run 1, Event 261746142, LumiSection 872662 at 12-Mar-2013 19:52:50.251 CET
++++ processing event:run: 1 lumi: 872662 event: 261746142 time:1230710000001
++++++ processing path for event:p
++++++++ module for event:demo
++++++++ finished for event:demo
++++++ finished path for event:p
++++++++ module for event:TriggerResults
++++++++ finished for event:TriggerResults
++++ finished event:
++++ processing end lumi:run: 1 luminosityBlock: 872662 time:1230005000001
++++++ processing path for end lumi:p
++++++++ module for end lumi:demo
++++++++ finished for end lumi:demo
++++++ finished path for end lumi:p
++++++++ module for end lumi:TriggerResults
++++++++ finished for end lumi:TriggerResults
++++ finished end lumi:
++++ processing end run:run: 1 time:2500000000001
++++++ processing path for end run:p
++++++++ module for end run:demo
++++++++ finished for end run:demo
++++++ finished path for end run:p
++++++++ module for end run:TriggerResults
++++++++ finished for end run:TriggerResults
++++ finished end run:
++++close input file
12-Mar-2013 19:52:50 CET Closed file file:/afs/cern.ch/cms/Tutorials/TTJets_RECO_5_3_4.root
++++finished: close input file
++ endJob module:demo
++ endJob finished:demo
++ endJob module:TriggerResults
++ endJob finished:TriggerResults
TrigReport ---------- Event Summary ------------
TrigReport Events total = 50 passed = 50 failed = 0
TrigReport ---------- Path Summary ------------
TrigReport Trig Bit# Run Passed Failed Error Name
TrigReport 1 0 50 50 0 0 p
TrigReport -------End-Path Summary ------------
TrigReport Trig Bit# Run Passed Failed Error Name
TrigReport ---------- Modules in Path: p ------------
TrigReport Trig Bit# Visited Passed Failed Error Name
TrigReport 1 0 50 50 0 0 demo
TrigReport ---------- Module Summary ------------
TrigReport Visited Run Passed Failed Error Name
TrigReport 50 50 50 0 0 demo
TrigReport 50 50 50 0 0 TriggerResults
TimeReport ---------- Event Summary ---[sec]----
TimeReport CPU/event = 0.003100 Real/event = 0.003062
TimeReport ---------- Path Summary ---[sec]----
TimeReport per event per path-run
TimeReport CPU Real CPU Real Name
TimeReport 0.003040 0.003005 0.003040 0.003005 p
TimeReport CPU Real CPU Real Name
TimeReport per event per path-run
TimeReport -------End-Path Summary ---[sec]----
TimeReport per event per endpath-run
TimeReport CPU Real CPU Real Name
TimeReport CPU Real CPU Real Name
TimeReport per event per endpath-run
TimeReport ---------- Modules in Path: p ---[sec]----
TimeReport per event per module-visit
TimeReport CPU Real CPU Real Name
TimeReport 0.003040 0.003002 0.003040 0.003002 demo
TimeReport CPU Real CPU Real Name
TimeReport per event per module-visit
TimeReport CPU Real CPU Real Name
TimeReport per event per module-visit
TimeReport ---------- Module Summary ---[sec]----
TimeReport per event per module-run per module-visit
TimeReport CPU Real CPU Real CPU Real Name
TimeReport 0.003040 0.003002 0.003040 0.003002 0.003040 0.003002 demo
TimeReport 0.000060 0.000044 0.000060 0.000044 0.000060 0.000044 TriggerResults
TimeReport CPU Real CPU Real CPU Real Name
TimeReport per event per module-run per module-visit
T---Report end!
++ Job ended
=============================================
MessageLogger Summary
type category sev module subroutine count total
---- -------------------- -- ---------------- ---------------- ----- -----
1 fileAction -s file_close 1 1
2 fileAction -s file_open 2 2
type category Examples: run/evt run/evt run/evt
---- -------------------- ---------------- ---------------- ----------------
1 fileAction PostEndRun
2 fileAction pre-events pre-events
Severity # Occurrences Total Occurrences
-------- ------------- -----------------
System 3 3
Add a histogram
I will not go into detail about how you should add histograms. Below are the three files that you have to change to plot a histogram. Compare it to what you already have and make changes by seeing what is missing. If you do not want to make changes, just copy and paste them as is. Here are the steps. The order in which you do them does not matter.
1. Make changes in your demo analyzer
DemoAnalyzer.cc
and it should like this
DemoAnalyzer.cc.
2. Make changes in the
BuildFile.xml
and it should look like this:
BuildFile.xml
3. Make changes in the config file
%LOCALCONFFILE%
and it should like this
%LOCALCONFFILE%
Then do
scram b
and run it by doing
cmsRun Demo/%LOCALCONFFILEDIR%/%LOCALCONFFILE%
You will see that now you have a root file called
histodemo.root
.
Open this root file and you will see a histogram showing number of tracks in 200 events over which you ran. The plot looks like this:
If you want to see other collections, as mentioned above you have to see what are available at
Reco event content documentation.
Histo Analyzer
This package is similar to the DemoAnalyzer package. The point in having it here is that, having understood how to create, change and run your own demo analyzer (or module), you can also have look at this package that tells you how to add histograms to your analyzer which I already briefly showed you above. This has been historically here and I did not want to remove it. After all it does not hurt to tell how to include a histogram in your analyzer again.
Eventually, you may want your analyzer to record something rather than just print out track numbers. The
mkedanlzr
script has an option
%LOCALHISTOEX%histo
to generate an analyzer code which writes a histogram in a root file. If you use it together with the option
%LOCALHISTOEX%track
, it will access the tracks, find their charge and fill a histogram.
To use this option, go to the
CMSSW_%LOCALWBRELEASE%/src
directory and create a new subdirectory
mkdir Histo
cd Histo
Create a new analyzer module
mkedanlzr %LOCALHISTOCMD%
Go to the
HistoAnalyzer
directory and compile the code
cd HistoAnalyzer
scram b
Change the data file to
'file:/afs/cern.ch/cms/Tutorials/TWIKI_DATA/TTJets_8TeV_53X.root'
in
%LOCALHISTODIR%%LOCALCONFHISTO%
that we were already using in
%LOCALCONFFILE%
and change
'ctfWithMaterialTracks'
to
'generalTracks'
by commenting out as follows
process.demo = cms.EDAnalyzer('HistoAnalyzer'
# , tracks = cms.untracked.InputTag('ctfWithMaterialTracks')
, tracks = cms.untracked.InputTag('generalTracks')
)
Run the analyzer
cmsRun %LOCALHISTODIR%%LOCALCONFHISTO%
A root file
histo.root
has been created and you can find the histogram in root
root histo.root
new TBrowser
Double-click on ROOT files, double-click on histo.root and double-click on demo and you will find the histogram "charge" which is specified in the
Histo/HistoAnalyzer/src/HistoAnalyzer.cc
.
The histogram looks like this:
Detailed instructions on how to write out a histogram file can be found in
page
TFileService: a ROOT Histogram Service.
Review status
4.1.3 Introduction to CMS Configuration Files
Complete:
Detailed Review status
Goals of this page
After reading this page you will understand the general structure and use of configuration files and configuration file fragments for running CMS analysis jobs.
This discussion-only tutorial provides an introduction to configuration files which are used to configure
cmsRun jobs. A complete working example is given.
Contents
Introduction
The CMS software framework uses a “software bus” model, where data is stored in the event which is passed to a series of modules. A single executable,
cmsRun, is used, and the modules are loaded at runtime. A configuration file defines which modules are loaded, in which order they are run, and with which configurable parameters they are run. Note that this is not an interactive system. The entire configuration is defined once, at the beginning of the job, and cannot be changed during running. This design facilitates the tracking of
event provenance, that is, the processing history of the event.
Full details about configuration files are given in
SWGuideAboutPythonConfigFile.
Configuration files in CMS
All CMS code is run by passing a config file (
_cfg.py
) to the CMSSW executable,
cmsRun.
cmsRun <Configuration File>
for example:
cmsRun MyConfig_cfg.py
Configurations are written using the Python language.
Using the Python interpreter, one can quickly check the Python syntax of
the configuration and run many (but not all) of the checks in the
CMS Python module
FWCore.ParameterSet.Config
.
python MyConfig_cfg.py
(Python interpreter)
After Python finishes importing and executing the configuration file,
all components will have been loaded into the program.
Contents of a typical configuration file
A config file consists (typically) of the following parts as data members of a "cms.Process" object (of your naming):
- A source (which might read Events from a file or create new empty events)
- A collection of modules (e.g. EDAnalyzer, EDProducer, EDFilter) which you wish to run, along with customised settings for parameters you wish to change from default values
- An output module to create a ROOT file which stores all the event data. (When running an Analyzer module, the histograms produced are not event data, so an output module is not needed in that case)
- A path which will list in order the modules to be run
Each config file is created from discrete building blocks which specify a component of the
cmsRun program and configure it via its parameter set.
A configuration file written using the Python language can be created as:
- a top level file, which is a full process definition (naming convention is
_cfg.py
) which might import other configuration files
- external Python file fragment, which are of two types:
- those used for module initialization (naming convention is
_cfi.py
)
- those used as configuration fragment (naming convention is
_cff.py
)
The fragments are often imported into the top level configuration file
using the "
process.load()
" method, which also attaches the imported
objects to the process. Usually fragments are imported into other fragments
using one of the import statements of the Python language.
Note: All imports create references, not copies. If a module is imported at two different places in a configuration, the imported symbols (variables) will reference the same objects. Changing an object at one place, changes the object at other places.
Most python config files will start with the line
import FWCore.ParameterSet.Config as cms
which imports our CMS-specific Python classes and functions.
Standard fragments are available in the CMSSW release's
Configuration/StandardSequences/python/
area.
They can be read in using syntax like
process.load("Configuration.StandardSequences.Geometry_cff")
In the Python language, a line that starts with "
#
" is a comment.
The word "module" has two meanings. A Python module is a file containing
Python code and the word also refers to the object created by importing a Python file.
In the other meaning, EDProducers, EDFilters, EDAnalyzers,
and OutputModules are called modules.
In order to make sure that your Python module can be imported by
other Python modules:
- Place it in the
python
subdirectory of a package
- Be sure your SCRAM environment is set up
- Go to your package and do
scram b
or scram b python
The above steps are needed only once. The correctness of a Python config is checked at a basic level every time the
scram
command is used.
Examples of configuration files in Python
In the example here, configuration will open the files
test.root
and
anotherTest.root
. It will run a producer that creates a track collection and adds it to the data. The combined set of events from the two input files along with the new collection of tracks will be saved in the file
test2.root
. For each event, the
MyRandomAnalyzer
module will run on the event.
_cfg file
import FWCore.ParameterSet.Config as cms
#set up a process named RECO
processName = "RECO"
process = cms.Process(processName)
# configures the source that reads the input files
process.source = cms.Source ("PoolSource",
fileNames=cms.untracked.vstring(
'file:test.root',
'file:anotherTest.root'
)
)
# loads producer TrackFinderProducer
process.tracker=cms.EDProducer('TrackFinderProducer',
threshold=cms.untracked.double(5.6)
)
# loads your analyzer
process.MyModule = cms.EDAnalyzer('MyRandomAnalyzer',
numBins = cms.untracked.int32(100),
minBin = cms.untracked.double(0),
maxBin = cms.untracked.double(100)
)
# talk to output module
process.out = cms.OutputModule("PoolOutputModule",
fileName = cms.untracked.string("test2.root")
)
# Defines which modules and sequences to run
process.mypath = cms.Path(process.tracker*process.MyModule)
# A list of analyzers or output modules to be run after all paths have been run.
process.outpath = cms.EndPath(process.out)
Many examples of
_cfg.py
can be found in
/CMSSW/CMS.PhysicsTools/PatAlgos/test
and
/CMSSW/CMS.PhysicsTools/PatExamples/test
.
_cfi.py file
This example is taken from
/CMSSW/CMS.PhysicsTools/PatAlgos/python/cleaningLayer1/electronCleaner_cfi.py
.
import FWCore.ParameterSet.Config as cms
cleanPatElectrons = cms.EDProducer("PATElectronCleaner",
## pat electron input source
src = cms.InputTag("selectedPatElectrons"),
# preselection (any string-based cut for pat::Electron)
preselection = cms.string(''),
# overlap checking configurables
checkOverlaps = cms.PSet(
muons = cms.PSet(
src = cms.InputTag("cleanPatMuons"),
algorithm = cms.string("byDeltaR"),
preselection = cms.string(""), # don't preselect the muons
deltaR = cms.double(0.3),
checkRecoComponents = cms.bool(False), # don't check if they share some AOD object ref
pairCut = cms.string(""),
requireNoOverlaps = cms.bool(False), # overlaps don't cause the electron to be discared
)
),
# finalCut (any string-based cut for pat::Electron)
finalCut = cms.string(''),
)
_cff.py
This example is taken from
/CMSSW/CMS.PhysicsTools/PatAlgos/python/cleaningLayer1/cleanPatCandidates_cff.py
.
The module initialization file above is included using the line
from CMS.PhysicsTools.PatAlgos.cleaningLayer1.electronCleaner_cfi import *
in the
_cff.py
fragment file below:
import FWCore.ParameterSet.Config as cms
from CMS.PhysicsTools.PatAlgos.cleaningLayer1.electronCleaner_cfi import *
from CMS.PhysicsTools.PatAlgos.cleaningLayer1.muonCleaner_cfi import *
from CMS.PhysicsTools.PatAlgos.cleaningLayer1.tauCleaner_cfi import *
from CMS.PhysicsTools.PatAlgos.cleaningLayer1.photonCleaner_cfi import *
from CMS.PhysicsTools.PatAlgos.cleaningLayer1.jetCleaner_cfi import *
from CMS.PhysicsTools.PatAlgos.producersLayer1.hemisphereProducer_cfi import *
#FIXME ADD MHT
# One module to count objects
cleanPatCandidateSummary = cms.EDAnalyzer("CandidateSummaryTable",
logName = cms.untracked.string("cleanPatCandidates|PATSummaryTables"),
candidates = cms.VInputTag(
cms.InputTag("cleanPatElectrons"),
cms.InputTag("cleanPatMuons"),
cms.InputTag("cleanPatTaus"),
cms.InputTag("cleanPatPhotons"),
cms.InputTag("cleanPatJets"),
)
)
cleanPatCandidates = cms.Sequence(
cleanPatMuons *
cleanPatElectrons *
cleanPatPhotons *
cleanPatTaus *
cleanPatJets *
cleanPatCandidateSummary
)
Using Python Interactively
One can use Python interactively to understand the config files a bit more.
python -i MyConfg_cfg.py
and then it takes you to the Python prompt (
>>>
) where you can type
Python statements interactively. Many things are possible. Two examples
follow.
- Print the entire configuration out in Python format with all the imported objects expanded. This might be much larger than top level configuration file if many objects are imported.
>>> print process.dumpPython()
- Print one particular attribute of the process. For example, if the process contains a path labeled "p", then print the path.
>>> process.p
OR
>>> print process.p.dumpPython()
When you are done, type CONTROL-D to quit the Python interpreter.
Cloning of Python Process
As mentioned above, all imports create references, not copies. Changing an object at one place, changes the object at other places. Thus, if standard module configurations are imported, replace statements should be used with care. Parameter changes happen globally so other configs could be affected. The standard solution to this problem is
cloning the module and changing parameters while doing that:
The standard syntax for
cloning is
from aPackage import oldName
newName = oldName.clone (changedParameter = 42)
or
from aPackage import oldName as _oldName
newName = _oldName.clone (changedParameter = 42)
The second form is better if the symbol oldName is not needed and
this occurs in a fragment that might be imported with the process
load function or a "from aModule import *" statement. Symbols starting
with an underscore are not imported in these cases.
An example is below. Here we are NOT importing the module but defining it right here, called
patMuonBenchmarkGeneric
. You can see that by cloning it we can avoid the possible problem mentioned in the beginning of this section, save a lot of repetition and change the input parameter that we need to, in this case,
InputTruthLabel
,
BenchmarkLabel
and
InputRecoLabel
.
import FWCore.ParameterSet.Config as cms
patMuonBenchmarkGeneric = cms.EDAnalyzer("GenericBenchmarkAnalyzer",
OutputFile = cms.untracked.string('benchmark.root'),
InputTruthLabel = cms.InputTag('muons'),
minEta = cms.double(-1),
maxEta = cms.double(2.8),
recPt = cms.double(0.0),
deltaRMax = cms.double(0.3),
PlotAgainstRecoQuantities = cms.bool(True),
OnlyTwoJets = cms.bool(False),
BenchmarkLabel = cms.string( 'selectedPatMuons' ),
InputRecoLabel = cms.InputTag( 'selectedLayer1Muons'')
)
patElectronBenchmarkGeneric = patMuonBenchmarkGeneric.clone(
InputTruthLabel = 'pixelMatchGsfElectrons',
BenchmarkLabel = 'selectedPatElectrons',
InputRecoLabel = 'selectedPatElectrons',
)
patJetBenchmarkGeneric= patMuonBenchmarkGeneric.clone(
InputTruthLabel = 'iterativeCone5CMS.CaloJets',
BenchmarkLabel = 'selectedPatJets',
InputRecoLabel = 'selectedPatJets',
)
patPhotonBenchmarkGeneric= patMuonBenchmarkGeneric.clone(
InputTruthLabel = 'photons',
BenchmarkLabel = 'selectedPatPhotons',
InputRecoLabel = 'selectedPatPhotons',
)
patTauBenchmarkGeneric= patMuonBenchmarkGeneric.clone(
InputTruthLabel = 'pfRecoTauProducer',
BenchmarkLabel = 'selectedPatTaus',
InputRecoLabel = 'selectedPatTaus',
)
Information Sources
These information source were used when the original versions of this TWIKI page were written. They are out of date and contain much obsolete information. The second one requires a password.:
Further Documentation
A complete description of the configuration file language, parameters used and some python tips are provided in the following software guide pages:
Review status
Responsible:
SudhirMalik
Last reviewed by:
DavidDagenhart - 15 Dec 2016
4.1.4 Config Editor
Detailed Review status
Introduction
The ConfigEditor is a tool for browsing and editing of Python configuration files in CMSSW. It allows
- Browsing:
- Visualise the complete structure of a Config File and all included config files (via import)
- Inspect the parameters of modules
- Track which modules use input from which other modules
- Track in which file certain modules can be found
- Open the definition of certain modules in the user's favourite editor
- Editing:
- Create user configuration files which start from some standard configuration (e.g. PAT) and contain all changes to it.
- Modify parameters of modules
- Apply tools (e.g. PAT tools)
The
Manual of the ConfigEditor can be found at
SWGuideConfigEditor#Manual. If you have problems using the ConfigEditor, please also look at:
SWGuideConfigEditor#Troubleshooting. Note that the ConfigEditor is not designed to debug Python related errors in Python config files. Therefore have a look at
SWGuidePythonTips, or at
SWGuideAboutPythonConfigFile. Feel free to contact
AndreasHinzmann for feedback and suggestions!
Getting started
Set up CMSSW and run ConfigEditor
- If you are running remotely (via ssh -Y):
edmConfigEditorSSH
Older releases:
- CMSSW >=3_8_2:
cmsrel CMSSW_3_8_2
cd CMSSW_3_8_2/src
cmsenv
edmConfigEditor
- CMSSW >=3_5_4
cmsrel CMSSW_3_5_4
cd CMSSW_3_5_4/src
cmsenv
edmConfigEditor
If you have freeze problems using edmConfigEditor over ssh, use this recipe:
addpkg FWCore/GuiBrowsers V00-00-38
scram b
- CMSSW 3_1_1 - 3_5_3 PAT tools are not yet supported. All other functionality is in place.
cmsrel CMSSW_3_1_1
cd CMSSW_3_1_1/src
cmsenv
addpkg FWCore/GuiBrowsers V00-00-38
scram b
edmConfigEditor
- CMSSW 2_1_X - 3_1_1 Only browsing functionality in place. No editing of configuration possible. The name of the tool is
edmConfigBrowser
.
- Set up CMSSW(>=2_1_X) environment:
cmsrel CMSSW_2_1_17
cd CMSSW_2_1_17/src
cmsenv
- Download latest version (0.3.2) of ConfigBrowser from AFS (tgz-file, 20 MB):
cp /afs/cern.ch/user/h/hinzmann/www/web/ConfigBrowser.tgz .
or from website http://cern.ch/test-configbrowser/ConfigBrowser.tgz
.
- Extract (total size 54 MB) and run ConfigBrowser:
tar -xzf ConfigBrowser.tgz
cd ConfigBrowser
./edmConfigBrowser
Browsing configuration files
- Inspect configuration files using your favorite editor
- Select your favorite editor in the menu Config -> Choose editor... (e.g. type emacs)
- Select an object and open the config file in which the selected object is defined via menu Config -> Open editor...
If you edit configuration files, you modify the default settings in your project CMSSW area! You should be aware of what you are doing. If you want to create user defined configuration follow the instructions in the next section.
Creating user configuration files
- Create a new user configuration
- Apply changes to the standard configuration and save them into the user configuration
- Edit parameters of selected modules using the Property View. Your changes will appear in the user configuration code generated by ConfigEditor.
- You can apply tools (e.g. PAT tools) via menu Config - Apply tool . Select a tool, choose its parameters and press apply.
- Save user configuration via menu File - Save as...
Tutorials
Browsing PAT with ConfigEditor (2010)
Writing PAT user configuration with ConfigEditor (2010)
Review status
4.2 Physics Analysis Toolkit (PAT)
Detailed Review status
Contents
Introduction
The
Physics Analysis Toolkit (PAT) is a high-level analysis layer providing the Physics Analysis Groups (PAGs) with easy access to the algorithms developed by Physics Objects Groups (POGs) in the framework of the CMSSW offline software. It aims at fulfilling the needs of most CMS analyses, providing both ease-of-use for beginners and flexibility for advanced users. PAT is fully integrated into CMSSW and an integral part of any release of CMSSW. Below you will find the most important links into the documentation of PAT:
- You will find the basic documentation of PAT as part of the WorkBookPAT subpages,
- You will find the documentation of more advanced or more technical details as part of the SWGuidePAT subpages.
PAT Documentation
You can find all basic documentation of PAT which is part of the
WorkBook listed below:
You can find more detailed documentation on the
SWGuidePAT subpages:
Review status
Responsible:
RogerWolf
Review: reviewed
4.2.1 Physics Analysis Toolkit (PAT): Data Formats
Detailed Review status
Contents
Introduction
In this section you will find a description of the main Data Formats of PAT
(1) . The hierarchy of pat::Candidates is illustrated in the figure
below. For each candidate you will find the following information:
- A common description of what information is available from the pat::Candidate (1) .
- A list of extra information that may be stored in the pat::Candidate with respect to the reco::Candidate.
- A list of all collections that you can expect to be present in a standard pat::Tuple
(1) .
- A link into the most actual DoxyGen
documentation providing access to all(!) available member functions of the corresponding pat::Candidate (1) .
(1) To learn more about regularly used expressions (like pat::Candidate, pat::Tuple, ...) have a look to the WorkBookPATGlossary.
The figure is clickable. By clicking on the collection of interest you will be lead to the corresponding description in the text.
All pat::Candidates are derived from their corresponding reco::Candidates. They are equivalent to a reco::Candidate but carry extra features and extra information to facilitate analysis work. The difference between a pat::Candidate and a reco::Candidate can be summarised in the following points:
- More Information: A pat::Candidate can carry more information than a reco::Candidate. This extra information spans from object resolutions, correction factors to the jet energy scale, parameterised reconstruction efficiencies and more.
- Facilitated Access: The access to all relevant information is facilitated via the pat::Candidate. You might be familiar the procedure to receive electron identification information via edm::Association
maps or the b-tag information associated to a given reco::Jet
. This is a non-trivial action and in addition very error prone. Via a pat::Candidate all this information is accessible via simple member functions. The used parameters, algorithms and procedures to arrive at this information represent the finest certified knowledge of the corresponding POG's.
- Event Size Reduction: The creation of a pat::Tuple supports you in reducing the enormous amount of information stored on RECO/AOD to the amount of information that you really need for your analysis. Sensible default configurations give you a very good starting point for first iterations. The event provenance is maintained. A pat::Tuple is well defined by the configuration file with which is was produced and a set of parameters. A pat::Tuple can very easily be exchanged between analysis groups for cross checks.
Note:
For a description how to access pat::Candidates via
edm::Handles
in the full framework or FWLite have a look at the
WorkBookPATTutorial.
pat::Photon
Description:
Data type to describe reconstructed photons, which is derived from the reco::Photon. In addition to the reco::Photon it may contain the following extra information:
- isolation calculated in a pre-defined cone in different sub-detectors of CMS. You can choose between isolation as recommended by the corresponding POG or a user defined isolation accessible via the member function
userIsolation(pat::IsolatioKey key)
, where the available IsolationKeys
are defined in Isolation.h
.
Note: user defined isolation needs to be configured during the pat::Candidate creation step as described in WorkBookPATConfiguration.
- isolation deposits for user defined isolation and more detailed studies
- match on generator level
- match on trigger level
- reconstruction efficiency
- object resolution
- Id variables
- shower shape variables
For a description of the algorithms to determine this extra information have a look at
WorkBookPATWorkflow.
Access:
The labels of all photon collections, which are created during the standard PAT workflow(s) are given below:
For a description of each listed module have a look at
WorkBookPATConfiguration.
Member Functions:
You can find a list of all member functions at
pat::Photon
.
pat::Lepton
Description:
The pat::Electron, pat::Muon and pat::Tau inherit some common functionalities from the common base class pat::Lepton. Almost all of these features are related to isolation. The information held in these objects is:
- isolation calculated in a pre-defined cone in different sub-detectors of CMS. You can choose between isolation as recommended by the corresponding POG or a user defined isolation accessible via the member function
userIsolation(pat::IsolatioKey key)
, where the available IsolationKeys
are defined in Isolation.h
.
Note: user defined isolation needs to be configured during the pat::Candidate creation step as described in WorkBookPATConfiguration#UserIsolation.
- isolation deposits in different regions of the detector. To study the details of isolation deposits see SWGuideIsoDeposits and the corresponding reference
.
Member Functions:
You can find a list of all member functions at
pat::Lepton
.
pat::Electron
Description:
Data type to describe reconstructed electrons, which is derived from the reco::Electron. In addition to the reco::Electron it may contain the following extra information:
- isolation in a predefined cone
- isolation deposits for user defined isolation and more detailed studies
- match on generator level
- match on trigger level
- reconstruction efficiency
- object resolution
- Id variables
- shower shape variables
The electrons are sorted in descending order in p
t. For a description of the algorithms to determine this extra information have a look at
WorkBookPATWorkflow.
Access:
The labels of all electron collections, which are created during the standard PAT workflow(s) are given below:
For a description of each listed module have a look at
WorkBookPATConfiguration.
Member Functions:
You can find a list of all member functions at
pat::Electron
.
pat::Muon
Description:
Data type to describe reconstructed muons, which is derived from the reco::Muon. In addition to the reco::Muon it may contain the following extra information:
- tracks from a TeV refit (
pat::Muon::pickyMuon()
, pat::Muon::tpfmsMuon()
)
- isolation in a predefined cone
- isolation deposits for user defined isolation and more detailed studies
- match on generator level
- match on trigger level
- reconstruction efficiency
- object resolution
- two MuonMETCorrectionData
objects. These objects allow to correct/uncorrect the caloMET and tcMET on the fly for the pat::Muon. They are obtained from the Muon-MET correction value maps that are described on WorkBookMetAnalysis#HeadingThree and WorkBookMetAnalysis#HeadingFour.
The muons are sorted in descending order in p
t. For a description of the algorithms to determine this extra information have a look at
WorkBookPATWorkflow.
Access:
The labels of all muon collections, which are created during the standard PAT workflow(s) are given below:
For a description of each listed module have a look at
WorkBookPATConfiguration.
Member Functions:
You can find a list of all member functions at
pat::Muon
.
pat::Tau
Description:
Data type to describe reconstructed taus, which is derived from the reco::Tau. In addition to the reco::Tau it may contain the following extra information:
- isolation in a predefined cone
- match on generator level
- match on trigger level
- reconstruction efficiency
- object resolution
- tau discrimination variables
The taus are sorted in descending order in p
t. Per default particle flow taus are used, though this may be switched to calorimeter taus during the step of pat::Tuple creation. For a description of the algorithms to determine this extra information have a look at
WorkBookPATWorkflow.
Access:
The labels of all tau collections, which are created during the standard PAT workflow(s) are given below:
For a description of each listed module have a look at
WorkBookPATConfiguration.
Per default, the collections
patTaus
and
selectedPatTaus
correspond to particle flow taus, with no further selection applied.
Note: since reco::PFTaus are in one-to-one correspondence with reco::PFJets,
one pat::Tau object will be added to the patTaus and selectedPatTaus collections for each particle flow jet.
As a consequence, most pat::Taus contained in the patTau and selectedPatTau collections are actually fakes.
The collection
cleanPatTaus
is the collection intended for analysis.
It has the following selection applied in order to reduce the fake-rate:
- ≥ 1 PFChargedHadron of pT > 1.0 GeV within matching cone of size dR = 0.1 around jet-axis (leading charged hadron exists)
- ≥ 1 PFGamma or PFChargedHadron of pT > 5.0 GeV within matching cone of size dR = 0.1 around jet-axis (pT of leading hadron greater than 5 GeV)
- no PFChargedHadron of pT > 1.0 GeV and PFGammas of pT > 1.5 GeV outside of signal cone of size dR = 5.0/jetET within jet (tau is isolated)
- discriminator against muons passed
- discriminator against electrons passed
- either 1 or 3 PFChargedHadrons within signal cone of size dR = 5.0/jetET (one or three prong tau)
The above selection criteria represent a compromise between tau id efficiency and fake-rate that is reasonable for many analyses.
In case you find a non-negligible background of electrons remaining in the cleanPatTaus collection, you may want to cut the region
between the ECAL barrel and endcap, in which the probability for electrons to get misidentified as taus is particularly high.
Member Functions:
You can find a list of all member functions at
pat::Tau
.
pat::Jet
Description:
Data type to describe reconstructed jets, which is derived from the reco::Jet. In addition to the reco::Jet it may contain the following extra information:
- b tag information
- jet energy corrections
- jet charge
- associated tracks
- jet flavour
- match on generator level (reco::genJet und parton)
- match on trigger level
- reconstruction efficiency
- object resolution
Per default anti-kT jets are used with R=0.5, though this may be switched to any other kind of jet collection during the step of pat::Tuple creation. Also alternative jet collections may be added to the standard pat::Tuple content depending on the analysis purposes. The energy scale of the pat::Jets is corrected to L3Absolute, while the correction factors for L2Relative, L5Flavour and L7Parton are also stored within the jet. They may be used to produce a clone of the jet, which is corrected to the specified level or fully uncorrected ('raw'). For more details on the arguments to be used have a look at the description of the
pat::JetCorrFactors below. The jets are sorted in descending order in (L3Absolute calibrated) p
t. For a description of the algorithms to determine this extra information have a look at
WorkBookPATWorkflow.
Access:
The labels of all jet collections, which are created during the standard PAT workflow(s) are given below:
(1) Note that when adding jet collections to the event content of the standard pat::Tuple the additional jet collections will be accompanied by a postfix identifying the additional jet collection. This postfix typically could be of type cleanPatJetsAK5Calo
.
For a description of each listed module have a look at
WorkBookPATConfiguration.
Member Functions:
You can find a list of all member functions at
pat::Jet
.
pat::MET
Description:
Data type to describe reconstructed MET, which is derived from the reco::MET. In addition to the reco::MET it may contain the following extra information:
- type 1 MET corrections
- muon MET corrections
- match on generator level
- match on trigger level
- reconstruction efficiency
- object resolution
Where the type 1 MET correction corresponds to the standard jet selection in use. Per default pat::MET is type 1 corrected and muon corrected. The muon corrections may be altered on the fly depending on the selection criteria for muons. For more details have a look at the description of the
pat::Muon. For a description of the algorithms to determine this extra information have a look at
WorkBookPATWorkflow.
Access:
The labels of the possible MET collections produced in a PAT workflow are given below:
The
patMETs
collection is produced in the standard workflow, but the track corrected MET collection (
patMETsTC
) and the particle flow MET collection (labeled as
patMETsPF
) may be added to the content of the standard pat::Tuple on the user's choice (see
patTuple_addJets_cfg.py
). For a description of each listed module have a look at
WorkBookPATConfiguration.
Member Functions:
You can find a list of all member functions at
pat::MET
.
pat::Trigger
For a complete description of all trigger related data formats within PAT have a look at
SWGuidePATTrigger#Data_Formats.
Helper Classes
In this section you will find a description of the most important helper classes of PAT. Direct access to this classes might not be necessary though it can give you more insides into the mechanisms of the generation and access to the main data formats of PAT.
pat::JetCorrFactors
Description:
Data type to host all potential correction factors of the factorised calibration ansatz (L1 to L7) to be embedded in the
pat::Jet. More then one set of
jetCorrFactors can be embedded in one pat::Jet, as useful for the comparison of different sets of calibrations or systematic studies. The pat::JetCorrFactors class will be issued internally when creating clones of pat::Jets with different calibration levels. This will need the correction step and the potential flavour as arguments. The following correction levels (and flavours) are supported:
correctstep |
flavour |
correction type |
Uncorrected |
- |
uncalibrated jet |
L1Offset |
- |
offset correction |
L2Relative |
- |
relative inter eta correction |
L3Absolute |
- |
absolute pt correction |
L23Residual |
- |
residual corrections between data and MC after the L3Absolute correction |
L4Emf |
- |
correction as a function of the jet emf |
L5Flavor |
gluon uds charm bottom |
hadron level correction for gluons, light quarks, charm, bottom |
L6UE |
gluon uds charm bottom |
underlying event correction for gluons, light quarks, charm, bottom |
L7Parton |
glu uds charm bottom |
parton level correction for gluons, light quarks, charm, bottom |
Note: the flavour strings are case insensitive. Per default the set of
jetCorrFactors is embedded in the pat::Jet, which corresponds to the JetMET default and the corrections for
L1Offset,
L2Relative,
L3Absolute,
L5Flavor and
L7Parton are stored in addition. All other correction factors will be transparent. Per default the pat::Jet will be corrected to the following level (if available from the list of correction levels) with the following priorities:
- to L2L3Residual if available in the list of correction levels.
- to L3Absolute if available in the list of correction levels.
- to Uncorrected (i.e. no correction will be applied, if niether L2L3Residual nor L3Absolute are available in the list of corrction levels.
In the latter case, the
PatJetProducer will issue a Warning during
patTuple production. For a description of the algorithms to determine this extra information have a look at
WorkBookPATWorkflow.
Access:
The labels of all jet collections, which are created during the standard PAT workflow(s) are given below:
For a description of the module have a look at
WorkBookPATConfiguration.
Member Functions:
You can find a list of all member functions at
pat::JetCorrFactors
.
pat::CandKinResolution
Description:
Data type to host potential object resolutions. It can be embedded to each pat::Object. For a detailed description have a look at
SWGuidePATResolutions.
Access:
Note that adding object resolutions and embedding them into a
pat::Candidiate is OPTIONAL. They are not part of the default PAT event content. To add resolutions to an arbitary
pat::Candidate add a service to your
cfg file for
patTuple production and switch the embedding of the resolution information for the corresponding
pat::Candidate to
True
. An example for b jets (in one central bin in eta) is given below:
## define the service
bjetResolution = stringResolution.clone(parametrization = 'EtEtaPhi',
functions = cms.VPSet(
cms.PSet(
bin = cms.string('0.000<=abs(eta) && abs(eta)<0.087'),
et = cms.string('et * (sqrt(0.0901^2 + (1.035/sqrt(et))^2 + (6.2/et)^2))'),
eta = cms.string('sqrt(0.00516^2 + (1.683/et)^2)'),
phi = cms.string('sqrt(0.0024^2 + (3.159/et)^2)'),
),
# ... in reality you will have more bins in eta here of course ...
),
constraints = cms.vdouble(0)
)
# embed the resolutions inot the patJets collection
patJets.addResolutions = True
patJets.resolutions = bjetResolution
You can find an example
cff file containing resolutions for muons, jets and MET as recently derived within the TopPAG based on MC truth information in a typical madgraph ttbar sample in
stringResolutions_etEtaPhi_cff.py
in the
python/recoLayer0 directory. Please note that this is
ONLY AN EXAMPLE file to demonstrate the technique. It does not ease you from the burden to derive and test the resolutions which might be applicable and adequte for your analysis.
Member Functions:
You can find a list of all member functions at
pat::CanKinResolution
.
Review status
4.2.2 Physics Analysis Toolkit (PAT): Workflow
Detailed Review status
Contents
Introduction
In this section you will find a description of the standard workflow of PAT. The creation of pat::Candidates starts from the AOD (or RECO) tier. The PAT Workflow
(1) is organised in a main production sequence (called
patDefaultSequence
) and a parallel sequence for associating trigger information. An overview of the sequences is given in the figure
below. The production of a
standard pat::Tuple
(1) consists of the following phases:
- Pre-Creation: this phase includes steps, which can or have to be performed before the creation of pat::Candidates starts.
- Candidate Creation: in this phase all extra information will be contracted into the pat::Candidates.
- Candidate Selection: in this phase an extra selection step on the newly created pat::Candidate collections is supported. By default no selection is applied though.
- Candidate Disambiguation: in this phase fully configurable extra information is added to support the disambiguation of reconstructed objects.
- PAT Trigger Event: in this phase all trigger information available in the AOD (RECO) event content is re-keyed and made available to the end-user.
(1) To learn more about regularly used expressions (like pat::Candidate, pat::Tuple, ...) have a look to the WorkBookPATGlossary.
The figure is clickable. By clicking on the collection of interest you will be lead to the corresponding description in the text.
You can browse the workflow for the creation of pat::Candidates exploiting the
edmConfigEditor as explained in
WorkBookConfigEditor. The result of the pat::Candidate creation is a pat::Tuple, which in the default configuration contains all supported pat::Candidate collections and may be used transiently (i.e. on the fly) or be made persistent at your home institute. Have a look at
WorkBookPATDataFormats to learn more about the data format and content of each corresponding pat::Candidate. The pat::Tuple should
NOT be viewed as a replacement of the AOD (RECO) event content but rather as a replacement of the flat user ntuple, which is commonly used in analyses. The Analysis Tool (AT) group of CMS strongly recommends the use of pat::Tuples instead of flat ntuples due to the following reasons:
- Performance: pat::Tuples and ntuples will have the same I/O performance, so there is no I/O advantage of a flat ntuple over a pat::Tuple. Additionally, the error-prone programming overhead of defining and addressing root branches drops out when using pat::Tuples.
- Compliance: The pat::Tuple is fully compliant with CMSSW. You can further process it with CMSSW (full framework) on a batch system or via crab, FWLite executables, or bare Root macros—however you like. The pat::Tuple can have a flexible event content that may range between the size of the AOD format (of ~100 kB/evt) and an extremely slim ntuple (of ~5 kB/evt). This change can be reached just by python configuration. The PAT team supports you with tools, sensible defaults and experience.
- Provenance: The pat::Tuple maintains full event provenance. Even if very flexible in its content the objects are always well defined by the configuration file they have been created with and a manageable number of configuration parameters. It is therefore very easy to interchange pat::Tuples between analysis groups for cross checks. You can still transform the pat::Tuple into a flat ntuple at a later point in your analysis if you really think that you will gain from that. You can use tools provided by the Analysis Tools (AT) group for that. The PAT team is also working on an improved support to further slim down a pat::Tuple after it has been created.
- Support: PAT canalises the finest certified knowledge of all POG's. With the standard pat::Tuple you start from a sensible default, that you can easily use for first iterations to define the event content you really need for your specific analysis and PAT supports the coherent reduction of information with many features and tools. The PAT support team organises bi-monthly tutorials of which any second one is a full week in-person tutorial. You will profit from a wealth of well maintained documentation TWikis and the support of the PAT team.
Note:
Several tools exist to facilitate the customisation and adaptation of the PAT workflow for the creation and the content of a pat::Tuple to the analyst's needs. To learn more about these tools have a look to
SWGuidePATTools. All these tools may also be applied via pull down menus in the
edmConfigEditor. When made persistent a reasonably configured pat::Tuple has a size between
6kB/evt
and
50kB/evt
with the largest space consumption usually associated to the pat::Jet collections. In the current default configuration of the pat::Tuple has a size of
~15kB/evt
. To learn more about the space consumption of a pat::Tuple in the current default configuration have a look
here
. Note that these size estimates have been performed on 100 events of an inclusive pythia ttbar sample in CMSW_3_5_7. To learn more about size estimates of pat::Tuples have a look at
SWGuidePATEventSize.
Pre-Creation
The main sequence starts with a pre-creation phase of the AOD or RECO event contents if necessary(!) and the
Monte Carlo matching, which has to be provided before being folded into the pat::Candidates. During the pre-creation phase useful information or transient data types may be created (if not yet available) which include the following information:
- A set of isolation variables using information from several detectors, which are non-standard and therefore not in the AOD (RECO) event content.
- The association of non-standard POG supported object identification variables, which are not yet part of the AOD (RECO) event content definition.
- A set of tau discriminator variables, which are just too numerous and analysis specific to be all part of the AOD (RECO) event content.
- A compilation of JEC factors for jets, which are retrieved from the event setup and prepared to be folded into the pat::Jet.
- A set of b-discriminator values, which might not be all available in the AOD (RECO) event content.
- The association of charged tracks.
- The corresponding jet-charge.
The complete information is provided in
cff files, which collect all objects needed to create a certain pat::Candidate collection as defined in the
cfi file that contains the module definition of the corresponding pat::Candidate EDProducer. You can find the corresponding
cff and
cfi files in the
producersLayer1
directory of the
PatAlgos
package. You can find the corresponding interfaces to the AOD (RECO) event content in the
recoLayer0
directory of the
PatAlgos
package. They may change from CMSSW release to CMSSW release as parts of the objects created during this pre-creation phase might migrate into later definitions of the AOD (RECO) event content. Other parts might need to be added when running on input files, which had been produced at times when the release that you are using did not yet exist.
Note:
You can view the role of this pre-creation phase as an interface between, what is naturally available on the AOD (RECO) of your input files (at the time the files were produced) and the full feature catalogue of your current CMSSW release you want to make use of for your analysis and which might not be in sync with the input files you are planning to use.
Note:
Another important step is the matching to Monte Carlo truth information, which is only applicable to simulated events. Many options of matching exist, which are based on the objects similarity on their spatial and kinematic distributions. To find out more about these options have a look to
SWGuidePATMCMatching. The corresponding module definitions as used in the standard configuration of PAT can be found in the
mcMatchLayer0
directory of the
PatAlgos
package. This step is part of the standard configuration of PAT. It can easily be switched off by the use of pat::Tools as described on
SWGuidePATTools.
Candidate Creation
After the pre-creation phase all extra information, which has been translated into a proper format is collapsed into the pat::Candidates. In the default configuration the following collections are produced during this step:
- patPhotons
- patElectrons
- patMuons
- patTaus
- patJets
- patMET
You can find the corresponding module definitions in the
producersLayer1
directory of the
PatAlgos
package. The production of all supported pat::Candidate collections is steered by the sequence
patCandidates
in the
patCandidates_cff.py
file. These collections are the inputs for the further operations of candidate
selection and candidate
disambiguation. In the default configuration of the PAT workflow they will not be made persistent and should therefore be viewed as transient object collections. To learn more about how to change the configuration of the content of a single pat::Candidate have a look to
WorkBookPATConfiguration.
Candidate Selection
PAT supports intuitive candidate selection via the
SWGuidePhysicsCutParser. With the help of this tool you may apply selections on any available member function of a given object via intuitive selection strings (e.g. "pt>30 & abs(eta)<3"). This will provide you with a new collection of objects that passed the applied selection. During the phase of pat::Candidate selection such candidate selections are applied to the pat::Candidate collections, resulting in the following new collections, which form the first potentially persistent layer of pat::Candidates in the default configuration of the PAT workflow:
- selectedPatPhotons
- selectedPatElectrons
- selectedPatMuons
- selectedPatTaus
- selectedPatJets
- patMET
You can find the corresponding
cfi files in the
selectionLayer1
directory of the
PatAlgos
package. The production of all supported selected pat::Candidate collections is steered by the sequence
selectedPatCandidates
in the
selectedPatCandidates_cff.py
file. The selection modules can have any pat::Candidate collection as input, so you can also refine a collection with a selectedPatCandidate collection as input, which will provide you with an additional collection of the candidates that passed this refined selection.
Note:
In the default configuration of the PAT workflow all selection strings are empty:
NO SELECTION IS APPLIED. So you will get object collections, which apart from their module label are equivalent to the previously produced pat::Candidate collections. Note that physically there is no selection on MET. Therefore there is no selection module provided for MET. (This might change in the future.) You might want to apply any object or event selection that might be suitable for your analysis already during pat::Tuple production or at any later step of your analysis by replacing the dummy selections in the corresponding configuration files or corresponding clones of them. Note that this is always possible with any input collection of pat::Candidate type even after the pat::Tuple has been made persistent.
Note:
The selection modules only act on object collections. To apply an event selection based on a collection of selected objects you might want to make use of the
candidateCountFilter modules provided in the same directory of the package. Have a look to
WorkBookPATConfiguration to find out more about the configuration of
selectedPatCandidate collections.
Candidate Disambiguation
Due to the way objects are reconstructed in CMS some of the physics measurements, like energy deposits in the calorimeter might be reconstructed several times as different physics objects. For example, an cluster in the electromagnetic calorimeter can be interpreted as a photon, electron or jet. It might therefore be present in several different candidate collections at the same time. Depending on the analysis purpose, this might be of interest, harmless, or harmful. For more complex analyses, which aim at the interpretation of a whole event containing candidates of several types a disambiguation step has to take place. Removing objects from collections should remain a very analysis dependent and high level analysis step though, which has to be performed with care and viewed with caution. It is clear that a common frame as a well defined common basis of object disambiguation, which might also combine the finest knowledge and expertise of all analysis groups within CMS is appreciable.
PAT supports such a common frame of object disambiguation in a user configurable and well defined way. In the default configuration of the PAT workflow no objects are removed from the collections, but overlapping objects from other collections are marked by
edm::Ptrs
. These pointers are stored in a new collection of pat::Candidates. In the default configuration of the PAT workflow the collections are:
- cleanPatPhotons
- cleanPatElectrons
- cleanPatMuons
- cleanPatTaus
- cleanPatJets
- patMET
You can find the corresponding
cfi files in the
cleaningLayer1
directory of the
PatAlgos
package. The production of all supported clean pat::Candidate collections is steered by the sequence
cleanPatCandidates
in the
cleanPatCandidates_cff.py
file. In later analysis steps the stored edm::Ptrs can be used to make an assessment on the overlapping candidate(s) of other collections. It can still be dropped from further considerations, or its energy can be used to modify the energy of the object of interest as for the example of a jet and an overlapping electron.
Note:
PAT supports selection and preselection steps for the overlap checking and overlap checking based on the distance of deltaR or common seeds to super clusters as used for the disambiguation of electrons and photons. Note that in the default configuration of the PAT workflow
NO OBJECT IS REMOVED from the pat::Candidate collections. Only the extra information of overlapping objects from other collections is added to the pat::Candidates. Also note that physically there is no cleaning on MET. Therefore there is no cleaning module provided for MET.
In the default configuration of the PAT worflow overlap checking is performed in the following way:
- cleanPatMuons: are taken as selectedPatMuons (no overlap checking).
- cleanPatElectrons: are checked for overlaps with cleanPatMuons.
- cleanPatPhotons: are checked for overlaps with cleanPatElectrons.
- cleanPatTaus: are checked for overlaps with cleanPatMuons and cleanPatElectrons.
- cleanPatJets: are checked for overlaps with cleanPatMuons, cleanPatPhotons, cleanPatElectrons, cleanPatTaus and isolated cleanPatElectrons.
The latter overlap check on isolated
cleanPatElectrons was added as an example of a preselection on overlapping objects.
Note:
The stored edm::Ptrs point to the object in the other candidate collection. It is therefore e.g. easily possible at any later analysis stages to check whether an electron, that is overlapping with a jet itself has an overlap with a photon, aso. All flexibility to judge is therefore still in the hands of the analyst. The clean pat::Candidate collections should be viewed as the default output of the standard pat::Tuple. If you already know that you won't need any object disambiguation it can be completely removed from the default PAT workflow though by the use of PAT tools. You can view this step of pat::Candidate processing
OPTIONAL.
Note:
It is always possible to add object disambiguation information with any input collection of pat::Candidate type even after the pat::Tuple has been made persistent. So you have the option to create a persistent pat::Tuple at your institute on which you apply a user-defined object disambiguation afterwards on the fly at later point in your analysis. To learn more about the way PAT supports object disambiguation have a look to
SWGuidePATCrossCleaning. To learn more about the tools to configure the default PAT workflow have a look to
SWGuidePATWorkflow.
PAT Trigger Event
Besides the main PAT production sequence, trigger information is re-keyed into a human readable form in the pat::TriggerEvent and the PAT trigger matching provides the opportunity to connect PAT objects with trigger objects for further studies. Thus the user can easily figure out exactly which object(s) fired a given trigger. The production sequence starts from the PAT trigger producers, and folds the information in the pat::TriggerObject, pat::TriggerFilter, and pat::TriggerPath classes. These classes are contracted into the pat::TriggerEvent, which should be viewed as the central entry point to all trigger information. A detailed description can be found at
SWGuidePATTrigger. The association of trigger objects and the pat::Candidates (produced in the main sequence) is provided by the pat::TriggerMatching, and is described in
SWGuidePATTrigger#PATTriggerMatcher. A detailed description of the pat::TriggerEvent can be found at
SWGuidePATTrigger#TriggerEvent.
Review status
4.2.3 Physics Analysis Toolkit (PAT): Configuration
Detailed Review status
Contents
Introduction
In this section you will find a description of the configuration of the workflow and content of the pat::Candidates
(1) . You may access this content from the pat::Candidate member functions. To learn more about the content of each pat::Candidate have a look to
WorkBookPATDataFormats. The following groups of modules will be described in the following:
The modules for the creation of the pat::TriggerEvent and the matching of trigger candidates to pat::Candidates is described in
SWGuidePATTrigger#Producers. You can easily browse each configuration file exploiting the
edmConfigEditor as explained on
WorkBookConfigEditor.
(1) To learn more about regularly used expressions (like pat::Candidate, pat::Tuple, ...) have a look to the WorkBookPATGlossary.
Candidate Configuration
In this section you will find a description how to (re-)configure the content of each pat::Candidate and the corresponding default configurations. In the standard workflow the pat::Candidate collections are the first object collections to be produced after a
pre-creation phase. The basic pat::Candidate collections are further processed in a
selection phase and finally in a optional phase of
candidate disambiguation. To learn more about the PAT workflow have a look to
WorkBookPATWorkflow. You will find the configuration files discussed on this page in the
python/producersLayer1
directory of the
PatAlgos
package.
pat::Photon
Description:
The standard module to produce pat::Photons is the
PATPhotonProducer module. It produces the patPhotons collection. You can get an automated description of the module and its parameters using the edmPluginHelp tool described on
SWGuideConfigurationValidationAndHelp. You can find the module implementation
here
.
Configurables:
You can find the full list of configuration parameters and the standard configuration of the PATPhotonProducer module
here
.
Algorithms:
You can find the links to the used algorithms for the implementation of the pat::Photons below. Note that the default implementation of algorithms as recommended and used by the corresponding POG or the Analysis Tools (AT) group are used. Therefore the following links will mostly point to the documentation as provided by these groups.
pat::Electron
Description:
The standard module to produce pat::Electrons is the
PATElectronProducer module. It produces the patElectrons collection. You can get an automated description of the module and its parameters using the edmPluginHelp tool described on
SWGuideConfigurationValidationAndHelp. You can find the module implementation
here
.
Configurables:
You can find the full list of configuration parameters and the standard configuration of the patElectrons module
here
.
Algorithms:
You can find the links to the used algorithms for the implementation of the pat::Electrons below. Note that the default implementation of algorithms as recommended and used by the corresponding POG or the Analysis Tools (AT) group are used. Therefore the following links will mostly point to the documentation as provided by these groups.
pat::Muon
Description:
The standard module to produce pat::Muons is the
PATMuonProducer module. It produces the patMuons collection. You can get an automated description of the module and its parameters using the edmPluginHelp tool described on
SWGuideConfigurationValidationAndHelp. You can find the module implementation
here
.
Configurables:
You can find the full list of configuration parameters and the standard configuration of the patMuons module
here
.
Algorithms:
You can find the links to the used algorithms for the implementation of the pat::Muons below. Note that the default implementation of algorithms as recommended and used by the corresponding POG or the Analysis Tools (AT) group are used. Therefore the following links will mostly point to the documentation as provided by these groups.
- WorkBookMCTruthMatch: a workbook description of MC truth matching tools provided by the AT group.
- SWGuidePATMCMatching : a description of the PAT implementation of the MC truth matching tools as described above.
- SWGuidePATTrigger: a complete description of the pat::Trigger.
- SWGuidePATTriggerMatching: a description of the matching of trigger information to offline reconstructed objects.
- SWGuideMuonIsolation : a description of the standard isolation algorithms and a description of the standard content of isoDeposits.
- WorkBookMuonID : a description of the object identification as recommended by the CMS.MuonPOG. You can find a detailed study on muon identification in this note CMS-AN2008/098
pat::Tau
Description:
The standard module to produce pat::Taus is the
PATTauProducer module. It produces the patTaus collection. You can get an automated description of the module and its parameters using the edmPluginHelp tool described on
SWGuideConfigurationValidationAndHelp. You can find the module implementation
here
.
Configurables:
You can find the full list of configuration parameters and the standard configuration of the patTaus module
here
.
Algorithms:
You can find the links to the used algorithms for the implementation of the pat::Taus below. Note that the default implementation of algorithms as recommended and used by the corresponding POG or the Analysis Tools (AT) group are used. Therefore the following links will mostly point to the documentation as provided by these groups.
pat::Jet
Description:
The standard module to produce pat::Jets is the
PATJetProducer module. It produces the patJets collection. You can get an automated description of the module and its parameters using the edmPluginHelp tool described on
SWGuideConfigurationValidationAndHelp. You can find the module implementation
here
.
Configurables:
You can find the full list of configuration parameters and the standard configuration of the patJets module
here
.
Algorithms:
You can find the links to the used algorithms for the implementation of the pat::Jets below. Note that the default implementation of algorithms as recommended and used by the corresponding POG or the Analysis Tools (AT) group are used. Therefore the following links will mostly point to the documentation as provided by these groups.
pat::MET
Description:
The standard module to produce pat::MET is the
PATMETProducer module. It produces the patMets collection. You can get an automated description of the module and its parameters using the edmPluginHelp tool described on
SWGuideConfigurationValidationAndHelp. You can find the module implementation
here
.
Configurables:
You can find the full list of configuration parameters and the standard configuration of the patMETs module
here
.
Algorithms:
You can find the links to the used algorithms for the implementation of the pat::Jets below. Note that the default implementation of algorithms as recommended and used by the corresponding POG or the Analysis Tools (AT) group are used. Therefore the following links will mostly point to the documentation as provided by these groups.
User-Defined Isolation for Leptons
reco::Photons, reco::Electrons and reco::Muons contain pre-defined isolation values recommended by the corresponding POGs. Via the pat::Candidate you can access the most common isolation values via the following member functions:
-
trackIso()
: isolation in the tracker.
-
ecalIso()
: isolatzion in the ECAL.
-
hcalIso()
: isolation in the HCAL.
-
caloIso()
: combined ECAL and HCAL isolation.
Apart from that you may have access to more detailed isolation information (especially for the reco::Muon) via the corresponding reco::Candidate member functions. In parallel PAT offers the possibility to exploit a flexible user-defined isolation for photons, electrons muons, taus and generic particles (i.e. tracks). You can access it via the member function
userIsolation(pat::IsolationKey key)
as defined under
WorkBookPATDataFormats#UserIsolation. Corresponding isolation values need to be filled during the creation process of the pat::Candidate collection in the
PSet userIsolation
in the cfi file of the corresponding pat::Candidate collection. This
PSet may consist of one or more additional
PSets with the following names:
PSet |
IsolationKey |
tracker |
pat::TrackerIso |
ecal |
pat::EcalIso |
hcal |
pat::HcalIso |
calo |
pat::CaloIso |
user |
pat::User1Iso |
|
pat::User2Iso |
|
pat::User3Iso |
|
pat::User4Iso |
|
pat::User5Iso |
where
user
stands for a
std::vector of
PSets which in the current implementation can have a maximal length of 5. The
PSets can be of the following two types:
SimpleIsolator:
cms.PSet(
src = cms.InputTag("edm::ValueMap"),
)
where
edm::ValueMap
is expected to be an
edm::ValueMap
associating a predefined isolation value to the original reco::Candidate.
IsoDepositIsolator:
cms.PSet(
src = cms.InputTag("edm::ValueMap"),
deltaR = cms.double(0.3),
mode = cms.string("mode"),
veto = cms.double(0.01),
threshold = cms.double(0.05)
)
where
edm::ValueMap
is expected to be an
edm::ValueMap
associating a predefined set of isoDeposits to the original reco::Candidate and the other parameters have the following meaning:
parameter |
type |
meaning |
deltaR |
double |
size of the isolation cone |
veto |
double |
size of a potential veto cone |
threshold |
double |
size of a given threshold value (in GeV) |
mode |
string |
mode to sum up the isoDeposits |
The parameter
deltaR
is a key parameter for calculating user-defined isolation values from isoDesposits, while the other parameters are optional. The parameter
mode
allows for the following configurations:
-
sum
: absolute sum of isoDeposits in cone.
-
sumRelative
: relative sum of isoDeposits in cone.
-
max
: absolute maximum of isoDeposits in cone.
-
maxRelative
: relative maximum of isoDeposits in cone.
-
sum2
: absolute sum of isoDeposits in cone squared.
-
sum2Relative
: relative sum of isoDeposits in cone squared.
-
count
: number of isoDeposits in cone.
Note:
User defined isolation is also available for
pat::GenericParticles e.g. for tracks.
Selection Configuration
There is a string based PatCandidateSelector to apply selections to the pat::Candidate collections in an intuitive way. The string configuration can have all kinds of functions and all member functions of the corresponding pat::Candidate as input. For a description of the string parser have a look at
SWGuidePhysicsCutParser.
The PatCandidateSelector produces the selectedPatCandidate collections, which form the standard pat::Candidate collections if object disambiguation is not applied. Per default the input to the PatCandidateSelector are the
patCandidates, but it can be any kind of pat::Candidate collection. In the standard workflow of pat::Candidate creation the selectedPatCandidate collections are produced directly after the creation of the basic pat::Candidates. For a detailed description of the standard workflow of pat::Candidate production have a look at
WorkBookPATWorkflow.
You will find the configuration files for the selection of pat::Candidates in the
python/selectionLayer1
directory of the
PatAlgos
package.
Links to the default selection strings for the most important pat::Candidate collections are listed below:
Note: there is no PATCandidateSelector for pat::MET (this might change in future).
Disambiguation Configuration
Due to the organisation of the physics object reconstruction in CMS an energy deposition in the CMS detector may be reconstructed as different objects at the same time. E.g. an energy deposit in the electromagnetic calorimeter may be reconstructed as an electron, a photon and a jet with electromagnetic fraction close to 1 at the same time. There is a configurable PatCandidateCleaner to apply some extra information for candidate disambiguation to the most important pat::Candidate collections during later analysis steps. For a more detailed description of the PatCandidateCleaner have a look at
WorkBookPATWorkflow.
The PatCandidateCleaner produces the cleanedPatCandidate collections, which form the standard pat::Candidate collections including object cross cleaning. Per default the input to the PatCandidateCleaner are the
selectedPatCandidates, but it can be any kind of pat::Candidate collection. In the standard workflow of pat::Candidate production the cleanedPatCandidate collections are produced after the selectedPatCandidate collections have been created. For a detailed description of the standard workflow of pat::Candidate creation have a look at
WorkBookPATWorkflow.
You will find the configuration files for candidate disambiguation in the
python/cleaningLayer1
directory of the
PatAlgos
package.
Links to the default configuration for the most important pat::Candidate collections are listed below:
Note:
There is no PATCandidateCleaner for pat::MET.
Review status
4.2.4 Physics Analysis Toolkit (PAT): Tutorial
Contents
Introduction
There exists a series of
PAT Tutorials, which are held at cern. You can find the next PAT Tutorial announced on
SWGuidePAT. Have a look at
indico
and
the CMS elearning area
to learn more about the Tutorials that have already taken place. On this page you can find a set of lectures, documentation and exercises for your personal studies. The following pages might be of help before getting started:
If you have more question or need more of support have a look at
SWGuidePAT#Support.
Lectures
You can find a list of recent lectures and documentation pages about PAT below:
- Lecture 1.1
: PAT introduction. (Jul2013)
- Lecture 1.2
: Learn how to access pat::Candidates with an EDAnalyzer or FWLite executable. (Jul2013)
- Lecture 2.1
: Learn more about how PAT supports object cross cleaning. (Jul2013)
- Lecture 2.2
: Learn more about how to configure the workflow and event content of a pat::Tuple. (Jul2013)
- Lecture 3.2
: Learn more about how PAT supports matching to trigger objects or Monte Carlo truth information. (Jul2013)
- Lecture 4.1
: Learn something about triggers, especially HLT, in CMS from the software point of view. (Jul2013)
- Lecture 4.2
: Learn more about the production, configuration and use of the pat::TriggerEvent. (Jul2013)
- Lecture 4.3
: Learn more about trigger matches to pat::Candidates. (Jul2013)
- Lecture 5
: Learn more about access to particle flow objects in PAT (PFBRECO). (Jul2013)
These lectures are being updated for each PAT Tutorial every ~6 Month. The date of last update is indicated in brackets.
Exercises
You can find a set of TWiki based Tutorial Exercises below. All exercises require:
Requirements
- A working cern account.
- At least 2GB of free space in your afs directory. In case you need some more space increase your quota here: CERN Resources Portal: AFS Workspaces
- Some knowledge of C++. Have a look to WorkBookBasicCPlusPlus to get a quick overhaul of the C++ you need.
- Some minimal knowledge of CMSSW.
Note:
This web course that is presented below is part of the
PAT Tutorial, which takes regularly place at cern and in other places. When following the PAT Tutorial the answers of questions marked in
RED on these TWiki pages should be filled into the exercise form that has been introduced at the beginning of the tutorial. Also the solutions to the Exercises found on these TWiki pages should be filled into the form. The exercises are marked in three colours, indicating whether this exercise is basic (obligatory), continuative (recommended) or optional (free). The colour coding is summarized in the table below:
Color Code |
Explanation |
|
Basic exercise, which is obligatory for the PAT Tutorial. |
|
Continuative exercise, which is recommended for the PAT Tutorial to deepen what has been learned. |
|
Optional exercise, which shows interesting applications of what has been learned. |
Basic exercises (

) are obliged and the solutions to the exercises should be filled into the exercise form during the PAT Tutorial.
Pre-exercises:
If you are feeling unsafe about how to do the preparation you have to do for the exercises indicated below you can follow the
Week exercise
For the
Tutorial
there is an over-spanning exercise from the area of Top-Physics that the participants should work on during the week. You can find this example on
WorkBookPATWeekExercise (responsible:
Felix Hoehle)
(Jul2014) .
PAT Tutorial Exercises:
- Week exercise: In this exercise you should apply learnt knowledge on a simplified physics problem: WorkBookPATWeekExercise.
- Monday
- Tuesday:
- Wednesday:
- Thursday:
- Friday:
The
WorkBookPAT exercises form a basic set of what you have to know about the production of
pat::Tuples and the access to
pat::Candidates. The
SWGuidePAT exercises are more comprehensive demonstrations and examples of the features and the use of PAT.
More Exercises
You can find some more comprehensive examples of the use of PAT within special POG and PAG applications below:
PAG Applications of PAT:
In this section you can find a set of links to tutorials illustrating the typical use of PAT within your Physics Analysis Group (PAG):
POG Applications of PAT:
Note:
These tutorials will cover the typical but only basic use of PAT in a simple POG/PAG specific application. They provide you with the following information:
- A small number of EDAnalyer(s) or EDProducer(s) of manageable complexity.
- A workflow, which highlights one or two PAG specific analysis aspects.
- The typical content of pat::Candidate, which your PAG has agreed upon.
- A short summary how the pat::Candiates were configured (when different from the PAT default).
- The PAG specific use of pat::Candidates and tools (e.g. certain aspects of cleaning, or a special use of isolation variables).
- Links to further PAG specific information.
The examples may refer to other parts of official code.
Review status
4.2.4.1 PAT Exercise 01: How to use the PAT Documentation
Contents
Objectives
- Learn how to use the CMSSW and EDM documentation resources.
- Learn how to use the PAT documentation resources.
Introduction
This tutorial will explain how to efficiently use the PAT Documentation. Entry points for typical questions arising about PAT will be given. Finally you may want to go through a small set of exercises to get used to the PAT Documentation. While basic questions are addressed in the WorkBook Documentation, advanced problems are discussed in the SWGuide Documentation.
Note:
This web course is part of the
PAT Tutorial, which takes regularly place at cern and in other places. When following the PAT Tutorial the answers of questions marked in
RED should be filled into the exercise form that has been introduced at the beginning of the tutorial. Also the solutions to the
Exercises should be filled into the form. The exercises are marked in three colours, indicating whether this exercise is basic (obligatory), continuative (recommended) or optional (free). The colour coding is summarized in the table below:
Color Code |
Explanation |
|
Basic exercise, which is obligatory for the PAT Tutorial. |
|
Continuative exercise, which is recommended for the PAT Tutorial to deepen what has been learned. |
|
Optional exercise, which shows interesting applications of what has been learned. |
Basic exercises (

) are obliged and the solutions to the exercises should be filled into the exercise form during the PAT Tutorial.
WorkBook Documentation
The main entry page to light 'weighted' PAT Documentation is the
WorkBookPAT. The following basic questions are addressed here:
You can find the basic information about configuring CMSSW jobs in
WorkBookConfigFileIntro and
WorkBookConfigEditor.
SWGuide Documentation and more
The main entry page to a full weight PAT Documentation is the
SWGuidePAT. Here you can find links to all available PAT Documentation. This section will discuss how you can find information to address the following two main questions:
- What are the parameters of a specific PAT object?
The primary source of information is the WorkBookPATDataFormats. Every PAT object is described here in detail. Please read through this page to get familiar with the functionality available in the PAT objects. You may want to know for example how b-tag information is accessible for jets in within PAT. A link to the Reference Manual
for the pat::Jet is provided. Note that you may have to choose the appropriate CMSSW version here
. In case you want to study the source code you may look into the Data Formats
defintion at Github or search the code using the LXR
cross reference. You will find the answer that pat::Jet::bDiscriminator and pat::Jet::getPairDiscri are returning all b-discriminators stored in the pat::Jet.
Question 1 a): In addition to the 4-momentum what extra information is available in a PAT muon (you can summarize in categories)?
- How can I configure a specific PAT algorithm?
The primary source of information are WorkBookPATWorkflow and WorkBookPATConfiguration. Here the configuration of the object content, selection and disambiguation are described. Please read through these pages to get familiar with the algorithms available in the PAT and how they can be configured. You may want to know for example how to change the jet algorithm and apply a cut on the jet pt within PAT. Changing the jet algorithm is a matter of the object (content) configuration itself. Applying a jet pt cut is a matter of the object selection. The object configuration section for pat::Jet provides a link to the Reference Manual
for the pat::Jet producer. The parameter jetSource is responsible for changing the jet algorithm. The section about the configuration of the object selection provides a link to the PatAlgos
package at Github, which contains the configuration file jetSelector_cfi.py
. The parameter cut is responsible for applying a cut on the jet pt.
Before configuring PAT you should have a look at the available SWGuidePATTools which provide a set of common methods to configure PAT. There you will find the tool switchJetCollection which allows to change the jet algorithm via the parameter jetCollection.
A much more intuitive way to configure PAT algorithms is by using the WorkBookConfigEditor. It allows to browse available parameters and the available tools, and automatically to create user configuration files with a customised configuration of PAT.
Question 1 b): What are the three main steps of the PAT workflow and what is the purpose of each step?
Question 1 c): What tools are available to configure the content of a PAT electron?
Exercises
Before leaving this page try to to do the following exercises:
Exercise 1 a):
Find out how to access the JetID in PAT?
You will find the solution here:
Exercise 1 b):
Find out what jet algorithm is used for PAT jets by default? You will find the solution here:
Exercise 1 c):
Find out if there is a selection applied to pat::Jets by default? You will find the solution here:
<!--
Exercise 1 d):

Find out how one can remove b-tagging information from PAT jets?
You will find the solution here:
Solution A in CVS
:
process.patJets.addBTagInfo=False Solution B in SWGuidePATTools:
# load the standard PAT config process.load("PhysicsTools.PatAlgos.patSequences_cff")
# load the coreTools of PAT from PhysicsTools.PatAlgos.tools.jetTools import * switchJetCollection(process, cms.InputTag('ak5CaloJets'), doJTA = True, doBTagging = False, jetCorrLabel = ('AK5','Calo'), doType1MET = True, genJetCollection = cms.InputTag("ak5GenJets"), doJetID = True, jetIdLabel = "ak5" ) -->
Exercise 1 d):

Find our how to add an additional jet collection to the PAT event content?
What PAT tool can be used? You will find the solution here:
Solution in SWGuidePATTools:
# load the standard PAT config process.load("PhysicsTools.PatAlgos.patSequences_cff")
# load the coreTools of PAT from PhysicsTools.PatAlgos.tools.jetTools import * labelAK4PFCHS = 'AK4PFCHS' postfixAK4PFCHS = 'Copy' addJetCollection( process, postfix = postfixAK4PFCHS, labelName = labelAK4PFCHS, jetSource = cms.InputTag('ak4PFJetsCHS'), jetCorrections = ('AK5PFchs', cms.vstring(['L1FastJet', 'L2Relative', 'L3Absolute']), 'Type-2') # FIXME: Use proper JECs, as soon as available ) process.out.outputCommands.append( 'drop *_selectedPatJets%s%s_caloTowers_*'%( labelAK4PFCHS, postfixAK4PFCHS ) )
Note:
In case of problems don't hesitate to contact the
SWGuidePAT#Support. Having successfully finished
Exercise 1 you might want to proceed to
Exercise 2 of the
WorkBookPATTutorial to learn more about how to create a
pat::Tuple from a RECO input file. For an overview you can go back to the
WorkBookPATTutorial entry page.
Review status
4.2.4.2 Exercise 02: How to create a PAT Tuple
Contents
Objectives
From this example you will learn
how to create a PAT Tuple on your own. The PAT team provides standard configuration files to create PAT Tuples, which contain already sensible default values for many parameters. The default output of a standard PAT Tuple are the most important collections of high analysis objects like photons, electrons, muons, tau leptons, jets, MET. We will go through the different production steps and change a few basic parameters. This example will provide you with the following information:
- How to run PAT with a standard configuration.
- Basic understanding of the configuration.
- Basic understanding of the output.
We will first run the configuration file to create a default
patTuple.root file as provided by the PAT team and then go through this configuration file step by step to learn more about it's insides and about tools to control what it is doing. A few questions will help you to check your learning success. After going though these explanations you should be well equipped to solve the
Exercises at the end of this page.
Note:
This web course is part of the
PAT Tutorial, which takes regularly place at cern and in other places. When following the PAT Tutorial the answers of questions marked in
RED should be filled into the exercise form that has been introduced at the beginning of the tutorial. Also the solutions to the
Exercises should be filled into the form. The exercises are marked in three colours, indicating whether this exercise is basic (obligatory), continuative (recommended) or optional (free). The colour coding is summarized in the table below:
Color Code |
Explanation |
|
Basic exercise, which is obligatory for the PAT Tutorial. |
|
Continuative exercise, which is recommended for the PAT Tutorial to deepen what has been learned. |
|
Optional exercise, which shows interesting applications of what has been learned. |
Basic exercises (

) are obliged and the solutions to the exercises should be filled into the exercise form during the PAT Tutorial.
Setting up of the environment
First of all connect to
lxplus
and go to some work directory. You can choose any directory, provided that you have enough space. You need ~500 MB of free disc space for this exercise. We recommend you to use your
~/scratch0 space. In case you don't have this (or do not even know what it is) check your quota typing
fs lq
and follow
this link
or change your quota
here
(click manage near "Linux and AFS" -> Increase quota to 2 GB). If you don't have enough space, you may instead use the temporary space (
/tmp/your_user_name), but be aware that this is lost once you log out of lxplus (or within something like a day). We will expect in the following that you have such a
~/scratch0 directory.
ssh your_lxplus_Name@lxplus6.cern.ch
[ ... enter password ... ]
cd scratch0/
Create a directory for this exercise (to avoid interference with code from the other exercises).
mkdir exercise02
cd exercise02
Create a local CMSSW release area and enter it.
cmsrel CMSSW_7_4_1_patch4
cd CMSSW_7_4_1_patch4/src
The first command creates all directories needed in a local CMSSW release area. Setting up the CMSSW environment is done by invoking the following script:
cmsenv
Having a look at the PAT configuration file
To check out the recommended modules and configuration files we want to use do the following:
git cms-addpkg PhysicsTools/PatAlgos
git cms-addpkg FWCore/GuiBrowsers
git cms-merge-topic -u CMS-PAT-Tutorial:CMSSW_7_4_1_patTutorial
scram b -j 4
This will checkout and compile the standard PAT, its workflow, and the necessary classes. The standard PAT sequence is run on standard AOD/RECO input files, specified in
PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
. Have a look at this file with your favourite editor, for example xemacs:
xemacs PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
You will see that this configuration file does not contain too much code:
## import skeleton process from PhysicsTools.PatAlgos.patTemplate_cfg import *
## let it run process.p = cms.Path( process.patDefaultSequence )
## from PhysicsTools.PatAlgos.patInputFiles_cff import filesRelValProdTTbarAODSIM process.source.fileNames = filesRelValProdTTbarAODSIM # process.maxEvents.input = 100
## output file process.out.fileName = 'patTuple_standard.root'
We left out the commented regions on the screen. This is due to the fact that most information you might have expected to see from a typical
cmsRun cfg file (like the
cms.OutputModule to write output to an edm output file) are imported from another file: the
patTemplate_cfg.py
. We import it in the very first line. You may want to have a closer look into the
patTemplate_cfg.py
file (which is indeed imported for all example files in the test directory of the
PatAlgos package). The
patTemplate_cfg.py
file provides defaults so that any python script (in the test directory) in which it is included runs fine. But of course you can override these defaults by parameter replacements at any time. We will do this later during this exercise. It is indeed the recommended way to change parameters by replacements instead of editing the files the parameters were defined in. The reason is obvious: editing the source file would change the parameter for any further cfg file which imports the source file - no chance to control what you did.
Try the following:
cmsRun PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
In case
cmsRun gives a message that the file patTemplate_cfg.py does not exist check the following: Did you setup your environment properly? Did you forget to type
cmsenv? You can test this checking the variable environment variable CMSSW_RELEASE_BASE ( echo $CMSSW_RELEASE_BASE) and see whether it points into the release you are expecting to work with (which is
/afs/cern.ch/cms/slc5_amd64_gcc462/cms/cmssw/CMSSW_7_4_1_patch4 in our case).
To have a look into the
patTemplate_cfg.py
file open it with your favourite editor:
xemacs PhysicsTools/PatAlgos/python/patTemplate_cfg.py &
We now come to the content of the
patTemplate_cfg.py
file. At the top level the process is defined:
import FWCore.ParameterSet.Config as cms
process = cms.Process("PAT")
Followed by the configuration of the
SWGuideMessageLogger and the output report.
## MessageLogger process.load("FWCore.MessageLogger.MessageLogger_cfi")
## Options and Output Report process.options = cms.untracked.PSet( wantSummary = cms.untracked.bool(True) )
This will just print a timing and trigger summary at the end of the job.
For this example we will only loop over 100 events from this test input file. The tool
pickRelValInputFiles will just pick up an up to date RelVal file and return a string corresponding to the I/o protocol and the logical filename (LFN). You could replace it for the code fragment to look like this:
## Source
process.source = cms.Source("PoolSource", fileNames = cms.untracked.vstring( '/store/relval/CMSSW_7_4_0_pre8/RelValProdTTbar_13/AODSIM/MCRUN2_74_V7-v1/00000/44E1E4BA-50BD-E411-A57A-002618943949.root' ) )
## Maximal Number of Events
process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(100) )
For more information on how to configure the input source, have a look at the relevant section of
SWGuidePoolInputSources. If you wish to only change the input files, you can start from the
top level configuration (patTuple_standard_cfg.py),by adding the lines:
process.source.fileNames = ['/store/relval/CMSSW_7_4_0_pre8/RelValProdTTbar_13/AODSIM/MCRUN2_74_V7-v1/00000/44E1E4BA-50BD-E411-A57A-002618943949.root']<span data-mce-mark="1"><span data-mce-mark="1"><span data-mce-mark="1">
and replace the example file by the file(s) of your choice. Note that the RECO or AOD content may have changed from release to release. So there is no guarantee that you will still be able to use a input file which has been produced with an older release that the one you are using.
In the following section of the patTemplate_cfg.py configuration file additional information is loaded to the process, which is needed by some components during the pat::Candidate production steps as explained in
WorkBookPATWorkflow:
## Geometry and Detector Conditions (needed for a few patTuple production steps) process.load("Configuration.Geometry.GeometryIdeal_cff") process.load("Configuration.StandardSequences.FrontierConditions_GlobalTag_cff") from [[Configuration.AlCa.GlobalTag]] import GlobalTag process.GlobalTag = GlobalTag (process.GlobalTag, 'auto:startup') process.load("Configuration.StandardSequences.MagneticField_cff")
Note: In this section the geometry information, calibration and alignment information (note the
global tag) and magnetic field information is loaded into the process, which is needed for the calculation of Jet/MET corrections for instance. For the simulation, the
global tag should always correspond to the
global tag used for the reconstruction of these events. This can be inferred from the dataset name, or by looking at the "Configuration files" linked in DBS. For data, the latest set of calibration constants and frontier conditions can be found at
SWGuideFrontierConditions, which is maintained by the calibration and alignment group.
Typically the line that defines the
global tag to be used would look like this:
process.GlobalTag.globaltag = cms.string('auto:run2_mc')
We are using the so called
autoCond struct to do this automatically for us here. In the following part of the configuration file the sequence for the production of pat::Candidates is loaded and made known to the process:
## Standard PAT Configuration File process.load("PhysicsTools.PatAlgos.patSequences_cff")
The last part of the configuration file is concerned with the configuration of the output file:
process.out = cms.OutputModule("PoolOutputModule", fileName = cms.untracked.string('patTuple.root'),
## save only events passing the full path
#SelectEvents = cms.untracked.PSet( SelectEvents = cms.vstring('p') ),
## save PAT output; you need a '*' to unpack the list of commands
##'patEventContent'
outputCommands = cms.untracked.vstring('drop *', *patEventContentNoCleaning ) )
process.outpath = cms.EndPath(process.out)
We want to save the produced layer of pat::Candidates in the output file. We will only save those events, which passed the whole path, and by default we drop everything else. Finally, we add the output module to the
end path of our process.
Having a deeper look at the PAT configuration
The edmConfigEditor. will allow allow you a detailed and interactive look into the Configuration. You can start it and load your config file like this:
edmConfigEditor PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
You can take a look at the modules scheduled to run, by choosing
paths and then
scheduled in the left tab.
You can choose the single modules, by clicking on them in the middle tab. Choose selectedPatElectrons.
This is one of the last steps of pat::Candidate production. The final acceptance cuts. It is fed by the result of
patElectrons
:
Have a look at
WorkBookPATConfiguration to learn more how to configure pat::Candidates during their production. You can also have a look at the
WorkBookPATDocNavigationExercise.
Question 2 a):
The selectedPatElectrons has the patElectrons collection as input. How would you verify this? As the parameter embedGenMatch for the creation of pat::Electrons in the patElectrons collection module is switched to True
, will the generator match also be embedded in the pat::Electrons of the selectedPatElectrons collection?
Note: you can exit the interactive python mode hitting CTRL-d. For now, let us just make a small modification to the acceptance cuts above. We'll restrict ourselves to electrons with
pt>10
GeV/c. Open the file
patTuple_standard_cfg.py
in your favourite editor:
[edit] PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
[... go to the end ...]
and add the following line at the end:
process.selectedPatElectrons.cut = 'pt > 10. && abs(eta) < 12.'
Note: The syntax is not exactly the same as in
cut = cms.string('') above, where the type is specified with
cms.string(...)
. That's because we are replacing an existing attribute of
selectedPatElectrons
, so python already knows the type.
Question 2 b):
As you applied a selection on the transverse momentum of the electrons in the selectedPatElectrons collection. Will it also be applied for the electrons in the cleanPatElectrons collection?
Running pat::Candidate production on top of RECO/AOD
To start the production of pat::Candidates do the following:
cmsRun PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py >& output.log
Note: In this examples the log output is redirected to a file
output.log. You may want to have a look into this file after the processing is done. Open it in your favourite text editor, you will see that a message is set out for the input file opening and for each event. To learn how to configure this log messages further have a look at
SWGuideMessageLogger:
globaltag = PRE_STA71_V4::All
==> using COND/Services/RelationalAuthenticationService for auth, sys 1
==> using COND/Services/RelationalAuthenticationService for auth, sys 1
12-Jun-2014 15:32:18 CEST Initiating request to open file root://eoscms//eos/cms/store/relval/CMSSW_7_1_0_pre4_AK4/RelValProdTTbar/AODSIM/START71_V1-v2/00000/7A3637AA-28B5-E311-BC25-003048678B94.root?svcClass=default
12-Jun-2014 15:32:20 CEST Successfully opened file root://eoscms//eos/cms/store/relval/CMSSW_7_1_0_pre4_AK4/RelValProdTTbar/AODSIM/START71_V1-v2/00000/7A3637AA-28B5-E311-BC25-003048678B94.root?svcClass=default
Begin processing the 1st record. Run 1, Event 1, LumiSection 1 at 12-Jun-2014 15:32:29.202 CEST
Begin processing the 2nd record. Run 1, Event 2, LumiSection 1 at 12-Jun-2014 15:32:54.961 CEST
Begin processing the 3rd record. Run 1, Event 3, LumiSection 1 at 12-Jun-2014 15:32:55.008 CEST
Begin processing the 4th record. Run 1, Event 4, LumiSection 1 at 12-Jun-2014 15:32:55.038 CEST
Begin processing the 5th record. Run 1, Event 5, LumiSection 1 at 12-Jun-2014 15:32:55.109 CEST
[...]
Jump to the end of the processing, where you can find the timing and trigger report summary:
TrigReport ---------- Event Summary ------------
TrigReport Events total = 100 passed = 100 failed = 0
TrigReport ---------- Path Summary ------------
TrigReport Trig Bit# Run Passed Failed Error Name
TrigReport -------End-Path Summary ------------
TrigReport Trig Bit# Run Passed Failed Error Name
TrigReport 0 0 100 100 0 0 outpath
TrigReport ------ Modules in End-Path: outpath ------------
TrigReport Trig Bit# Visited Passed Failed Error Name
TrigReport 0 0 100 100 0 0 out
TrigReport ---------- Module Summary ------------
TrigReport Visited Run Passed Failed Error Name
TrigReport 100 100 100 0 0 out
TrigReport 0 0 0 0 0 ak4PFJetsPtrs
TrigReport 0 0 0 0 0 caloJetMETcorr
TrigReport 0 0 0 0 0 caloType1CorrectedMet
TrigReport 0 0 0 0 0 caloType1p2CorrectedMet
TrigReport 100 100 100 0 0 elPFIsoDepositCharged
TrigReport 100 100 100 0 0 elPFIsoDepositChargedAll
TrigReport 100 100 100 0 0 elPFIsoDepositGamma
TrigReport 100 100 100 0 0 elPFIsoDepositNeutral
TrigReport 100 100 100 0 0 elPFIsoDepositPU
TrigReport 0 0 0 0 0 elPFIsoValueCharged03NoPFId
TrigReport 0 0 0 0 0 elPFIsoValueCharged03PFId
TrigReport 100 100 100 0 0 elPFIsoValueCharged04NoPFId
TrigReport 100 100 100 0 0 elPFIsoValueCharged04PFId
TrigReport 0 0 0 0 0 elPFIsoValueChargedAll03NoPFId
TrigReport 0 0 0 0 0 elPFIsoValueChargedAll03PFId
TrigReport 100 100 100 0 0 elPFIsoValueChargedAll04NoPFId
TrigReport 100 100 100 0 0 elPFIsoValueChargedAll04PFId
TrigReport 0 0 0 0 0 elPFIsoValueGamma03NoPFId
TrigReport 0 0 0 0 0 elPFIsoValueGamma03PFId
TrigReport 100 100 100 0 0 elPFIsoValueGamma04NoPFId
TrigReport 100 100 100 0 0 elPFIsoValueGamma04PFId
[...]
In our case, it doesn't convey much information because none of the modules actually rejects events. Nevertheless it is confirmed that 100 events were processed.
Note: In the context of analyses the name trigger report does not have anything to do with the L1 Trigger or HLT for physics events. It just tells you which modules were visited and whether events passed or failed when visiting the module. The fact that some electrons are rejected due to the object selection cuts that we defined above is not reflected here since in this step only objects (i.e. electrons) as part of an object collection are rejected and not whole events. An object count filter as also defined in the
PatAlgos/python/selectionLayer1 directory would have had an effect on whole events and hence be reflected in the summary report.
Finally, a timing report for each module is printed at the end of the job:
TimeReport ---------- Event Summary ---[sec]----
TimeReport CPU/event = 0.000000 Real/event = 0.000000
TimeReport ---------- Path Summary ---[sec]----
TimeReport per event per path-run
TimeReport CPU Real CPU Real Name
TimeReport CPU Real CPU Real Name
TimeReport per event per path-run
TimeReport -------End-Path Summary ---[sec]----
TimeReport per event per endpath-run
TimeReport CPU Real CPU Real Name
TimeReport 0.296235 0.304723 0.296235 0.304723 outpath
TimeReport CPU Real CPU Real Name
TimeReport per event per endpath-run
TimeReport CPU Real CPU Real Name
TimeReport per event per module-visit
TimeReport ------ Modules in End-Path: outpath ---[sec]----
TimeReport per event per module-visit
TimeReport CPU Real CPU Real Name
TimeReport 0.296235 0.000000 0.296235 0.000000 out
TimeReport CPU Real CPU Real Name
TimeReport per event per module-visit
TimeReport ---------- Module Summary ---[sec]----
TimeReport per event per module-run per module-visit
TimeReport CPU Real CPU Real CPU Real Name
TimeReport 0.296235 0.304719 0.296235 0.304719 0.296235 0.304719 out
TimeReport 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ak4PFJetsPtrs
TimeReport 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 caloJetMETcorr
TimeReport 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 caloType1CorrectedMet
TimeReport 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 caloType1p2CorrectedMet
TimeReport 0.000790 0.000765 0.000790 0.000765 0.000790 0.000765 elPFIsoDepositCharged
TimeReport 0.000380 0.000325 0.000380 0.000325 0.000380 0.000325 elPFIsoDepositChargedAll
TimeReport 0.000470 0.000473 0.000470 0.000473 0.000470 0.000473 elPFIsoDepositGamma
TimeReport 0.000050 0.000095 0.000050 0.000095 0.000050 0.000095 elPFIsoDepositNeutral
TimeReport 0.000050 0.000051 0.000050 0.000051 0.000050 0.000051 elPFIsoDepositPU
TimeReport 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 elPFIsoValueCharged03NoPFId
TimeReport 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 elPFIsoValueCharged03PFId
TimeReport 0.000000 0.000022 0.000000 0.000022 0.000000 0.000022 elPFIsoValueCharged04NoPFId
TimeReport 0.000260 0.000266 0.000260 0.000266 0.000260 0.000266 elPFIsoValueCharged04PFId
TimeReport 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 elPFIsoValueChargedAll03NoPFId
TimeReport 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 elPFIsoValueChargedAll03PFId
TimeReport 0.000040 0.000026 0.000040 0.000026 0.000040 0.000026 elPFIsoValueChargedAll04NoPFId
TimeReport 0.000010 0.000027 0.000010 0.000027 0.000010 0.000027 elPFIsoValueChargedAll04PFId
TimeReport 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 elPFIsoValueGamma03NoPFId
[...]
Question 2 c): Can you tell from the timing report of your output what the most time consuming part is during the PAT tuple production?
Inspecting the output of pat::Candidate production
You might want to have a quick look at the event content of the original AOD/RECO input file before inspecting the event content of the newly created PAT tuple. You can do this by invoking the edm tool
edmDumpEventContent or by using the
TBrowser
in an interactive ROOT session.
Note that the AOD/RECO file is located on castor. A way to access it is given below:
Note: Due to the large file size it might take a short while to open the file.
Note: If you would like to view the event content of the copied AOD and later of the PAT tuple with the root Browser make sure that you have loaded the FWLite libraries in advance, which will allow us to easily plot physical quantities in ROOT. Have a look at
SWGuideFWLite to learn more about the meaning of FWLite.
root -l root://eoscms//eos/cms/store/relval/CMSSW_7_1_0_pre4_AK4/RelValProdTTbar_13/AODSIM/POSTLS171_V1-v2/00000/0AD19371-15B6-E311-9DB6-0025905A6110.root
gSystem->Load("libFWCoreFWLite.so");
AutoLibraryLoader::enable();
TBrowser b
You might have added these lines to your
.rootrc or
rootLogon.C file to have these libraries loaded automatically when starting root.
Then do the following:
- double-click on
ROOT files
on the left sidebar;
- double-click on the ROOT file name (
root://eoscms//eos/cms/store/relval/CMSSW_5_3_6/RelValTTbar/GEN-SIM-RECO/PU_START53_V14-v1/0003/3E3EDF4A-E92C-E211-A1BF-003048D2BD66.root
) in the in the left sidebar;
- double-click on the
Events;1
directory in the left side bar.
This will list the (many) collections that are present in this file:
Those collections ending with
_HLT
have been produced (or processed) by the High Level Trigger, which performs some fast reconstruction; those ending with
_RECO
have been produced by standard reconstruction algorithms.
You can now exit
root and go back to your working directory to inspect the PAT tuple that you have created before. Open
root again in your working directory and proceed as above:
- double-click on
ROOT files
on the left sidebar;
- double-click on the ROOT file name in the main window (on the right side);
- double-click on the
Events;1
directory in the main window.
And we see that the PAT has collapsed the information in a much smaller number of collections of high level analysis objects:
Those collections ending with
_PAT
are the pat::Candidate collections created by the process above. Other collections (if any) are copied over from the input file. Have a look at
SWGuidePATEventSize to learn more about event size management and how to configure the event content with PAT.
To do a raw plot of the transverse momentum distribution of the pat::Electrons do the following:
- double-click on the directory
patElectrons_selectedPatElectrons__PAT
(this opens the PAT electron collection)
- double-click on
patElectrons_selectedPatElectrons__PAT.obj
- scroll down to
pt()
and double-click on it (look at the image below to locate it)
If everything went smoothly, you should obtain the electron pt spectrum, with a cut at 10
GeV, as expected. You can exit
root
typing
.q
in the prompt.
Exercises
Now that you have run your first PAT job and before leaving this page try to do the following exercises:
Exercise 2 a):
How can you increase the number of events from 100 to 1000? (
You might also want to change the default dataset to your favourite dataset. But note that you will get slightly different results then).
You will find the solution here:
Add the following line to your
patTuple_standard_cfg.py file:
process.maxEvents.input = 1000
Exercise 2 b): Add a clone of the
selectedPatJets collection (the default
pat::Jet collection without object disambiguation) to the event content, with the collection label
goodPatJets. You can achieve this by cloning the
selectedPatJet
module in your configuration file. Have a look to
WorkBookConfigFileIntro if you don't know how to clone a module. Change the selection string for your
goodPatJets collection to
"pt>30 & abs(eta)<3"
. What is the mean jet multiplicity in your event sample for this new jet collection?
Note: Make sure that your new collection will be written to the event output. You can use the feature of cloning modules together with the string selection feature to apply object and event selections in your analysis in a very elegant and human readable way.
You will find the solution here:
Add the following lines to your
patTuple_standard_cfg.py file:
## jet selector
from PhysicsTools.PatAlgos.selectionLayer1.jetSelector_cfi import *
<verbatim>process.goodPatJets = selectedPatJets.clone(src = 'selectedPatJets', cut = 'pt > 30 & abs(eta) < 3')</verbatim>
## make sure to write it to the event content of your pat tuple, this is needed to run the process (unscheduled mode)
from PhysicsTools.PatAlgos.patEventContent_cff import *
process.out.outputCommands += ["keep *_goodPatJets*_*_*"]
Exercise 2 c): add the
ak4PFJets collection to your event content (have a look to
patTuple_addJets_cfg.py
to find out how to do that) and inspect the event content of the produced file with
edmDumpEventContent (type
edmDumpEventContent --help in your shell to find out how to use it).
Can you see what has changed in the event content with respect to the standard output you saw before? (Copy and paste the additional line into the form.)
Note: You may observe more than one additional collection due to what is called embedding in PAT. You will learn about this in
Exercise 3 .
You will find the solution here:
Solution in SWGuidePATTools:
Add the following lines to your
patTuple_standard_cfg.py file:
## uncomment the following line to add different jet collections
## to the event content
from PhysicsTools.PatAlgos.tools.jetTools import addJetCollection
labelAK4PF = 'AK4PF'
addJetCollection(
process,
labelName = labelAK4PF,
jetSource = cms.InputTag('ak4PFJets'),
jetCorrections = ('AK5PF', cms.vstring(['L1FastJet', 'L2Relative', 'L3Absolute']), 'Type-1'),
)
process.out.outputCommands.append( 'drop *_selectedPatJets%s_caloTowers_*'%( labelAK4PF ) )
Exercise 2 d): In the PAT section of the SWGuide you find a section that explains how to add customized, user defined data to a
pat::Candidate. Have a look to
SWGuidePATUserData to learn more about this feature. Follow the instruction given there to add the
relative isolation variable (
relIso) to the
pat::Muon in your
pat::Tuple.
Note:
In case of problems don't hesitate to contact the
SWGuidePAT#Support. Having successfully finished
Exercise 2 you might want to proceed to the other exercises of the
WorkBookPATTutorial to learn more about PAT.
Review status
4.2.4.3 Exercise 04: How to create a PAT Tuple via crab
This exercise runs a CRAB job to create a pat::Tuple from the AOD data.
BEWARE THIS PAGE IS WAY OBSOLETE
BEWARE. THIS PAGE NEEDS TO BE UPDATED. DO NOT BLINDLY FOLLOW INSTRUCTIONS BELOW.
IN PARTICULAR THE FOLLOWING TOOLS OR PROCEDURE USED BELOW ARE DEPRECATED OR SIMPLY NOT WORKING:
- CRAB2 : USE CRAB3 INSTEAD
- SL5 :USE SL6 AND UPTODATE SCRAM ARCH AND CMSSW RELEASE
- DBS CLI: USE DAS INSTEAD
- DBS2 URL'S : USE DBS3
- CASTOR: USE EOS
Introduction
This does not teach the details of GRID nor is a CRAB tutorial. For CRAB tutorial you must refer to
WorkBookCRAB2Tutorial. A complete guide to CRAB is at
SWGuideCrab. This exercise runs CRAB job to create a PATtuple from (skimmed) data and is part of the PAT tutorial exercises. The purpose of these exercises is to show PAT users how they can use Grid tools in creating their PAT tuple for CMS analysis.
Having your storage space set up may take several days, Grid jobs run with some latency, and there can be problems.
You should
set aside about a week to complete these exercises. The actual effort required is not the whole week but a few hours. For
CRAB questions unrelated to this twiki, please use the links
CRAB2 FAQ and
CRAB3.
Pre-requisite for this exercises
To perform this set of exercises you need a Grid Certificate, CMS VO membership and SiteDB registration. It is also assumed that you have CERN account. You must have a space to write out your output files to , say
castor
. All
lxplus
users have account on
castor
where they can write their CRAB jobs output.
Obtain a CERN account
- Use the following link for a CMS CERN account: CMS CERN account
- A CERN account is needed, for example, to login in to any e-learning web-site, or obtain a file from the afs area. A CERN account will be needed for future exercises.
- Obtaining a CERN account can be time-consuming. To expedite the process please ask the relevant institutional team leader to perform the necessary "signing" after the online form has been submitted and received for initial processing by the secretariat.
Obtain a Grid Certificate, CMS VO membership and SiteDB registration
- A Grid Certificate, CMS VO membership and SiteDB registration will be needed for the next set of exercises. The registration process can be time-consuming (actions by several people are required), so it is important to start it as soon as possible. There are three main requirements which can be simply summarized: A certificate ensures that you are who you claim to be. A registration in the VO recognizes your (identified by your certificate) as a member of CMS. A SiteDB is a database and web interface that CMS uses to track sites and also used by CRAB publication step to find out the hypernews username of a person from their Grid Certificate's DN (Distinguished Name) etc.. Please look at Get Your Grid Certificate, CMSVO and SiteDB registration to complete these three steps. All three steps are needed to be completed before you successfully submit jobs on the Grid.
NOTE:
Legend of colors for this tutorial:
GRAY background for the commands to execute (cut&paste)
GREEN background for the output sample of the executed commands
BLUE background for the configuration files (cut&paste)
Step 1 - Setup a release, setup CRAB environment and verify your grid certificate is OK
To setup a
CMSSW
release for our purpose, we will execute the following instructions:
Login to
lxplus.cern.ch
ad go to your working area ( preferably
scratch0
area, for example
/afs/cern.ch/user/m/malik/scratch0/
). Then execute the following commands. We will call
/afs/cern.ch/user/m/malik/scratch0/
as
YOURWORKINGAREA
.
Create a directory for this exercise (to avoid interference with code from the other exercises).
mkdir exercise04
cd exercise04
Create a local release area and enter it.
setenv SCRAM_ARCH slc5_amd64_gcc462
cmsrel CMSSW_7_4_1_patch4
cd CMSSW_7_4_1_patch4/src
Set up the environment.
source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.csh
cmsenv
More on setup local Environment and prepare user analysis code are given
here.
To setup CRAB execute the following command. The explanation of this command is given
here.
source /afs/cern.ch/cms/ccs/wm/scripts/Crab/crab.csh
To verify that your certificate - It is assumed that you have successfully installed and followed the above instructions for setup. Now you verify that your GRID certificate has all the information needed.
Initialize your proxy
voms-proxy-init -voms cms
You should the following output:
Enter GRID pass phrase:
Your identity: /DC=org/DC=doegrids/OU=People/CN=Sudhir Malik 503653
Creating temporary proxy .......................... Done
Contacting lcg-voms.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch] "cms" Done
Creating proxy ................... Done
Your proxy is valid until Fri Jul 1 22:41:34 2011
Now run the following command
voms-proxy-info -all | grep -Ei "role|subject"
The output should look like this:
subject : /DC=org/DC=doegrids/OU=People/CN=Sudhir Malik 503653/CN=proxy
subject : /DC=org/DC=doegrids/OU=People/CN=Sudhir Malik 503653
attribute : /cms/Role=NULL/Capability=NULL
attribute : /cms/uscms/Role=NULL/Capability=NULL
Step 2 - Know your data set name in the DBS
The data set name is needed to run the CRAB job.
In this exercise, we will use the data set used =/Jet/Run2011B-PromptReco-v1/AOD=.
I picked up this dataset from the twiki
PhysicsPrimaryDatasets. Further, this dataset is also used in the script
PhysicsTools/PatExamples/test/patTuple_data_cfg.py
. This script is a standard script provided by PAT group to work with collision data and hence I picked it.
Before moving further, it would be interesting to see how to find this dataset in the DAS.
To do this open the
DAS
in a browser. In the
ADVANCED KEYWORD SEARCH
type the following :
dataset dataset=/Jet/Run2011B*
A page pops up ( after few seconds). There are several datasets listed on this page. The one we use is listed as
/Jet/Run2011B-PromptReco-v1/AOD
You can also make use of the DBS Command Line Interpreter (CLI):
- Having set your CMSSW environment invoke the dbs command:
cd exercise04/CMSSW_7_4_1_patch4/src
cmsenv
dbs search --query "find dataset where dataset=/Jet/Run2011B*"
The result below confirms what you already obtained with the DBS web interface:
Using DBS instance at: http://cmsdbsprod.cern.ch/cms_dbs_prod_global/servlet/DBSServlet
-------------------------------------------------------
dataset
/Jet/Run2011B-v1/RAW
/Jet/Run2011B-raw-19Oct2011-HLTTest-v1/USER
/Jet/Run2011B-PromptReco-v1/RECO
/Jet/Run2011B-PromptReco-v1/DQM
/Jet/Run2011B-PromptReco-v1/AOD
/Jet/Run2011B-LogError-PromptSkim-v1/RAW-RECO
/Jet/Run2011B-HighMET-PromptSkim-v1/RAW-RECO
/Jet/Run2011B-19Oct2011-HLTTest-v1/USER
/Jet/Run2011B-19Oct2011-HLTTest-v1/DQM
/Jet/Run2011B-15Nov2011-HiggsCert-v1/DQM
Step 3 - Run your PATtuple config file locally to make sure it works
First copy the python script
patTuple_data_cfg.py
to your local area. To do this, execute the following command:
cvs co -r CMSSW_7_4_1_patch4 PhysicsTools/PatExamples/test/patTuple_data_cfg.py
Before executing the python script
patTuple_data_cfg.py
, we need to edit this script and make
three modifications. If you are daring, try to run the config file
without and
with modifications one by one and see the error messages you might get. These modifications are:
1.
Change the line for global tag
process.GlobalTag.globaltag = 'GR_R_42_V12::All:'
to
process.GlobalTag.globaltag = 'GR_R_44_V11::All'
because we to use the appropriate gobal tag. More on global tag is
SWGuideFrontierConditions.
2.
Change
inputJetCorrLabel = ('AK5PF', ['L1Offset', 'L2Relative', 'L3Absolute', 'L2L3Residual'])
to
inputJetCorrLabel = ('AK5PF', [])#NO jet energy correction yet
because I chose not to use any jet correction.
3.
Replace all the input data files with the following:
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/786/CA31E47B-1AA2-E011-B432-003048F1C58C.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/786/48B98498-02A2-E011-96B6-BCAEC5329732.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/785/6E3DC514-C4A1-E011-BBC5-0030487CD6F2.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/784/B01C8DF5-03A2-E011-A186-BCAEC518FF68.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/754/8027445A-D6A1-E011-91DE-0030487CD77E.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/746/C2ADFBCB-A3A1-E011-97DA-003048D37560.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/746/A49A3BEE-E0A1-E011-AB60-BCAEC532970F.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/746/90E492BD-DEA1-E011-8D48-003048F110BE.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/746/2E7786CA-A3A1-E011-9241-BCAEC532970D.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/746/1A134FCA-A3A1-E011-BAC3-BCAEC5329702.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/740/F6DB1AFB-7AA1-E011-95F6-003048F11DE2.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/740/E46112AF-74A1-E011-ABD5-BCAEC5329721.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/715/1852D1D3-26A1-E011-899C-001D09F28EC1.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/696/548BA9E0-26A1-E011-AD09-003048F118C4.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/676/AA128CD7-3DA1-E011-938E-001D09F34488.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/676/A4562376-47A1-E011-AF79-0030487CBD0A.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/676/809361D8-3DA1-E011-ACCF-001D09F2423B.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/676/383D553B-3FA1-E011-BA14-003048D2BC42.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/676/04437321-50A1-E011-B51A-001D09F29169.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/F82CB1ED-EAA0-E011-9CA9-001D09F28EC1.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/ECD4DADB-DCA0-E011-980E-0019B9F72D71.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/EC4F6380-16A1-E011-8A26-003048D2BDD8.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/CE809180-16A1-E011-A5DF-0030486780A8.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/90F2050B-DAA0-E011-A767-001D09F2441B.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/74536EF9-DEA0-E011-892A-001D09F2905B.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/601C1E07-F9A0-E011-AA76-001D09F295FB.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/5C09D510-3BA1-E011-A20F-001D09F254CE.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/4A48B4DB-DCA0-E011-8026-001D09F244BB.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/24B51FC9-FBA0-E011-AAE0-003048F1182E.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/22D01B9C-18A1-E011-9F0B-0030486780A8.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/1AA90303-F9A0-E011-98E5-003048F11C5C.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/675/0C6593C3-E1A0-E011-9D48-001D09F28EC1.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/674/EADA254E-A7A1-E011-AE17-BCAEC518FF44.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/674/940EB3F6-CBA0-E011-B453-0019B9F72BFF.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/674/80BBE492-ABA0-E011-942D-001D09F34488.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/674/602510F6-A5A0-E011-82E5-001D09F29114.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/167/673/A02BB63F-E3A0-E011-9A58-0030487C7E18.root'
because these correspond to our dataset chosen.
Now from your directory
/YOURWORKINGAREA/exercise04/CMSSW_7_4_1_patch4/src
, execute the following command
cmsRun PhysicsTools/PatExamples/test/patTuple_data_cfg.py
If successful, this will create an output file
jet2011A_aod.root
of size about
79MB
with
1000 events
.
Now you prepare the CRAB configuration file
Step 4 - Prepare your crab.py script for CRAB job submission
Your crab config files to submit CRAB job should look like this
crab.cfg.
In this
crab.cfg
file, the name of the output file is
jet2011A_aod.root
. Make sure that this name of the output file is the same as defined in the line
process.out.fileName = cms.untracked.string(''jet2011A_aod.root")
in the file
patTuple_data_cfg.py
.
To look at what each of the line in
crab.cfg
means, refer to
CRAB Tutorial.
This
crab.cfg
file contains
storage_path
as
/srm/managerv2?SFN=/castor/cern.ch/user/m/malik/pattutorial/
that we create in the next step. In this exercise we run over the data set name
/Jet/Run2011A-PromptReco-v4/AOD
. This
crab.cfg
also uses a file called
Cert_160404-167151_7TeV_PromptReco_Collisions11_JSON.txt
. This file that describes which luminosity sections in which runs are considered good and should be processed. In CMS, this kind of file is in the JSON format. (JSON stands for Java Script Object Notation).
To find the most current good luminosity section files in JSON format, please visit
https://cms-service-dqm.web.cern.ch/cms-service-dqm/CAF/certification/Collisions11/
.
To know on how to work with files for Good Luminosity Sections in JSON format please look at the twiki
SWGuideGoodLumiSectionsJSONFile.
Instructions on how to setup your crab jobs over selected lumi section is
Running over selected luminosity blocks.
You can find the official (centrally produced) JSON format files
here
or on afs at the CAF area
/afs/cern.ch/cms/CAF/CMSCOMM/COMM_DQM/certification
. At the same locations as the JSON format file you can also find the txt file reporting the history of the quality and DCS selection and at the bottom the snippet to configure any CMSSW application for running on the selected good Lumi Sections.
Step 5 - Get JSON
format file, creating writable directory in castor
and setup CRAB
environment
Get the
JSON
file to you
WORKINGDIRECTORY/exercise04/CMSSW_7_4_1_patch4/src
area
wget http://cms-service-dqm.web.cern.ch/cms-service-dqm/CAF/certification/Collisions11/7TeV/Prompt/Cert_160404-167151_7TeV_PromptReco_Collisions11_JSON.txt
Notice how character "s" was removed from "https".
wget
command does not like "https" somehow.
Create a directory in
castor
called
pattutorial
rfmkdir /castor/cern.ch/user/m/malik/pattutorial
Create a directory in
castor
called
jet2011A_AOD
rfmkdir /castor/cern.ch/user/m/malik/pattutorial/jet2011A_AOD
This directory has to be
group
writable, so change the persmission accordingly
rfchmod 775 /castor/cern.ch/user/m/malik/pattutorial/jet2011A_AOD
To see your created directory do:
nsls /castor/cern.ch/user/m/malik/pattutorial/jet2011A_AOD
Setup up CRAB environment
source /afs/cern.ch/cms/ccs/wm/scripts/Crab/crab.csh
Step 6 - Create CRAB job
To create your CRAB job, execute the following command:
crab -create
You should see the following output on the screen:
crab: Version 2.7.8 running on Sat Jul 2 18:18:27 2011 CET (16:18:27 UTC)
crab. Working options:
scheduler glite
job type CMSSW
server ON (use_server)
working directory /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110702_181826/
Enter GRID pass phrase:
Your identity: /DC=org/DC=doegrids/OU=People/CN=Sudhir Malik 503653
Creating temporary proxy ............................................................................ Done
Contacting voms.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch] "cms" Done
Creating proxy ............................................................................................. Done
Your proxy is valid until Sun Jul 10 18:18:35 2011
crab: Contacting Data Discovery Services ...
crab: Accessing DBS at: http://cmsdbsprod.cern.ch/cms_dbs_prod_global/servlet/DBSServlet
crab: Requested (A)DS /Jet/Run2011A-PromptReco-v4/AOD has 194 block(s).
crab: 495 jobs created to run on 36743 lumis
************** MC dependence removal ************
removing MC dependencies for photons
removing MC dependencies for electrons
removing MC dependencies for muons
removing MC dependencies for taus
removing MC dependencies for jets
WARNING: called applyPostfix for module/sequence patJetPartonMatch which is not in patDefaultSequence!
WARNING: called applyPostfix for module/sequence patJetGenJetMatch which is not in patDefaultSequence!
WARNING: called applyPostfix for module/sequence patJetGenJetMatch which is not in patDefaultSequence!
WARNING: called applyPostfix for module/sequence patJetPartonAssociation which is not in patDefaultSequence!
=================================================
Type1MET corrections are switched off for other
jet types but CaloJets. Users are recommened to
use pfMET together with PFJets & tcMET together
with JPT jets.
=================================================
WARNING: called applyPostfix for module/sequence patDefaultSequence which is not in patDefaultSequence!
WARNING: called applyPostfix for module/sequence patDefaultSequence which is not in patDefaultSequence!
WARNING: called applyPostfix for module/sequence patDefaultSequence which is not in patDefaultSequence!
removed from lepton counter: taus
---------------------------------------------------------------------
INFO : some objects have been removed from the sequence. Switching
off PAT cross collection cleaning, as it might be of limited
sense now. If you still want to keep object collection cross
cleaning within PAT you need to run and configure it by hand
WARNING: called applyPostfix for module/sequence countPatCandidates which is not in patDefaultSequence!
---------------------------------------------------------------------
INFO : cleaning has been removed. Switch output from clean PAT
candidates to selected PAT candidates.
switchOnTrigger():
PATTriggerProducer module patTrigger exists already in sequence patDefaultSequence
==> entry re-used
---------------------------------------------------------------------
switchOnTrigger():
PATTriggerEventProducer module patTriggerEvent exists already in sequence patDefaultSequence
==> entry re-used
---------------------------------------------------------------------
crab: Checking remote location
crab: WARNING: The stageout directory already exists. Be careful not to accidentally mix outputs from different tasks
crab: Creating 495 jobs, please wait...
crab: Total of 495 jobs created.
Log file is /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110702_181826/log/crab.log
Step 7 - Submit CRAB job
To submit the CRAB job you should execute the following command:
crab -submit
You should see the following output on the screen:
crab: Version 2.7.8 running on Sat Jul 2 18:22:20 2011 CET (16:22:20 UTC)
crab. Working options:
scheduler glite
job type CMSSW
server ON (default)
working directory /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110702_181826/
crab: Registering credential to the server : vocms58.cern.ch
crab: Credential successfully delegated to the server.
crab: Starting sending the project to the storage vocms58.cern.ch...
crab: Task crab_0_110702_181826 successfully submitted to server vocms58.cern.ch
crab: Total of 495 jobs submitted
Log file is /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110702_181826/log/crab.log
Step 8 - Status of CRAB job
To see the status of your CRAB job, execute the following command:
crab -status
You should see the following output on the screen:
crab: Version 2.7.8 running on Sat Jul 2 18:23:19 2011 CET (16:23:19 UTC)
crab. Working options:
scheduler glite
job type CMSSW
server ON (default)
working directory /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110702_181826/
crab:
ID END STATUS ACTION ExeExitCode JobExitCode E_HOST
----- --- ----------------- ------------ ---------- ----------- ---------
1 N Submitting SubRequested
2 N Submitting SubRequested
3 N Submitting SubRequested
4 N Submitting SubRequested
5 N Submitting SubRequested
6 N Submitting SubRequested
7 N Submitting SubRequested
8 N Submitting SubRequested
9 N Submitting SubRequested
10 N Submitting SubRequested
--------------------------------------------------------------------------------
11 N Submitting SubRequested
12 N Submitting SubRequested
13 N Submitting SubRequested
14 N Submitting SubRequested
15 N Submitting SubRequested
16 N Submitting SubRequested
17 N Submitting SubRequested
18 N Submitting SubRequested
19 N Submitting SubRequested
20 N Submitting SubRequested
--------------------------------------------------------------------------------
21 N Submitting SubRequested
22 N Submitting SubRequested
23 N Submitting SubRequested
24 N Submitting SubRequested
25 N Submitting SubRequested
26 N Submitting SubRequested
27 N Submitting SubRequested
28 N Submitting SubRequested
29 N Submitting SubRequested
30 N Submitting SubRequested
--------------------------------------------------------------------------------
.................................................................
.................................................................
--------------------------------------------------------------------------------
471 N Submitting SubRequested
472 N Submitting SubRequested
473 N Submitting SubRequested
474 N Submitting SubRequested
475 N Submitting SubRequested
476 N Submitting SubRequested
477 N Submitting SubRequested
478 N Submitting SubRequested
479 N Submitting SubRequested
480 N Submitting SubRequested
--------------------------------------------------------------------------------
481 N Submitting SubRequested
482 N Submitting SubRequested
483 N Submitting SubRequested
484 N Submitting SubRequested
485 N Submitting SubRequested
486 N Submitting SubRequested
487 N Submitting SubRequested
488 N Submitting SubRequested
489 N Submitting SubRequested
490 N Submitting SubRequested
--------------------------------------------------------------------------------
491 N Submitting SubRequested
492 N Submitting SubRequested
493 N Submitting SubRequested
494 N Submitting SubRequested
495 N Submitting SubRequested
crab: 495 Total Jobs
>>>>>>>>> 495 Jobs Submitting
crab: You can also follow the status of this task on :
CMS Dashboard: http://dashb-cms-job-task.cern.ch/taskmon.html#task=malik_crab_0_110702_181826_2c05hd
Server page: http://vocms58.cern.ch:8888/logginfo
Your task name is: malik_crab_0_110702_181826_2c05hd
Log file is /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110702_181826/log/crab.log
You can also see the status of your CRAB job on the dashboard URL listed at end of the screen output of the above step i.e.
http://dashb-cms-job-task.cern.ch/taskmon.html
(here your task).
The contents of this
URL
will depend on the status of your jobs i.e. running, pending, successful, failed etc. An example scren shot of this URL is here:
Another view of the url obtained by clicking on "TaskMonitorId" is:
NOTE: There could be a mismatch between seeing the status of your jobs by executing the command
crab -status
and the dashboard URL.
If you see any jobs with status
successful
, you should see the output root files in
/castor in the directory
/castor/cern.ch/user/m/malik/pattutorial/jet2011A_AOD
To look at this directory, do
rfdir /castor/cern.ch/user/m/malik/pattutorial/jet2011A_AOD
If there are any output files in this directory, it should look like this:
-rw-r--r-- 1 cms003 zh 1245918093 Jun 30 17:33 jet2011A_aod_100_0_I5h.root
-rw-r--r-- 1 cms003 zh 1274219226 Jun 30 17:39 jet2011A_aod_101_0_CRr.root
-rw-r--r-- 1 cms003 zh 972050289 Jun 30 17:20 jet2011A_aod_102_0_3DV.root
-rw-r--r-- 1 cms003 zh 1089918123 Jun 30 17:56 jet2011A_aod_103_0_tFV.root
-rw-r--r-- 1 cms003 zh 903472800 Jun 30 17:49 jet2011A_aod_105_0_wDk.root
-rw-r--r-- 1 cms003 zh 1193558045 Jun 30 17:34 jet2011A_aod_106_0_KpW.root
-rw-r--r-- 1 cms003 zh 647739659 Jun 30 17:03 jet2011A_aod_107_0_MGr.root
-rw-r--r-- 1 cms003 zh 1025885141 Jun 30 17:47 jet2011A_aod_108_0_9Rj.root
-rw-r--r-- 1 cms003 zh 1284572326 Jun 30 17:57 jet2011A_aod_109_0_iyZ.root
-rw-r--r-- 1 cms003 zh 1311061481 Jun 30 11:53 jet2011A_aod_10_1_Sj5.root
-rw-r--r-- 1 cms003 zh 1280795427 Jul 01 07:32 jet2011A_aod_110_2_EhQ.root
-rw-r--r-- 1 cms003 zh 1294175267 Jun 30 18:15 jet2011A_aod_113_0_esC.root
-rw-r--r-- 1 cms003 zh 1277053249 Jun 30 18:19 jet2011A_aod_114_0_pq3.root
-rw-r--r-- 1 cms003 zh 1285075447 Jun 30 18:17 jet2011A_aod_115_0_fOy.root
-rw-r--r-- 1 cms003 zh 1259237238 Jun 30 18:25 jet2011A_aod_116_0_RXi.root
-rw-r--r-- 1 cms003 zh 1245886124 Jun 30 18:22 jet2011A_aod_117_0_DIT.root
-rw-r--r-- 1 cms003 zh 1215675870 Jun 30 13:55 jet2011A_aod_11_1_kT4.root
-rw-r--r-- 1 cms003 zh 1267111659 Jun 30 18:17 jet2011A_aod_120_0_BNj.root
-rw-r--r-- 1 cms003 zh 1249178192 Jun 30 18:20 jet2011A_aod_121_0_Pw9.root
-rw-r--r-- 1 cms003 zh 1067304425 Jun 30 18:14 jet2011A_aod_122_0_GDS.root
-rw-r--r-- 1 cms003 zh 1249161461 Jun 30 18:18 jet2011A_aod_123_0_4Ku.root
-rw-r--r-- 1 cms003 zh 1011336451 Jun 30 17:25 jet2011A_aod_124_0_FaR.root
-rw-r--r-- 1 cms003 zh 1100894887 Jun 30 17:30 jet2011A_aod_125_0_w6E.root
-rw-r--r-- 1 cms003 zh 1303136732 Jun 30 17:32 jet2011A_aod_127_0_PCd.root
-rw-r--r-- 1 cms003 zh 1292889530 Jun 30 17:37 jet2011A_aod_128_0_UGc.root
-rw-r--r-- 1 cms003 zh 843154097 Jun 30 13:18 jet2011A_aod_12_1_jS0.root
-rw-r--r-- 1 cms003 zh 1288254355 Jun 30 17:40 jet2011A_aod_130_0_SI7.root
-rw-r--r-- 1 cms003 zh 1288040088 Jun 30 17:31 jet2011A_aod_131_0_BNS.root
-rw-r--r-- 1 cms003 zh 1286172683 Jun 30 17:34 jet2011A_aod_132_0_gby.root
-rw-r--r-- 1 cms003 zh 1261984151 Jun 30 17:31 jet2011A_aod_133_0_xeu.root
-rw-r--r-- 1 cms003 zh 1929437920 Jun 30 17:59 jet2011A_aod_134_0_mzS.root
-rw-r--r-- 1 cms003 zh 1291247174 Jun 30 17:31 jet2011A_aod_135_0_5lU.root
-rw-r--r-- 1 cms003 zh 1277940439 Jun 30 17:34 jet2011A_aod_136_0_B6g.root
-rw-r--r-- 1 cms003 zh 1257422925 Jun 30 17:30 jet2011A_aod_137_0_0cf.root
-rw-r--r-- 1 cms003 zh 1278923610 Jun 30 17:42 jet2011A_aod_138_0_9Vn.root
-rw-r--r-- 1 cms003 zh 1282749623 Jun 30 17:31 jet2011A_aod_139_0_yFP.root
-rw-r--r-- 1 cms003 zh 1320252214 Jun 30 15:36 jet2011A_aod_13_1_MYg.root
-rw-r--r-- 1 cms003 zh 1281544555 Jun 30 17:43 jet2011A_aod_140_0_EIG.root
-rw-r--r-- 1 cms003 zh 1255142335 Jun 30 17:40 jet2011A_aod_142_0_j7Q.root
-rw-r--r-- 1 cms003 zh 1260618421 Jun 30 17:36 jet2011A_aod_143_0_1wD.root
-rw-r--r-- 1 cms003 zh 1169947714 Jun 30 18:14 jet2011A_aod_145_0_nF8.root
Step 9 - Job Output Retrieval Check for output files in /castor
For the jobs which are in the "Done" state it is possible to retrieve the log files of the jobs (just the log files), because the output files are copied to the Storage Element associated to the T2 specified on the
crab.cfg
which is
/castor/cern.ch/user/m/malik/pattutorial/jet2011A_AOD
in our case and infact return_data is 0 which means we are not publishing the data to the DBS. The following command retrieves the log files of all "Done" jobs of the last created CRAB project. The job results will be copied in the
res
subdirectory of your crab project ( for example
crab_0_110702_181826
):
crab -getoutput
When you execute this command you should see an output should look like:
crab: Version 2.7.8 running on Fri Jul 1 10:06:14 2011 CET (08:06:14 UTC)
crab. Working options:
scheduler glite
job type CMSSW
server ON (default)
working directory /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/
crab: Only 494 jobs will be retrieved from 495 requested.
(for details: crab -status)
crab: Starting retrieving output from server vocms58.cern.ch...
crab: Results of Jobs # 1 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 2 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 3 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 4 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 5 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 6 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 7 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 8 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 9 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 10 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 11 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 12 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 13 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 14 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 15 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 16 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 17 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 18 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 19 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 20 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 21 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 22 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 23 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 24 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 25 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
crab: Results of Jobs # 26 are in /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/
................................
................................
................................
It may happen that while doing
crab -getoutput
, your disk quota gets exceeded. In that case you will not be able to look at the log files of all the jobs.
If you succeed in getting
crab -getoutput
of however many jobs that got done, your
crab_0_110702_181826/res
should have all the log files and look like the output below.
Note. Let us say jobs 1 to 50 and 60 to 70 and 95 are done and you do not want to wait till all the jobs are done, you can get their output by doing
crab -getoutput 1-50,6-70, 95
...................
...................
...................
-rw------- 1 malik zh 48K Jun 30 11:30 crab_fjr_2.xml
-rw------- 1 malik zh 4.5M Jun 30 11:30 CMSSW_2.stdout
-rw------- 1 malik zh 49K Jun 30 11:32 crab_fjr_4.xml
-rw------- 1 malik zh 4.4M Jun 30 11:32 CMSSW_4.stdout
-rw------- 1 malik zh 48K Jun 30 11:32 crab_fjr_14.xml
-rw------- 1 malik zh 4.8M Jun 30 11:32 CMSSW_14.stdout
-rw------- 1 malik zh 50K Jun 30 11:33 crab_fjr_24.xml
-rw------- 1 malik zh 5.0M Jun 30 11:33 CMSSW_24.stdout
-rw------- 1 malik zh 49K Jun 30 11:33 crab_fjr_16.xml
-rw------- 1 malik zh 5.0M Jun 30 11:33 CMSSW_16.stdout
-rw------- 1 malik zh 49K Jun 30 11:34 crab_fjr_9.xml
-rw------- 1 malik zh 4.6M Jun 30 11:34 CMSSW_9.stdout
-rw------- 1 malik zh 50K Jun 30 11:34 crab_fjr_23.xml
-rw------- 1 malik zh 5.0M Jun 30 11:34 CMSSW_23.stdout
-rw------- 1 malik zh 49K Jun 30 11:37 crab_fjr_21.xml
-rw------- 1 malik zh 5.0M Jun 30 11:37 CMSSW_21.stdout
-rw------- 1 malik zh 48K Jun 30 11:38 crab_fjr_1.xml
-rw------- 1 malik zh 4.4M Jun 30 11:38 CMSSW_1.stdout
-rw------- 1 malik zh 49K Jun 30 11:38 crab_fjr_6.xml
-rw------- 1 malik zh 4.6M Jun 30 11:38 CMSSW_6.stdout
-rw------- 1 malik zh 48K Jun 30 11:41 crab_fjr_7.xml
-rw------- 1 malik zh 4.8M Jun 30 11:42 CMSSW_7.stdout
-rw------- 1 malik zh 49K Jun 30 11:49 crab_fjr_20.xml
-rw------- 1 malik zh 5.0M Jun 30 11:49 CMSSW_20.stdout
-rw------- 1 malik zh 49K Jun 30 11:50 crab_fjr_5.xml
-rw------- 1 malik zh 4.6M Jun 30 11:50 CMSSW_5.stdout
-rw------- 1 malik zh 48K Jun 30 11:53 crab_fjr_10.xml
Step 10 - Check log files to trace problems, if any
You can look at the log files in
crab_0_110702_181826/res
directory after executing the above step to see the details in case a job fails.
You can also print a short report about the task, namely the total number of events and files processed/requested/available, the name of the dataset path, a summary of the status of the jobs, and so on. A summary file of the runs and luminosity sections processed is written to res/. In principle -report should generate all the info needed for an analysis. To get a report execute the following. Note in this case 1 job out of 495 jobs did not execute.
crab -report
The output of this command should look like something like this ( this is an old cut and paste, but conveys the message):
crab: Version 2.7.8 running on Fri Jul 1 09:59:55 2011 CET (07:59:55 UTC)
crab. Working options:
scheduler glite
job type CMSSW
server ON (default)
working directory /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/
crab: --------------------
Dataset: /Jet/Run2011A-PromptReco-v4/AOD
Remote output :
SE: srm-cms.cern.ch srm-cms.cern.ch srmPath: srm://srm-cms.cern.ch:8443/srm/managerv2?SFN=/castor/cern.ch/user/m/malik/pattutorial/jet2011A_AOD/
------------------
------------------
Total Jobs : 495
Luminosity section summary file: /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/res/lumiSummary.json
# Jobs: Cleared:494
----------------------------
Log file is /afs/cern.ch/user/m/malik/scratch0/PAT2011JULY/CMSSW_4_2_4/src/crab_0_110630_103547/log/crab.log
If you want to publish your output to DBS, you need to re-run your CRAB job with modification of some option. Please see
here for more details.
Step 10 - Open an output root file to make sure you see the plots of variables.
Make sure to open a root file to see if it contains what you wnanted. Once you have your output root files, you are ready to analyze them.
To open a root file from your castor storage area, do as follows ( as an example, replace path name and root file name to what you actually have in your area)
root -l /castor/cern.ch/user/m/malik/pattutorial/jet2011A_AOD/jet2011A_aod_89_0_Afy.root
Note:
If you are doing this exercise in the context of the PAT Tutorial course in case of problems don't hesitate to contact the
SWGuidePAT#Support. Having successfully finished
Exercise 4 you might want to proceed to
Exercise 5 to learn how to access the pat::Candidate collections that you just produced within an EDAnalyzer or within an FWLite executable. For an overview you can go back to the
WorkBookPATTutorial entry page.
--
SudhirMalik - 2-July-2010
StefanoBelforte - 2015-05-24 put BIG warning about most of things here being obsolete |
Main.Sudhir - 30 June 2011 |
update to CMSSW_4_2_4 |
Main.Sudhir - 19 May 2010 |
update to CMSSW_3_8_3, last screen shoton "crab -report" still old |
Main.Sudhir - 13 May 2010 |
update to CMSSW_3_6_1, screen shots still for CMSSW_3_5_7 |
4.2.4.4 Exercise 04: How to analyse PAT Candidates
Contents
Objectives
In this example you will learn
how to access the information from a PAT candidate with a simple piece of code that can easily be transformed into an EDAnalyzer or an FWLite executable. We also show you how you can very easily dump selected information of a
pat::Tuple or a RECO or AOD file into into a customized and
EDM-compliant user n-tuple, without writing a single line of C++ code. You can find all examples presented on this TWiki page in the
PatExamples
package. They will provide you with the following information:
- How to create a plugin or a bare executable within a CMSSW package.
- How to book a histogram using the TFileService within full CMSSW or FWLite.
- How to access the information of a PAT candidate within a common EDAnalyzer.
For this example we picked the
pat::Muon collection. You can transform everything you learn to any other
pat::Candidate by replacing the
reco::Muon data format and labels by the corresponding data format and labels of the collection of your interest. An example of how to do such a replacement is given at the
end of the page. A few questions will help you to check your learning success. After going through these explanation you should be well equipped to solve the
Exercises at the end of the page.
Note:
This web course is part of the
PAT Tutorial, which takes place regularly at CERN and other places. When following the PAT Tutorial, the answers to questions marked in
RED should be filled into the exercise form that has been introduced at the beginning of the tutorial. The solutions to the
Exercises should also be filled into the form. The exercises are marked in three colours, indicating whether the exercise is basic (obligatory), continuative (recommended), or optional (free). The colour-coding is summarized in the table below:
Color Code |
Explanation |
|
Basic exercise, which is obligatory for the PAT Tutorial. |
|
Continuative exercise, which is recommended for the PAT Tutorial, to deepen what has been learned. |
|
Optional exercise, which shows interesting applications of what has been learned. |
Basic exercises (

) are mandatory and the solutions to the exercises should be filled into the exercise form during the PAT Tutorial.
Setting up the environment
First of all, connect to
lxplus
and go to some work directory. You can choose any directory, provided that you have enough space—you need ~5 MB of free disc space for this exercise. We recommend that you use your
~/scratch0 space. In case you don't have this (or don't even know what it is), check your quota by typing
fs lq
and following
this link
. If you don't have enough space, you may instead use the temporary space (
/tmp/your_user_name), but be aware that this is lost once you log out of lxplus (or within something like a day). We assume in the following tutorial that you are using your
~/scratch0 directory.
ssh your_lxplus_name@lxplus6.cern.ch
[ ... enter password ... ]
cd scratch0/
Create a new directory for this exercise (in order to avoid interference with code from the other exercises) and enter it.
mkdir exercise04
cd exercise04
Create a local release area and enter it.
cmsrel CMSSW_7_4_1_patch4
cd CMSSW_7_4_1_patch4/src
The first command creates all directories needed in a local release area. Setting up the environment is done invoking the following script:
cmsenv
How to get the code
The
PatExamples
package is part of all CMSSW releases. To be able to inspect and to work on the examples described below, you should add the package to your
src directory and compile it. To achieve this do the following:
git cms-addpkg PhysicsTools/PatAlgos
git cms-addpkg PhysicsTools/PatExamples
git cms-merge-topic -u CMS-PAT-Tutorial:CMSSW_7_4_1_patTutorial
scram b -j 4
Note: Compiling is accomplished with the
scram b
command. To make use of more than one core to compile packages, we use of the
scram
compiler flag
-j
followed by the number of cores we would like to make use of.
This will checkout the
PatAlgos and
PatExamples packages from the central release area for CMSSW_7_4_1_patch4.
Note: For the following examples you need to produce a PAT tuple with the name
patTuple.root beforehand. It can be generated using the
patTuple_standard_cfg.py
either from from the
PatExamples or from the
PatAlgos package. We will make use of the file in the
PatExamples package:
cmsRun PhysicsTools/PatAlgos/test/patTuple_standard_cfg.py
Have a look to the
WorkBookPATTupleCreationExercise to learn more about the creation of PAT tuples.
How to access information from a pat::Candidate with an EDAnalyzer or FWLite
We will first show to access the information of a
pat::Candidate with a full framework EDAnalyzer or FWLite executable. We show how this can be done very conveniently using the principle of the BasicAnalyzer as defined in the
PhysicsTools/UtilAlgos
package.
How to run the EDAnalyzer example
To run the EDAnalyzer example on the newly created PAT tuple do the following:
cmsRun PhysicsTools/PatExamples/test/analyzePatMuons_edm_cfg.py
You will receive a root file with name
analyzePatMuons.root which will contain a set of histograms in a directory
patMuonAnalyzer. Open the root file that will have been produced by typing:
root -l analyzePatMuons.root
Question 4 a): When you produced the patTuple.root file from the patTuple_standard_cfg.py as recommended above you were using a standard pythia ttbar MC file as input. What is the mean muon pt of the muons that you find in this sample?
Note: The file and directory names are defined by a parameter of the
TFileService
and by the name of the module in the configuration file that is described
below. The histograms are registered in the
beginJob function of the
PatMuonAnalyzer class that is described below. They contain the
pt,
eta and
phi of all muons in the event.
Note: You would do an analysis within the full framework, when running complex large scale analyses, which make full use of the
EDM event content (e.g. when reading objects from but also writing objects into the event) and when running with the help of
crab or local batch systems.
Let us have a look at the configuration file:
import FWCore.ParameterSet.Config as cms
process = cms.Process("Test")
process.source = cms.Source("PoolSource", fileNames = cms.untracked.vstring( "file:patTuple_standard.root" ) ) process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(100) )
## --- ## This is an example of the use of the BasicAnalyzer concept used to exploit C++ classes to do anaysis ## in full framework or FWLite using the same class. You can find the implementation of this module in ## PhysicsTools /PatExamples/plugins/PatMuonEDAnlyzer.cc. You can find the EDAnalyzerWrapper.h class in ## PhysicsTools /UtilAlgos/interface/EDAnalyzerWrapper.h. You can find the implementation of the ## PatMuonAnalyzer class in PhysicsTools /PatExamples/interface/PatMuonAnlyzer.h. You will also find # # back the input parameters to the module.
process.patMuonAnalyzer = cms.EDAnalyzer("PatMuonEDAnalyzer", muons = cms.InputTag("selectedPatMuons"), )
process.TFileService = cms.Service("TFileService", fileName = cms.string('analyzePatMuons.root') )
process.p = cms.Path(process.patMuonAnalyzer)
Note: You can find the definition of the input file (
patTuple_standard.root), the definition of the full framework module of our BasicAnalyzer (
patMuonAnalyzer) and the definition of the
TFileService, which is configured to book the histograms of all registered EDAnalyzers in a histogram file with name
analyzePatMuons.root. This histogram file will contain a folder with the name of our EDAnalyzer module, in which you find the set of histograms that we have booked in the implementation of the
PatMuonAnalyzer. In the module definition the
edm::InputTag for the
pat::Muon collections is given. Have a look into the constructor of the
module implementation to see, how this
edm::InputTag is passed on the the analyzer code. Have a look to
WorkBookPATDataFormats to find out what collection labels are available (after running the default workflow). Have a look to
WorkBookPATWorkflow to find out how they are produced and what they mean.
Question 4 b): For more complex cfg files the definition of the module could be kept in an extra cfi file, which could be loaded into the main configuration file instead of having the module definition being part of the cfg file. How would you do that? (Have a look at WorkBookConfigFileIntro#ConfigContents for some hints).
How to run the FWLite example
To fill the same histograms using an FWLite executable. Do the following:
PatMuonFWLiteAnalyzer PhysicsTools/PatExamples/test/analyzePatMuons_fwlite_cfg.py
You will get the same output as for the full framework example that has been demonstrated above. Note the difference in speed. Also the FWLite executable can be configured using the CMS python configuration language in analogy to the full framework example. The configuration file in our example is given below:
import FWCore.ParameterSet.Config as cms
process = cms.PSet()
process.fwliteInput = cms.PSet( fileNames = cms.vstring('file:patTuple_standard.root'), ## mandatory
maxEvents = cms.int32(100), ## optional
outputEvery = cms.uint32(10), ## optional )
process.fwliteOutput = cms.PSet( fileName = cms.string('analyzePatMuons.root'), ## mandatory )
process.patMuonAnalyzer = cms.PSet( ## input specific for this analyzer muons = cms.InputTag('selectedPatMuons') )
Note: The key element of the configuration is the
edm::ParameterSet process. This should contain another
edm::ParameterSet process.fwliteInput to define the the input histogram, the number of events to be processed and whether or not and when there should be an output printed to the screen. The latter two parameters are optional. If omitted default values will be picked up. Another mandatory
edm::ParameterSet in the configuration file is the
process.fwliteOutput, where the name of the output file is defined. The
edm::ParameterSet with name
process.patMuonAnalyzer defines the parameters, which are important for the actual implementation of our
PatMuonAnalyzer class. Have a look to the instantiation of the
FWLite executable below and the explanations there to learn how the the name of the
process.patMuonAnalyzer PSet is passed on to the actual implementation.
Note: In FWLite there is no other interaction with the
EDM event content but reading. You can view FWLite executables as a safer (and more convenient) way to write compiled interactive root macros. You would do an analysis within FWLite, when performing smaller studies on the basis of (compiled) root macros. As a rule of thumb it make sense to run FWLite executable, if their complexity is not significantly larger than that of a typical single EDAnalyzer.
The implementation of the PatMuonAnalyzer
In the example above we were using the same physical piece of code for the plugin that you ran using
cmsRun and the FWLite executable. This concept gives you a lot of flexibility in the decision whether you want to use a full framework module for your analysis, that you might want to run with a batch system or via the grid or whether you want to do a quick study using FWLite. You can find the class declaration in
PhysicsTools/PatExamples/interface/PatMuonAnalyzer.h
:
#include <map> #include <string>
#include "TH1.h" #include "PhysicsTools/UtilAlgos/interface/BasicAnalyzer.h" #include "DataFormats/PatCandidates/interface/Muon.h"
/** \class PatMuonAnalyzer PatMuonAnalyzer.h "PhysicsTools/PatExamples/interface/PatMuonAnalyzer.h"
\brief Example class that can be used to analyze pat::Muons both within FWLite and within the full framework
This is an example for keeping classes that can be used both within FWLite and within the full framework. The class is derived from the BasicAnalyzer base class, which is an interface for the two wrapper classes EDAnalyzerWrapper and FWLiteAnalyzerWrapper. You can fin more information on this on WorkBookFWLiteExamples#ExampleFive. */
class PatMuonAnalyzer : public edm::BasicAnalyzer {
public: /// default constructor PatMuonAnalyzer (const edm::ParameterSet& cfg, TFileDirectory & fs); PatMuonAnalyzer (const edm::ParameterSet& cfg, TFileDirectory & fs, edm::ConsumesCollector&& iC); /// default destructor virtual ~PatMuonAnalyzer(){}; /// everything that needs to be done before the event loop void beginJob(){}; /// everything that needs to be done after the event loop void endJob(){}; /// everything that needs to be done during the event loop void analyze(const edm::EventBase& event);
private: /// input tag for mouns edm::InputTag muons_; edm::EDGetTokenT > muonsToken_; /// histograms std::map hists_; };
Note: When feeling unfamiliar with the code snippets above have a look to
WorkBookBasicCPlusPlus to learn more about the basics of C++, that you will regularly encounter when writing or reading code of EDAnalyzers. The class that you see above is derived from the abstract
edm::BasicAnalyzer class, which provides our class with the basic features to set up an EDAnalyzer of a FWLite executable within the CMS
EDM (Event Data Model), such that you can concentrate on the physics part of your work.
As for a normal EDAnalyzer you find a
beginJob(),
endJob() and a
analyze(...) function. These functions have to be provided with an implementation to make a concrete class out of the
edm::BasicAnalyzer class. In our example we also added an
edm::InputTag that we will use to pass the product label of the muon collection on to the analyzer code and a
std::map that we will use for histogram management. If you want to learn more about the
edm::BasicAnalyzer class have a look to the
WorkBookFWLiteExamples#ExampleFive.
Question 4 c):
What does it mean, that some parts of the declarations are private and others are public? What does private mean in C++?
You can find the implementation of the class in
PatExamples/src/PatMuonAnalyzer.cc
:
/// default constructor PatMuonAnalyzer::PatMuonAnalyzer(const edm::ParameterSet& cfg, TFileDirectory & fs): edm::BasicAnalyzer::BasicAnalyzer(cfg, fs), muons_(cfg.getParameter("muons")) { hists_["muonPt" ] = fs.make("muonPt" , "pt" , 100, 0., 300.); hists_["muonEta" ] = fs.make("muonEta" , "eta" , 100, -3., 3.); hists_["muonPhi" ] = fs.make("muonPhi" , "phi" , 100, -5., 5.); } PatMuonAnalyzer::PatMuonAnalyzer(const edm::ParameterSet& cfg, TFileDirectory & fs, edm::ConsumesCollector&& iC): edm::BasicAnalyzer::BasicAnalyzer(cfg, fs), muons_(cfg.getParameter("muons")) , muonsToken_(iC.consumes >(muons_)) { hists_["muonPt" ] = fs.make("muonPt" , "pt" , 100, 0., 300.); hists_["muonEta" ] = fs.make("muonEta" , "eta" , 100, -3., 3.); hists_["muonPhi" ] =fs.make("muonPhi" , "phi" , 100, -5., 5.); }
/// everything that needs to be done during the event loop void PatMuonAnalyzer::analyze(const edm::EventBase& event) { // define what muon you are using; this is necessary as FWLite is not // capable of reading edm::Views using pat::Muon;
// Handle to the muon collection edm::Handle<Muon> > muons; event.getByLabel(muons_, muons);
// loop muon collection and fill histograms for(std::vector<Muon>::const_iterator mu1=muons->begin(); mu1!=muons->end(); ++mu1){ hists_["muonPt" ]->Fill( mu1->pt () ); hists_["muonEta"]-> Fill( mu1->eta() ); hists_["muonPhi"]-> Fill( mu1->phi() ); }
Note: In the constructor the muon collection is passed on as a parameter. This gives you the possibility to analyze different muon collection with the same piece of code. Then we book a small set of histograms that we want to fill with some basic quantities. These quantities are
pt,
eta and
phi of each reconstructed muon in the analyzed data. We add these histograms to the
std::map that we have booked in the declaration of the class and register them to the TFileService. This is an
EDM service to facilitate your histogram management. Have a look to
SWGuideTFileService to learn more about it. The combination of the TFileService together with the declaration of the
std::map to manage the access of the histograms within our class is a very convenient way to perform your histogram management. Just add a line to the constructor if you want to add a histogram and fill it in the
analyze(...) function. Thus you only have to edit the implementation.
In the
analyze(...) function the muon collection is drawn into the event using the
edm::Handle. Within the
EDM this is always done the same way. Also the following loop over the muons is a typical piece of code that you will always find back when looping an object collection within the
EDM. We fill the booked histograms for all muons in the collection for each event.
Question 4 d):
You can view the edm::InputTag as a kind of string that keeps the module label of a collection in the event content. But it keeps more information. How would you look for the data members and member functions of the edm::InputTag class to find out what extra information it keeps? Have a look to the WorkBookPATDocNavigationExercise documentation for some hints.
In
PatExamples/plugins/PatMuonEDAnalyzer.cc
you can find the piece of code that 'wraps' our class implementation into a full EDAnaylzer plugin that can be used with
cmsRun:
#include "FWCore/Framework/interface/MakerMacros.h"
/* This is an example of using the PatMuonAnalyzer class to do a simple analysis of muons using the full framework and cmsRun. You can find the example to use this code in PhysicsTools /PatExamples/test/.... */ #include "PhysicsTools/UtilAlgos/interface/EDAnalyzerWrapper.h" #include "PhysicsTools/PatExamples/interface/PatMuonAnalyzer.h"
typedef edm::AnalyzerWrapper<PatMuonAnalyzer> PatMuonEDAnalyzer; DEFINE_FWK_MODULE(PatMuonEDAnalyzer);
Note: You can take this as a template of for any other 'wrap-up' of a class that has been derived from the
edm::BasicAnalyzer. You can use it for 90% of your analysis purposes. In case you need the
edm::EventSetup though you will have to fall back on the implementation of a full EDAnalyzer. You can find a full description how to do this on
WorkBookWriteFrameworkModule.
In
PatExamples/bin/PatMuonFWLiteAnalyzer
you can find the piece of code that 'wraps' our class implementation into a FWLite executable
int main(int argc, char* argv[]) { // load framework libraries gSystem->Load( "libFWCoreFWLite" ); AutoLibraryLoader::enable();
// only allow one argument for this simple example which should be the // the python cfg file if ( argc < 2 ) { std::cout << "Usage : " << argv[0] << " [parameters.py]" << std::endl; return 0; } if( !edm::readPSetsFrom(argv[1])->existsAs("process") ){ std::cout << " ERROR: ParametersSet 'plot' is missing in your configuration file" << std::endl; exit(0); }
PatMuonFWLiteAnalyzer ana(edm::readPSetsFrom(argv[1])->getParameter("process"), std::string("patMuonAnalyzer"), std::string("patMuonAnalyzer")); ana.beginJob(); ana.analyze(); ana.endJob(); return 0; }
To make sure that the executable is compiled we had to add a corresponding line to the
BuildFile.xml in the
bins directory of the package:
<use name="root"/>
<use name="boost"/>
<use name="rootcintex"/>
<use name="FWCore/FWLite"/>
<use name="DataFormats/FWLite"/>
<use name="FWCore/PythonParameterSet"/>
<use name="DataFormats/PatCandidates"/>
<use name="CommonTools/Utils"/>
<use name="PhysicsTools/FWLite"/>
<use name="PhysicsTools/Utilities"/>
<use name="PhysicsTools/PatUtils"/>
<use name="PhysicsTools/PatExamples"/>
<use name="PhysicsTools/SelectorUtils"/>
<environment>
<bin file="PatMuonEDMAnalyzer.cc"></bin>
<bin file="PatMuonFWLiteAnalyzer.cc"></bin>
<bin file="PatBasicFWLiteAnalyzer.cc"></bin>
<bin file="PatBasicFWLiteJetAnalyzer.cc"></bin>
<bin file="PatBasicFWLiteJetAnalyzer_Selector.cc"></bin>
<bin file="PatBasicFWLiteJetUnitTest.cc"></bin>
<bin file="PatCleaningExercise.cc"></bin>
<bin file="PatAnalysisTasksExercise.cc"></bin>
</environment>
The line
is the line in question here.
Note: You see the implementation of a normal C++ main function. The function
edm::readPSetsFrom(...) allows to read
edm::ParameterSets from configuration files similar to the
cfi files in the full framework. We instantiate the
PatMuonFWLiteAnalyzer, passing on the
edm::ParameterSet, the name of the
edm::ParameterSet that will hold the parameters that will be passed on to the executable and the name of the directory, in which the histograms are to be saved. If the last argument is left empty the histograms will be saved directly in thto the head of the histogram root. The configuration that we use, will give a similar output as you get it for the plugin version described above. If you want to learn more about the implementation of FWLite executables have a look to
WorkBookFWLiteExamples.
Question 4 e):
How would you add a statement to compile another executable with name MyPatExecutable.cc to this BuildFile.xml file?
How to access information from a pat::Candidate via an EDM-Tuple
You can also create an
EDM-Tuple from
pat::Candidates which you can easily and intuitively access again via the member functions of FWLite or just via plain root. The AT (Analysis Tools) group provides you with the tools to create flat n-tuples that automatically keep the event provenance information (as required by the CMS Physics management for any CMS publication) without having to write any line of C++ code! We also give guideline if you still want to write you own n-tupleizers. To learn more about these tools have a look to
SWGuideEDMNtuples. We are making use of a generic n-tuple dumper as described
here. The following configuration file shows how to create an
EDM-tuple from the
patTuple.root that we have created
above.
import FWCore.ParameterSet.Config as cms
process = cms.Process("Test")
process.source = cms.Source("PoolSource", fileNames = cms.untracked.vstring( "file:patTuple_standard.root" ) ) process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(100) )
process.MessageLogger = cms.Service("MessageLogger")
## --- ## This is an example of the use of the plain edm::Tuple dumper to analyze pat::Muons process.patMuonAnalyzer = cms.EDProducer( "CandViewNtpProducer", src = cms.InputTag("selectedPatMuons"), lazyParser = cms.untracked.bool(True), prefix = cms.untracked.string(""), eventInfo = cms.untracked.bool(True), variables = cms.VPSet( cms.PSet( tag = cms.untracked.string("pt"), quantity = cms.untracked.string("pt") ), cms.PSet( tag = cms.untracked.string("eta"), quantity = cms.untracked.string("eta") ), cms.PSet( tag = cms.untracked.string("phi"), quantity = cms.untracked.string("phi") ), ) )
process.p = cms.Path(process.patMuonAnalyzer)
process.out = cms.OutputModule("PoolOutputModule", fileName = cms.untracked.string('edmTuple.root'), # save only events passing the full path SelectEvents = cms.untracked.PSet( SelectEvents = cms.vstring('p') ), # save PAT Layer 1 output; you need a '*' to # unpack the list of commands 'patEventContent' outputCommands = cms.untracked.vstring('drop *', 'keep *_patMuonAnalyzer_*_*') )
process.outpath = cms.EndPath(process.out)
Note: We name the instance of the
CandViewNtpProducer patMuonAnalyzer as in the examples before. We create three flat TTrees from the
cleanPatMuon collection, from the
pt,
eta and
phi of the each muon. You can easily run a selection based on the StringCutParser beforehand. You can learn more about the StringCutParser on
SWGuidePhysicsCutParser. Finally we configure an output module for which we drop all other informatino but the three flat trees, that have produced with the
patMuonAnalyzer moduie. The output will be written into an
EDM file with name
edmTuple.root. To run this example do the following
cmsRun PhysicsTools/PatExamples/test/analyzePatMuons_tuple1_cfg.py
The content of the
edmTuple.root file is shown below:
Type Module Label Process
-----------------------------------------------------------------
vector<float> "patMuonAnalyzer" "eta" "PAT"
vector<float> "patMuonAnalyzer" "phi" "PAT"
vector<float> "patMuonAnalyzer" "pt" "PAT"
unsigned int "patMuonAnalyzer" "EventNumber" "PAT"
unsigned int "patMuonAnalyzer" "LumiBlock" "PAT"
unsigned int "patMuonAnalyzer" "RunNumber" "PAT"
Note: As you see apart from
pt,
eta and
phi also the
EventNumber,
LumiBlock and
RunNumber of each event are written into the tuple per default. This file contains the typical structure of an
EDM file. You can investigate it and find the TTrees in the
Events folder. You can also access the flat tuples making use of the features of FWLite. An example for a compiled executable is given below:
int main(int argc, char* argv[]) { // ---------------------------------------------------------------------- // First Part: // // * enable the AutoLibraryLoader // * book the histograms of interest // * open the input file // ----------------------------------------------------------------------
// load framework libraries gSystem->Load( "libFWCoreFWLite" ); AutoLibraryLoader::enable();
// initialize command line parser optutl::CommandLineParser parser ("Analyze FWLite Histograms");
// set defaults parser.integerValue ("maxEvents" ) = 1000; parser.integerValue ("outputEvery") = 10; parser.stringValue ("outputFile" ) = "analyzeEdmTuple.root";
// parse arguments parser.parseArguments (argc, argv); int maxEvents_ = parser.integerValue("maxEvents"); unsigned int outputEvery_ = parser.integerValue("outputEvery"); std::string outputFile_ = parser.stringValue("outputFile"); std::vector inputFiles_ = parser.stringVector("inputFiles");
// book a set of histograms fwlite::TFileService fs = fwlite::TFileService(outputFile_.c_str()); TFileDirectory dir = fs.mkdir("analyzePatMuon"); TH1F* muonPt_ = dir.make("muonPt" , "pt" , 100, 0., 300.); TH1F* muonEta_ = dir.make("muonEta" , "eta" , 100, -3., 3.); TH1F* muonPhi_ = dir.make("muonPhi" , "phi" , 100, -5., 5.);
// loop the events int ievt=0; for(unsigned int iFile=0; iFile<inputFiles_.size(); ++iFile){ // open input file (can be located on castor) TFile* inFile = TFile::Open(inputFiles_[iFile].c_str()); if( inFile ){ // ---------------------------------------------------------------------- // Second Part: // // * loop the events in the input file // * receive the collections of interest via fwlite::Handle // * fill the histograms // * after the loop close the input file // ---------------------------------------------------------------------- fwlite::Event ev(inFile); for(ev.toBegin(); !ev.atEnd(); ++ev, ++ievt){ edm::EventBase const & event = ev; // break loop if maximal number of events is reached if(maxEvents_>0 ? ievt+1>maxEvents_ : false) break; // simple event counter if(outputEvery_!=0 ? (ievt>0 && ievt%outputEvery_==0) : false) std::cout << " processing event: " << ievt << std::endl;
// Handle to the muon pt edm::Handle<float> > muonPt; event.getByLabel(std::string("patMuonAnalyzer:pt"), muonPt); // loop muon collection and fill histograms for(std::vector<float>::const_iterator mu1=muonPt->begin(); mu1!=muonPt->end(); ++mu1){ muonPt_ ->Fill( *mu1 ); } // Handle to the muon eta edm::Handle<float> > muonEta; event.getByLabel(std::string("patMuonAnalyzer:eta"), muonEta); for(std::vector<float>::const_iterator mu1=muonEta->begin(); mu1!=muonEta->end(); ++mu1){ muonEta_ ->Fill( *mu1 ); } // Handle to the muon phi edm::Handle<float> > muonPhi; event.getByLabel(std::string("patMuonAnalyzer:phi"), muonPhi); for(std::vector<float>::const_iterator mu1=muonPhi->begin(); mu1!=muonPhi->end(); ++mu1){ muonPhi_ ->Fill( *mu1 ); } } // close input file inFile->Close(); } // break loop if maximal number of events is reached: // this has to be done twice to stop the file loop as well if(maxEvents_>0 ? ievt+1>maxEvents_ : false) break; } return 0; }
Note: You can find the full example in
PhysicsTools/PatExamples/bin/PatMuonEDMAnalyzer.cc
. It follows the implementation of a typical compiled FWLite example as described in
WorkBookFWLiteExamples. You can run the example as follows:
PatMuonEDMAnalyzer inputFiles=edmTuple.root outputFile=analyzePatMuons.root
Have a look into the implementation of the executable to find out what you have to expect as output. After the explanations above it should be easy for you to figure it out.
Note: It is not necessary to produce the
EDM-tuple from a persistent set of
pat::Canddiate, a PAT tuple. You can create
pat::Candidates 'on the fly' without making them persistent. An example for such an
EDM -tuple directly from a RECO file without making the
pat::Candidate collections persistent is given below:
## import skeleton process from PhysicsTools.PatAlgos.patTemplate_cfg import *
## --- ## This is an example of the use of the plain edm::Tuple dumper to analyze pat::Muons process.patMuonAnalyzer = cms.EDProducer( "CandViewNtpProducer", src = cms.InputTag("selectedPatMuons"), lazyParser = cms.untracked.bool(True), prefix = cms.untracked.string(""), eventInfo = cms.untracked.bool(True), variables = cms.VPSet( cms.PSet( tag = cms.untracked.string("pt"), quantity = cms.untracked.string("pt") ), cms.PSet( tag = cms.untracked.string("eta"), quantity = cms.untracked.string("eta") ), cms.PSet( tag = cms.untracked.string("phi"), quantity = cms.untracked.string("phi") ), ) )
## let it run process.p = cms.Path( process.patDefaultSequence * process.patMuonAnalyzer )
process.out.fileName = "edmTuple.root" process.out.outputCommands = ['drop *', 'keep *_patMuonAnalyzer_*_*']
Exercises
Before leaving this page try to do the following exercises:
Exercise 4a): Write your own MyPatJetAnalyzer following the example of the PatMuonAnalyzer above. Choose the implementation as a (full framework) EDAnalyzer. Read in PAT jets and electrons and answer the following questions:
What is the mean pt of the leading jet?
What is the mean of the closest distance (in deltaR) between the leading electron and the closest pat default jet above 30GeV?
Note: Do you still remember the jet algorithm that you are using? To what jet energy correction (JEC) level is this jet corrected? What is the way to find out?
Exercise 4 b): Move the definition of the PatJetEDAnalyzer module from the configuration (
cfg) file in the
test directory to a
cfi file in the python directory. Import the module and create two clones of it in your
cfg file. Then import the
jetSelector_cfi
module from the
selectionLayer1
directory of the
PatAlgos
package and create a collection of jets with a
pt larger than 60
GeV. Replace the default input for the jet collection of one of the analyzer clones by this new collection. Now answer the following questions:

What is the mean jet multiplicity of all jets and of all high pt jets?

What is the difference in eta of all jets and of all high pt jets?
Note: The following links might be of help for you:
Exercise 4 c): Extend the PatMuonAnalyzer class by the UserVariable
relIso as defined in
Exercise 2d) in
Exercise 2. Have a look to
SWGuidePATUserData to learn how to read this information from the
pat::Muons. Fill a histogram with 50 bins from 0. to 100. Run over the ttbar RelVal input sample, that is used from the
patTuple_standard_cfg.py file.
What is the mean of this variable?
Exercise 4 d): Create an
EDM-tuple with PAT on the fly as described on this TWiki, by a customized
patTuple_standard_cfg.py
.
Exercise 4 e): In the above example we made use of the BasicAnalyzer base class. It shows that, when restricted to the abilities of FWLite the full framework and FWLite are literally the same. Have a look to
WorkBookFWLiteExamples#ExampleFive to learn more about this. Investigate the necessary file in the
PhysicsTools/PatUtils
package and try to understand how the
FWLiteAnalyzerWrapper and the
EDAnalyzerWrapper work.
In case of problems don't hesitate to contact the
SWGuidePAT#Support. Having successfully finished
Exercise 4 you are well prepared to do an analysis using PAT.
CONGRATULATIONS! 
. You might still want to proceed to other exercises of the
WorkBookPATTutorial to go deeper into the material and to learn more about the many possibilities to configure and customise your personal
pat::Tuple. #ReviewStatus
Review status
4.2.5. Using PAT On Data
Contents
This is an example for how to run on data in the
Jet
PD.
4.5.1. Prescription
To know more about what data are available for analysis and what are the conditions under which they have been taken have a look to the
PdmV2012Analysis page of the PPD.
A prescription to run the first data in CMSSW_7_4_15 is available on
SWGuidePATReleaseNotes41X. Please follow this link to keep up to date with latest updates. Due to the rapidly changing start-up conditions in software it is beneficial to check this often for the next few weeks.
Input files
The triggers for the
Jet
PD are listed
here
.
Let's take all of them :
Jet = cms.vstring( 'HLT_DiJetAve100U_v4',
'HLT_DiJetAve140U_v4',
'HLT_DiJetAve15U_v4',
'HLT_DiJetAve180U_v4',
'HLT_DiJetAve300U_v4',
'HLT_DiJetAve30U_v4',
'HLT_DiJetAve50U_v4',
'HLT_DiJetAve70U_v4',
'HLT_Jet110_v1',
'HLT_Jet150_v1',
'HLT_Jet190_v1',
'HLT_Jet240_v1',
'HLT_Jet30_v1',
'HLT_Jet370_NoJetID_v1',
'HLT_Jet370_v1',
'HLT_Jet60_v1',
'HLT_Jet80_v1' ),
The first few files are given below:
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/E4D2CB53-9881-E011-8D99-003048F024DC.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/DC453887-7481-E011-89B9-001617E30D0A.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/DABFB9E8-9B81-E011-9FF7-0030487CD812.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/D4E6F338-9B81-E011-A8DE-003048F110BE.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/CEC22BF3-9681-E011-AA91-003048CFB40C.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/BC737791-9581-E011-90E2-0030487CD6DA.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/B6E2792E-8881-E011-9C34-000423D9A212.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/A47B3EF5-9D81-E011-AC11-003048F024FA.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/84253FEF-9B81-E011-9DB7-001617DBD230.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/80749EFD-AB81-E011-A6E3-0030487CD76A.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/749CCE90-9581-E011-862A-000423D9A212.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/6CDB388A-9681-E011-B324-0030487CD178.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/464C7F95-9581-E011-BAEC-0030487CAEAC.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/42B01B46-9D81-E011-AF75-003048F1C58C.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/36C0F1A9-1382-E011-A5B5-003048F11942.root',
'/store/data/Run2011A/Jet/AOD/PromptReco-v4/000/165/121/10691D45-9D81-E011-9A9A-003048F118C2.root',
These files are used in the configuration file in the next section.
Making PAT-tuples
The PAT-tuples are made (without DCSTRONLY for the moment) with
this configuration file
. This file will produce
pat::Jets
from
reco::PFJets
and use
the
PFJetIDSelectionFunctor
in the
full framework to write out only jets that pass jet ID. Thus it is unnecessary to apply jet ID in FWLite.
To run it do the following:
cmsRun PhysicsTools/PatExamples/test/patTuple_data_cfg.py
This will create a file called "jet2011A_aod.root" out of the first 1000 events. The event content is:
cmslpc06:$ edmDumpEventContent jet2011A_aod.root
Type Module Label Process
---------------------------------------------------------------------------------------------
edm::TriggerResults "TriggerResults" "" "HLT"
trigger::TriggerEvent "hltTriggerSummaryAOD" "" "HLT"
L1GlobalTriggerReadoutRecord "gtDigis" "" "RECO"
edm::ConditionsInEventBlock "conditionsInEdm" "" "RECO"
edm::SortedCollection<CaloTower,edm::StrictWeakOrdering<CaloTower> > "towerMaker" "" "RECO"
edm::TriggerResults "TriggerResults" "" "RECO"
reco::BeamSpot "offlineBeamSpot" "" "RECO"
vector<reco::Track> "generalTracks" "" "RECO"
vector<reco::Vertex> "offlinePrimaryVertices" "" "RECO"
vector<reco::Vertex> "offlinePrimaryVerticesWithBS" "" "RECO"
edm::OwnVector<reco::BaseTagInfo,edm::ClonePolicy<reco::BaseTagInfo> > "selectedPatJets" "tagInfos" "PAT"
edm::TriggerResults "TriggerResults" "" "PAT"
vector<CaloTower> "selectedPatJets" "caloTowers" "PAT"
vector<pat::Electron> "selectedPatElectrons" "" "PAT"
vector<pat::Jet> "goodPatJets" "" "PAT"
vector<pat::MET> "patMETs" "" "PAT"
vector<pat::MET> "patMETsPF" "" "PAT"
vector<pat::Muon> "selectedPatMuons" "" "PAT"
vector<pat::Photon> "selectedPatPhotons" "" "PAT"
vector<reco::GenJet> "selectedPatJets" "genJets" "PAT"
vector<reco::PFCandidate> "selectedPatJets" "pfCandidates" "PAT"
Running with FWLite
In order to analyse this in FWLite, use
this executable
and
this configuration file
.
PatBasicFWLiteJetAnalyzer PhysicsTools/PatExamples/bin/analyzePatJetFWLite_cfg.py
This will make some distributions of the
Jets
in the event and write to the file
analyzePatBasics.root
. The histograms are:
KEY: TH1F jetPt;1 pt
KEY: TH1F jetEta;1 eta
KEY: TH1F jetPhi;1 phi
KEY: TH1F disc;1 Discriminant
KEY: TH1F constituentPt;1 Constituent pT
Have fun!!!
--
SalvatoreRoccoRappoccio - 21-Mar-2011
4.2.6 Physics Analysis Toolkit (PAT): Glossary
Detailed Review status
Contents
A
addJetCollection:
pat::Tool to
add an arbitrary jet collection to the
pat::EventContent and
PAT Workflow. The source code is located in the
tools
directory of the
PatAlgos package. You can find a full description of the tool at
Jet Tools.
B
C
pat::Candidate:
The CMSSW Event Data Model (
EDM) is optimised for space consumption. The Data Formats of reconstructed objects contain only very basic information, while additional information might be added in form of extra Data Formats, which themselves might be distributed over the whole event content. This makes it difficult for the end user to access all information, which is necessary and available for analysis. The
pat:.Candidate is the common Data Format of
PAT. Each pat::Candidate is derived from a corresponding
reco::Candidate
including a user configurable set of extra information. This extra information might already be part of the event content (only being re-keyed) or newly created before being folded into the pat::Candidate Data Format. Examples for such information are:
- isoDeposits.
- electronId.
- b-tag information.
- Monte Carlo truth matching.
- object resolutions.
- jet energy correction factors.
Apart from the generic support of
Monte Carlo truth matching and
object disambiguation one of the key features of PAT is
object embedding. At the moment the following reco::Candidate Data Formats are canonically supported:
The pat::Candidtate collections are produced during the
PAT Workflow. The main collections are the
selectedPatCandidate and the
cleanPatCandidate collections. Due to the feature of embedding the size of a pat::Candidate collection as well as the whole
pat::EventContent is configurable, ranging between 6kB/evt and 36kB/evt for reasonable configurations. The persistent output of of the
PAT Workflow is often referred to as
pat::Tuple. To learn more about the size estimate of a pat::Tuple have a look to
SWGuidePATEventSize. To learn more about the Data Formats have a look to
WorkBookPATDataFormats.
cleanPatCandidates:
This is a common
reference to the collection labels of a common
PAT Candidate collection that contain extra information about object disambiguation. This information is fully user configurable. You can find all currently supported cleanPatCandidate collection labels under the description of the
pat::Candidate. A suggested configuration exists in the
cleaningLayer1
directory of the
PatAlgos
package. To learn more about the support of object disambiguation by PAT have a look to
SWGuidePATCrossCleaning.
Cross Cleaning:
By construction energy deposits in the CMS detector may be interpreted differently depending on the analyst's view or the corresponding analysis purpose. For several analysis, which mostly aim for the interpretation of a combination of different objects it is necessary to resolve such ambiguities.
The disambiguation of overlapping objects is often referred to as object Cross Cleaning. This name is a bit misleading as the disambiguation of objects does not necessarily mean, that elements will be erased from object collections. PAT supports object disambiguation by adding extra information about overlapping objects, which is well defined and completely user configurable. No elements are removed from the object collections in the default configuration. A typical example of overlapping objects is an electron, which may be interpreted as a photon or a jet at the same time. To learn more about the support of object disambiguation by PAT have a look to
SWGuidePATCrossCleaning.
D
E
pat::Electron:
pat::Candidate corresponding to the
reco::GsfElectron
. You can learn more about the
pat::Electron at
WorkBookPATDataFormats#PatElectron.
Embedding:
The
Event Data Model (EDM) is optimised for disc space consumption. High level analysis objects like electrons are reconstructed from basic reconstruction objects like super clusters or tracks that might again consist of base clusters or reconstructed hits. All higher level analysis objects consist of
smart links (pointers) to the lower level reconstruction objects they are made up from. In addition extra information like track extras might be removed from the track object and kept in an extra data format in order to keep the track data format small. There is no other correlation between different object collections but via such smart pointers. As parts of these collections might be dropped from the event content at later steps of the event processing, these smart pointer correlations might be broken (
dangling) and pointing nowhere. This architecture makes it extremely complicated for standard users to keep track of where the information of the high level analysis objects might be localised, as it might be distributed all over the event content, thus
complicating to drop information, which might not be needed for the analysis. Moreover it reduces the flexibility of reducing the event size, which also has influence on the runtime performance of the analysis frame: the user might be interested in the calorimeter towers of which the jets in a certain jet collection are made up from. So he/she should keep the calorimeter tower collection in the event content, which requires a sizable amount of disc space even if the fraction of calorimeter towers of interest might be much smaller.
Embedding is the answer that PAT has to issues of this kind. Via configuration the user may
choose to embed certain object information into the
pat::Candidate during the production step. This information will be hard copied into the
pat::Candidate and may now be dropped from the event content. In the example of the jet collection only the calorimeter towers, which are part of the jet will be kept. This process is fully transparent to the user: All member function of the
pat::Candidate will be used the same way independent from whether the corresponding information has been embedded before or internally is still called by reference. To learn more about object embedding within PAT have a look to the
WorkBookPATWorkflow or to the
WorkBookPATConfiguration. To learn more about how embedding works have a look to the
SWGuidePATEmbeddingExercise.
pat::EventContent:
Event content of a
pat::Tuple as defined in the file
https://github.com/cms-sw/cmssw/blob/CMSSW_5_3_X/PhysicsTools/PatAlgos/python/patEventContent_cff.py
when making the event content persistent . In the default configuration all
cleanPatCandidtates are written to the event. When removing
pat::Cleaning from the
workflow the
selectedPatCandidates will be made persistent instead. To get an impression of what the size of a typical
pat::Tuple as derived from the standard ttbar input files is have a look
here
. In the
patEventContent_cff.py file also other vectors of useful event information are pre-defined. Of course the user is further free to customize the event content to his/her needs. To learn more about the customization of the event content of your personal
pat::Tuple have a look at
SWGuidePATConfigExercise. To learn more about tools how to estimate the size of your private
pat::Tuple have a look to
SWGuidePATEventSize.
F
G
H
I
J
pat::Jet:
pat::Candidate corresponding to the
reco::Jet
. The
pat::Jet may carry the extra information of a
reco::CaloJet, a
reco::PFJet or a
reco::JPTJet. It can carry much more information depending on the choice of the user. You can learn more about the
pat::Jet at
WorkBookPATDataFormats#PatJet.
K
L
M
Monte Carlo truth matching:
Association of generated event information to reconstructed object information for simulated events. Note that generator information (generator particles or generator particle based jets) are matched to reconstructed objects and not vice verse. The Analysis Tools packages provide sophisticated tools to cover all issues of matching. Have a look to
WorkBookMCTruthMatch to learn more about these. To learn more about how PAT exploits the analysis tools of MC matching have a look to
SWGuidePATMCMatching. You can find the files, that provide the default configuration of MC matching in the
mcMatchLayer0
directory of the
PatAlgos
package.
pat::MET:
pat::Candidate corresponding to the
reco::MET
. You can learn more about the
pat::MET at
WorkBookPATDataFormats#PatMET.
pat::Muon:
pat::Candidate corresponding to the
reco::Muon
. You can learn more about the
pat::Muon at
WorkBookPATDataFormats#PatMuon.
N
O
P
PAT:
The Physics Analysis Toolkit (PAT) is a high-level analysis layer providing the Physics Analysis Groups (PAGs) with easy access to the algorithms developed by Physics Objects Groups (POGs) in the framework of the CMSSW offline software. It aims at fulfilling the needs of most CMS analyses, providing both ease-of-use for beginners and flexibility for advanced users. PAT is fully integrated into CMSSW and an integral part of any release of CMSSW. You can find more information about PAT in the
WorkBookPAT and the complete documentation in the
SWGuidePAT.
pat:
The minor letter abbreviation
pat
is the common namespace label for
PAT. You mostly find it as namspace label for the
pat::Candidates, but we also use it sometimes for more abstract expressions like
pat::Tuple,
pat::EventContent or
pat::Workflow
Patification:
The act of creating a
pat::Tuple or running the
Pat Workflow on the fly. You can call this a true slang expression that should not be used. Nevertheless you might stumble over it from time to time.
PF2PAT:
Common abbreviation for a tool that makes
pat::Candidates from particle flow objects only. Already without the use of PF2PAT you have the possibility to add jet collections made from particle flow constituents or to make use of of particle flow MET (pfMET). (Have a look to the
pat::Tools to see how to do this.) But PF2PAT gives you the most consistent interface of the particle flow algorithms to PAT. To learn more about the use of
PF2PAT have a look to the
SWGuidePF2PAT.
pat::Photon:
pat::Candidate corresponding to the
reco::Photon
. You can learn more about the
pat::Photon at
WorkBookPATDataFormats#PatPhoton.
PU: - Pile Up
Q
R
S
selectedPatCandidates:
This is a common
reference to the collection labels of a common
PAT Candidate collection that has passed the selection step of the
PAT Workflow but does not contain any extra information about object disambiguation. In the default configuration there is no selection applied to the reconstructed objects. To learn more about this phase of the PAT Workflow have a look to
WorkBookPATWorkflow#SelectedCandidate. You can find the configuration
selectionLayer1
directory of the
PatAlgos
package. To learn more about tool to switch from the PAT default configurastion including information about object disambiguation (
[cleanPatCandidates) to the
selectedPatCandidates have a look to the
SWGuidePATTools#CMS.Core_Tools.
switchJetCollection:
pat::Tool to
switch the default jet collection (ak5Calo) to an arbitrary jet collection in the
PAT Workflow. The source code is located in the
tools
directory of the
PatAlgos
package. You can find a full description of the tool at
SWGuidePATTools#Jet_Tools.
T
pat::Tau:
pat::Candidate corresponding to the
reco::PFTau
. You can learn more about the
pat::Tau at
WorkBookPATDataFormats#PatTau.
pat::Tools:
A common expression for a large set off tools to modify and customize the
PAT Workflow and/or
event content. You can apply these tools just in your config file or in a more intuitive way via the
edmConfigEditor. To learn more about the available
pat::Tools have a look to
SWGuidePATTools. It includes also tools to process input files from earlier versions of CMSSW.
pat::TriggerEvent:
In addition to the
trigger::TriggerEvent
PAT provides a
pat::TriggerEvent to facilitate access to L1 and HLT trigger information. The pat::TriggerEvent is the common entry point to all trigger information in PAT. To learn more about it have a look to
SWGuidePATTrigger.
pat::Tuple:
This is the common expression of the
Persistent Layer of PAT, characterized by the
pat::EventContent. Whenever you create an EDM file containing
pat::Candidates during the workbook exercises we refer to it as a
pat::Tuple. In real life the
pat::Tuple is an EDM file that contains
pat::Candidates and is taylored to the user's needs. It sustains full event provenance information and can be used with full CMSSW, FWLite, or plain
root. As the content of a
pat::Tuple can be very different from configuration to configuration (even down to the content of a
pat::Candidate within a certain collection) it is not comparable to a data Tier like RECO or AOD. You should rather view it as
better replacement of a flat user ntuple, that keeps all features of the EDM and can still be transformed into a flat ntuple at later stages of the analysis. Studies show that this is not necessary though: containing the same event information, there is hardly any performence differences in data access and runtime performance compared to a flat ntuple. Have a look
here
to get an idea of the current size of a
pat::Tuple in the default configuration. It is of course not necessary to make the
pat::Tuple persistent to work with
pat::Canddiate collections. At your choice you can produce the
pat::Tuple also on the fly.
U
V
W
pat::Workflow:
This is the common workflow for the creation of a
pat::Tuple. It consists of a preparation phases followed by three further phases:
When adding
PAT Triger information fo the
pat::Candidates it might go through another phase. Have a look to
WorkBookPATWorkflow to learn more about the details.
X
Y
Z
Review status
4.3 Particle Candidates Utilities and Framework Modules
Complete:
Detailed Review status
Goals of this page:
This page is intended to familiarize you with the common set of classes and tools which are used to develop a modular Physics Analysis software using the Framework and Event Data Model.
Contents
Candidate Framework Modules
Generic framework modules to manipulate Particle Candidates are provided.
In particular:
Candidate Selectors
The module
CandSelector
selects Particle Candidates with specified
cuts that can be specified by the user via a configurable string.
Example of the configuration is:
process.goodMuons = cms.EDFilter("CandSelector",
src = cms.InputTag("selectedLayer1Muons"),
cut = cms.string("pt > 5.0")
)
This will take the PAT muons as described
here and select those which have a transverse momentum larger than 5
GeV/c.
More details on available candidate selectors can be found in:
Candidate Combiners
Combiner modules compose other particles to create
CompositeCandidate
.
daughter particles kinematics is copied into the composite
object, and links to the original "master" particles can be stored
using
ShallowCloneCandidate
.
An example of usage of such modules is the following:
process.zToMuMu = cms.EDProducer("CandViewShallowCloneCombiner",
decay = cms.string('selectedLayer1Muons@+ selectedLayer1Muons@-'),
cut = cms.string('50 < mass < 120'),
)
This will take the PAT layer 1 muons that have opposite sign, and combine them into
Z
candidates, throwing away those candidates outside of the mass range from 50 to 120
GeV/c2.
More details on available candidate selectors can be found in the following document:
It is also possible to specify an optional name and daughter roles for these tools, like:
process.zToMuMu = cms.EDProducer("CandViewShallowCloneCombiner",
decay = cms.string('selectedLayer1Muons@+ selectedLayer1Muons@-'),
cut = cms.string('50.0 < mass < 120.0'),
name = cms.string('zToMuMu'),
roles = cms.vstring('muon1', 'muon2')
)
This will automatically assign names and roles as described
here. The rest of the functionality is the same as the previous example.
Other Modules
A more complete list of the available modules to manipulate
collections of candidates can be found in:
Candidate Utilities
Utilities are provided to perform the most common operations on
Particle Candidates.
Overlap Checking
Overlap between two candidates occurs if the two candidates,
or any of their daughters, share one of the components
(a track, a super-cluster, etc.). The utility
OverlapChecker
checks for overlap occurrences:
#include "DataFormats/Candidate/interface/OverlapChecker.h"
OverlapChecker overlap;
const Candidate & c1 = ..., & c2 = ...;
if (overlap( c1, c2 ) ) { ... }
Note: this overlap checking mechanism only looks for identical components in the
candidate decay chain, but has no way to check, for instance, of two candidates are made
from tracks or clusters that share a common set of hits. More advanced utilities should
be used for such more refined overlap checking.
Candidate Setup Utilities
"Setup" utilities allow to modify the candidate content (momentum, vertex, ...).
The simplest provided setup utility is:
-
AddFourMomenta
: sets a candidate's 4-momentum adding 4-momenta of its daughters. In the following example, a Composite Candidate is created, its two daughters are added to it, and its momentum is set as the sum of the daughters four-momenta:
CompositeCandidate comp;
comp.addDaughter( dau1 );
comp.addDaughter( dau2 );
AddFourMomenta addP4;
addP4.set( comp );
Boosting Candidates
If you want for to boost a candidate, you should get sure
you can modify it. Candidates taken from an event collections
are immutable, so you need to clone them before boosting.
If you want to boost a candidate to another candidate
center of mass, you can do the following:
Candidate * c1clone = c1->clone();
Booster boost(c2->boostToCM());
booster.set(*c1clone);
Once booster, if
cand1
is a
ComposteCandidate
,
all its daughters are stored internally to it will also be
boosted.
If you want to boost a
ComposteCandidate
and its daughters
to its center of mass, you can use the following example:
// create booster object
CenterOfMassBooster boost(h);
// clone object and update its kinematics
Candidate * higgs = h.clone();
boost.set(*higgs);
// get boosted Z as Higgs daughters
const Candidate * Z1 = higgs->daughter(0);
const Candidate * Z2 = higgs->daughter(1);
In the above example, the Z daughters (leptons) will also be
boosted.
Booster utilities are defined in
CMS.PhysicsTools/CandUtils
.
Common Vertex Fitter
Common vertex fitter is a "setup" operator using the
Vertex Fitter tools for track collections.
It requires the magnetic field map to be passes to the algorithm,
which can be obtained from the EventSetup:
ESHandle<CMS.MagneticField> B;
es.get<IdealMagneticFieldRecord>().get(B);
CandCommonVertexFitter<KalmanVertexFitter> fitter;
fitter.set(B.product());
const Candidate & zCand = ...; // get a Z candidate
VertexCompositeCandidate fittedZ(zCand);
fitter.set(fittedZ);
More details on:
Candidates and Monte Carlo Truth
Candidate used to represent Monte Carlo truth from
generator output can be matched to RECO objects
using a set of common tools described in the document below:
Review status
Responsible:
LucaLista
Last reviewed by:
PetarMaksimovic - 28 Feb 2008
4.4 Generator event format in AOD
Complete:
Detailed Review status
Goals of this page:
This page documents event generator format stored in AOD.
Contents
Introduction
Generator event in AOD takes about half the size of
native HepMC events .
This is done using a particle representation that inherits from
reco::Candidate
.
GenParticle: Generator Particle Candidate
Generator particles are represented by the classes
GenParticle
which inherit from
reco::Candidate
It contains a four-momentum, charge, vertex and:
- a PDG identifier
(pdg_id()
in HepMC::GenParticle
)
- a status code (
status()
in HepMC::GenParticle
). Standard status codes are described in HepMC manual
, and have the following convention in Pythia6 (see below): 0 | null entry |
1 | particle not decayed or fragmented, represents the final state as given by the generator |
2 | decayed or fragmented entry (i.e. decayed particle or parton produced in shower.) |
3 | identifies the "hard part" of the interaction, i.e. the partons that are used in the matrix element calculation, including immediate decays of resonances. (documentation entry, defined separately from the event history. "This includes the two incoming colliding particles and partons produced in hard interaction." [ * ]) |
4-10 | undefined, reserved for future standards |
11-200 | at the disposal of each model builder equivalent to a null line |
201-... | at the disposal of the user, in particular for event tracking in the detector |
IMPORTANT: Other generators have other conventions. (e.g. Those for Pythia8 are described in the documentation for the statusHepMC function in
http://home.thep.lu.se/~torbjorn/pythia81html/EventRecord.html
. They mainly differ from the Pythia6 ones in that status = 3 is no longer used.). The only thing you can really rely on is that 'stable' particles (=those handed over to Geant or fastsim in the simulation which are those with decay length > 2cm) have status=1.
For some other generators' conventions, see for instance:
Generator Particles Collections
The default generator particle collection is
reco::GenParticleCollection
,
which is a typedef for
std::vector<reco::GenParticle>
.
Generator particle contain mother and daughter links to particles in the
same collection, as sketched below.
An example to access this collection is the following analyzer code fragment:
#include "DataFormats/HepMCCandidate/interface/GenParticle.h"
using namespace reco;
void MyModule::analyze(const edm::Event & iEvent, ...) {
Handle<GenParticleCollection> genParticles;
iEvent.getByLabel("genParticles", genParticles);
for(size_t i = 0; i < genParticles->size(); ++ i) {
const GenParticle & p = (*genParticles)[i];
int id = p.pdgId();
int st = p.status();
const Candidate * mom = p.mother();
double pt = p.pt(), eta = p.eta(), phi = p.phi(), mass = p.mass();
double vx = p.vx(), vy = p.vy(), vz = p.vz();
int charge = p.charge();
int n = p.numberOfDaughters();
for(size_t j = 0; j < n; ++ j) {
const Candidate * d = p.daughter( j );
int dauId = d->pdgId();
// . . .
}
// . . .
}
}
GenParticle Conversion from HepMCProduct
GenParticles are saved by default in FEVT, RECO and AOD.
If you run the generator yourself, you may need to convert
the output of the generator in the
HepMC format to the AOD format using
the module
GenParticleProducer.
You need to add the following configuration files to your script:
include "SimGeneral/HepPDTESSource/data/pythiapdt.cfi"
include "CMS.PhysicsTools/HepMCCandAlgos/data/genParticles.cfi"
and remember to add at the beginning of your path the module:
Decay Tree Drawing Utilities
The modules
ParticleTreeDrawer
and
ParticleDecayDrawer
print the
generated decay tree to provide a visual inspection. The modules can be
configured with different options to print different levels of details.
More information in the page below:
Particle List Utility
The module
ParticleListDrawer
dumps the full generated event
in a way similar to the command PYLIST in pythia.
More information in the page below:
Related Documents
Review status
Responsible:
LucaLista
Last reviewed by:
PetarMaksimovic - 28 Feb 2008
4.5 MC Truth Matching Tools
Complete:
Detailed Review status
Goals of this page:
This page is intended to familiarize you with the tools
to match the reconstructed object to generated particle.
Contents
Purpose
Provide common tool to make easier writing Monte Carlo generator
truth matching.
Introduction
The matching tools assume we use the new format for HepMCProduct
using
Particle Candidates.
The Monte Carlo truth matching of a composite object (for instance, a reconstructed
Z→μ+μ-) is done in two steps:
- matching of final state particles to final state generator particles (in this case the muons)
- automatic matching of reconstructed composite objects to composite MC parents (in this case the Z)
The output product of the first stage is a one-to-one
AssociationMap
that can be stored in the event, and works as input for the second step.
These matching tools are based on:
Matching Format
Using AssociationMap
Object matchings are stored in the event with the format of
an one-to-one
AssociationMap.
For each matched object of a collection
A
, a unique object in
a collection
B
is stored. For convenience, the following
typedef is defined in
DataFormats/Candidate/interface/CandMatchMap.h
:
namespace reco {
typedef edm::AssociationMap<
edm::OneToOne<reco::CandidateCollection, reco::CandidateCollection>
> CandMatchMap;
}
An example of code accessing an association is the following:
Handle<CandMatchMap> match;
event.getByLabel( "zToMuMuGenParticlesMatch", match );
CandidateRef cand = ...; // get your reference to a candidate
CandidateRef mcMatch = (*match)[ cand ];
You can access those maps in FWLite. Internally,
the association map stores indices to the matched
objects in the two source collections in the
data member
map_
.
For instance, to plot the reconstructed muon
pt
versus the true
pt, from generator particles,
you can use the following interactive ROOT command:
Events.Draw("allMuons.data_[allMuonsGenParticlesMatch.map_.first].pt():
genParticleCandidates.data_[allMuonsGenParticlesMatch.map_.second].pt()")
#edit the above two lines to be a single line
In the above example, the following branches have
been used:
-
allMuons
: the collection of muon candidates
-
genParticleCandidates
: the collection of generator level particles,
-
allMuonsGenParticlesMatch
: the name of the module used to create the match map of muon candidates to generator particles. The ROOT branch containing the map will have an alias identical to the module name.
Some problems with ROOT to manage dictionary properly
has been experienced, so some expected interactive use
patterns may give problems. Please, report them to
the Analysis Tools group.
Using Association
It is also possible to use object matchings with the format of an one-to-one
Association.
For each matched object of any type, a unique object in
a collection of
GenParticle
objects is stored. For convenience, the following
typedef is defined in
DataFormats/HepMCCandidate/interface/GenParticleFwd.h
:
namespace reco {
typedef edm::Association<GenParticleCollection> GenParticleMatch;
}
An example of code accessing an association is the following:
Handle<GenParticleMatch> match;
event.getByLabel( "zToMuMuGenParticlesMatch", match );
CandidateRef cand = ...; // get your reference to a candidate
GenParticleRef mcMatch = (*match)[cand];
Accessing those kinds of maps in FWLite is possible.
It is trivial when objects from a single collection are
matched (you basically store internally a vector of
matched indices), but interpreting the data may
require non trivial unpacking if more than one
collection is matched. Work is ongoing on the EDM
side to review in general association maps.
ΔR Matching Modules
Using MCTruthDeltaRMatcher
The module
MCTruthDeltaRMatcher
defined in
CMS.PhysicsTools/HepMCCandAlgos
matches candidate collection to their MC truth parent
based on a maximum ΔR are provided. Optionally, only particles with a given
PDG id are matched. One example of
configuration is the following:
process.selectedMuonsGenParticlesMatch = cms.EDProducer( "MCTruthDeltaRMatcher",
src = cms.InputTag("selectedLayer1Muons"),
matched = cms.InputTag("genParticleCandidates"),
distMin = cms.double(0.15),
matchPDGId = cms.vint32(13)
)
This example will create a map between the selected PAT layer 1 muons and the generator level muons that are matched within a cone of 0.15.
The follwing
cfi.py
are provided in the
CMS.PhysicsTools/HepMCCandAlgos
directory to match pre-defined candidate sequences:
Using MCTruthDeltaRMatcherNew
The module
MCTruthDeltaRMatcherNew
defined in
CMS.PhysicsTools/HepMCCandAlgos
matches any collection matching
edm::View<Candidate>
to their MC truth parent based on a maximum ΔR are provided.
Optionally, only particles with a given PDG id are matched.
One example of configuration is the following:
process.selectedMuonsGenParticlesMatchNew = cms.EDProducer( "MCTruthDeltaRMatcherNew",
src = cms.InputTag("selectedLayer1Muons"),
matched = cms.InputTag("genParticleCandidates"),
distMin = cms.double(0.15),
matchPDGId = cms.vint32(13)
)
Composite Object Matching
Using MCTruthCompositeMatcher
Once you have done RECO-MC truth matching for final state particles,
you may want to reconstruct composite objects, like
Z→μ+μ- or
H→ZZ→μ+μ-e+e-,
and then find the corresponding parent match (if the Z or Higgs are correctly reconstructed)
in the Monte Carlo truth.
The module
MCTruthCompositeMatcher
creates an association map
of reconstructed composite objects to their corresponding generator
parent, based on the association of their daughters.
The following example, taken from
Electro-Weak Z→μ+μ- skim
matches
Z→μ+μ- candidate to
MC truth based on the matching of muon daughters:
process.zToMuMuGenParticlesMatch = cms.EDProducer( "MCTruthCompositeMatcher",
src = cms.InputTag("zToMuMu"),
matchMaps = cms.VInputTag("selectedLayer1MuonsGenParticlesMatch")
)
Using MCTruthCompositeMatcherNew
The module
MCTruthCompositeMatcherNew
creates an association map
of reconstructed composite objects to their corresponding generator
parent, based on the association of their daughters.
One example of matching of Z→μ
+μ
-
given a match map of muons to generator particles, is the following:
process.zToMuMuMCMatch = cms.EDProducer( "MCTruthCompositeMatcherNew",
src = cms.InputTag("zToMuMu"),
matchMaps = cms.VInputTag("selectedMuonsGenParticlesMatchNew")
)
Merging MC match maps
The new map type allow very simple way to merge them, whatever
is the type of input collection that was matched. A single merged
map can be saved instead of many single maps, if needed.
The way to match then is trivial, following the example below:
process.mergedMCMatch = cms.EDProducer( "GenParticleMatchMerger",
src = cms.VInputTag("muonMCMatch", "electronMCMatch",
"trackMCMatch", "zToMuMuMCMatch", "zToEEMCMatch",
"HTo4lMCMatch")
)
*Warning*: merged maps are non trivial to inspect
via FWLite. Work is ongoing on the EDM side to review association maps.
Composite Matching Utility
Using MCCandMatcher<C1, C2>
The utility
MCCandMatcher<C1, C2>
defined in
CMS.PhysicsTools/HepMCCandAlgos
does the matching within your analyzer, if you prefer not to run a framework
module.
C1
and
C2
could be either
reco::CandidateCollection
, or any collection of objects
inheriting from
reco::Candidate
, like jets, electrons, muons, etc.
or
edm::View<reco::Candidate>
.
The utility takes as input the one-to-one match map containing the
final state matches, e.g.: the one produced with the module
MCTruthDeltaRMatcher
described above.
The usage is the following:
// get the previously produced final state match
Handle<CandMatchMap> mcMatchMap;
evt.getByLabel( matchMap_, mcMatchMap );
// create the extended matcher that includes automatic parent matching
MCCandMatcher<reco::CandidateCollection, reco::CandidateCollection> match( * mcMatchMap );
// get your candidate
const Candidate & cand = ...
// find match reference
CandidateRef mc = match( cand );
// access matched parent if non null
if ( mc.isNonnull() ) {
int pdgId = pdgId( * mc );
double mass = mc->mass();
}
The map can also find matches of final state particles, just looking at
the input final state match map.
Using utilsNew::CandMatcher<GenParticleCollection>
The utility
MCCandMatcher
should be replaced by
utilsNew::CandMatcher<GenParticleCollection>
,
defined in:
CMS.PhysicsTools/CandUtils
.
An example of usage is reported below:
using namespace edm;
using namespace std;
using namespace reco;
// get your collection of composite objects
Handle<CandidateView> cands;
evt.getByLabel(src_, cands);
// get your match maps for final state
// (electrons, muons, tracks, ...)
size_t nMaps = matchMaps_.size();
std::vector<const GenParticleMatch *> maps;
maps.reserve( nMaps );
for( size_t i = 0; i != nMaps; ++ i ) {
Handle<reco::GenParticleMatch> matchMap;
evt.getByLabel(matchMaps_[i], matchMap);
maps.push_back(& * matchMap);
}
// create cand matcher utility passing the input maps
utilsNew::CandMatcher<GenParticleCollection> match(maps);
int size = cands->size();
for( int i = 0; i != size; ++ i ) {
const Candidate & cand = (* cands)[i];
// get MC match for specific candidate
GenParticleRef mc = match[cand];
}
Complete Running Examples
An example of complete analysis using MC matching tools
is the
Z reconstruction skim from the
EWK
Analysis Group. See:
Generic Candidate Matching
Final state MC truth matching has been written in such a way that
it could be extended to different types of matching, even not specifically
MC truth matching. The "generality" could be further extended, if needed.
A generic match module is defined in the package
CMS.PhysicsTools/CandAlgos
by the following template,
defined in the namespace
reco::modules
:
template<typename S, typename D = DeltaR<reco::Candidate> >
class CandMatcher;
where
S
is a (typically, but not only RECO-MC) pair selector type (see below), and
D
is an utility to
measure the match "distance" of two Candidates. By default,
the distance is measured as
ΔR (the utility
DeltaR
is defined
in
CMS.PhysicsTools/CandUtils
), but other criteria could be adopted and
easily plugged-in by the user.
An example of RECO-MC pair selection
S
is defined in the
package
CMS.PhysicsTools/HepMCCandAlgos
, and checks
that the Monte Carlo particles belong to the final state (
status = 1
)
and have the same charge as the particle to be matched:
struct MCTruthPairSelector {
explicit MCTruthPairSelector( const edm::ParameterSet & ) { }
bool operator()( const reco::Candidate & c, const reco::Candidate & mc ) const {
if ( reco::status( mc ) != 1 ) return false;
if ( c.charge() != mc.charge() ) return false;
return true;
}
};
This selection is applied before applying the
ΔR cut.
The actual selector module is defined as
CMS.PhysicsTools/HepMCCandAlgos
:
typedef reco::modules::CandMatcher<
helpers::MCTruthPairSelector
> MCTruthDeltaRMatcher;
More details and usage examples can be found in:
Physics Object Matching
An extended version of the
CandMatcher
has been created for the physics object toolkit. The
PhysObjectMatcher
allows a more general matching and an ambiguity resolution for multiple matches. It can be configured using 5 template arguments:
C1 |
The collection to be matched (e.g., a CandidateView of reconstructed objects) |
C2 |
The target collection (e.g., a collection of GenParticles ) |
S |
A preselector for the match (e.g., a selection on PDG id or status) |
D |
The class determining a match between two objects (default: deltaR) |
Q |
The ranking of matches (default: by increasing deltaR) |
The module produces an
Association
from C1 to C2. Configuration parameters are
src |
InputTag for C1 |
matched |
InputTag for C2 |
resolveAmbiguities |
bool to enable / disable the resolution of ambiguities. If false each object in C1 is associated to the best match in C2, but several objects in C1 can point to the same object in C2. |
resolveByMatchQuality |
bool to choose the type of ambiguity resolution. If true multiple associations of the same object in C2 are resolved by choosing the best match, otherwise the match with the lowest index in C1 is chosen. |
Options for the helper classes used by
PhysObjectMatcher
are
S |
MCMatchSelector |
Preselection for MC matches |
checkCharge |
bool: use / ignore electrical charge |
mcPdgId |
vint32: MC particle codes |
mcStatus |
vint32: MC status codes |
DummyMatchSelector |
no preselection |
D |
MatchByDR |
deltaR match |
maxDeltaR |
cut on deltaR |
MatchByDRDPt |
match by deltaR and relative deltaPt |
maxDeltaR |
cut on deltaR |
maxDPtRel |
cut on fabs(pt2-pt1)/pt2 |
Q |
reco::helper::LessByMatchDistance<D,C1,C2> |
ranking by distance (e.g., DeltaR ) |
MatchLessByDPt |
ranking by relative deltaPt |
Some concrete matching modules are defined in
PhysicsTools/HepMCCandAlgos/plugins/MCTruthMatchers.cc
:
-
MCMatcher
: deltaR + deltaPt match between a CandidateView
and a GenParticleCollection
; ranking by deltaR
-
MCMatcherByPt
: as above, but ranking by deltaPt
-
GenJetMatcher
: deltaR match between a CandidateView
and GenJetCollection
; ranking by deltaR
Warning: Matching in Dense Environments
Matching by ΔR may not work reliably in dense environments, such as jets. For studies needing high quality matching of reconstructed tracks with true tracks, it is possible to base the matching either on the number of hits that they share in common, or on a comparison of the 5 helix parameters describing the track. How to do this is described
here, but unfortunately can only be done on FEVT data, since it requires the presence of TrackingParticles that are not stored on RECO. (These are truth tracks, which contain links to the GEANT-produced SimTracks and generator-produced GenParticles that they correspond to).
Review status
Responsible:
LucaLista
Last reviewed by:
PetarMaksimovic - 28 Feb 2008
4.6 HLT Tutorial
Complete:
Detailed Review status
Newsbox |
This page has been updated for CMSSW 52X and the latest 2012 HLT studies |
Goals of this page:
This page is intended to familiarize you with HLT ideas, software and utilities. In particular, you will learn:
- how to run various trigger paths,
- how to analyze HLT related information.
Contents
Introduction
Why a trigger ?
- Most of events/processes produced at a high-energy hadronic collider are not interesting : an early decision, i.e. online selection, has to be made to avoid running the offline code on millions of uninteresting events
High Level Trigger (HLT) :
- Takes events accepted by the L1, first level of the CMS Trigger
- Decides, based on more elaborated algorithms, whether the event should be kept
Therefore, it is a crucial part of the CMS data flow since it's the HLT algorithms and filters which will decide whether an event should be kept for an offline analysis : any offline analysis depends on the outcome of HLT, i.e. on the HLT efficiency.
Important benchmark ideas :
- Rates (at 2E33) :
- 13 MHz -> L1 -> 50kHz
- -> HLT -> 150 Hz
- Conceive and run approx. 200 triggers, as efficient as possible
- Reconstruction :
- Seeded by L1
- As close as possible to offline reconstruction
- Regional reconstruction : saving CPU
- HLT level : typically :
- L2 : Calorimeter and Muon information
- L3 : Tracking information
Example of HLT reconstruction : Muons :
- L2 :
- Uses L1 seeds : Up to 4 muons provided by Global Muon Trigger (GMT), information : (pT, charge, phi, eta)
- "StandAlone" muons : using muon segments, charge clusters, then Outside->In fitting
- Filters : pT, invariant-mass
- Filters : Calorimeter-based isolation
- L3 :
- Uses L2 seeds
- "Global regional muon reconstruction" : using tracker information in muon window
- Filters : pT, invariant-mass, Impact-Parameter, track-quality
- Filters : Tracker-based isolation
Please do visit Trigger, HLT pages :
Subscribe to hypernews :
Producing event with Triggers
Triggers defined as
path
blocks : sequence of modules and operands :
path MyTrigger = {doL1Reco & doL1Seeding & ApplyPrescale & doHltReco, HltCondition}
Modules :
- Reconstruction :
doL1Reco
, doHltReco
, etc...
- Prescale :
ApplyPrescale
- Filter :
HltCondition
Operands :
- the "," or dependency operator, the operand to the right is dependent on the operand on the left (i.e. the right accesses data produced by the left)
- the "&" or sequencing operator, the operand on the left is executed first, followed by the operand on the right, but they are not dependent on each other
Consequences :
- The result of each operand is a boolean : the final outcome of
MyTrigger
is "reject" or "accept"
- For a given path, the overall answer is the "AND" of all operands : If ever one of the operands on the left fails, reject AND stop processing : Saving CPU time !
Recommended reading :
PathTriggerBits
The HLT configuration in CMS is stored in a dedicated database system,
ConfDB. Use the dedicated
GUI to create, manipulate and store trigger path configurations. Use the
ConfDB web browser
to browse the content of the configuration database and inspect available HLT menus. The command-line tool
edmConfigFromDB allows you to retrieve (complete or partial) configurations as either ascii or python configuration files to be fed to
cmsRun
.
A quick look at the code
Please visit :
HLT code in
HLTrigger
Running Trigger paths
Follow instructions from
HLTtable
For input files : go on the Data Aggregation System page
DAS
Analyzing Trigger/Offline information
Motivation : Get in one place, on event-by-event basis :
- L1-, HLT-related information
- Offline reconstructed information
- HLT information without filters: "Open HLT" mode
Enables to :
- Study Trigger efficiencies, as function of offline reconstructed quantities...
- Get Trigger rejections, overlaps, rates...
- ...for L1, HLT conditions
Code :
- Package :
HLTrigger/HLTanalyzers
: HLTanalyzers
- Driving code :
HLTAnalyzer .h .cc
: an EDAnalyzer
- Configuration file to run :
HLTrigger/HLTanalyzers/test/HLTAnalysis.cfg
More details in
HLTAnaManual
Getting the L1 information
Motivation : Have the information of HLT seeds at disposal
Using
L1Extra
objects, from MCTruth or from the L1 Emulator : In the
HLTAnalysis.cfg
file :
module HLTAnalyzer = {
...
string l1extramc = l1extraParticles
...
}
Physics objects |
Variables stored |
L1Extra Class |
Instances |
Muons |
E, pT, phi, eta, isolation, mip |
L1MuonParticle |
|
EM particles |
E, ET, phi, eta |
L1EmParticle |
“Isolated” and “NonIsolated” |
Jets |
E, ET, phi, eta |
L1JetParticle |
"Forward" and "Central" |
Taus |
E, ET, phi, eta |
L1JetParticle |
"Tau" |
MET |
ET, ET(tot), ET(had), phi |
L1EtMissParticle |
|
Branches of variables created per instance.
Code in :
HLTInfo .h .cc
Getting the HLT information
How do I get information about Trigger Results ?
- From the
TriggerResults
class : Associate Trigger Path to Decision. HLTAnalyzer dynamically creates as many branches as Triggers present, and fill them with corresponding Trigger decision. Trigger branches pop up in form of "TRIGG_".
Code in :
HLTInfo .h .cc
If you want to get the names of the paths (e.g. for accessing them by
path name rather than by bit number), see the example code in
HLTrigger/HLTanalyzers/src/HLTrigReport.cc
.
See also the hypernews discussion
here
.
Getting the Offline-reconstructed and other information
Specify in
HLTAnalysis.cfg
the instances of reconstructed objects that you want :
module HLTAnalyzer =
...
string muon = "muons"
string Electron = "pixelMatchGsfElectrons"
string Photon = "correctedPhotons"
string recjets = "iterativeCone5CMS.CaloJets"
string genjets = "iterativeCone5GenJets"
string recmet = met
string genmet = genMet
string calotowers = towerMaker
...
}
Physics objects |
Variables stored |
Instances |
Muon |
E, ET, pT, eta, phi |
muons |
Electron |
E, ET, pT, eta, phi |
pixelMatchGsfElectrons |
Gamma |
E, ET, pT, eta, phi |
correctedPhotons |
Jet |
E, ET, pT, eta, phi |
iterativeCone5CMS.CaloJets, iterativeCone5GenJets |
MET |
pT, phi, ET(sum) |
met, genMet |
Calo Towers |
E, ET, E(em), E(had), E(out), eta, phi |
towerMaker |
MC truth |
Id, Vtx(X,Y,Z), pT |
|
Code in :
HLTEgamma
,
HLTMuon
,
HLTJets
(deals with MET as well)
- Getting also information about the MC truth information about generated particles : pT, Identity, Vertex... Code in
HLTMCtruth
Offline collections can be accessed when running simultaneously and RAW+RECO samples. This can be achieved using crab with use_parent=1 option. Running on a RECO dataset will then allow to access its parent dataset, namely the corresponding RAW dataset.
Getting the MC truth information
Motivation : Have the generator level information at disposal
Object |
Variables stored |
Instances |
Generated particle |
PDG-Identity, particle status, [x,y,z]-component of primary vertex |
genParticles |
Generated particle |
pT, phi, eta |
genParticles |
|
Event pT-hat, i.e. scale |
genEventScale |
Calculate the rate of a new trigger:
Please have a look at:
How to run OpenHLT
You may also want to take a look at the
recipes from the STEAM group
Recipes for producing ntuples
These versions of the analyzer have the OpenHLT capability included.
For additional tags needed to run on real data consistent with online releases, consult
this page.
For the correct
GlobalTag to use in a given release consult
this page.
Review status
Responsible: Pedrame Bargassa
Last reviewed by:
ChristosLeonidopoulos - 22 Feb 2008
PAT Examples: Trigger Example
Contents
Introduction
Trigger information stored in the CMS
EDM is optimzed for space, which comes with a lack of user friendlyness in accessing this information.
The PAT provides the opportunity to access this information easily and especially to deal with the cross links between the different items (objects, filters/modules and paths).
The aim of this PAT Workbook example is to complete the information given during the PAT tutorial exercises. It is therefor necessary for the understanding of the exercises at least to follow the tutorial first:
- Necessary: A tutorial about PAT trigger information is found here
.
- Recommended: The PAT trigger information is described in detail in the corresponding Software Guide.
How to get the code
Find out more about the details
The production of PAT trigger information consists of up to four steps:
- Production of PAT trigger objects, filters and paths:
- Matching between PAT and PAT trigger objects:
- The corresponding configuration file is PhysicsTools/PatAlgos/python/triggerLayer1/triggerMatcher_cfi.py
.
- The provided default configurations cannot cover all use cases due to their wide variety. They rather serve as configuration examples. The user is expected to define her/his own matches and to incorporate them into the work flow.
- Creation of the PAT trigger event:
- Embedding of matched PAT trigger objects into the PAT objects:
The production of trigger information is not part of the default PAT work flow. To switch it on easily, Python tools are provided in
PhysicsTools/PatAlgos/python/tools/trigTools.py
.
Exercises
Get the code skeletons
In your work area:
This is not needed for exercise 1.
Exercise 1 - trigger information production
The problem to solve
Modify an existing PAT configuration file so, that it
- contains the full PAT trigger information;
- contains no trigger matches.
Hints:
- Use an existing configuration that works.
- Python tools can and should be used.
The solution
- To switch on the default PAT trigger and trigger event production, append these lines to you PAT configuration file:
from CMS.PhysicsTools.PatAlgos.tools.trigTools import switchOnTrigger
switchOnTrigger( process )
- To remove the default (example) trigger matches from the work flow, also append these lines:
process.patTriggerSequence.remove( process.patTriggerMatcher )
process.patTriggerEvent.patTriggerMatches = []
Exercise 2 - trigger information production with matching
The problem to solve
Modify an existing PAT configuration file so, that it
- contains the selected PAT objects
- contains the full PAT trigger information;
- contains the (and only this) match of PAT muons to HLT muons with:
- match by ΔR,
- best match in ΔR only.
Hints:
- Use the configuration skeleton file PhysicsTool/PatExamples/test/producePatTrigger_cfg.py
only:
- Append all needed changes to this file.
- Do not change any file in the CMS.PhysicsTools/PatAlgos package itself.
- Python tools can and should be used.
- Reduce the number of event for testing in
process.maxEvents.input = 1000
- The output file of this exercise serves also as input file for exercise 3.
The solution
- To switch to selected PAT objects, add these lines:
from CMS.PhysicsTools.PatAlgos.tools.coreTools import removeCleaning
removeCleaning( process )
- The described match is defined as follows:
- Match by ΔR only with ranking also by ΔR:
process.muonTriggerMatchHLTMuons = cms.EDFilter( "PATTriggerMatcherDRLessByR",
-
- Match selected PAT muons now:
src = cms.InputTag( "selectedPatMuons" ),
This is the input tag as found of the
according muon selection module
.
-
- Match to PAT trigger objects:
matched = cms.InputTag( "patTrigger" ),
This is the
default input tag as found of the
according trigger producer
.
-
- Use the
AND
of the selector configurables:
andOr = cms.bool( False ),
This allows for wild cards.
filterIdsEnum = cms.vstring( 'TriggerMuon' ),
filterIds = cms.vint32( 0 ),
or
filterIdsEnum = cms.vstring( '*' ),
filterIds = cms.vint32( 83 ),
The corresponding codes are found in
DataFormats/HLTReco/interface/TriggerTypeDefs.h
.
-
- Skip further selection criteria with wild cards:
filterLabels = cms.vstring( '*' ),
pathNames = cms.vstring( '*' ),
collectionTags = cms.vstring( '*' ),
-
- Use the default selection limits for muons:
maxDPtRel = cms.double( 0.5 ),
maxDeltaR = cms.double( 0.5 ),
These are described in the
Software Guide - PATTriggerMatcher Module Configuration
resolveAmbiguities = cms.bool( True ),
-
- Select the match by the defined ranking:
resolveByMatchQuality = cms.bool( True )
)
The ranking has been defined already by the choice of the concrete matcher module (s. above).
- The new match (and only this one) is included in the work flow by:
- Load the PAT trigger information into your process:
process.load( "CMS.PhysicsTools.PatAlgos.triggerLayer1.triggerProducer_cff" )
Modify the trigger matcher sequence, which is found in
PhysicsTools/PatAlgos/python/triggerLayer1/triggerMatcher_cfi.py
:
-
-
- Add the new matcher module first:
process.patTriggerMatcher += process.muonTriggerMatchHLTMuons
-
-
- Remove anything else found to be present in the matcher sequence:
process.patTriggerMatcher.remove( process.patTriggerElectronMatcher )
process.patTriggerMatcher.remove( process.patTriggerMuonMatcher )
process.patTriggerMatcher.remove( process.patTriggerTauMatcher )
-
-
- Let the PAT trigger event being aware of the (and only this) new match:
process.patTriggerEvent.patTriggerMatches = [ "muonTriggerMatchHLTMuons" ]
- To finally switch on the PAT trigger information:
from CMS.PhysicsTools.PatAlgos.tools.trigTools import switchOnTrigger
switchOnTrigger( process )

Since the Python tool
switchOnTrigger( process )
needs the final list of trigger matches to be included, it has to come
last in the whole procedure. This is also the reason, why the PAT trigger information has to be loaded explicitly before to be modified (otherwise the Python tool does this). This differs from the approach in
exercise 1.
Exercise 3 - trigger information analysis
The problem to solve
Modify an existing CMSSW analyzer skeleton so, that the analyzer:
Hints:
- Use the checked out analyser's code and configuration skeleton files:
-
The code skeleton does not compile as it comes.
- The output file of exercise 2 serves as input file for this exercise.
- The analyzer uses the
TFileService
. Histograms can be defined in ther beginJob()
method. An example is given.
- Handles to access all PAT trigger collections, the PAT trigger event and the PAT muon collection are pre-defined in the
analyze()
method. Not all of them are necessarily needed.
- Also the PAT trigger match helper to access cross links between different PAT trigger items is already predefined:
const TriggerMatchHelper matchHelper;
- The loop over the PAT muons and the access to them in the needed format is pre-defined:
for ( size_t iMuon = 0; iMuon < muons->size(); ++iMuon ) {
// loop over muon references (PAT muons have been used in the matcher in task 3)
const reco::CandidateBaseRef candBaseRef( MuonRef( muons, iMuon ) );
} // iMuon
- An empty
endJob()
method to deal with the pt mean values is already present.
The solution
- To compare PAT and trigger objects:
- The required histograms can be defined in the
beginJob()
method e.g. like this:
histos2D_[ "ptTrigCand" ] = fileService->make< TH2D >
( "ptTrigCand", "Object vs. candidate p_{T} (GeV)", 60, 0., 300., 60, 0., 300. );
histos2D_[ "ptTrigCand" ]->SetXTitle( "candidate p_{T} (GeV)" );
histos2D_[ "ptTrigCand" ]->SetYTitle( "object p_{T} (GeV)" );
histos2D_[ "etaTrigCand" ] = fileService->make< TH2D >
( "etaTrigCand", "Object vs. candidate #eta", 50, -2.5, 2.5, 50, -2.5, 2.5 );
histos2D_[ "etaTrigCand" ]->SetXTitle( "candidate #eta" );
histos2D_[ "etaTrigCand" ]->SetYTitle( "object #eta" );
histos2D_[ "phiTrigCand" ] = fileService->make< TH2D >(
"phiTrigCand", "Object vs. candidate #phi", 60, -Pi(), Pi(), 60, -Pi(), Pi() );
histos2D_[ "phiTrigCand" ]->SetXTitle( "candidate #phi" );
histos2D_[ "phiTrigCand" ]->SetYTitle( "object #phi" );
-
- To fill the histograms, add inside the loop over the PAT muons in the
analyze()
method this to
- access the mathed trigger muon to the given PAT muon using the PAT trigger match helper:
const TriggerObjectRef trigRef(
matchHelper.triggerMatchObject(candBaseRef,triggerMatch,iEvent,*triggerEvent)
);
-
-
- fill the histograms including a necessary check for the validity of the retrieved objects:
if ( trigRef.isAvailable() ) { // check references (necessary!)
histos2D_[ "ptTrigCand" ]->Fill( candBaseRef->pt(), trigRef->pt() );
histos2D_[ "etaTrigCand" ]->Fill( candBaseRef->eta(), trigRef->eta() );
histos2D_[ "phiTrigCand" ]->Fill( candBaseRef->phi(), trigRef->phi() );
}
- To analyze the mean pt of all trigger objects depending on their filter ID,
- Define the min. and max. filter ID in:
- the analyzer's class definition in PhysicsTools/PatExamples/plugins/PatTriggerAnalyzer.h
unsigned minID_;
unsigned maxID_;
-
-
- the analyzer's constructor in PhysicsTools/PatExamples/plugins/PatTriggerAnalyzer.cc
minID_( iConfig.getParameter< unsigned >( "minID" ) ),
maxID_( iConfig.getParameter< unsigned >( "maxID" ) ),
-
-
- the analyzer's configuration in PhysicsTools/PatExamples/test/analyzePatTrigger_cfg.py
minID = cms.uint32( 81 ),
maxID = cms.uint32( 96 ),
This is, where the exact numbers as found in
DataFormats/HLTReco/interface/TriggerTypeDefs.h
go.
-
- Define maps to sum up the objects' counts and pt depending on the filter ID in PhysicsTools/PatExamples/plugins/PatTriggerAnalyzer.cc
std::map< unsigned, unsigned > sumN_;
std::map< unsigned, double > sumPt_;
-
- In the
beginJob()
method of the analyzer in PhysicsTools/PatExamples/plugins/PatTriggerAnalyzer.cc:
histos1D_[ "ptMean" ] = fileService->make< TH1D >
( "ptMean", "Mean p_{T} (GeV) per filter ID",
maxID_ - minID_ + 1, minID_ - 0.5, maxID_ + 0.5);
histos1D_[ "ptMean" ]->SetXTitle( "filter ID" );
histos1D_[ "ptMean" ]->SetYTitle( "mean p_{T} (GeV)" );
and
for ( unsigned id = minID_; id <= maxID_; ++id ) {
sumN_[ id ] = 0;
sumPt_[ id ] = 0.;
}
-
- In the
analyze()
method of the analyzer in PhysicsTools/PatExamples/plugins/PatTriggerAnalyzer.cc, accumulate the sums depending on the objects' filter IDs:
for ( unsigned id = minID_; id <= maxID_; ++id ) {
const TriggerObjectRefVector objRefs( triggerEvent->objects( id ) );
sumN_[ id ] += objRefs.size();
for ( TriggerObjectRefVector::const_iterator iRef = objRefs.begin();
iRef != objRefs.end(); ++iRef ) {
sumPt_[ id ] += ( *iRef )->pt();
}
}
-
- In the
endJob()
method of the analyzer in PhysicsTools/PatExamples/plugins/PatTriggerAnalyzer.cc, fill the histogram with the mean values:
for ( unsigned id = minID_; id <= maxID_; ++id ) {
if ( sumN_[ id ] != 0 ) histos1D_[ "ptMean" ]->Fill( id, sumPt_[ id ]/sumN_[ id ] );
}
Get the code solutions
In your work area:
Review status
Responsible:
VolkerAdler
Last reviewed by:
VolkerAdler - 18 Jan 2010
--
RogerWolf - 11 Jun 2009