MultiTrackValidator
Complete:
Goal of this page
The MultiTrackValidator is a tool that produces a set of histograms useful to test, validate and debug the track reconstruction chain.
This page describes the plot produced by the validator and briefly shows how to configure and use it.
More detailed description of the tools that can be used to produce performance plots and compare the ones for different releases of different configuration can be found from
TrackingValidationMC page.
Where to find it
You can find the MultiTrackValidator in the
Validation/RecoTrack
package.
A configuration file example to run the MultiTrackValidator is test/MultiTrackValidator_cfg.py. Check it out from git:
git cms-addpkg Validation/RecoTrack/
How to run it
MultiTrackValidator takes as input one or more root files containing previously produced tracks (edit
fileNames = cms.untracked.vstring in
PoolSource module).
The default configuration is in
Validation/RecoTrack/python/MultiTrackValidator_cfi.py
.
Configuration parameters
The main configuration parameters are the following:
- label
- the vector of input collections. It can contain the name of any module producing a collection of objects inheriting from the Track class
- ignoremissingtrackcollection
- a flag to avoid stopping the program execution in case of missing input track collection.
- beamSpot
- the beam spot which the tracks are referred to
- dEdx1Tag, dEdx2Tag
- the dEdx products
- label_tp_effic
- the collection of TrackingParticles used for efficiency studies.
- label_tp_fake
- the collection of TrackingParticles used for fake rate studies.
- UseAssociators
- flag to associate tracks and TrackingParticles within MultiTrackValidator (
True
), or to retrieve the association map from the event (False
). In the first case, the associators
parameter must point to the associators, while in the second case the parameter must point to the association map.
- associators
- the list of the associators or association maps to be used (which one, is controlled by
UseAssociators
parameter)
- sim
- vector if SimHit collections
- useGsf
- use Gsf specific methods for Gsf tracks
- doSimPlots
- flag for whether or not to do plots of all TrackingParticles passing
TrackingParticleSelectionForEfficiency
(under "simulation" directory)
- doSimTrackPlots
- flag for whether or not to do plots from TrackingParticles, e.g. efficiencies
- doRecoTrackPlots
- flag for whether or not to do plots from tracks, e.g. fake rates
- dodEdxPlots
- flag for whether or not to do plots from dEdx products (if false, dEdx products are not read from Event)
- dirName
- the base directory of histograms in the output DQM fiowhere in the output file.
- parametersDefiner
- defines where the tracking particle parameters are eveluated. Use LhcParametersDefinerForTP for LHC tracks or CosmicParametersDefinerForTP for cosmics.
- TrackingParticleSelectionForEfficiency
- set of cuts to select the TrackingParticles for efficiency studies (i.e. the simulated track which are expected to be reconstructed). The cuts are defined in Validation/RecoTrack/python/TrackingParticleSelectionForEfficiency_cfi.py
.
- histoProducerAlgoBlock
- set of parameters for histograms
- minX, maxX, nintX
- minimum, maximum, and number of bins for histograms for quantity X
- useFabsEta
- a flag to fill plots vs the absolute value of pseudorapidity of vs the signed value.
- useInvPt
- a flag to fill plots vs the inverse of the transverse momentum.
- useLogPt
- use logarithmic scale in plots vs pt
- TpSelectorForEfficiencyVsX
- TrackingParticle selector for efficiencies vs. quantity X (X=Eta, Phi, Pt, VTXR, VTXZ)
- generalTpSelector
- TrackingParticle selector for efficiencies vs. other quantities than any of the X above
Example configurations
You can take a look of the example configurations to see something simple, but they are badly out of date. You are likely better served by running either the standard, standalone, or trackingOnly setup (more details below) using the `runTheMatrix.py` workflow of your choice as the base. This way the geometry, era, and
GlobalTag are guaranteed to be consistent and correct. The downside is that these setups are somewhat complex to understand.
Edit the configuration file as you prefer (you can change the default parameters directly in
MultiTrackValidator_cfg.py
, for example:
process.multiTrackValidator.out = 'myFile.root'
) and run it, e.g.:
cmsenv
cmsRun Validation/RecoTrack/test/MultiTrackValidator_cfg.py
By default the above step produces a root file in the DQMIO format. It can be converted to a root file with histograms by running the harvesting step by e.g.
cmsRun Validation/RecoTrack/test/MTV_HARVESTING.py
cmsDriver (as part of standard sequences)
MultiTrackValidator can also be run as part of the standard configurations generated with
cmsDriver.py
, e.g.
cmsDriver.py step3 --conditions auto:run2_mc -n -1 --eventcontent RECOSIM,DQM -s RAW2DIGI,L1Reco,RECO,EI,VALIDATION,DQM --datatier GEN-SIM-RECO,DQMIO --customise SLHCUpgradeSimulations/Configuration/postLS1Customs.customisePostLS1 --magField 38T_PostLS1 --no_exec
cmsDriver.py step4 --conditions auto:run2_mc -n -1 -s HARVESTING:validationHarvesting+dqmHarvesting --filetype DQM --customise SLHCUpgradeSimulations/Configuration/postLS1Customs.customisePostLS1 --mc --magField 38T_PostLS1 --filein file:step3_RECO_EI_VALIDATION_DQM_inDQM.root --no_exec
But please use e.g.
runTheMatrix.py
to generate the example configuration, they are guaranteed to be up to date while the lines above are already obsolete.
For input files, include
--filein ...
parameter to "step3" line, or edit the generated configuration file.
Note: If you do not run full reconstruction, you should use
trackingOnlyMode. This is because the default configuration makes plots e.g. for tracks from
AK4PFJets (via PFCandidates), and if the jet collection is missing, the default machinery will not work.
cmsDriver (MTV alone, i.e. standalone mode)
Starting from 7_5_0_pre6, cmsDriver can be used to generate configurations for running
MultiTrackValidator only, e.g.
cmsDriver.py step3 ... --eventcontent DQM --datatier DQMIO -s VALIDATION:tracksValidationStandalone --filein <RECOFILE(S)> [--secondfilein <DIGIFILE(S)>] --fileout step3_inDQM.root
cmsDriver.py step4 ... -s HARVESTING:postProcessorTrackSequence --filetype DQM --filein file:step3_inDQM.root
Here
...
means the usual parameters for the steps. Use e.g.
runTheMatrix.py
to generate an example configuration.
The "step3" configuration needs (GEN-SIM-)RECO as the primary input files, and GEN-SIM-DIGI-RAW(-HLTDEBUG) (i.e. something containing the
TrackingParticles) as secondary input files. See also
Validation/RecoVertex/README.md
for similar instructions for running the
vertex validation package using the two-file solution.
Important The "step3" configuration generation needs at least
--filein ...
parameter, otherwise
cmsDriver.py
will give an error (and why not giving
--secondfilein ...
as well on the same go). Also note that the secondary files need to contain all the events in the primary file, and probably easiest is to query all DIGI-RAW files with DAS to a file
das_client --limit 0 --query "file dataset=/..." > files_digi.txt
and then use
--secondfilein filelist:files_digi.txt
.
Note: If you re-run the reconstruction only partly, you should use
trackingOnlyMode. This is because the default configuration makes plots e.g. for tracks from
AK4PFJets (via PFCandidates), and if the jet collection is not re-done, the Refs via PFCandidates do not point to the freshly-made tracks.
cmsDriver (tracking-only reconstruction, validation, and DQM; i.e. trackingOnly mode) or runTheMatrix
Starting from 8_0_0_pre3, cmsDriver can be used to generate configurations for running tracking-only reconstruction, validation, and DQM, e.g.
cmsDriver.py step3 ... -s RAW2DIGI,RECO:reconstruction_trackingOnly,VALIDATION:@trackingOnlyValidation,DQM:@trackingOnlyDQM
cmsDriver.py step4 ... -s HARVESTING:@trackingOnlyValidation+@trackingOnlyDQM
Here
...
means the usual parameters for the steps. Use e.g.
runTheMatrix.py
to generate an example configuration.
Since 8_1_0_pre7 some trackingOnly-workflows have been added to
runTheMatrix.py
. The exact list of available workflows depends on the release, but they can be found with e.g.
runTheMatrix.py -n [-w upgrade] | fgrep trackingOnly
The
-w upgrade
is needed to view all phase2 workflows as only a subset of them are imported in the default matrix. Note that while the trackingOnly variants have been generally added only for ttbar without pileup workflows, the very same RECO and HARVESTING step configurations can be used for other samples as well, also those with pileup (the pileup is possible because nothing in trackingOnly validation requires running the
MixingModule in the
playback mode that is part of the standard VALIDATION procedure; complication of the playback mode is that one needs access to the very same MinBias files that were used to mix the pileup events).
Command-line utility for harvesting validation histogram
There is a command-line utility for harvesting the validation histograms:
harvestTrackValidationPlots.py
(tracking DQM histograms are not included). For impatient
harvestTrackValidationPlots.py step3_inDQM.root # produces harvest.root
For more options, see
harvestTrackValidationPlots.py -h
Plotting
Tools to make plots from the DQM root files are discussed on
TrackingValidationMC page.
Differences of the sequences
The differences of the "standard", "standalone mode", and "tracking-only" mode are explained below.
Sequence |
Description |
tracksValidation (default) |
The default sequence includes the MTV for `generalTracks`, and tracks for each iteration as selected from `generalTracks`. MTV variant for tracks from PV (wrt. signal simulated vertex TrackingParticles and wrt. all TrackingParticles) as well as using all TrackingParticles for efficiencies are included, but only using `generalTracks` and the high-purity subset of them. |
tracksValidationStandalone |
In addition to tracksValidation , includes per-iteration plots for tracks from PV, and efficiencies with all TrackingParticles. |
@trackingOnlyValidation (tracksValidationTrackingOnly ) |
In addition to tracksValidationStandalone , includes MTV variants for built tracks (as opposed to selected tracks) and seeds. Must be run on the same job as reconstruction. |
Filter input collections
The TrackingParticles used for efficiency studies are already filtered according to
TrackingParticleSelectionForEfficiency_cfi.py
.
Nevertheless, the user may want to custom filter the track or TrackingParticle collections used to feed the MultiTrackValidator.
This can be easily done using the filters defined in
Validation/RecoTrack/python/cuts_cff.py
.
Please remember to change the input collection labels for the MultiTrackValidator and to add the filter modules to the path.
The output file
The output file (
DQM_V0001_R000000001__Global__CMSSW_X_Y_Z__RECO.root
or similar) contains several directories according to the names in the
label and in the
associators vectors that are set in the configuration file.
Every directory contains the same set of histograms, but filled using a different track collection and a different track associator. For example, a directory named
general_AssociatorByHits
contains the validation plots obtained with the tracks produced in the
generalTracks module and the
TrackAssociatorByHits.
Here is the list of the implemented histograms and plots.
Note that some of the details have changed during the development of MTV (mostly in 75X and 76X cycles), the documentation below reflects the current state
Definition of efficiency, fake rate, duplicate rate, pileup rate
Since these quantities are repeatedly used, they are defined here once and for all
- Efficiency
- denominator: TrackingParticles (passing some selection); numerator: same TrackingParticles that are associated to track(s)
- Fake rate
- denominator: tracks; numerator: tracks that are not associated to any TrackingParticle
- Duplicate rate
- denominator: tracks; numerator: tracks that are associated to TrackingParticle, that is associated to at least two tracks
- Pileup rate
- denominator: tracks; numerator: tracks that are associated to TrackingParticle from in-time pileup (eventId: bunchCrossing=0, event = 0)
Plots for cross checking with simulation (dirName/simulation
)
- ptSIM
- pT of all in-time TrackingParticles
- etaSIM
- eta of all in-time TrackingParticles
- vertposSIM
- transverse position of the production vertices of all in-time TrackingParticles
- tracksSIM
- number of all in-time TrackingParticles per event
- bunchxSIM
- bunch crossing of all (in-time and out-of-time) TrackingParticles
Validation summary plots (dirName
)
- effic_vs_coll
- Average efficiency per input track collection (for TrackingParticles passing
TpSelectorForEfficiencyVsEta
)
- effic_vs_coll_allPt
- Average efficiency per input track collection (for TrackingParticles passing
TpSelectorForEfficiencyVsEta
excluding the pT cut)
- fakerate_vs_coll
- Average fake rate per input track collection
- duplicatesRate_coll
- Average duplicate rate per input track collection
- pileuprate_coll
- Average pileup rate per input track collection
- num_reco_coll
- Number of tracks per collection
- num_simul_coll
- Number of TrackingParticles (passing
TpSelectorForEfficiencyVsEta
) per collection (should have same value for all bins)
- num_simul_coll_allPt
- Number of TrackingParticles (passing
TpSelectorForEfficiencyVsEta
excluding the pT cut) per collection (should have same value for all bins)
- num_assoc(recoToSim)_coll
- Number of tracks associated to TrackingParticles per collection
- num_assoc(simToReco)_coll
- Number of TrackingParticles (passing
TpSelectorForEfficiencyVsEta
) associated to tracks per collection
- num_assoc(simToReco)_coll_allPt
- Number of TrackingParticles (passing
TpSelectorForEfficiencyVsEta
excluding the pT cut) associated to tracks per collection
- num_duplicate_coll
- Number of duplicate tracks per collection
- num_pileup_coll
- Number of pileup tracks per collection
Validation plots per input track collection (dirName/label_associator
)
Global tracking performances
- tracks
- Number of reconstructed tracks (associated to TrackingParticles)
- fakes
- Number of fake tracks (i.e. tracks not associated to any TrackingParticle)
- effic
- Efficiency vs eta
- efficPt
- Efficiency vs pT
- effic_vs_X
- Efficiency vs X (see below for possible values of X)
- fakerate
- Fake rate vs eta
- fakeratePt
- Fake rate vs pT
- fakerate_vs_X
- Efficiency vs X
- duplicatesRate
- Duplicate rate vs eta
- duplicatesRate_Pt
- Duplicate rate vs pT
- duplicatesRate_X
- Duplicate rate vs X
- pileuprate
- Pileup rate vs eta
- pileuprate_Pt
- Pileup rate vs pT
- pileuprate_X
- Pileup rate vs X
- chargeMisIdRate
- Charge mis-ID rate vs eta
- chargeMisIdRate_Pt
- Charge mis-ID rate vs pT
- chargeMisIdRate_X
- Charge mis-ID rate vs X
In above, the quantity
X can be
- phi
- Track phi parameter
- dxy
- Track dxy parameter
- dz
- Track dz parameter
- hit
- Number of hits
- layer
- Number of layers
- 3Dlayer
- Number of 3D layers (pixel + matched strips)
- pixellayer
- Number of pixel layers
- dr
- Minimum DeltaR between tracks
- pu
- Pileup
- vertpos
- TrackingParticle production vertex xy position (only for efficiency)
- zpos
- TrackingParticle production vertex z position (only for efficiency)
- chi2
- Track chi2/ndof (only for fake rates etc)
num_reco_eta Number of reco track vs eta
num_assoc(simToReco)_eta Number of associated tracks (simToReco) vs eta
num_assoc(recoToSim)_eta Number of associated (recoToSim) tracks vs eta
num_simul_eta Number of simulated tracks vs eta
num_reco_pT Number of reco track vs pT
num_assoc(simToReco)_pT Number of associated tracks (simToReco) vs pT
num_assoc(recoToSim)_pT Number of associated (recoToSim) tracks vs pT
num_simul_pT Number of simulated tracks vs pT
nrec_vs_nsim number of reconstructed vs number of simulated tracks (2D plot)
Number of hits, chi2 and charge distributions
chi2 track normalized chi2
chi2_prob normalized chi2 probability
chi2_vs_eta track chi2 vs eta (2D plot)
chi2mean mean track chi2 vs eta
chi2_vs_nhits track chi2 vs number of hits (2D plot)
hits number of hits per track
losthits number of lost hits per track
hits_eta mean number of hits vs eta
losthits_eta mean number of lost hits vs eta
nhits_vs_eta number of hits vs eta (2D plot)
nlosthits_vs_eta number of lost hits vs eta (2D plot)
charge charge distribution
Pulls and residues of track parameters
(see
TackBase.h
for details on track parameters)
eta pseudorapidity residue
pullPt pull of pT
pullTheta pull of theta parameter
pullPhi pull of phi parameter
pullDxy pull of dxy parameter
pullDz pull of dz parameter
pullQoverp pull of qoverp parameter
h_dxypulleta sigma of dxy pull vs eta
h_ptpulleta sigma of p_{t} pull vs eta
h_dzpulleta sigma of dz pull vs eta
h_phipulleta sigma of #phi pull vs eta
h_thetapulleta sigma of #theta pull vs eta
dxypull_vs_eta dxy pull vs eta (2D plot)
ptpull_vs_eta pt pull vs eta (2D plot)
dzpull_vs_eta dz pull vs eta (2D plot)
phipull_vs_eta phi pull vs eta (2D plot)
thetapull_vs_eta theta pull vs eta (2D plot)
h_ptshifteta mean delta_pT/pT vs eta
Resolution of track parameters
sigmapt sigma(delta_pT/pT) vs eta
sigmaptPt sigma(deltap_T/pT) vs pT
sigmacotTheta sigma(delta_cot(theta)) vs eta
sigmacotThetaPt sigma(delta_cot(theta)) vs pT
sigmaphi sigma(delta_phi) vs eta
sigmaphiPt sigma(delta_phi) vs pT
sigmadxy sigma(delta_dxy) vs eta
sigmadxyPt sigma(delta_dxy) vs pT
sigmadz sigma(delta_dz) vs eta
sigmadzPt sigma(delta_dz) vs pT
etares_vs_eta eta residue vs eta (2D plot)
ptres_vs_eta pt residue vs eta (2D plot)
ptres_vs_pt pt residue vs pt (2D plot)
cotThetares_vs_eta cotTheta residue vs eta (2D plot)
cotThetares_vs_pt cotTheta residue vs pt (2D plot)
phires_vs_eta phi residue vs eta (2D plot)
phires_vs_pt phi residue vs pt (2D plot)
dxyres_vs_eta dxy residue vs eta (2D plot)
dxyres_vs_pt dxy residue vs pt (2D plot)
dzres_vs_eta dz residue vs eta (2D plot)
dzres_vs_pt dz residue vs pt (2D plot)
Track association plots
assocFraction fraction of shared hits (TrackAssociatorByHits only)
assocSharedHit number of shared hits (TrackAssociatorByHits only)
assocChi2 track association chi2 (TrackAssociatorByChi2 only)
assocChi2_prob probability distribution of association chi2 (TrackAssociatorByChi2 only)
Track Associators
The MultiTrackValidator analyzes the tracking performance by comparing every reconstructed Track with the corresponding TrackingParticle. Reconstructed Tracks are matched to TrackingParticles using the
Track Associator.
MultiTrackValidatorBase and TrackerSeedValidator
The MultiTrackValidator class inherits from a base class, called MultiTrackValidatorBase. It implements all the functionalities that are common with the
TrackerSeedValidator, like the instance of DQM services, getting common parameters from the cfg file, common functions, etc.
Use of View
The MultiTrackValidator takes from the event an
edm::Handle<edm::View<reco::Track> >
instead of a simple reco::TrackCollection.
This allows to retrieve from the event any collection of objects inheriting from Track class. Therefore, the MultiTrackValidator produces validation plots not only for standard tracks but also for GsfTracks.
Review status
Responsible:
GiuseppeCerati
Last reviewed by: Most recent reviewer