TauID Validation How-To


  • This page explains how to run the Tau ID validation package.


The tau validation package is currently undergoing major changes.

  • Last working tag: V06-06-15

Known issues

This is a collection of known issues and bugs in V7x, we are currently testing the release and fix the bugs/broken files. If you find a problem which is not in the list please send a mail to mauro.verzetti_AT_cern.ch

  • Broken default plotting

Software structure

The package resides in Validation/RecoTau directory. As all CMSSW modules it is made of a core wrtitten in C++ residing in the src/, interface/ and plugins/ directories and of some python configuration scripts residing in python/ directory. This page will mostly focus on the configuration part, since is the most useful for the end user who wants to run the package. Now we will briefly describe the content of the major files.


This file contains the information of the collections that we are validating. It also contains the definitions of the efficiency producer that will be used in the validation procedure and, finally, all the modules necessary to plot the result from the Validation.


In version 7-x-x of the software the code will become self-configuring using the informations from the RecoTauTag package currently installed in the scram area. An example of how a module is built will be kept commented to make more easy the possible hardcoding of some changes. Another new feature in V7xx in this file is that the sequences and modules produced there are templates that will not be run in the final cfg but cloned to produce more suitable ones, in principle there should be no side effect since any change here should propagate to the final modules.

By now (07 Aug 2011) the standard plotting tool is broken, it may not be recovered but substituted by the MultipleCompare tool


These files contain the specific settings needed to run the validation on these samples.


In version 7-x-x these files produce the modules, sequences and paths that will be really run in the validation


This file defines the sequences that will run in the DQMMCValidation. This sequence is run for every release on the RelVal samples and on every main MC dataset produced by CMS. The only weak point is that this sequence is run with the default RecoTauTag package that comes with the release.


This files contains all the in line options that can be given to RunValidation_cfg when running locally or on grid (not in DQM)


These files contain the location of the files to be analyzed by RunValidation_cfg in local/batch mode. An example of crab.cfg can be found in test directory


Main executable of the package. Performs the analysis. A modification of this file (RunValidation_VtXCutScan.py) is kept as example of how modifying it for algorithm commissioning purposes.


Plotting macro that allows the user to plot on the same TCanvas several single efficiency plots and to compare them with plots coming from a different release, but with the same name. If no reference file is passed it works as a drawing tool.


In version 7xx the introduction of argparse makes easier to fins the suitable option:

MultipleCompare -h
Usage: MultipleCompare.py -T testFile [options] [search strings that you want to apply "*" is supported as special character]

Script to plot the content of a Validation .root file and compare it to a
different file:

  -h, --help            show this help message and exit
  -T testFile, --TestFile=testFile
                        Sets the test file
  -R refFile, --RefFile=refFile
                        Sets the reference file
  -o outputFile, --output=outputFile
                        Sets the output file
  --logScale            Sets the log scale in the plot
  -f, --fakeRate        Sets the fake rate options and put the correct label
                        (implies --logScale)
  -t testLabel, --testLabel=testLabel
                        Sets the label to put in the plots for test file
  -r refLabel, --refLabel=refLabel
                        Sets the label to put in the plots for ref file
  --maxLog=number       Sets the maximum of the scale in log scale (requires
                        --logScale or -f to work)
  --minDiv=number       Sets the minimum of the scale in the ratio pad
  --maxDiv=number       Sets the maximum of the scale in the ratio pad
  --logDiv              Sets the log scale in the plot


CMSSW_3_11_X, CMSSW_4_1_X, CMSSW_4_2_X

cvs co -r <tag> Validation/RecoTau  #Check out from cvs
scramv1 b                                                   #Compiles
cd Validation/RecoTau/test
source UtilityCommands.csh                      #Sets some environmental variables / aliases


Edit Validation/RecoTau/test/EventSource_ZTT_RECO_cff.py to point the correct data files. (from DBS, etc). In the test/ directory, execute the command ./RunValidation_cfg.py. The folder CMS.TauID/ZTT_recoFiles/ will be created and the validation production will be run. To produce the summary plots, and build a webpage, execute the following commands:

cd TauID/ZTT_recoFiles
An index.html will be generated with the summary plots for the TauID validation.

Version 6xx Since the names of the discriminators is changing very quickly it is suggested to check python/RecoTauValidation_cfi.py to contain the correct names for producers and algorithms

Version 7xx The introduction of self configuring algorithm solves the problem of editing python/RecoTauValidation_cfi.py but breaks the commands Summarize and BuildWebpage. Since the duty of release by release validation has been passed to the DQM these toos will not be provided anymore (probably, discussion still ongoing). MultipleCompare, more easy to customize and more flezible, will replace the plotting tools.

Comparing to another release

Follow the instructions given in the Quickstart above (note that this procedure must be followed for both releases) For example, the following commands will compare 310pre7 to pre6.

# in your pre7 Validation/RecoTau/test directory
cd TauID/ZTT_recoFiles
Compare compareTo=[your pre6 Validation Dir]/CMS.TauID/ZTT_recoFiles referenceLabel="310pre6" testLabel="310pre7"
A directory ComparedTo310pre6= will be produced automatically. To add this into the webpage, re-run BuildWebpage.

Note that when you source UtilityCommands.csh, it sets the enviroment variable $PastResults to the PF validation webpage that you can use to compare to past results. For example, you could replace [your pre6 Validation Dir] with $PastResults/CMSSW_3_1_0_pre6/CMS.TauID/ZTT_recoFiles to compare your results to the official ones.

Version 7xx As the other plotting tools the comparison one is broken too. MultipleCompare, passing a reference file, acts the same way.

Submitting the results

First, contact to Colin to get the necessary permissions. Then, run the following commands.
# in your Validation/RecoTau/test directory
cd CMS.TauID
You can add a label to the submission using the -e EXTENSION, --extension=EXTENSION option to submit.py


Since the duty of checking constantly the validation has been moved to DQM and RelMon, the web page is not needed anymore. It is still used to keep plots from the development of the algorithm, but it will probably will be deleted in the near future.

RunValidation_cfg options

The options are passed into RunValidation_cfg on the command line, using the CMSSW VarParsing framework (SWGuideAboutPythonConfigFile#VarParsing_Example) Some options can have multiple values e.g.:

# in your Validation/RecoTau/test directory
./RunValidation_cfg.py [optionName]=MyOptionValue [multiOption]=MyMultiOption1 [multiOption]=MyMultiOption2 

The default option is given in bold.


Specify the type of event to validate.
  • ZTT standard Z to hadronically decaying taus validation
  • QCD validate QCD fake rate
  • ZEE validate electron to hadronic tau fake rate coming from Zee events
  • ZMM validate mu to hadronic tau fake rate coming from Zmumu events
  • RealData validate basic fakerates on collision data


Specify the source of the data. Options:
  • recoFiles Get data from the corresponding EventSource_[eventType]_RECO_cff.py
  • recoFiles+PFTau Get data as above, but rerun the PFTau sequence (useful for testin tags)
  • digiFiles Get data from Event_[eventType]_DIGI_cff.py and run reconstruction sequence, then validate
  • fastsim Generate [eventType] events using the fast sim and validate them
  • fullsim Generate [eventType] events using the full sim and validate them (not fully tested)


Specify the global tag to use. If left as none, will default to loading whatever is defined in Configuration/StandardSequences/python/FrontierConditions_GlobalTag_cff.py. See SWGuideFrontierConditions for a current listing of appropriate GlobalTags.


Append a label to directory produced by the validation. Defaults to no label. Example:
./RunValidation_cfg.py label="MyLabel"
will store the output in TauID/ZTT_recoFiles_MyLabel (as opposed to ZTT_recoFiles)


Specify the number of events to process. Default is -1 (all).


Override the event source file in the case that [dataSource] is using the recoFiles or digiFiles option.
./RunValidation_cfg.py label="MyFiles" dataSoure=recoFiles+PFTau sourceFile=MyRootFiles_cff.py
will take the .root files in MyRootFiles_cff.py, rerun PFTau, then store them in TauID/ZTT_recoFiles_MyFiles


This parameter is optional. If a modification file is specified, it will be loaded in the RunValidation_cfg at the end of the file. You can add more than one modification file. For example, suppose you want to modify the lead pion pt cut for the shrinking cone taus.

Create a modification file

import FWCore.ParameterSet.Config as cms
from RecoTauTag.Configuration.RecoPFTauTag_cff import shrinkingConePFTauDiscriminationByLeadingPionPtCut
print "Modifying lead pion requirement."
shrinkingConePFTauDiscriminationByLeadingPionPtCut.MinPtLeadingPion = cms.double(20.)

Run the validation:

./RunValidation_cfg.py myModifications=TestModify.py label=IncreaseLeadPionReq dataSource=recoFiles+PFTau condtions="STARTUP_31X_V1::All"


Boolean (default False) to turn on the grid compatibility mode. To be given in your crab.cfg


# produce 500 ZTT (always default eventType) fastsim events, and run the validation on them.
./RunValidation_cfg.py dataSource=fastsim maxEvents=500

# produce 500 QCD fastsim events, and run the validation
./RunValidation_cfg.py dataSource=fastsim eventType=QCD maxEvents=500

# Add a label to the output
./RunValidation_cfg.py label="MyChanges"

# with the RECO files defined in EventSource_ZTT_RECO_cff.py; rerun the PFTau sequences, then do the validation
#  this is good for checking new tags
./RunValidation_cfg.py conditions=<GlobalTag> dataSource=recoFiles+PFTau

# with the DIGI files defined in EventSource_ZTT_DIGI_cff.py; rerun reconstruction, then do the validation
./RunValidation_cfg.py dataSource=digiFiles

# run the full sim, and set the conditions
./RunValidation_cfg.py dataSource=fullsim conditions="IDEAL_31X::All"

Automatic batch submission

Validation/RecoTau/test/LXBatchValidation.py takes the same options as RunValidation_cfg.py, but will submit nJobs with maxEvents each to LXBatch. You must then merge the resulting batch files (which individually only contain the numerator/denominator histograms). Example:

# submit batch jobs to produce and validation 10k QCD events

cd Validation/RecoTau/test
source UtilityCommands.csh
./LXBatchValidation.py maxEvents=1000 nJobs=10 eventType=QCD dataSource=fastsim

# have a coffee

cd TauID/QCD_fastsim/

# continue as above...
Edit | Attach | Watch | Print version | History: r28 < r27 < r26 < r25 < r24 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r28 - 2013-06-06 - MauroVerzetti
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback