4.2.4.3 PAT Exercise 03: How to configure your PAT Tuple

Contents

Objectives

  • Learn how to configure a patTuple according to your personal needs.
  • Learn how to use the edmConfigEditor to do so.
  • Learn more about the patTools.

Introduction

You have already learnt how to produce a pat::Tuple using the standard configuration of PAT in Exercise 2. A standard pat::Tuple might not be exactly what you need. For this reason PAT provides several tools (PAT tools) conceived to facilitate the customisation and configuration of the standard PAT workflow and output. For instance, to run your own analysis on data and not on simulated events, you have to remove the matching to generator particles for the pat::Candidates. A tool to make this has been already implemented (we will see it in more details later): you can call it just as a simple function!

In this tutorial you will learn how to modify the workflow for pat::Candidate production according to your analysis requirements, and how to adapt the event content of the pat::Tuple you want to produce using PAT tools. You will see how to investigate the PAT configuration and how to edit it, but also how to explore the content of the created pat::Tuple. Several tools exist in order to simplify these operations. In the following sections you will see how to make use of the following tools:

  • edmConfigEditor
  • PAT tools
  • edmDumpEventContent
ALERT! Note:

This web course is part of the PAT Tutorial, which takes regularly place at cern and in other places. When following the PAT Tutorial the answers of questions marked in RED should be filled into the exercise form that has been introduced at the beginning of the tutorial. Also the solutions to the Exercises should be filled into the form. The exercises are marked in three colours, indicating whether this exercise is basic (obligatory), continuative (recommended) or optional (free). The colour coding is summarized in the table below:

Color Code Explanation
red Basic exercise, which is obligatory for the PAT Tutorial.
yellow Continuative exercise, which is recommended for the PAT Tutorial to deepen what has been learned.
green Optional exercise, which shows interesting applications of what has been learned.

Basic exercises ( red ) are obliged and the solutions to the exercises should be filled into the exercise form during the PAT Tutorial.

Setting up the environment

We assume that you are logged in on lxplus and are in your work directory. If not you can follow the instruction given here.

ssh -X your_lxplus_name@lxplus6.cern.ch 
cd scratch0/
mkdir exercise03
cd exercise03
cmsrel CMSSW_7_4_1_patch4
cd CMSSW_7_4_1_patch4/src 
cmsenv
git cms-addpkg PhysicsTools/PatAlgos
git cms-addpkg FWCore/GuiBrowsers
git cms-merge-topic -u CMS-PAT-Tutorial:CMSSW_7_1_0_patTutorial
scram b -j 4

If you are running remotely (via ssh -Y) use edmConfigEditorSSH instead of edmConfigEditor in the following.

ALERT! Note that you need a reasonably good network connection to make use of the graphical tools via ssh -Y. If you don't have a sufficient connection, you may proceed doing this exercise using interactive python and text editors only.

How to browse the configuration of the PAT workflow

The first step is to learn more about PAT by looking at the production of the standard pat::Tuple and by inspecting the configuration file used to produce it. The tools which will be used to go more into detail are:

  • The edmConfigEditor, a graphical tool to visualise the workflow of all kind of configuration files within the CMSSW framework and to edit their configurations. Find more details at SWGuideConfigEditor.

  • The edmDumpEventContent tool, that shows which products exist in the event content of the produced pat::Tuple. It specifies, for each product, which are the kind of objects and the their module and instance labels. Find more details about it at WorkBookEdmInfoOnDataFile#EdmDumpEventContent. You might also find the PAT Tutorial Pre-Exercises 6 useful, which is dedicated to the EDM utilities, in order better to understand how to use this tool.
Start with a basic exercise:

  • Produce a pat::Tuple using the standard configuration file ( patTuple_standard_cfg.py).
  • Visualise which are the products in the produced file using the edmDumpEventContent.
  • Look at the collection of pat objects produced by the module named cleanPatJets.
  • Explore the configuration file with ConfigEditor looking for that module
  • Find out which is the pat jet algorithm used by default
Open patTuple_standard_cfg.py, you can find it in PhysicsTools/PatAlgos/test:
cd PhysicsTools/PatAlgos/test

For our purposes it is enough to run over only a few events, therefore at the end the maximal number of events is set to 100. If you want to change it edit the according line in the patTuple_standard_cfg.py file and set it to 120:

process.maxEvents.input = 120 

Run the example:

cmsRun patTuple_standard_cfg.py

Look at the content of the produced PAT tuple:

edmDumpEventContent patTuple_standard.root

Type                                  Module                   Label          Process   
----------------------------------------------------------------------------------------
edm::OwnVector<reco::BaseTagInfo,edm::ClonePolicy<reco::BaseTagInfo> >    "selectedPatJets"        "tagInfos"     "PAT"     
vector<CaloTower>                     "selectedPatJets"        "caloTowers"   "PAT"     
vector<pat::Electron>                 "selectedPatElectrons"   ""             "PAT"     
vector<pat::Jet>                      "selectedPatJets"        ""             "PAT"     
vector<pat::MET>                      "patMETs"                ""             "PAT"     
vector<pat::Muon>                     "selectedPatMuons"       ""             "PAT"     
vector<pat::Photon>                   "selectedPatPhotons"     ""             "PAT"     
vector<pat::Tau>                      "selectedPatTaus"        ""             "PAT"     
vector<reco::GenJet>                  "selectedPatJets"        "genJets"      "PAT"       

The output shows which are the products of the pat::Tuple and gives information about the type, module label, instance label and the process they were produced in.

Please modify the patTuple_standard_cfg.py to avoid a temporary problem when browsing the test samples from the release (so called RelVal samlpes). Replace the following:

from PhysicsTools.PatAlgos.patInputFiles_cff import filesRelValProdTTbarAODSIM
process.source.fileNames = filesRelValProdTTbarAODSIM

by

process.source.fileNames = cms.untracked.vstring('/store/relval/CMSSW_7_1_0_pre4_AK4/RelValProdTTbar_13/AODSIM/POSTLS171_V1-v2/00000/0AD19371-15B6-E311-9DB6-0025905A6110.root')

Now look at the configuration using the ConfigEditor:

edmConfigEditor patTuple_standard_cfg.py &

Question Question 3 a) Which is the jet algorithm used by pat default? Try to find out it by browsing the cfg file. (Suggestion: you can browse the process path in the TreeView, left-bottom column, looking for patJets module, or you can decide to use Find in Edit menu.)

Jetsource.png

Do not forget to exit the ConfigEditor once you have completed your exercises (*File --> Exit *).

You can also look for the default jet algorithm by using interactive python:

ipython patTuple_standard_cfg.py
>>>process.patJets

Note: There is a tab completion feature: Type process.p and the tab key. Press ctrl+D to quit interactive python.

How to edit configuration of PAT workflow

In order to produce a user-defined workflow, PAT provides a set of common tools which simplify to configure your analysis. Find the list of all available PAT tools and a detailed explanation on their function and on how to apply them in your analysis at SWGuidePATTools.

To produce your own user-defined PAT analysis follow these steps:

  • Start from the standard configuration file.
  • Apply the PAT tools to change the configuration of the standard configuring file according to the specific needs of your analysis.
  • Replace remaining parameter values according to the needs of your analysis.
We now will through this scheme step by step:

Step 1: Create your analysis configuration file starting from PAT default configuration.

Open the edmConfigEditor.

edmConfigEditor

From the File menu choose New configuration file. Import a standard configuration file by clicking on Import Configuration, choose the patTuple_standard_cfg.py file as the configuration to start from:

ImportConfigFile.png

Step 2: Apply PAT tools.

Remove the MC matching from the PAT default sequence to avoid keeping generator level information. Click on Apply Tool to visualise the list of all available PAT tools. For each tool a description is provided, which appears to the right when clicking on the tool name. Choose coreTools.removeMCMatching by clicking on Apply. Look at coreTools to get more information about the coreTools and at removeMCMatching for more details about the usage of the tool.

RemoveMCmatching.png

Now look at the resulting code in the top left corner of the ConfigEditor window:

ResultingCode.png

This corresponds to writing the following configuration code by hand in a text editor:

from patTuple_standard_cfg import * from PhysicsTools.PatAlgos.tools.coreTools import * removeMCMatching(process, ['All']) 

Question Question 3 b) Check if the MC matching has really been switched off by looking at the patJets and the patMuons module. At which value addGen(Jet)Match is set?

You can navigate through the path in the TreeView or you can decide to use Edit-->Find menu. You can also have a look at patMuons and patJets using interactive python.

addGenMatch.png

Save your cfg file ( File-->SaveAs) as patTuple_standard_without_MC_match_cfg.py, exit the ConfigEditor ( File-->Exit) and come back to the terminal. You can also look at the patMuon collection with iPython. Do it both before and after having applied the removeMCMatching tool:

ipython patTuple_standard_cfg.py
>>>process.patMuons.embedGenMatch

Before removing MC matching:
cms.bool(True)

python -i  patTuple_standard_without_MC_match_cfg.py
>>>process.patMuons.embedGenMatch

After removing MC matching:
cms.bool(False)

Again you can see the variable is changed. You can also look at the whole PatMuon configuration by typing process.patMuons. ALERT! Note: use CTRL+D to quit interactive python.

Step 3: Replace remaining parameter values.

Open your file with the ConfigEditor:

edmConfigEditor patTuple_standard_without_MC_match_cfg.py

Wait until the configuration file is fully loaded. Restrict output to Muons and Jets restricting the kept collections. Select the out module and change the outputCommands to ['keep *_selectedPatMuons*_*_*', 'keep *_selectedPatJets*_*_*']. This is a python-syntax list containing strings for the names of the objects to be kept.

keptCollections.png

For test purposes set the maximum number of events to 200. Click on Find in Edit menu, and look for maxEvents. In this way the right module will be selected and you can change the default value in the PropertyView column:

maxEvents.png

This corresponds to writing the following configuration code by hand in an text editor:

from patTuple_standard_cfg import *

process.out.outputCommands = ["keep *_selectedPatMuons*_*_*","keep *_selectedPatJets*_*_*"]

process.maxEvents.input = 200 

Question Question 3 c) Change the output filename, in the out module, and call it patTuple_analysis.root. Which is the line which appears in the resulting code box in the top-left corner? If you are not using ConfigEditor to edit your configuration, which is the line of code you have to add to your config file to change output fileName?

Suggestion: look at the out module in the TreeView under endpath, or using Edit-->Find.
Save your config file ( File-->Save) and close the editor ( File-->Exit) .

Question Question 3 d) Run your configuration and inspect the produced file. What can you find in the pat::Tuple generated?

Run your configuration:
cmsRun patTuple_standard_without_MC_match_cfg.py

Inspect the pat::Tuple with the edmDumpEventContent:

edmDumpEventContent patTuple_analysis.root 

Look at the edmDumpEventContent outcome.

Exercises

Before leaving this page try to do the following exercises:

yellow Exercise 3 a) Write your own configuration file starting from the standard configuration patTuple_standard_cfg.py and add ak4PFJetsCHS and ca8PFJetsCHSPruned to your PAT output using one of the the jetTools. Question What does change in the event content with respect to the output of the standard configuration?

You will find the solution here:

Open a new configuration file and call it AddJetColl_cfg.py. Edit it using the addJetCollection tool to add the desired jet collection (more information can be found on addJetCollection), set the maxEvent number to 100 and change the output file name, you can choose, for instance, addJetsColl_patTuple.root:

# import standard configuration
from patTuple_standard_cfg import *
# import the jetTools of PAT
from PhysicsTools.PatAlgos.tools.jetTools import * 
# apply the addJetCollection tool defined in jetTools.py
labelAK4PFCHS = 'AK4PFCHS'
postfixAK4PFCHS = 'Copy'
addJetCollection(
   process,
   postfix = postfixAK4PFCHS,
   labelName = labelAK4PFCHS,
   jetSource = cms.InputTag('ak4PFJetsCHS'),
   jetCorrections = ('AK5PFchs', cms.vstring(['L1FastJet', 'L2Relative', 'L3Absolute']), 'Type-2')
   )
process.out.outputCommands.append( 'drop *_selectedPatJets%s%s_caloTowers_*'%( labelAK4PFCHS, postfixAK4PFCHS ) )

labelCA8PFCHSPruned = 'CA8PFCHSPruned'
addJetCollection(
   process,
   labelName = labelCA8PFCHSPruned,
   jetSource = cms.InputTag('ca8PFJetsCHSPruned',''),
   algo = 'CA8',
   rParam = 0.8,
   #genJetCollection = cms.InputTag('ak8GenJets'), # not in used SIM yet
   genJetCollection = cms.InputTag('ak5GenJets'),
   jetCorrections = ('AK5PFchs', cms.vstring(['L1FastJet', 'L2Relative', 'L3Absolute']), 'None'),
   )
process.out.outputCommands.append( 'drop *_selectedPatJets%s_caloTowers_*'%( labelCA8PFCHSPruned ) )
process.maxEvents.input = 100
process.out.fileName= 'addjetColl_patTuple.root'

You can produce your config file using ConfigEditor as well. Open patTuple_standard_cfg.py with ConfigEditor �verbatim17� Click on Edit using ConfigEditor. Click on ApplyTool and choose jetTools.addJetCollection. Set the parameters values according to the code above.

Find the maxEvents module and set the number of events on which run to 100. Change the default name of the output file. Look for "out" using Find in the Edit menu.

Save file as AddJetColl_cfg.py.

Run your configuration file then look at the event content with edmDumpEventContent tool.

In the event content the extra jet collections have been added.

green Exercise 3 b) Starting from the configuration file created in the previous exercise, switch the default jet collection to ak4CaloJets using jetTools.

You will find the solution here:

Open a new file and call it SwitchJetColl_cfg.py.

# import standard configuration
from patTuple_standard_cfg import *
# import the jetTools of PAT
from PhysicsTools.PatAlgos.tools.jetTools import *

switchJetCollection(
   process,
   jetSource = cms.InputTag('ak4CaloJets'),
   jetCorrections = ('AK5Calo', cms.vstring(['L1FastJet', 'L2Relative', 'L3Absolute']), 'Type-1'), 
   )

process.maxEvents.input = 100
process.out.fileName= 'switchjetColl_patTuple.root'

The same can also be done with the ConfigEditor.

ALERT! Note:
In case of problems don't hesitate to contact the SWGuidePAT#Support. Having successfully finished Exercise 3 you might want to proceed to Exercise 4 of the WorkBookPATTutorial to learn more about PAT.

Review status

Reviewer/Editor and Date (copy from screen) Comments
RogerWolf - 17 March 2012 Added color coding.

Responsible: AnnapaolaDeCosa

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng ImportConfigFile.png r2 r1 manage 73.9 K 2010-06-16 - 10:20 AnnapaolaDeCosa  
PNGpng Jetsource.png r1 manage 325.4 K 2014-06-13 - 12:11 TillArndt  
PNGpng RemoveBut.png r1 manage 125.8 K 2010-05-31 - 17:18 AnnapaolaDeCosa  
PNGpng RemoveMCmatching.png r2 r1 manage 131.3 K 2010-09-08 - 11:15 AnnapaolaDeCosa  
PNGpng ResultingCode.png r2 r1 manage 38.9 K 2010-09-08 - 18:05 AnnapaolaDeCosa  
PNGpng addGenMatch.png r1 manage 132.4 K 2010-06-14 - 16:05 AnnapaolaDeCosa  
PNGpng addJetColl.png r1 manage 158.8 K 2010-06-01 - 14:23 AnnapaolaDeCosa  
PNGpng addJetCollectionKT4.png r1 manage 133.4 K 2010-06-01 - 16:26 AnnapaolaDeCosa  
PNGpng jetSource.png r2 r1 manage 128.4 K 2010-06-15 - 16:03 AnnapaolaDeCosa  
PNGpng keptCollections.png r1 manage 125.7 K 2014-06-13 - 14:51 TillArndt  
PNGpng maxEvents.png r3 r2 r1 manage 177.4 K 2014-06-13 - 13:32 TillArndt  
Cascading Style Sheet filecss tutorial.css r1 manage 0.2 K 2010-05-31 - 10:57 AndreasHinzmann Styles for tutorials
Edit | Attach | Watch | Print version | History: r105 < r104 < r103 < r102 < r101 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r105 - 2014-06-30 - TillArndt
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback