HLT/Btag Validation

[[mailto:marfin@mail.desy.de][]][[mailto:marfin@cern.ch][]]

Dedication

For Btag POG/ TSG

Abstract

This document is a brief introduction into the technical details of HLTriggerOffline/Btag. It covers a few topics related to 'how-to start', 'how-to run', 'how-to plot and test' etc

That's only a short introduction. Please, look at the reffered twikis for more details

   HOW-TO start

The package HLTriggerOffline/Btag is designed to do a validation in offline DQM So far, there is only support of RelVal MC.

The structure of package is

BuildFile.xml
doc/
interface/
src/
python/
test/

   CVS repository

First, we have to get the code from CVS repo.

cmsrel CMSSW_5_3_3
cd CMSSW_5_3_3/src
cmsenv
cvs co HLTriggerOffline/Btag
Then we have to update the package(s) which are provides the configurations of the validation

cvs co HLTriggerOffline/Common
checkdeps -a
scram b -j4
hash -r

To test the code, we have to fix the files HLTriggerOffline/Common/python/HLTValidation_cff.py and HLTriggerOffline/Common/test/hltHarvesting_cff.py

cp HLTriggerOffline/Btag/doc/HLTValidation_cff.py HLTriggerOffline/Common/python/
cp HLTriggerOffline/Btag/doc/hltHarvesting_cfg.py HLTriggerOffline/Common/test
cp HLTriggerOffline/Btag/doc/HLTValidationHarvest_cff.py HLTriggerOffline/Common/python/

scram b -j4

   Getting Relval samples

   Producing local RelVal samples

I've tested the code w/o the access to central produced relval mc files. I've produced 10 events of TTbar for testing purpose at desy-cms013.desy.de Higgs group server @ DESY.

  • Production RelVal TTbar, so called stage-i * GEN-SIM-DIGI-etc, step 1

cd HLTriggerOffline/Btag/test
cmsDriver.py TTbar_7TeV_cfi.py -s GEN,SIM,DIGI,L1,DIGI2RAW,HLT:GRun -n 10 --eventcontent FEVTDEBUGHLT --conditions auto:startup_GRun --mc

  • RAW2DIGI,RECO,VALIDATION:hltvalidation, step 2

cmsDriver.py step2 -s RAW2DIGI,RECO,VALIDATION:hltvalidation --eventcontent FEVTDEBUGHLT --conditions auto:startup_GRun -n 10 --mc --filein file:TTbar_7TeV_cfi_py_GEN_SIM_DIGI_L1_DIGI2RAW_HLT.root

After step 1 and step 2, you should have two files

TTbar_7TeV_cfi_bla_bla.root
step2_bla_bla_VALIDATION.root

   Getting central produced RelVal samples

The central produced samples can usually be used only at lxplus.cern.ch

Of course, the code should be tested on the number of events >10 . First, we have to find relval samples. There are a few ways to get the list of the datasets and the needed files. You can visit the page 1. But there is a more sophisticated way to get not only names of datasets but and the names of root files

Here is an example how to use DAS query 2

dataset dataset=/RelValTT*/*/*HLTDEBUG | grep dataset.name

This is an example of getting the same info using das CLI das_client.py

das_client.py --query="dataset dataset=/RelValTT*/*/*HLTDEBUG  | grep dataset.name "  --idx=0 --limit=100000

I don't know why but das_client.py doesn't support plain format anymore. It outputs in JSON format.

You can find a very useful and simple parser JSON.sh in the package folder scripts/ . To get the same with parsing JSON, please do

das_client.py --query="dataset dataset=/RelValTT*/*/*HLTDEBUG  | grep dataset.name "  --idx=0 --limit=100000  | ./JSON.sh -b  |  egrep '\["data.*"name.*"\]' | awk '{print $2}'

To get the list of all root files for datasets

_dataset="/RelValTT*/*/*HLTDEBUG"
das_client.py --query="dataset dataset=$_dataset  | grep dataset.name "  --idx=0 --limit=100000  | ./JSON.sh -b  |  egrep '\["data.*"name.*"\]' | awk '{print $2}' | tr -d "\"" > list && cat list | xargs -I {} echo "echo {}; echo; das_client.py --query=\" file dataset={} \" --idx=0 --limit=100000 |   ./JSON.sh -b  |  egrep '\[\"data.*\"name.*\"\]' | awk '{print \$2}'; echo ;echo  " | sh

Ok, if we've chose some root file file.root provided by the previous command, then the steps 1,2 are not needed any more.

   The test of DQM harvesting

Ok. Before we'll continue, we have to be sure that we either `` step2_bla_bla_VALIDATION.root `` or `` file.root`` under hands. Harvesting in DQM is a production of a set of TH1 objects stored somehow in the files with names like

DQM_V0001_R00000001_CMSSW_5_3_3_Relval_TrigVal.root

If you used to get step2_bla_bla_VALIDATION.root, you can use fixed hltHarvesting_cfg.py . If you use the central RelVal file, please replace the PoolSource attribute fileNames value by file.root

Harvesting is done via

cmsRun ../../Common/test/hltHarvesting_cfg.py

   Plotting DQM_V_R_CMSSW____Relval_TrigValroot

If we've done all correctly, we can produce a few plots. Later, we'll start DQM GUI server for the operation. But now, it has a sense, to produce plots manually with some script. I've prepared in scripts/ folder a few useful tools.

cuy.py
v_root_lsRoot

All plottings can be done via running the command

$CMSSW_BASE/src/HLTriggerOffline/Btag/scripts/cuy.py -x HLTBtagPlot.xml --verbose

where HLTBtagPlot.xml is a configuration file of the form

<cuy>
<validation type=REFREL_SAMPLE_Val file=DQM_V0001_R000000001__CMSSW_3_1_0__RelVal__TrigVal.root release=REFREL_SAMPLE_Val>
<TH1 name=REFREL_JetTag_L3 source=/DQMData/Run 1/HLT/Run summary/BTag/HLT_DiJet40Eta2p6_BTagIP3DFastPV/JetTag_L3/>
<TH1 name=REFREL_JetTag_Offline source=/DQMData/Run 1/HLT/Run summary/BTag/HLT_DiJet40Eta2p6_BTagIP3DFastPV/JetTag_Offline/>
<!--
JetTag_L3
JetTag_Offline
JetTag_OffvsL3
Vertex_HLT_x
Vertex_HLT_y
Vertex_HLT_z
Vertex_Off_x
Vertex_Off_y
Vertex_Off_z
Vertex_OffvsHLT_x
Vertex_OffvsHLT_y
Vertex_OffvsHLT_z
-->
</validation>
<superimpose name=SAMPLE_Val title=JetTag SetLogy=false SetGrid=true XTitle=Discriminator YTitle=Nevents   >
<superimposeItem name=REFREL_JetTag_L3 color=1 MarkerStyle=22 legend=JetTag_L3/>
<superimposeItem name=REFREL_JetTag_Offline color=2 MarkerStyle=22 legend=JetTag_Offline/>
</superimpose>
</cuy>

Also here is a commented list of TH1 objects. How did I obtained it? You can use scripts/v_root_lsRoot described here 3The commands below do compilation of the tool and extracting the list of TH1(2) objects.

cd ../scripts/v_root_lsRoot
make; cd-
# create a map
$CMSSW_BASE/src/HLTriggerOffline/Btag/scripts/v_root_lsRoot/lsRoot -l -m   DQM_V0001_R000000001__CMSSW_3_1_0__RelVal__TrigVal.root
# get a list of histograms
$CMSSW_BASE/src/HLTriggerOffline/Btag/scripts/v_root_lsRoot/lsRoot    "DQM_V0001_R000000001__CMSSW_3_1_0__RelVal__TrigVal.root:/DQMData/Run 1/HLT/Run summary/BTag/HLT_DiJet40Eta2p6_BTagIP3DFastPV" | awk '{if(NR>5) print $1}'

The output of cuy.py is a nice plot `$ref{fig1}$`

SAMPLE_Val.png

L3/Offline discriminator distribution for 10 events `label{fig1}`

Harvesting HLT Btagging for CMSSW_5_2_7

The example of testing HLT Btagging at L25 and L3 is shown below. In order to start, first one needs to get list of input RelVal files to put into the hltHarvesting_cfg.py. Let's go to 1, take the name of /RelVal/TTbar dataset and get a list of files.

 _dataset="/RelVal/TTbar/CMSSW_5_2_7/..."
 cd ../scripts
 das_client.py --query="dataset dataset=$_dataset  | grep dataset.name "
 --idx=0 --limit=100000  | ./JSON.sh -b  |  egrep '\["data.*"name.*"\]'
 | awk '{print $2}' | tr -d "\"" > list && cat list | xargs -I {} echo "echo {};
 echo; das_client.py --query=\" file dataset={} \" --idx=0 --limit=100000 |
./JSON.sh -b  |  egrep '\[\"data.*\"name.*\"\]' | awk '{print \$2}'; echo ;echo  " | sh

Then, the obtained list is being used directly in a local run of cmsRun. Hovewer, it is now possible to start CRAB jobs in $CMSSW_BASE/src/HLTriggerOffline/Btag/test/ folder.

After running the

cmsRun ../../Comon/test/hltHarvesting_cfg.py

in the test folder, the file DQM_R00001_V00001_CMSSW_5_2_7__RelTrig.root is produced. Then a template of the xml file for plotting can be created if one does the following

$CMSSW_BASE/src/HLTriggerOffline/Btag/scripts/cuy.py -c  DQM_R0001_V00001_CMSSW_5_2_7__RelTrig.root > HTLBTagPlot_CMSSW_5_2_7_MC.xml

The file mentioned abore requres additional efforts to clean unnecessery lines with references to TH1 objects which originate from other parts of DQM like ECAL, HCAL etc. Finally, the plots are created

$CMSSW_BASE/src/HLTriggerOffline/Btag/scripts/cuy.py -x  HTLBTagPlot_CMSSW_5_2_7_MC.xml  --verbose

The full set of the validation plots is shown below

Performance plots

From top-left side to bottom-right sides: JetTag distribution; efficiency vs JetTag for b/c jets; efficiency vs pT; Mistag-rates vs efficiency(b) for L25/L3

   Additional HOW-TOs

  • HOW-TO get a list of hltBtag collections needed for feeding HLTBtagPerformanceAnalyzer.cc

edmDumpEventContent step2_RAW2DIGI_RECO_VALIDATION.root | grep "hltBLifetime

or go to https://savannah.cern.ch/support/index.php?130651

and get

hltCaloJetL1FastJetCorrected
hltSelector4JetsL1FastJet
hltSelectorJets20L1FastJet
hltPixelTracks
hltFastPVPixelTracks
hltFastPVPixelTracksRecover
hltFastPVPixelTracksMerger
hltBLifetimeRegionalCtfWithMaterialTracksbbPhiL1FastJetFastPV
hltPixelVertices
hltFastPVPixelVertices
hltFastPVPixelVertices3D
hltBLifetimeL25AssociatorbbPhiL1FastJetFastPV
hltBLifetimeL3AssociatorbbPhiL1FastJetFastPV
hltBLifetimeL25TagInfosbbPhiL1FastJetFastPV
hltBLifetimeL3TagInfosbbPhiL1FastJetFastPV
hltBLifetimeL25BJetTagsbbPhiL1FastJetFastPV
hltBLifetimeL3BJetTagsbbPhiL1FastJetFastPV

  • HOW-TO submit to CVS

# go to subpackage folder
cd HLTriggerOffline/Btag/
# add all sub folders and files
find . -type d -print | grep -v CVS | grep -v "admin" | sed -ne 's/.\///p' | xargs -I {}  echo "cvs add {} " | sh
find . -type f -print | grep -v CVS | grep -v "admin" | sed -ne 's/.\///p' | xargs -I {}  echo "cvs add {} " | sh
# commit changes
cvs ci -m "Working version of HLTriggerOffline/Btag/"

   What next?

We have to test the code with DQM GUI service. We can repeat all steps done by Dmitry (DESY summer student) sometimes ago 4.

   References

The online list of RelVal datasets 1

1http://cms-project-relval.web.cern.ch/cms-project-relval/relval_stats/

The DAS system 2

2https://cmsweb.cern.ch/das

The tool for dumping content of any root file 3

3http://root.cern.ch/phpBB3/viewtopic.php?f=12&t=10786

How to start DQM GUI 4

4https://twiki.cern.ch/twiki/bin/view/Sandbox/DQMonlineDOliinychenko

Other useful twiki 5

5https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideTriggerSoftwareValidationhttps://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideHLTriggerOfflinehttps://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookDQM#How_to_visualize_the_DQM_histogrhttps://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideValidationTableHLThttps://twiki.cern.ch/twiki/bin/view/Atlas/ATLASStorageAtCERN#How_to_list_a_directory_in_EOS

   RsT

The documentation was written in RsT markup.

   More details

you can find more details here 6 and here 7

6http://docutils.sourceforge.net/docs/user/rst/demo.txt http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.txt http://docutils.sourceforge.net/rst.txt

7http://docutils.sourceforge.net/docs/user/rst/demo.html#inline-markup

   HOW to obtain from RST different outputs

  • Obtain HTML

rst2html.py HOW-TO.rst --syntax-highlight="long" > HOW-TO.html

  • Obtain PDF

rst2pdf   HOW-TO.rst

rst2latex   HOW-TO.rst  > HOW-TO.tex

  • Obtain twiki (for CERN twiki)

cat   HOW-TO.rst | rst2twiki.py  > HOW-TO.twiki

  • Obtain wiki

cat   HOW-TO.rst | rst3wiki.py  > HOW-TO.wiki

Good Luck!

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng CMSSW_5_2_7_plot1.png r1 manage 18.4 K 2013-06-18 - 11:18 IgorMarfin  
PNGpng CMSSW_5_2_7_plot2.png r1 manage 18.0 K 2013-06-18 - 11:18 IgorMarfin  
PNGpng CMSSW_5_2_7_plot3.png r1 manage 17.0 K 2013-06-18 - 11:18 IgorMarfin  
PNGpng CMSSW_5_2_7_plot4.png r1 manage 17.1 K 2013-06-18 - 11:18 IgorMarfin  
PNGpng CMSSW_5_2_7_plot5.png r1 manage 27.4 K 2013-06-18 - 11:18 IgorMarfin  
PNGpng CMSSW_5_2_7_plot6.png r1 manage 28.3 K 2013-06-18 - 11:18 IgorMarfin  
PNGpng CMSSW_5_2_7_plot7.png r1 manage 23.9 K 2013-06-18 - 11:18 IgorMarfin  
PNGpng CMSSW_5_2_7_plot8.png r1 manage 23.9 K 2013-06-18 - 11:18 IgorMarfin  
PNGpng SAMPLE_Val.png r1 manage 17.7 K 2013-04-25 - 17:52 IgorMarfin  
Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2013-07-10 - AnneCatherineLeBihan
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback