Simple BTagging

Basic Idea Behind BTagging

BTagging refers to the process of identifying b-jets. Most of the time in BTagging, b-jets are to be distinguished from other light jets, c-jets, and gluons. In order to do this, there are various algorithms that BTagging team provides. (Look at this link for the basic ideas about BTagging: WorkBookBTagging) All the algorithms assign a numeric value called discriminator to all the jets, which indicates how much a jet is likely to be a b-jet; even though the range of the discriminators differ from an algorithm to the other, higher discriminator means higher probability of getting b-jet.

In this document, I'm going to introduce a simple process in which one can test how BTagging works on a given dataset.

Documents

First, this is the config file that creates root file which includes necessary events for BTagging: btagginganalyzer_cfg.py

After the root file is created, one can run the following config file on the created root file to create the "final root file" which has BTagging information: btaggingedmanalyzer_cfg.py

The analyzer .cc file used in order to create the root tree including BTagging information is: BTaggingAnalyzer.cc

and the header file referenced by .cc file is: BTaggingAnalyzer.h

Finally, the buildfile for edmanalyzer is: BuildFile

Simple Descriptions About The Codes

The configuration file btagginganalyzer_cfg.py basically sorts out the events related to BTagging; there are not only information about jets, but also the float association vector that matches all the jets with a numeric number(which is equal to the discriminator) for all the BTagging algorithms. The other configuration file btaggingedmanalyzer_cfg.py runs BTaggingAnalyzer.cc on the root file created by the previous config file.

BTaggingAnalyzer.cc creates bTree where jet_discriminator, jet_flavours, jet_pt, jet_eta, and number_of_jets are given. Important line that one can modify each time is:

edm::Handle<reco::JetTagCollection> bTagHandle;

iEvent.getByLabel("Algorithm_You_Want_to_Use", bTagHandle);

const reco::JetTagCollection & bTags = *(bTagHandle.product());

Here, in the Algorithm_You_Want_to_Use, one can specify what kind of BTagging algorithm one wants to use. The names provided are shown here.

In addition, pt and eta cuts can be modified in the analyzer.

After the analyzer is run, one can achieve a root file (named "wow.root" here as default) where bTree exists.

The following is the branches inside bTree:

bTree.jpg

Looking into jet_discriminator, one can see the minimum and maximum values, which are going to be used later in macro files in order to make some plots.

bTree_disc.png

Macro Files To Make Plots

First, we can make plots of discriminator distributions for different kind of jets. This can be done by the following macro: plotter.C

max and min values should be adjusted for different BTagging algorithms.

Now, what actually matters is where the discriminator cut should be. For this, one can consider so-called b-tag efficiency and mistag rate. That is, for a fixed discriminator cut, one can consider how many b-jets actually have higher discriminator than the cut, and at the same time, how many other jets were 'mistakenly' included in the region above the discriminator cut, which corresponds to mistag rate.

This b-tag efficiency vs other jet mistag rate plot can be drawn by the following macro: eff_maker.C

The template needed by the macro eff_maker is: d_cut.h

Again, max and min values in d_cut.h should be adjusted for different BTagging algorithms. Some of the algorithms actually include huge negative values when there are some jets which can't be considered with the specific algorithms (e.g. for Secondary Vertex methods, there are some cases where we can't reconstruct secondary vertices of some jets).

When running on huge amount of events, eff_maker.C takes quite a long time to process it. So, it will be a good idea to lessen the num_plots to shorten the running time.

Locations Of The Files

To make this simeple BTagging template work, one should put them into the following directory:

CMSSW_BASE/src/BTag/BTaggingAnalyzer/btagginganalyzer_cfg.py

CMSSW_BASE/src/BTag/BTaggingAnalyzer/btaggingedmanalyzer_cfg.py

CMSSW_BASE/src/BTag/BTaggingAnalyzer/src/BTaggingAnalyzer.cc

CMSSW_BASE/src/BTag/BTaggingAnalyzer/interface/BTaggingAnalyzer.h

CMSSW_BASE/src/BTag/BTaggingAnalyzer/plotter.C

CMSSW_BASE/src/BTag/BTaggingAnalyzer/eff_maker.C

CMSSW_BASE/src/BTag/BTaggingAnalyzer/d_cut.h

Performances Of BTagging On Sample Datasets

I ran this simple btagging analyzer on ~ 7 million events of /WJets-madgraph/Spring10-START3X_V26_S09-v1/GEN-SIM-RECO sample. .cc file specified the algorithm as CombinedSecondaryVertexMVABJetTags. Followings are the results I achieved:

  • Discriminator distribution:

discrim_dist.jpg

It is shown that the discriminator values tend to be higher for b-jets. There are discriminator values of -10 which are not shown in the figure. If log scale is applied, the distribution looks like the following:

b_discrim_log.png

  • B-jet tag efficiency vs mistag rate:

eff.jpg


-- MinjaeCho - 07-Jul-2010
Topic attachments
I Attachment History Action Size Date Who Comment
Texttxt BTaggingAnalyzer.cc.txt r2 r1 manage 4.5 K 2010-07-08 - 17:25 UnknownUser BTaggingAnalyzer.cc
Texttxt BTaggingAnalyzer.h.txt r1 manage 1.9 K 2010-07-08 - 16:27 UnknownUser BTaggingAnalyzer.h
Texttxt BuildFile.txt r1 manage 0.4 K 2010-07-08 - 16:31 UnknownUser BuildFile
JPEGjpg bTree.jpg r1 manage 69.8 K 2010-07-08 - 16:20 UnknownUser bTree
PNGpng bTree_disc.png r1 manage 118.1 K 2010-07-08 - 20:17 UnknownUser Discriminator values
PNGpng b_discrim_log.png r1 manage 108.5 K 2010-07-08 - 18:34 UnknownUser Log scaled discriminator distribution
Texttxt btagginganalyzer_cfg.py.txt r1 manage 4.7 K 2010-07-08 - 16:30 UnknownUser btagginganalyzer_cfg.py
Texttxt btaggingedmanalyzer_cfg.py.txt r1 manage 0.6 K 2010-07-08 - 16:31 UnknownUser btaggingedmanalyzer
Texttxt d_cut.h.txt r1 manage 4.5 K 2010-07-08 - 16:31 UnknownUser d_cut.h
JPEGjpg discrim_dist.jpg r2 r1 manage 102.8 K 2010-07-08 - 20:27 UnknownUser discriminator distribution
JPEGjpg eff.jpg r2 r1 manage 119.6 K 2010-07-08 - 18:32 UnknownUser b-jet tag efficiency vs mistag rate
Texttxt eff_maker.C.txt r1 manage 2.8 K 2010-07-08 - 16:32 UnknownUser eff_maker.C
Texttxt plotter.C.txt r1 manage 1.3 K 2010-07-08 - 16:32 UnknownUser plotter.C
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2010-07-08 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback