b-Tagging Offline Guide


B-tagging associates a single, real number - a discriminator - with each jet. B quark (light quark) initiated jets will always tend to show higher (lower) values of the discriminator but details (like the possible range) depend on the specific algorithm. For an introduction to b tagging and its use in physics analysis, see the b Tag Section of the CMSSW Workbook.

NEW For an up-to-date (2012) description of the b-tagging proceedure please follow the the tutorial here 2012 developer tutorial. The tutorial contains detailed instructions on how to measure the b-tagging performance in MC using the standard validation package. For the more advanced users/developers instructions on how to edit the existing algorithms and measure performance changes are also provided.


Several b tag algorithms have been implemented in CMSSW. Some exploit the long B hadron lifetime, others its semi-leptonic decay mode and others use kinematic variables related to the high B hadron mass and hard b fragmentation function. Some of them have several variants for extracting the discriminator value from a given set of input variables.

  • "Track Counting" algorithm: More... Close This is a very simple tag, exploiting the long lifetime of B hadrons. It calculates the signed impact parameter significance of all good tracks, and orders them by decreasing significance. Its b tag discriminator is defined as the significance of the N'th track. It comes in two variations for N = 2 (high efficiency) or N = 3 (high purity).
  • "Jet Probability" algorithm: More... Close This is a more sophisticated algorithm, also exploiting the long lifetime of B hadrons. Its b tag discriminator is equal to the negative logarithm of the confidence level that all the tracks in the jet are consistent with originating from the primary vertex. This confidence level is calculated from the signed impact parameter significances of all good tracks. It reads the resolution function on these from a database (DB). Indeed, we have two versions of this tagger: JetProbabilityBJetTags and JetBProbabilityBJetTags - the latter uses only the four most displaced tracks, matching the typical reconstructed multiplicity of a B decay vertex.
  • "Soft Muon" and "Soft Electron" algorithms: More... Close These two algorithms tag b jets by searching for the lepton from a semi-leptonic B decay, which typically has a large Pt_rel with respect to the jet axis. Their b tag discriminators are the output of neural nets based on the leptons Pt_rel, impact parameter significance and a few other variables. For each of these taggers, we want to have a simple one (cutting basically only on the presence and pT of the lepton), and a complex one using also jet quantities to compute a MVA analysis. This is already the case for Muons; Electron taggers need refinement and are not really usable up to 2.1.X.
  • "Simple Secondary Vertex" algorithms: More... Close These class of algorithms reconstructs the B decay vertex using an adaptive vertex finder, and then uses variables related to it, such as decay length significance to calculate its b tag discriminator. It has been found to be more robust to Tracker misalignment than the other lifetime-based tags. CMSSW releases <= 35X contain one version of the algorithm (simpleSecondaryVertexBJetTags). Starting from 36X two versions will be provided: simpleSecondaryVertexHighEffBJetTags (equivalent to the previous version) and simpleSecondaryVertexHighPurBJetTags (with increased purity due to a cut on the track multiplicity at the secondary vertex)
  • "Combined Secondary Vertex" algorithm: More... Close This sophisticated and complex tag exploits all known variables, which can distinguish b from non-b jets. Its goal is to provide optimal b tag performance, by combining information about impact parameter significance, the secondary vertex and jet kinematics. (Currently lepton information is not included). . The variables are combined using a likelihood ratio technique to compute the b tag discriminator. A variant of this tagger combines the variables using the Multivariant Analysis (MVA) tool.

Reconstruction sequence


All b-tag algorithms need some common inputs:
the primary vertex
the collection is supposed to be sorted and the first element is used as the signal vertex
the jets to be tagged and their associated charged tracks
the modules creating associations of tracks to jets are defined in the RecoJetsAssociation sequence.
Some algorithms need additional information, e.g., leptons for the soft lepton tagging. All the corresponding sequences have to be run before the b-tagging modules, unless the products are already available in the input file.

The b-tagging sequence

(Re-)Reconstruction of the b-tagging results is done by loading the following fragment in your configuration and including the b-tag sequence in your path

process.path = cms.Path(process.btagging)

The sequence will generate the intermediate products used by the tagging algorithms (TagInfo objects) and run the taggers. If you only want to (re-)run some of the b tag algorithms, you can look inside the file and use parts of the sequence. The corresponding TagInfo producers have to be kept in the sequence! For example, to reconstruct the jet probability information, with a change in the impact parameter cut, you need:

process.impactParameterTagInfos.maximumTransverseImpactParameter = 0.1
myBTag = cms.Sequence(process.impactParameterTagInfos*process.jetProbabilityBJetTags)
process.path = cms.Path(myBTag)

The original proposal for this design can be found here.

b-tagging at HLT

The present HLT trigger menus include two types of b-tagging paths:

  • BTagMu_JetNNN, BTagMu_DiJetNNN and BTagMu_DiJetNNN_Mu5 : selecting events with muons close to jets and providing a calibration sample for the measurement of the b-tag efficiency
  • BTagIP_JetNNN : events with at least one b-tagged jet, based on impact parameters

BTagIP_JetNNN and BTagMu_JetNNN b-tag paths are enabled since May 2010. The evolution of the HLT b-tag paths with the HLT menus in 2010 are documented in *this talk*

b Tag RECO/AOD Data Format

For each algorithm a final b-tag discriminator value is associated to each input jet. The results are stored as AssociationVectors in RECO and AOD, together with collections of intermediate results (InfoTags) which can be used to (re-)calculate the tags. A detailed list of the products can be found here.

MC tools and b-tag Performance Analysis

A necessary step in the measurement of b-tag performance on MC is the association between reconstructed jets and partons. The corresponding tools are described in SWGuideBTagMCTools.

Information of how associate and read MC information for tracks and vertexes can be found in the following links:

Performance analysis and validation

  • Validation plots for recent releases are published in this PAGE

Access to performance numbers and recommended taggers

The latest recommendations for the operating points and scale factors can be found here

Software setup for b tagging in boosted event topologies

Please refer to BoostedBTagSWSetup


  • Links to the documentation can be found here

Related Topics


Review status

Reviewer/Editor and Date (copy from screen) Comments
JyothsnaK - 14-Oct-2010 Updated HLT information and added BTV-10-001 link
JyothsnaK - 06-Jun-2010 Updated HLT information.
Main.tomalini - 22 April 2007 Moved task lists to b tag POG twiki
KatiLassilaPerini - 23 Jan 2007 created the page from based on the current BTau page

Responsible: Wolfgang Adam
Last reviewed by:

Topic attachments
I Attachment History Action Size Date Who Comment
Texttxt reco_validation_CSVIVFV2_cfg.py.txt r1 manage 16.7 K 2014-06-20 - 12:55 PetraVanMulders Example of how to use the new CSVIVFV2
Edit | Attach | Watch | Print version | History: r68 < r67 < r66 < r65 < r64 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r68 - 2015-12-30 - PetraVanMulders

    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback