The determination of the lepton fake rates proceeds through four steps:
- A production of trees from the CMGTuple, with a setup similar to the production of the main analysis trees but with one entry per probe lepton. The same set of trees contains the two control regions lepton + b-jet and lepton plus b-muon (a flag in each entry specifies what region it corresponds)
- From each tree, and for each definition of control region, and choice of lepton ID, histograms of the ETmiss distribution are made for the numerator and denominator of the fake rate
- Then, the different histograms are combined, and the fake rate is determined.
- Finally, measurement from different control regions are combined, and plots are produced.
Determination of the FR from MC and from the Z+lepton control region is done with a different procedure, that will be explained separately
FR Tree Production
The main cfg file for the production of the FR trees is
run_ttHLep8TeV_FR_cfg.py
.
The configuration is similar to the main analysis one up to the point of the physics objects:
- The starting event analyzers are all in common:
skimAnalyzer
(log the number of processed events), jsonAna
(apply JSON to data), ttHGenAna
(define MC-truth leptons), ttHVertexAna
(define good vertices)
- The trigger bit selection,
triggerAna
, is done in the same way but with a different selection of triggers.
- The lepton analyzer is then run
ttHLepAna
, saving the list of probe leptons in event.selectedLeptons
, and the other leptons that pass just the basic ID but not the preselection in event.looseLeptons
(these will be used e.g. to tag events with muons from b decays)
Events are accepted if they contain at least one probe lepton
- The lepton MC match analyzer follows,
ttHLepMCAna
, attaching MC match information to the selected leptons
- The
ttHJetAna
and ttHJetMCAna
follow, defining the jets removing the probe leptons, applying JEC if needed, and smearing the jets in MC to match the resolution in data
- Finally, the
ttHLepFRAnalyzer
module searches for each probe lepton a suitable tag and fills the output tree
The
ttHLepFRAnalyzer
produces trees in the following format:
-
tagType
identifies what the tag is: a jet (1) or a muon (13)
-
Probe_*
are the variables of the probe lepton
-
TagLepton_*
are the variables of the tag lepton (filled with something sensible only if tagType
is 13)
-
TagJet_*
are the variables of the tag lepton (filled with something sensible only if tagType
is 1)
-
hasSecondB
identifies whether the event has a second jet in addition to the tag and the probe
-
SecondJet_*
are the variables of the second jet (filled if hasSecondB
is true); the second jet is choosen as the one with highest b-tagging discriminator among those passing the basic jet selection (|η| < 2.4, pT > 25 GeV)
-
dr_tp
, dphi_tp
are the ΔR and Δφ between tag and probe
-
met
, metPhi
are the ETmiss and its azimuthal angle
-
mtw_probe
and mtw_tag
are the transverse mass of the lepton plus met system for the probe and tag lepton (the latter only for tagType
13)
Several other variables, e.g related to trigger matching, are also present - see the code for details.
Filling ETmiss distributions
This part of the code would benefit from a clean rewrite (maybe in python?)
The filling of the FR distributions is performed with the
fillFRSimple.cxx
macro, which takes as input
- The name of a tree, with possibly some postfix specifying that non-default trigger conditions should be used
- An number identifying the definition of the lepton selection (including possibly the request of an additional b-jet)
- Possibly also an integer identifying the individual bin to consider (used only for debugging)
For each bin in p
T, |η|, pdg id of the probe, the macro fills up to 2x2x2 = 8 distributions according to:
- the tag type (jet and lepton)
- the loose and tight working point of the specified id (however some ids have only the tight or only the loose wp)
- whether the probe qualifies for the denominator and for the numerator of the FR
The binning depends on the id selected.
Plots are produced also for each distribution.
Extracting the fake rate
Also this part of the code would benefit from a clean rewrite (maybe in python?)
The macro that performs the fit of the FR is
fitFRDistsSimple.cxx
, which takes as input the id of the selection to use.
What happens inside of the macro is:
- at the beginning
appropriate input files are opened according to the tag type and to the control region (for the requirement of an additional b-tag it is necessary to include the TTbar background and to use the W and Z samples in jet bins)
- then, each bin is processed individually and a fake rate is computed for the tight and loose WPs of the selection in the processOneBin
method:
- histograms are read from the files, and MCs are added up according to their cross sections in readAndAdd
.
- then, the FR is determined and plotted as function of ETmiss in the fitFRMT
function (called that way because originally the transverse mass was used instead of the ETmiss; it wasn't a good idea, since mT is correlated with the lepton pT, and the FR is dependent on pT).
- the determination of the FR for non-prompt lepton from the combination of high and low ETmiss in done in the fitFRMTCorr
function.
This function is called first from within fitFRMT
just to plot the result of the correction, and is called again at the end of processOneBin
if the corrected FR is to be used as the nominal answer (i.e. for high pT leptons).
--
GiovanniPetrucciani - 21 Feb 2014