The 8 TeV Ratio Method for Photon Fake Rates


The Problem

Photon processes typically suffer from large backgrounds arising from jets; in addition, photons deposit the least information of any physics object in the CMS detector. Jets can be misidentified (or "fake") photons if they fluctuate to one or two leading pi0s, resulting in an electromagnetic object indistinguishable from a single, highly energetic photon. Furthermore, the likelihood that an apparent photon came from a jet depends strongly on which portion of the ECAL the jet is in (barrel vs endcap), and the pt of that jet. And it may depend on whether the jet originates from a quark or a gluon. Lastly, since we cannot expect simulation to accurately predict the jet-faking-photon rate, we typically want to measure the "photon fake rate" in a data-driven manner.

The Strategy

To fight against the large jet backgrounds, we typically apply "tight" photon identification requirements on our photon candidates. We define a "tight photon" to describe our signal according to the VGamma photon definition listed below. We expect this selection to be inclusive to real photons with moderate jet contamination. This is tighter than many HLT photon triggers, so in every photon data set there are quite a few photon candidates from jets that will fail tight photon selection. These not-quite photons are useful to us for two reasons. First, we know what they are--if they donít have a charged track they have to be jets. And second, by looking at their distribution we can get a good idea of how many jets happen to pass the tight photon selection.

To make this systematic, we define a "Photon-Like Jet" or "plJet" as listed below-so that the definition wonít interfere with HLT cuts and so that real photons virtually never end up as photon-like jets. Photon-like jets are required to fail tight photon selection in some way but still be moderately photon-like. Armed with these two definitions we can define the ratio f/e as the probability of jets appearing as tight photons over the probability of jets appearing as photon-like jets. This will depend on the pt of the jet and what section of the detector it lands in (barrel / end cap). Once we have that we can just count photon-like jets and multiply by the ratio to get the amount of QCD contamination of tight photons.

What is the f/e Ratio?

The f/e ratio = (the probability of jets appearing as tight photons) / (the probability of jets appearing as photon-like jets)

Tight Photon Definition

This is implemented for you in the is_tight function in cut_definitions.h.
  • cut EB (EE)
  • Pixel Seed Veto
  • H/E < 0.05 (0.05)
  • Sigma_IEtaIEta < 0.011 (0.03)
  • hollow track Iso dR04 < 2.0 + 0.001 x pt + 0.0167 x rho25 (2.0 + 0.001 x pt + 0.032 x rho25)
  • Ecal Iso dR04 < 4.2 + 0.006 x pt + 0.183 x rho25 (4.2 + 0.006 x pt + 0.090 x rho25 )
  • Hcal Iso dR04 < 2.2 + 0.0025 x pt + 0.062 x rho25 (2.2 + 0.0025 x pt + 0.180 x rho25)

Photon-like Jet Definition

This is implemented for you in the is_photonLikeJet function in cut_definitions.h. The object must pass
  • Pixel Seed Veto
  • H/E < 0.05 (0.05)
  • Sigma_IEtaIEta < 0.014 (0.035)
  • hollow track Iso dR04 < min( 5*(3.5 + 0.001 * pt + 0.0167*rho25), 0.2*pt) (EC: min( 5*(3.5 + 0.001 * pt + 0.032*rho25), 0.2*pt) )
  • Ecal Iso dR04 < min( 5*(4.2+0.006*pt + 0.183*rho25 ), 0.2*pt) (EC: min( 5*(4.2+0.006*pt + 0.090*rho25 ), 0.2*pt) )
  • Hcal Iso dR04 < min( 5*(2.2 + 0.0025 * pt + 0.062*rho25), 0.2*pt) (EC: min( 5*(2.2+0.0025*pt + 0.180*rho25), 0.2*pt) )
And the object must fail the photon selection in at least one of the following ways:
  • Sigma IEtaIEta > 0.011 (0.030)
  • TrkIsoHollowDR04 > 3.5 + 0.001 * pt + 0.0167*rho25 (3.5 + 0.001 * pt + 0.032*rho25)
  • Ecal Iso dR04 > 4.2+ 0.006 *pt + 0.183*rho25 (4.2+ 0.006 *pt + 0.090*rho25)
  • Hcal Iso dR04 > 2.2 + 0.0025 * pt + 0.062*rho25 (2.2 + 0.0025 * pt + 0.180*rho25)

The Ratio Method

Our handle for separating jets from photons in the tight photon region will be sigma_IEtaIEta. For those who don't know, sigma_IEtaIEta is a measure of the spread of the photon candidate's super cluster in the eta direction.


We see here that real photons don't venture above a sigma_ieta_ieta of 0.011, where the tight photon cut is, while photon-like jets continue well above that. So photon candidates with sigma_ieta_ieta > 0.011 can be unambiguously identified as jets, this regin we'll call the "sigma_ieta_ieta side band". We'll also call the area with sigma_ieta_ieta < 0.011 the "tight region". The side band normally gets cut off by the HLT at about 0.014. Please note that the above plot is for the barrel, for the end cap there is a very different distribution, but with similar structure and a side band.

We can fit templates to the sinin distribution, and particularly to itís side band, to determine the number jets contaminating the tight region. This is most clearly demonstrated in the single particle case.

The Single Particle Case

This is not what is actually done in the ratio method, but serves as a good introduction to how things are done. Consider the leading photon candidate in some photon data. If we apply a pixel veto to the photon candidates we can expect to exclude lepton contamination. We can then express the possibilites for leading photon candidates in the following table:

Table 1


The left represents what kind of object underlies a photon candidate, and the top is the type of particle we observe. In the matrix are probabilities of what happens. "f" is probability that a jet appears as a Tight photon and "e" is the probability that it appears as a photon-like jet. Of course, jets usually don't appear as either and just look like jets, but we don't care about those for this analysis, it just means that f + e ≠ 1. Can re-express this in terms of pure numbers in Table 2.

Table 2


Here, N_T and N_plJ are the numbers of tight objects and photon-like jets respectively that we observe in our leading photon candidates. n_gamma and n_J are the numbers of photon and jets in reality that feed into our tight photons; we do not get to observe these numbers directly. We get to observe directly the number of jets that appear as photon-like jets just by counting them. But N_T is a mixture of real photons and jets: N_T = n_gamma + n_J-to-T.

To separate the two matrix elements in the Tight column we fit sigma_ieta_ieta templates to the sigma_ieta_ieta distribution of the tight photon. We take the template of real photons from monte carlo and we can use the photon-like jets for the jet sigma_ieta_ieta template. When we make the sigma_ieta_ieta distribution we have to eliminate the sigma_ieta_ieta cut from the tight photon selection in order to preserve the sigma_ieta_ieta side band. Similarly, for the jet template we have to exclude photon-like jets that only fail the sigma_ieta_ieta cut. The fit should give the number of jets in the data distribution. We then integrate to get the number of jets in the tight region. Now we've found all the matrix elements of the table, including the number of real photons. We can now calculate a f/e: f/e = n_J-to-T / N_plJ

That's nice, but we want to know how f/e depends on pt and how it changes for barrel and end cap photon candidates. To get sensitivity to pt and detector section we can do is to draw a bunch of these tables for various slices (bins) of pt for barrel photons and end cap photons. For single objects there's lots of statistics so we can get fine pt resolution. We also would like to try and be sensitive to possible differences in f/e due to the quark-gluon fraction of the jets. The single object case isn't up to this task, but adding a second object will help.

The Full Ratio Method

This is the two particle version of single particle case, but now we look at two leading photon candidates. Table 3 lists probabilities in the same format as Table 1. Again, only we get to directly observe the columnís sums.

Table 3


There should be different underlying physics going into the gamma+jet and jet+jet states. There should be more quark jets for the gamma + jet case than in the dijet case. So in principal jets from gamma jet should appear as tight photons and photon-like jets at a different rate than jets from jet+jet. Therefore, in the table we use fí and eí for the gamma+jet row and f and e for the jet+jet row. When we go to build f/e ratios we will only use elements from the same row.

Table 4


Table 4 re-expresses table 3 in terms of real numbers. We can observe the number of dijets that appear as two photon-like jets directly since there are no competing entries in the plJet+plJet column. We can separate the tight + photon-like jet column into itís elements using a one dimensional sigma_ieta_ieta template fit on the tight photon, exactly as we did in the single particle case. As in the 1D case, the jet sigma_ieta_ieta templates are constructed out of the photon-like jets in data and the photon templates come from monte carlo. Separating the tight+tight column into itís elements can be done using two dimensional sigma_ieta_ieta template fit. Two dimensional template fitting isnít as reliable as 1D template fitting and itís particularly bad at estimating n_JJ-to-TT. We can construct a ratio that targets the fake rate from dijets without using 2D template fitting:


This is what is currently used for the ratio method and is the type of ratio reported in ratio_method_results_v2.h.

Now we want to know how the f/e ratio depends on pt and whether the particle is in the barrel or the end cap. In the single particle case we did this by writing a two tables like this for each pt bin, one table for the barrel and one table for the end cap. But now we have two particles, so which objectís pt do we choose? To avoid ambiguity we insist that both particles be in the same pt bin. We also have to insist that they are either both in the barrel or both in the end cap. This damages our statistics but allows us to unambiguously identify the region and pt for which the ratio applies. This requirement makes the ratio method fairly data intensive and requires us to use courser pt binning than we could use for the single particle case.


The results of the ratio method are dependent on our choice of the tight and fake-able object templates. We can estimate the systematic uncertainty of this choice by perturbing the templates in various ways and see how the results vary. We consider two tight photon templates and seven photon-like jet templates. We then repeat the ratio method study for each combination of tight photon and photon-like jet templates to see the distribution of results. The then take the RMS of the results as the systematic uncertainty.

[plot of a distribution from the curve, and the plot of the percentage difference]

Tight Templates Considered:

  • The main template applies a small (translational) shift to the distribution to correct for the differences between data and MC. The shift is -0.00008 in the barrel and -0.00024 in the end cap.

  • The other photon template considered is the same distribution as the main template without the corrective shift.

Photon-like jet templates:

  • The main photon-like jet template uses the usual photon-like jet definition and are taken derived from photon-like jets in both tight+plJet and plJet+plJet events.
We also consider:
  • Templates using the main photon-like jet definition and are constructed using only photon-like jets from plJet+plJet events.

  • Templates using the main photon-like jet definition and are constructed using only photon-like jets from tight+plJet events.

  • Templates using more tighter isolation requirements. Each isolation cuts in the usual photon-like jet definition has the form iso < Min( 5*(A +epsilon*pt + delta*rho25 ), 0.20*pt), which is dominated by the 0.20*pt term. Here we change the isolation cuts to the form: iso < Min( 5*(A +epsilon*pt + delta*rho25 ), 0.15*pt)

  • Templates using photon-like jet definitions that includes objects that fail the tight photon track isolation requirement by 2.5GeV instead of 1.5Gev. The tight photon cut on track isolation is TrkIsoHollowDR04 < 2.0 + 0.001*pt + 0.0167*rho25 In the normal photon-like jet definition we include objects that fail the tight photon track isolation cut by a 1.5GeV: TrkIsoHollowDR04 > (3.5 + 0.001 * pt + 0.0167*rho25). This leaves a gap. Here we widen that gap to so that we include objects as photon-like jets if TrkIsoHollowDR04 > (4.5 + 0.001 * pt + 0.0167*rho25)

  • Templates using photon-like jet definitions that require both a wider track iso gap and a limit the 15% modification to isolation cuts.

  • Templates using photon-like jet definitions that require that photon-like jets fail at least two of the tight photon cuts instead of just one.

How to apply the photon fake rate to get the QCD contamination.

The f/e ratio functions for 2011A and 2011B have been compiled and are ready to be used. They are listed in ratio_method_results_v2.h as both TF1's and C++ functions. Use these to compute the f/e for individual objects.

Method 1:

The simplest way is as follows: Say you wanted the amount of QCD contamination in a sample of barrel tight photons with pt between 40 and 60GeV. Then just count up all the barrel photon-like jets in that pt range and multiply the count by the f/e corresponding to the bin center. The product will be the number of jet contaminants. This is particularly nice if the f/e ratios are derived with the same pt binning as their application, as was the case for the VGamma aTGC analysis.

Method 2:

In the sample where you look for tight photons, instead look for photon-like jets. For every photon-like jets we find, we expect a fraction (f/e) of a QCD object with similar kinematics contaminating the tight photon sample. So you can then make a weighted histograms of the expected QCD contamination.

1. Loop over events from which you get your tight and look for photon-like jets.

2. For every photon-like-jet we find, compute the ratio f/e from the plJetís pt. Remember that the ratio functions are different for plJets in the barrel and plJets in the end cap.

3. For every photon-like-jet we found we expect a fraction of an object of QCD contamination of the same pt as the plJet and in the same part of the detector. That fraction is the f/e we just calculated. What to do next depends on what you want to do.

  • If you just want an absolute number of QCD contamination, count the photon-like jets and weight each one itís f/e. This is basically Method 1 done object-by-object.


  • If you want to plot some kinematic variable q of your tight photons (such as pt) and you want a plot of the QCD contamination as a function of p. Fill a histogram with q for photon-like jets and weight each entry by the plJetís f/e:
for ith pljet in all plJets:
     QCD_contamination_dist->Fill( qi, (f/e)i)

Confidence intervals in the f/e Ratios

In ratio_method_results_v2.h there are three sets of functions: central value functions, and upper and lower envelopes. The central value functions describe the expected f/e ratios. The upper and lower envelope indicate the corresponding 90% confidence limits of the f/e ratio around that central value. These confidence intervals take into account both systematic and statistical uncertainties.

Other Stuff:

File Descriptions:

  • ratio_method_results_v2.h: Results of the ratio method in functional form, both as C++ boolean functions and as TF1's. There are three functions: the central values, and upper and lower envelope functions that describe the 90% confidence interval of the central values. The C++ functions take a boolean use_2011A, which makes the function repport 2011A results when true and 2011B results when false. For Pt outside the investigated range the functions return -1.
  • cut_definitions.h: This file implements the VGamma tight photon definition and photon-like jet definition as boolean functions for easy implementation.

-- AnthonyBarker - 25-Jan-2012

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng Equation2.png r1 manage 18.1 K 2012-01-26 - 21:51 AnthonyBarker  
PNGpng Table1.png r1 manage 11.8 K 2012-01-26 - 12:28 AnthonyBarker  
PNGpng Table2.png r1 manage 15.0 K 2012-01-26 - 12:29 AnthonyBarker  
PNGpng Table3.png r1 manage 19.4 K 2012-01-26 - 13:58 AnthonyBarker  
PNGpng Table4.png r1 manage 27.8 K 2012-01-26 - 13:58 AnthonyBarker  
Header fileh cut_definitions.h r2 r1 manage 3.3 K 2012-01-26 - 22:05 AnthonyBarker Tight and fake-able object definitions explicitly used for the ratio method, following the T and F definitions for the VGamma group.
PNGpng equation1.png r1 manage 20.6 K 2012-01-26 - 09:58 AnthonyBarker  
Header fileh ratio_method_results_v2.h r8 r7 r6 r5 r4 manage 5.1 K 2012-02-03 - 06:01 AnthonyBarker Results of the ratio method in function form and as TF1's for 2011A and 2011B. Now with optimized binning.
PNGpng sigma_ieta_ieta.png r1 manage 99.3 K 2012-01-26 - 11:36 AnthonyBarker  
Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r9 - 2017-03-30 - AnthonyBarker
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback