2015 CMS Data Analysis School Extended Z'-to-dimuons Exercise


Complete: 5
Detailed Review status

Link to the slides we'll go over as motivation for this exercise and the AN2012_422:

Newsbox
This Twiki page presents one of the long exercises prepared for 2015 CMS Data Analysis School, at LPC, FNAL, Jan. 12-16, 2015 and at Bari, Italy, Jan. 19-23, 2015
We assume the attendees to be familiar with basic C++ and Python and to have performed the pre-exercises: https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookExercisesCMSDataAnalysisSchool#PreExercises.

Contents


Physics Motivation/Goals

This extended exercise is intended to familiarize you with two dimuon analysis efforts in CMS: Drell-Yan-to-dimuon and Z'-to-dimuon.

Briefly, the Drell-Yan signature is a very rich source for discoveries in the Standard Model. Precise measurements of the Drell-Yan distribution is an important part of doing SM and BSM physics at the LHC.

Briefly, many new models of physics predict a new narrow resonance, here referred to generically as Z', that decays into a pair of oppositely-charged muons. We look for a bump from Z' in the dimuon invariant mass spectrum using a shape-based search, looking for the difference in between the exponentially falling background and the resonance peak that would show up should a Z' exist.

In this exercise, you'll work on the material of the analysis to produce the dimuon mass spectra for data and MC.

In doing this exercise you can learn (among other things):

  • how to create Z'-to-dimuon-oriented ntuples having needed/useful quantities embedded;
  • how to evaluate acceptance, trigger efficiency, mass resolution;
  • how to choose cut values for event and object selection (e.g. muon isolation);
  • how to estimate some of the backgrounds using the data itself.

Contacts

  • Facilitators: Jason Lee ( Univ of Seoul), Youngdo Oh ( Korea)

Z'->mu mu analysis in a nutshell:

Introductory slides are here and the paper (insert links)

  • Data: integrated luminosity of 19.8 fb-1 at 8 TeV in 2012
  • Signatures: 2mu final state
  • Backgrounds:
    • irreducible DY
    • reducible with leptons from b/c hadrons decays
      • ttbar Z->tau tau
      • diboson production WW, WZ, ZZ
      • single top tW
    • reducible with fake leptons
      • W+jets, QCD
  • Selection strategy and observables:
    • 2 well reconstructed and isolated high pt muons
    • muons coming from the primary vertex
  • Results:
    • 2mu invariant mass
    • T&P method
    • background estimation from data
    • sigma95/sigma SM for exclusion

Pre-requisites

  • Basic knowledge of python, C++ and CMSSW
  • Basic knowledge of physics objects and the identification techniques: muons
  • Knowledge of ROOT and ability to write, compile and execute macros
  • Short Exercises suggested:
    • Muons at link: here
Some things are best learned by just reading and copy/pasting example code; we do some of that, but we leave some places in the code to be filled in by you.

Getting started

Since we are using miniAODs, https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookMiniAOD2015.

Log into "cluster905.knu.ac.kr" and Set up cmssw

ssh cluster905.knu.ac.kr

cmsrel CMSSW_7_4_7_patch2
cd CMSSW_7_4_7_patch2/src
cmsenv
git cms-merge-topic ikrav:egm_id_747_v2
git clone https://github.com/jshlee/cmsdas 
scram b -j6
cd cmsdas/ZprimeAnalyser/test
cmsRun runAnalysis.py

To run on data use, runAnalysisRD.py!

Location of MiniAODS:

/cmsdas/scratch/jlee/ZprimeToMuMu_M-5000_TuneCUETP8M1_13TeV-pythia8/RunIISpring15DR74-Asympt25ns_MCRUN2_74_V9-v1/MINIAODSIM
/cmsdas/scratch/jlee/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIISpring15DR74-Asympt25ns_MCRUN2_74_V9-v3/MINIAODSIM
/cmsdas/scratch/jlee/TT_TuneCUETP8M1_13TeV-powheg-pythia8/RunIISpring15DR74-Asympt25ns_MCRUN2_74_V9-v2/MINIAODSIM
/cmsdas/scratch/jlee/ZZ_TuneCUETP8M1_13TeV-pythia8/RunIISpring15DR74-Asympt25ns_MCRUN2_74_V9-v3/MINIAODSIM
/cmsdas/scratch/jlee/WW_TuneCUETP8M1_13TeV-pythia8/RunIISpring15DR74-Asympt25ns_MCRUN2_74_V9-v1/MINIAODSIM
/cmsdas/scratch/jlee/WZ_TuneCUETP8M1_13TeV-pythia8/RunIISpring15DR74-Asympt25ns_MCRUN2_74_V9-v1/MINIAODSIM
/cmsdas/scratch/jlee/WJetsToLNu_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/RunIISpring15DR74-Asympt25ns_MCRUN2_74_V9-v1/MINIAODSIM
/cmsdas/scratch/jlee/SingleMuon/Run2015B-PromptReco-v1/MINIAOD
/cmsdas/scratch/jlee/DoubleMuon/Run2015B-PromptReco-v1/MINIAOD

Step 1: The ntuples creation

root -l out_tree.root
TBrowser b

Look at the content of this root output file. Let's look just at some basic variables:

How many events does it contain? Show... Hide 49980

How many muons per event? (check Mu_nbMuon) Show... Hide 151440

What is their muon pT spectrum? (Mu_ptBestTrack) Show... Hide 871.9 GeV

Step 2: The analysis code

The c++ analysis code is in

cmsdas/ZprimeAnalyser/plugins/ZprimeAnalyser.cc

You don't need to know these macros in detail, just try to run them. If you are interested, you can look at them further in your spare time wink

Use the provided miniAOD as input and run ZprimeAnalyser. This ZprimeAnalyser runs the full selection for 2mu and saves histograms and reduced ROOT Trees in output files in ROOT format.

The full selection is:

  • Both muons must pass this selection:
    • muon must be a global muon and a tracker muon (isGlobalMuon && isTrackerMuon)
    • pT > 45 GeV
    • |dxy wrt beamspot| < 0.2 cm (abs(dB) < 0.2)
    • relative tracker isolation less than 10% (isolationR03.sumPt / innerTrack.pt < 0.10)
    • number of tracker layers with hits > 5 (globalTrack.hitPattern.trackerLayersWithMeasurement > 5)
    • at least one pixel hit (globalTrack.hitPattern.numberOfValidPixelHits >= 1)
    • at least two muon stations in the fit; this implies trackerMuon (numberOfMatchedStations > 1)
    • dpT/pT < 0.3
  • Then at least one muon must be trigger-matched to the single muon HLT path.
  • 3D angle between muons to suppress cosmics,
  • on the vertex chi2,
  • on dpT/pT to reject grossly mismeasured tracks
  • The dimuons must be opposite-sign

In CMS, the muon reconstruction software uses tracks reconstructed in the inner silicon tracker matched to tracks reconstructed in the outer muon system to produce the global muon track, giving an estimate for the muon charge, momentum, and production vertex. In addition, several different strategies for including information from the muon system have been developed for high-pT muon reconstruction and momentum assignment. For the results in this analysis, we use the “Tune P” ("cocktail") algorithm, which chooses, on a muon-by-muon basis, between the results of a few such algorithms using the tail probability of the χ2/d.o.f. of the muon track fits.

These cuts are implememted in file ZprimeAnalyser.cc, i.e. from line L229-L249 a method for the muon selection is defined: https://github.com/jshlee/cmsdas/blob/master/ZprimeAnalyser/plugins/ZprimeAnalyser.cc#L229-L249

Since we are looking for two muons which come from a single event (DY, Z, Z'), it makes sense that both muons come from the same vertex. We apply a vertex cut that calculates a probability that the two muons are from the same vertex. Moreover a cosmic muon can have the same signal as two muons coming from a single DY event. We can reject most of the cosmic background by excluding muons which are back-to-back. We only want to keep one dimuon candidate per event, and we want to keep the dimuon with the highest pt.

Exercise 1) Physics observables and final plot of invariant mass with miniAOD Z' 5 TeV sample

In this exercise we will look at results running the previous executable on /ZprimeToMuMu_M-5000_TuneCUETP8M1_13TeV-pythia8/RunIISpring15DR74-Asympt25ns_MCRUN2_74_V9-v1/MINIAODSIM

root -l out_tree.root
TBrowser b

Reminder of the selection steps:

  • Selection step: "Initial number of events" - No cuts
  • Selection step: "HLT selection" - SingleMu Trigger HLT_Mu40_eta2p1 (ignoring triggers for this exercise)
  • Selection step: "Muon. cuts" - High Pt muon ID and Isolation
  • Selection step: "DiMuon cuts"

Inside the out_tree.root file, it contains the following plots “n-1” plots, i.e. the displayed quantity is excluded from the cuts. The value of the cut is indicated by a vertical dashed line.

  • The top row shows distributions of quantities related to the number of hits included in the muon track fit in the different subdetectors: the number of silicon tracker layers containing hits, the number of hits in the pixel detector, the number of DT and CSC segments and RPC hits in the muon chambers and the number of tracker-muon segment matches per muon.
  • The bottom row shows distributions of the relative pT error δ(pT)/pT, longitudinal impact parameter and the tracker relative isolation.

You will find plots like these:

MC-Zprime5000-CMSSW720-RecoMuon-numberOftrackerLayers.png MC-Zprime5000-CMSSW720-RecoMuon-numberOfValidPixelHits.png MC-Zprime5000-CMSSW720-RecoMuon-numberOfValidMuonHits.png MC-Zprime5000-CMSSW720-RecoMuon-numberOfMatchedStations.png
Number of silicon tracker layers containing hits Number of hits in the pixel detector Number of DT and CSC segments and RPC hits in the muon chambers Number of tracker-muon segment matches per muon
MC-Zprime5000-CMSSW720-RecoMuon-errorPt.png MC-Zprime5000-CMSSW720-RecoMuon-dxy.png MC-Zprime5000-CMSSW720-RecoMuon-trackiso.png  
Relative pT error δ(pT)/pT Longitudinal impact parameter Tracker relative isolation  

In this file you will see the reconstructed and generated invariant mass spectra. The low mass tail is due to FSR and PDF effects.

MC-Zprime5000-CMSSW720-RecoMuon-Mass.pngMC-Zprime5000-CMSSW720-GenMuon-Mass.png

Many distributions at generator level will also be produced (Pt, Eta, Phi...).

Exercise: Try to plot the same distributions at reco level.

Exercise: Try to fill gen distributions before the reco-gen matching. Do you see any differences?

Mass resolution

The invariant mass resolution is evaluated from simulation, using the signal sample. The reconstructed invariant mass of each dimuon passing our selection criteria is compared to its true mass. The resolution is extracted by fitting the core of the distribution with a Gaussian.

In python try

import ROOT
rootfilename = "out_tree.root"
tt = ROOT.TFile(rootfilename)
tree = tt.tree.Get("tree")
hist = ROOT.TH1F("massResolution", "massResolution", 100, -0.5, 0.5)      
tree.Project("massResolution","(reco_diMu_m - gen_diMu_m)/gen_diMu_m","reco_diMu_m > 4000 && reco_diMu_m < 6000")
hist.Draw()

Exercise: How much is the resolution on the Z'mass peak?

MC-Zprime5000-CMSSW720-MassResolution-4000_M_6000.png MC-Zprime5000-CMSSW720-MassResolution-fit_All_points.png

Exercise: Try to excract the invariant mass resolutions as a function of invariant mass.

Exercise: Make many bins in masses (0<M<250, 250<M<750, 750<M<1250, 1250<M1750, 1750<M<1250, 2000<M<4000 and 4000<M<6000), then plot the corresponding mass resolution, fit it Gaussian. After plot a relation between Mass resolution and the mass bins (as a hint, please refer to the right plot).

Exercise 2): Final plot of invariant mass with Run2 data and MC

Calculating weights for the Monte Carlo samples

The number of Monte Carlo (MC) events is arbitrary, usually picked so that the statistical error on the quantity one wishes to explore is minimized, but there is always the trade-off between this goal and the reality of computing power and storage. (The more events, the more time it takes to simulate them, and the more time it takes to run analysis code on them.) When only looking at the distributions from just one particular physics process, e.g. Drell-Yan Z/γ* to dimuons, the overall normalization of the histograms is arbitrary; when comparing/overlaying the distributions from different processes, e.g. including the additional background from ttbar into dimuons, the relative normalization between the two matters. And if one wants to compare these distributions to those obtained in data, one sometimes has to use the absolute normalization given the integrated luminosity of the data sample. Above we've determined the latter; here we have a quick exercise on calculating the MC weights.

Each physics process has a cross section σ, expressed in units of barn; often this comes with metric prefixes, such as millibarn (mb) or picobarn (pb). (These cross-section values are from e.g. PYTHIA, or from our theorist friends who know how to use properly higher-order codes to calculate them.) For a given integrated luminosity L (equivalent to the amount of data collected, in e.g. pb-1), the number of events generated from that process is given by N = σL.

Weights question 1: If we have an arbitrary number of events NMC from the MC sample, what weight w should the histograms for this process be scaled by if we're going to eventually use it in a data/MC comparison?

Hint... Hide What if you have a histogram containing all NMC events? You want it to represent N = σL events instead.

Answer... Hide w = σL / NMC

Be aware that there may be a filter efficiency for some processes.

Note that the NMC that matters is the original number of events you pushed through simulation; beware if you lose a batch job somewhere!

One can also think in terms of the partial weight, which does not include the integrated luminosity. It is sometimes useful to calculate the weight in two steps like this so that the scale of the integrated luminosity can be included later in the analysis chain, important when e.g. one loses the output of a few jobs and ends up not running on the full data sample.

Calculate the partial weights for the MC signal and background histograms we have provided (assume integrated luminosity = 1 for now). We will add the data later.

The datasets used, the cross-sections and filter efficiencies used to simulate a given MC dataset, the number of events in any dataset (that can be found using the CMS DBS) are collected in this table.

Sample Cross-section [pb]
ZprimeToMuMu_M-5000_TuneCUETP8M1_13TeV-pythia8 5.48E-5
TT_TuneCUETP8M1_13TeV-powheg-pythia8 831.8
DYJetsToLL_M-50_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8 6025.2
ST_tW_top_5f_inclusiveDecays_13TeV-powheg-pythia8_TuneCUETP8M1 35.6
ST_tW_antitop_5f_inclusiveDecays_13TeV-powheg-pythia8_TuneCUETP8M1 35.6
WJetsToLNu_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8 61526.7
WW_TuneCUETP8M1_13TeV-pythia8 118.7
WZ_TuneCUETP8M1_13TeV-pythia8 65.9
ZZ_TuneCUETP8M1_13TeV-pythia8 31.8

Scale the histograms with the weight you calculated. You can use histogram->Scale(weight);

Efficiency

After a physical event happens every step towards its recognition is potentially ineffective - we may fail to "see" it. Generally the steps taken by the event toward us are "acceptance", "trigger", "reconstruction", "selection". What is important is the end/combined result of the "efficiency" but it is much easier to understand what is going on if one can look at a "single" step at a time i.e. "parametrize" the overall efficiency. Very often we are free to "parametrize" the steps as long as we use the parametrization consistently. At the end we need to correct the observed number of events by the efficiencies (1/eff).

"Acceptance" is the part of events that we regard as "detectable". You may think of it as a "geometrical efficiency" (the detector coverage is limited) and "kinematic efficiency" (not all "kinematic" properties of a given particle are reliably detectable by the detector). Given the DY (or Z', if we get lucky) production of a dimuon, how often do both the muons even have a chance of being reconstructed? I.e., what is the value of the fraction

acceptance = (Number of dimuons where both muons are in the nominal detector geometry) / (Number of dimuons)

CMS covers almost all of the solid angle around the interaction point. This is given approximately by the edges of the muon chambers, which extend down to roughly 10° from the beamline; this corresponds to a pseudorapidity of |eta| < 2.4. On the other hand, we have to apply a minimum cut on muon transverse momentum to ensure that charged particle makes it into the muon system and can in principle be reconstructed as a Global muon. This cut is normally determined by the Global muon reconstruction efficiency turn-on and is equal to roughly 6 GeV.

Keep in mind that the acceptance rises as a function of mass because at lower masses the dimuon is boosted more with respect to the lab frame and has an increased chance to be boosted along the z-axis, giving lower acceptance; at higher masses, the dimuon is produced closer to rest, so the lab frame is closer to the dimuon rest frame and the decays are more central.

Then "trigger" efficiency is the efficiency to have the event recorded for analysis by the trigger system. This is the "on-line" efficiency and it is unrecoverable - if something is lost there it is lost forever. Since in the offline analysis we are dealing with events that were triggered, one has to take into account the online transverse momentum threshold and the trigger efficiency turn-on. The pt thresholds are different for each trigger.

"Reconstruction" efficiency is the efficiency to reconstruct the event (i.e. all of its components).

"Selection" efficiency is the efficiency to actually pick up the event by the event selection

There are two common ways to correct for efficiency once measured:

  • measure all in data and apply it to data. Ideally you don't need the MC for the efficiency correction.
    • method appealing but there are various "dependences" of the efficiencies and you are not necessarily able to account for them even if you know about them (for instance statistics may be low and you can not make fine binning in specific projection to accurately map the dependence).
  • measure the efficiency in data and in MC, to get the ratio data/MC and apply it to the data (thus data are "matched" to MC). Then you can continue by treating data and MC in the same way - the differences in efficiencies are taken into account.
    • the "dependencies" of the efficiencies are well modeled by MC (remember - there are 3000+ people working for the experiment and MC is cross checked many-many times; still, it doesn't make it perfect).
It is also important to keep in mind that the method you use for efficiency estimation is never perfect- it is "biased". Very often to estimate the bias in data one uses the MC alone but this mixes the bias of the method and the quality of the MC. It is less requiring to compare the method directly between data and MC - this gives a more precise estimate for the differences data/MC. Even if the method is biased, in first approximation the bias is the same in data and MC so no need of special correction (of course in a real analysis this should be checked). In any case, "normalizing" to MC first is arguably better and helps developing the overall analysis technique in an easier fashion.

Another related point is that one should carefully think about what is the actual variable(s) used in the efficiency parametrization. When the event is born there is the "true" variable, say pt. Very often a model you want to compare to does not include the whole physics of the process - typically no QED corrections. This means if you want to compare to it you need the "true" pt before final-state-radiation (FSR) but what reaches the detector is not the "true" pt - it is the "post-FSR" pt. Then because of the resolution you don't measure the "post-FSR" pt but the "RECO" pt.

Your analysis technique should be such that to correct properly for these effects and estimate the precision of that. If you think about it you'll find out that there is no way to exactly separate "Acceptance" and "RECO" efficiency - you will always have border/resolution effects, however "Acceptance" times "RECO" efficiency can be defined such as to always be <=1.

Then, all the efficiencies could be estimated by the "post-FSR" variables from MC, all the data/MC efficiency corrections - by the "RECO" variables (of course) and systematic effects studied on this ground.

Maybe the most important aspect of an "efficiency" is the denominator, i.e. efficiency with respect to what. By construction every efficiency is relative! To get the efficiency right one should have a very good estimate (understanding) of what is in the denominator.

  • The quality of the efficiency estimation is at most good as the quality of the "probe" (muon, event, etc.; object, generally) defining the denominator.
    • we have some di-muon events selected in our samples
    • the efficiency per event is parametrized by single muon efficiencies - i.e. reconstruction and trigger efficiencies - in first approximation the two muons have no correlations efficiency wise.
    • Then one needs to also account for the efficiency related to the di-muon selection specifically - vertex, angle requirements.
This exercise reflects what is done in the analysis.
  • for "data-driven" estimation we apply the tag-and-probe technique. Its idea is simple - using a dimuon (dilepton) resonance, Z, we can reconstruct one of the muons (the "tag") and use only "part" of the other muon, the tracker track, to identify the Z peak. Such events have very small background. This means we know "for sure" we are dealing with a muon. Then we can estimate all the single muon efficiencies with respect to the part we used, the tracker track, and this is our "probe". One can do the same with respect to the Stand Alone muon and estimate the track efficiency. And so on.

Edit ZprimeAnalyser.cc to find out the ε * acc (efficiency*acceptance = number of selected events/ N_MC) of our selection. You can use this number and scale the histograms with a weiight calculated as w' = σLε * acc / Nentries

For the signal sample that you run the value printed is Accxeff = 0.880562.

Comparisons between data and simulations

Now we will compare distributions of invariant mass in data and simulation. The distribution from each simulated sample is scaled by weights derived from the information in the Table reported above and using the value of the integrated luminosity .

The entire summed distribution is scaled to the data using the ratio of the observed to predicted numbers of events in the region of 60 < Mμμ < 120 GeV around the Z peak in the sample of events passing a prescaled muon trigger.

A THStack object will plot two or more histograms on the same plot, but adds up the entries so that all histograms can be seen instead of overlaying the histograms. Show all the MC histograms on a single THStack object.

THStack *hs = new THStack("hs", NULL);

hs->Add(histo1);

hs->Add(histo2);

hs->Draw();

Try to produce the final invariant mass spectra considering all the background MC samples. ZprimeRecomas_BG.png

Create the comparison of the observed opposite-sign dimuon invariant mass spectrum with that expected from standard-model processes. All contributions to the expected spectrum are estimated using simulations. The observed and the expected spectra are in good agreemen. The signal invariant mass for mZ'= 5 TeV is also superimposed.

DY-MC-Data2012-mass60.png

“Other prompt leptons” includes the contributions from Z → ττ, the diboson production WW, WZ, and ZZ, and single top tW.

The plot obtained is compared to the one reported in the AN2012_422 where the contributions to the expected spectrum are estimated using simulations, except for the contribution from multi-jets and W+jets, which are evaluated from the data.

AN_Spectrum.png

Exericse 3) Tag & probe

The Tag&Probe method utilizes the properties of a well known resonance (e.g. Z, J/Psi) to extract a sample of muon probes suitable for efficiency estimation. All the efficiencies extracted with Tag and Probe method are per muon.

In the particular case of the Drell-Yan analysis the intention is to skim an unbiased sample of Z candidates. The tag muon is required to pass a set of selection criteria designed to identify the required particle type. The other (candidate) muon is then used to probe a property which is under investigation.

Extend ZprimeAnalyser.cc to select:

  • Tag muon as muon candidate passing the high pt muon ID, matching with the generated muon and also matching with one of the muons passing the trigger,
  • Probe muon is just a track muon with very loose cut on pt.
  • Tag and Probe muons should be with opposite charge.

Add in 3 parameters (nbTP, nbTF and Eff.):

  • nbTP is defined as number of pairs where one object is a tag while the other is a probe failing tag criteria but passing probe criteria
  • nbTF is number of pairs where one object is a tag while the other is a probe failing tag criteria and failing probe criteria.
  • Eff is computed as "nbTP/nbTP+nbTF".
Extend the code ZprimeAnalyser.cc to get the T&P eff to appear in the screen as:

output eff= 96%

**************************************
* Tag & Probe Eff at Z pole is 0.959 *
**************************************

Exercise: try to write a Macro to plot the following:

  • Eff vs eta of the probe muon
  • Eff vs phi of the probe muon
  • Eff vs nbPV of the probe muon
  • Eff vs M(t,p)
The efficiency obtained as a function of the pseudorapidity should looks like this:

TPeff_vs_eta.png

Exercise 4) Background estimation

The dominant irreducible background to our signal comes from the Drell–Yan dimuon production, while the largest background following the Drell–Yan production comes from tt decays.

All background components are estimated using simulations except for the background from jets, which is estimated from data. In order to demonstrate that the Monte Carlo simulation describes the reducible backgrounds well, we also derive them from data as a cross check. We use the following methods:

  • For tt and tt-like lepton-flavor symmetric backgrounds, we use the “eμ” method, in which the number of dimuon events is estimated from the electron-muon dilepton spectrum.
  • For dibosons and other charge-symmetric sources of prompt muons, we check the agreement between data and simulation by examining the same-sign dimuon spectrum.
  • For the background from jets, the rate is estimated from data, using events that fail isolation cuts.
  • For dimuons from cosmic ray events, we verify that the contribution is negligible by examining events rejected by anti-cosmics cuts, such as the transverse impact parameter cut

Estimating background from jets: Fake rate technique

Behind the t-tbar background, the other main background for Z' is general QCD. Most QCD is removed with an efficient isolation cut, but there is still a rate at which jets fake muons. Monte Carlo does not model well the events that eventually pass this cut, so we are forced to rely mostly on data for the fake rate.

  • You could use the anti-isolated di-muon spectrum (events that pass our selection criteria but fail isolation) to predict the rate at jets will fake di-muon events in data.
  • Then you could build a re-weighting map from the simplest cases for closure and to understand the procedure and then apply this on data.

Estimating ttbar background from data: e&mu method

For ttbar->dilepton decays, same-sign muons would not be a good method for measuring the background since ttbar wouldn't decay into two same-sign leptons. However, the probability is twice as high for ttbar to emu as for ttbar to mumu. So if the ttbar->emu decays can be measured, the ttbar->mumu decays can be estimated.

If we assume that MC accurately represents data, then the ratio of ttbar->emu and ttbar->mumu decays should be the same in MC and data. This means the ratio (MC_mumu/MC_emu) = (data_mumu/data_emu). We can measure the MC mumu and emu background distributions and the data emu background distribution. From this we can estimate the data_mumu background on a bin-by-bin basis.

For emu method the 2012 data sample "MuEG 2012 datasets", which is rich with both muons and electrons, were used. The "HLT_Mu22_Photon22_CaloIdL" trigger was used. Then the first reconstructed object is a Muon passing high pt muon ID (2012), while the second object is an Electron passing HEEP V4.1 (2012) "TableHEEPID.pdf".

Extending ZprimeAnalyser.cc, plot the invariant mass spectra of the emu events selected from background MC sample and then superimpose to the previous spectra the emu invariant mass distribution from Data.

e_mu.png

Exercise: now try to adopt this Macro to also plot the cumulative plot for both data & MCs.

Limits

the observed and expected upper limits on the ratio Rσ of the production cross section times branching fraction of a Z′ boson relative to that for a Z boson. The observed and expected limits agree within the uncertainties. The figure also shows the predicted cross section times branching fraction ratios for the benchmark Z′SSM [50] and Z′ψ [51] models. https://twiki.cern.ch/twiki/bin/viewauth/CMS/LimitsToolDocumentation

For the muon channel alone, the 95% CL lower limits on the mass of a Z′ resonance are 2770 GeV for Z′SSM and 2430 GeV for Z′ψ.

https://twiki.cern.ch/twiki/pub/CMS/SWGuideCMSDataAnalysisSchool2015DimuonExercise/Limit.png

The μ+μ− invariant mass spectrum has been found to be in agreement with expectations from the standard model. The analysis excludes, at 95% confidence level, a Sequential Standard Model Z′SSM resonance lighter than 2770 GeV and a superstring-inspired Z′ψ lighter than 2430 GeV.

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2015-08-25 - JasonSHLee
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback