H -> ZZ -> 2l2j / 2l1J analysis (Full 2016 Run 2 data)



  • HZZ meetings (Friday, 14:00)
  • Working meetings are called when needed on Monday at 16:00, CERN time.


  • Analysis Note: AN-17-019
  • PAS (2016, full dataset): HIG-17-012, together with other channels

Samples, Cross sections


The data to be used is the September 23rd ReReco for Runs B to G and the Prompt Reco for Run H.





These are the data for ttbar control region:


JSON: Cert_271036-284044_13TeV_23Sep2016ReReco_Collisions16_JSON.txt (36.814 /fb)


  • ICHEP dataset (Runs B, C, D): 12.9 /fb
  • Run E: 4.32 /fb
  • Run F: 3.37 /fb
  • Run G: 8.02/fb
  • Run H: 9.20 /fb

Grand total is 36.8/ fb


Dataset Decay σ*BR*filter eff (pb) Comments
/BulkGravToZZToZlepZhad_narrow_M-800_13TeV-madgraph/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM ZZ -> 2l2q (l = e, mu, tau) - sigma unknown, BR = 0.0924
/GluGluHToZZTo2L2Q_M750_13TeV_powheg2_JHUgenV698_pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM ZZ -> 2l2q (l = e, mu, tau) - sigma unknown, BR = 0.0924
/DYJetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6_ext1-v2/MINIAODSIM - 6025.2 (CHECK ME) inclusive
/DY1JetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 1016.0  
/DY2JetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 331.4  
/DY3JetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 96.36  
/DY4JetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 51.4  
/DYBJetsToLL_M-50_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 88.2771  
/DYJetsToLL_M-50_HT-100to200_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 147.4 HT-binned
/DYJetsToLL_M-50_HT-200to400_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 40.99 HT-binned
/DYJetsToLL_M-50_HT-400to600_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 5.678 HT-binned
/DYJetsToLL_M-50_HT-600to800_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v2/MINIAODSIM - 2.198 HT-binned
/DYJetsToLL_M-50_HT-800to1200_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 2.198 HT-binned
/DYJetsToLL_M-50_HT-1200to2500_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 2.198 HT-binned
/DYJetsToLL_M-50_HT-2500toInf_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 2.198 HT-binned
/TTJets_DiLept_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM The POWHEG sample disappeared frown 57.35  
/WZ_TuneCUETP8M1_13TeV-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 47.13 inclusive LO
/ZZ_TuneCUETP8M1_13TeV-pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM - 16.52 inclusive LO

The trick to merge the DYbjets correctly:


It must be run between job creation and submission, so you do

batch.py -i xxx.py -o <outDir> samples.csv
source setGenericRedirAndFilters.sh <outDir>

This is not super-important but it gives you some btagged events at high-mass, if you run alpha factor on regular jet-binned samples there are really few...

Pileup Reweighting (Needs to be updated)



V3 Ntuples

Location: /eos/cms/store/caf/user/sudha/ZZ2l2q/Moriond2017/V3

V2 Ntuples

Location: /eos/cms/store/caf/user/sudha/ZZ2l2q/Moriond2017/V2

V1 Ntuples

Location of samples: https://github.com/trtomei/ZZAnalysis/blob/2l2q_80X/AnalysisStep/test/Plotter/goodDatasets_Moriond2017_V1.txt

Signal samples: /afs/cern.ch/user/t/tomei/public/HZZ2L2Q/Moriond2017/V1/
Data, backgrounds and high mass VBF signal: /store/caf/user/sudha/ZZ2l2q/Moriond2017/V1/

V0 Ntuples

Location in: /afs/cern.ch/user/t/tomei/public/HZZ2L2Q/Moriond2017/V0/

Instructions to make ntuples and plots

Framework - https://github.com/sudhaahuja/ZZAnalysis/tree/2l2q_80X
Instructions - https://github.com/CJLST/ZZAnalysis/wiki/SubmittingJobs
List of datasets to be used - https://github.com/sudhaahuja/ZZAnalysis/blob/2l2q_80X/AnalysisStep/test/prod/samples2l2q_Moriond2017.csv
Plotting script - https://github.com/sudhaahuja/ZZAnalysis/blob/2l2q_80X/AnalysisStep/test/Plotter/plotDataVsMC_2l2q.C

Temporary note (before submitting jobs)::

For Signal: In ZZAnalysis/AnalysisStep/prod/analyzer_20152l2q.py, add the following line

process.ZZTree.skipEmptyEvents = False

For Data: Make the following changes
In ZZAnalysis/AnalysisStep/test/MasterPy/ZZ2l2qAnalysis.py - change the global tags for Runs B-G (SeptRepro) & Run H (Prompt) data accordingly:

        process.GlobalTag = GlobalTag(process.GlobalTag, '80X_dataRun2_2016SeptRepro_v7', '')
        #process.GlobalTag = GlobalTag(process.GlobalTag, '80X_dataRun2_Prompt_v16', '') # For RunH            

Objects, methods (Needs to be cross-checked and updated)

Anywhere except where mentioned otherwise, content is taken directly from miniAODs; for documentation please refer to WorkBookMiniAOD



  • As in 4-lepton analysis :
    • Loose Muons: pT > 5, |eta| < 2.4, dxy< 0.5, dz < 1, (isGlobalMuon || (isTrackerMuon && numberOfMatches>0)) && muonBestTrackType!=2
      • dxy and dz are defined w.r.t. the PV and using the muonBestTrack, eg:
        dxy = fabs(l.muonBestTrack()->dxy(PV->position()))
      • Note that non-global tracker muons must be arbitrated (numberOfMatches>0) and that muons with muonBestTrackType == 2 (standalone) are discarded even if they are marked as global or tracker muons.
    • Tight Muons: as Loose Muons+ PF Muon Isolation: iso/pT < 0.35, using PF combined relative isolation with cone size R=0.3, and Δβ correction. The cut is applied after recovered FSR photons are subtracted from the isolation cone (see below).

Ghost cleaning:

process.cleanedMu = cms.EDProducer("PATMuonCleanerBySegments",
   src = cms.InputTag("calibratedMuons"),
   preselection = cms.string("track.isNonnull"),
   passthrough = cms.string("isGlobalMuon && numberOfMatches >= 2"),
   fractionOfSharedSegments = cms.double(0.499))



  • As in 4-lepton analysis:
    • Loose Electrons: pT > 7, |eta| < 2.5, dxy< 0.5, dz < 1,
      The conversion rejection cut gsfTrack.hitPattern().numberOfHits(HitPattern::MISSING_INNER_HITS)<=1 was used before we moved to the Spring15 BDT; we don't use it anymore since this variable is now part of the BDT inputs.
    • Tight Electrons: Loose Electrons + non triggering MVA ID, using the following recipe:
      • In CMSSW_7_6_X , we don't recompute the ID and we directly retrieve the variable from the pat::Electron as userFloat("ElectronMVAEstimatorRun2Spring15NonTrig25nsV1Values")
      • The working point is:
        float fSCeta = fabs(l.superCluster()->eta());
        bool isBDT = (pt<=10 && ((fSCeta<0.8                  && BDT > -0.265) ||
                                 (fSCeta>=0.8 && fSCeta<1.479 && BDT > -0.556) ||
                                 (fSCeta>=1.479               && BDT > -0.551))) 
                  || (pt>10  && ((fSCeta<0.8                  && BDT > -0.072) ||
                                 (fSCeta>=0.8 && fSCeta<1.479 && BDT > -0.286) || 
                                 (fSCeta>=1.479               && BDT > -0.267)));
    • Lepton cross cleaning: Remove electrons which are within ΔR(eta,phi)<0.05 of a muon passing tight ID && SIP<4
    • Isolation: New recipe, as of 7_6_X samples: The isolation cut is applied after recovered FSR photons are subtracted from the isolation cone (see below).
      • iso/pT < 0.35, using PF combined relative isolation with cone size R=0.3:
        double Ana::pfIso03(pat::Electron elec, double Rho) {
            double PUCorr = Rho*ElecEffArea(elec.superCluster()->eta());
            double iso = (elec.pfIsolationVariables().sumChargedHadronPt+std::max(elec.pfIsolationVariables().sumPhotonEt+elec.pfIsolationVariables().sumNeutralHadronEt-PUCorr,0.0))/elec.pt();
            return iso;
      • using rho correction with Spring15-25ns-based effective areas from EGamma POG. Note that these EA are binned in eta of the electron's supercluster, not the electron eta.

Photons for FSR

Start from PF photons from the particleFlow collection.

  • Preselection: pT > 2 GeV, |η| < 2.4, photon PF relative isolation less than 1.8.
    The PF isolation is computed using a cone of 0.3, a threshold of 0.2 GeV on charged hadrons with a veto cone of 0.0001, and 0.5 GeV on neutral hadrons and photons with a veto cone of 0.01, including also the contribution from PU vertices (same radius and threshold as per charged isolation) .
  • Supercluster veto: remove all PF photons that match with any electron passing loose ID and SIP cuts; matching is according to (|Δφ| < 2, |Δη| < 0.05) OR (ΔR < 0.15), with respect to the electron's supercluster.
  • Photons are associated to the closest lepton in the event among all those passing loose ID + SIP cut.
  • Discard photons that do not satisfy the cuts ΔR(γ,l)/ETγ2 < 0.012, and ΔR(γ,l)<0.5
  • If more than one photon is associated to the same lepton, the lowest-ΔR(γ,l)/ETγ2 is selected.
  • For each FSR photon that was selected, exclude that photon from the isolation cone all leptons in the event passing loose ID + SIP cut if it was in the isolation cone and outside the isolation veto (ΔR>0.01 for muons and (ele->supercluster()->eta() < 1.479 || dR > 0.08) for electrons; note: these requirements should probably be rechecked for consistency with isolation algorithms).

Lepton efficiency scale factors

Muon scale factors

The histogram with overall data to simulation scale factors (for tracking, reconstruction, identification, impact parameter and isolation requirements ) is available here.

Electron scale factors

Electron efficiencies are measured for ID|Reco, ID+ISO+SIP|Reco and SIP|Reco in 6 electron pT bins (7, 10, 20, 30, 40, 50, 1000) and 4 electron superclaster |η| bins (0.0, 0.8, 1.479, 2.0, 2.5). A novelty with respect to the run I is a different set of scale factors for crack electrons. gsf::Electron->isGap() is used to determine whether the electron is a crack electron. These scale factors are officially approved by egamma POG ( presentation), derived with 76X data and ready for use.

  • Scale factors for new MVA ID working point with respect to the reconstruction with full systematics :
  • Scale factors for ID+ISO+SIP, for |SIP| < 4 and iso/pT < 0.35 with cone size R=0.3 working points with respect to the reconstruction with full systematics:
  • Scale factors for SIP with respect to the reconstruction with only central values provided:
Please note that scale factors are measured and therefore should be applied for electrons up to 1000 GeV but for simplicity the upper bound in the provided root files is always 200 GeV. All the fits and plots can be found here.

Lepton momentum scale and resolution, event-by-event mass error calibration

Muon scale and resolution corrections

Muon momentum scale corrections are applied using the KalmanMuonCorrector class.

The corrections can be downloaded by doing, under $CMSSW_BASE/src, the following

 git clone https://github.com/bachtis/Analysis.git -b KaMuCa_V2 KaMuCa 

A KalmanMuonCalibrator object can be created using as input string "DATA_76X_13TeV" or "MC_76X_13TeV". Then:

/// ====== ON DATA (correction only) =====
    double corrPt = calibrator.getCorrectedPt(mu.pt(), mu.eta(), mu.phi(), mu.charge());
    double corrPtError = corrPt * calibrator.getCorrectedError(corrPt, mu.eta(), mu.bestTrack()->ptError()/corrPt );

/// ====== ON MC (correction plus smearing) =====
    double corrPt = calibrator.getCorrectedPt(mu.pt(), mu.eta(), mu.phi(), mu.charge());
    double corrPtError = corrPt * calibrator.getCorrectedError(corrPt, mu.eta(), mu.bestTrack()->ptError()/corrPt );
    double smearedPt = calibrator.smear(corrPt, mu.eta());
    double smearedPtError = smearedPt * calibrator.getCorrectedErrorAfterSmearing(smearedPt, mu.eta(), corrPtError /smearedPt );

Electron scale and resolution corrections

WARNING: Egamma corrections should be applied only on 25ns data, not on 50ns data :WARNING

Use 7_6_X branch of the EGamma code from https://twiki.cern.ch/twiki/bin/view/CMS/EGMSmearer

Apply the correction before all cuts and the computation of the relative isolation, but do not recompute the MVA id discriminator

ak4 Jets

Starts from the slimmedJets collection.

In 76X, reapply Jet energy corrections:

  • MC: Fall15_25nsV2_MC, apply 'L1FastJet','L2Relative','L3Absolute'
  • data: Fall15_25nsV2_DATA, apply 'L1FastJet','L2Relative','L3Absolute','L2L3Residual'

looseJetID: follow instructions in https://twiki.cern.ch/twiki/bin/viewauth/CMS/JetID#Recommendations_for_13_TeV_data

pileupJetId: PU jet ID is currently buggy; a fix will arrive soon. In the meanwhile, it should not be applied. Still true? Previously used cut:

      float jpumva=0.;
        }else if(jeta>2.75){
        }else if(jeta>2.5){
        }else if(jpumva<=-0.63)passPU=false;
        }else if(jeta>2.75){
        }else if(jeta>2.5){
        }else if(jpumva<=-0.95)passPU=false;

The jets are required to have Pt>30 and |eta|<2.4. They must be cleaned with a DeltaR>0.4 cut wrt all tight leptons in the event (cf. ID definition at HiggsZZ4l2015#Muons and HiggsZZ4l2015#Electrons) passing the SIP and isolation cut computed after FSR correction, as well as with all FSR collected photons attached to these leptons.

For extra jets in the event (not forming ZZ candidates) the |eta| cut is relaxed to 4.7.

ak8 (merged) Jets

Starts from the slimmedJetsAK8 collection.

In 80X, reapply Jet energy corrections:

  • MC: Fall15_25nsV2_MC, apply 'L1FastJet','L2Relative','L3Absolute'
  • data: Fall15_25nsV2_DATA, apply 'L1FastJet','L2Relative','L3Absolute','L2L3Residual'

Additionally, apply the L2 and L3 corrections only to the jet pruned mass.

looseJetID: follow instructions in https://twiki.cern.ch/twiki/bin/viewauth/CMS/JetID#Recommendations_for_13_TeV_data

The jets are required to have Pt>170, |eta|<2.4, tau21 < 0.6. They must be cleaned with a DeltaR>0.8 cut wrt all tight leptons in the event (cf. ID definition at HiggsZZ4l2015#Muons and HiggsZZ4l2015#Electrons) passing the SIP and isolation cut computed after FSR correction, as well as with all FSR collected photons attached to these leptons.

The scale factors for the working points can be found at the dedicated jet W-tagging twiki: https://twiki.cern.ch/twiki/bin/viewauth/CMS/JetWtagging#Working_points_and_scale_factors

Trigger requirements

  • The paths are now:
    • HLT_Ele23_Ele12_CaloIdL_TrackIdL_IsoVL_DZ_v* || HLT_DoubleEle33_CaloIdL_GsfTrkIdVL_v* || HLT_Ele27_WPTight_Gsf_v* || HLT_Ele25_eta2p1_WPTight_Gsf_v* HLT_Ele27_eta2p1_WPLoose_Gsf_v*
    • HLT_Mu17_TrkIsoVVL_Mu8_TrkIsoVVL_DZ_v* || HLT_Mu17_TrkIsoVVL_TkMu8_TrkIsoVVL_DZ_v* || HLT_IsoMu24_v* || HLT_IsoTkMu24_v*

Do we really need / want such a complicated trigger selection?

Analysis flow (Needs to be updated)

  • requiring at least one good vertex: !isFake && ndof > 4 && |z| <= 24 && position.Rho <= 2
        src = cms.InputTag("offlinePrimaryVertices"),
        cut = cms.string('!isFake && ndof > 4 && abs(z) <= 24 && position.Rho <= 2'),
        filter = cms.bool(True),

  • IP and isolation requirement: all leptons should satisfy cuts described above (on FSR-subtracted isolation)
  • Z to lepton candidates made of OSSF lepton pairs passing the above requirements.
    • In addition, require: 55 < mll < 120 GeV and ptll > 100 GeV
  • Z to jet candidates are either a merged jet passing the above requirements for ak8 or a jet pair passing the above requirements for ak4
    • In addition require: 40 < mjj or mJ < 180 GeV and ptjj > 100 GeV (ptJ is already cut at 170 GeV at miniAOD level): the mass cuts are very loose to keep both signal region and sideband events, the latter will be used for background estimation.
  • Among all ZZ pairs, require that:
    • ΔR(eta,phi)>0.02 between each of the leptons (to remove ghosts)
    • the two highest-pT leptons pass pT > 40 and 24 GeV
    • define the Z1 as the one with jets (two ak4 jets or one ak8 jet); the Z with leptons is the Z2.
    • m(2l2j / 2l1J) > 300 GeV
  • If more than one ZZ candidate survives, choose the one with the Z2 closest in mass to nominal Z. If two or more candidates include the same Z2 and differ just by the Z1, choose the one with the highest-pT (of the merged jets or the vector sum of jet pTs in case of resolved jets). Double check this

K factors

  • For the time being we only apply a FLAT k-factor = 1.231 for the DY+jets background.

Kinematic Discriminants

The latest construction of spin0/spin2 discriminants and the corresponding templates can be ref erred to the talk :

Event Categorization

  • VBF-tagged category: At least two extra jets and vbfmela > 1.043 - 460. / ZZMass + 634.
  • B-tagged category: At least two b-tags either in the two jets (subjets) that make the resolved (merged) Z1. We use CSVv2 with b-tagging threshold of 0.46 (I think this is MEDIUM, but check it)


Using Z mass constraint to fit jet momenta, more details in: kinZfitter

To get the code:

cd $CMSSW_BASE/src 
git clone https://github.com/tocheng/KinZfitter.git
cd KinZfitter
git checkout -b from-Zhadv1.1 Zhadv1.1


Combination Cards



Efficiency calculation (notes from Candice)

Main repo: https://github.com/CandiceYou/HighMassCombInputs

For running efficiency, resolution and 2D templates, it is sufficient to just uncomment the relevant block in this bash script and run it:

When the new trees arrive, the input files will be changed here:
and here

  • The inputfiles_spin0ggH and inputfiles_spin0VBF arrays list the sample masses , and get the file in the following format:
    • /ggHiggs/ZZ2l2qAnalysis.root
    • /VBFHiggs/ZZ2l2qAnalysis.root
- The background samples and data files are in selection.cc.

Since the current samples are in different directories with different naming, I temporarily added a inputDir2 and hacked selection.cc a bit to add high mass samples. This will be simplified once the new samples arrive.

These are the 2D template I got with the last samples:

I think efficiency will be a easier test run, since resolution and template involve many scripts. If you run the default run2l2q_all.sh, the efficiency will be produced.
The code will make a new set of trees which make the selections and label each event for 12 categories. This step usually takes some time.
To speed it up for a quick test, you can select only a few samples by reducing the arrays here:

Review page

-- ThiagoTomei - 2017-02-17
-- SudhaAhuja - 2017-01-12
Edit | Attach | Watch | Print version | History: r15 < r14 < r13 < r12 < r11 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r15 - 2017-04-25 - SudhaAhuja
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback