• at line 55 there is a cut of 20 GeV on the jets, while it is 40 GeV at line 49, any reason for that?

    • 20 GeV is a threshold for muons, which are checked for presence inside the jet.
  • for the cuts on lines 66,67 for the discriminants, is there any study, justification how they were chosen?

  • I am not sure that I understand the data-MC comparison in Fig. 9b and in Fig 10b. In Fig 9b the data agree with the overall sum of MCs well. In Fig. 10b they agree less and the figure caption does not help. Is Fig. 10b after applying the kMC factors? Because the agreement looks worse.

    • In fact 9b and 10b are two different plots: 9b shows comparison between all data and MC , and 10b represents (data - Top/Dibosons) and Drell-Yan. However, plots at figure 9 were produced with wrong Drell-Yan normilization: for these two figures I used NNLO Drell-Yan xsection values - 4578 pb, 851 pb and 335 pb for DY 0,1 and 2 jets (I was trying to reproduce Duong's results with NNLO and forgot to change xsections back to NLO while making these two plots), and for the rest of the plots in the AN standart (NLO) xsection values were used - 4754 pb, 888 pb and 348 pb for DY 0,1 and 2 jets. I'll replace plots at figure 9 in new version. Two versions of Ystar with no tag and c-tag with NLO and NNLO xsections are in attachment.
  • In general, the method to extract the kMC factors is based only on number of events. It would be much better to take a distribution which is sensitive to c- and b-tagging and fit that as sum of the 3 components to extract these kMC. I would recommend to try it, it should not be very complicated.

    • We used RooFit to obtain k-mc factors from shapes fit ( see kFactosFit.c in attachment). Is was done as simultaneous fit of two distributions - Ystar with b- and c-tags, with k_MC-factor for light component fixed to 1. As a result, k_mc factors for b and c components were equal to 0.78 and 1.03 respectively , which is consistent with the results, obtained by solving equations with numbers of events.

  • in the closure test, are the same events used or which events are used?
    • Yes, one sample was used to calculate response matrix, background and acceptance and in closure test. Result of appying unfolding procedure to anther sample in closure test a priori won't coincide with generator level distribution from original sample, so it will be impossible to say, whether this difference is caused only by statistics or some errors in the unfolding procedure.

  • I am surprised that the pileup has such a large effect on the last 2 bins for the c-jets, unless it is just statistics.

    • These effect is seen for most of the uncertainties, not only pileup, because of the statistics.

  • - what would happen if at page 14, first formula, you would take
    the N_data-Top/Dibosons-light tagged= k_light*NDY,light,light-tagged+
    • Light tagging requires anti-b and anti-c tagging. However, there are no anti-tag SFs, so the number of events in this modified equations can be incorrect, so that the result of the equation solving wouldn't be correct.

  • it would be good to have comparisons before and after k-factors, similar to Figs
    14-16 of the note, for all the possible cases, for the moment for instance I do not
    understand what Fig. 19 is, are k-factors applied?
    • These figures show SVM before applying k-factors, additional plots and descriptions added in AN.

  • It is indeed not good that using the SVfit mass the k-factors come out so
    different including or not SV jets. Concerning the tagger, did you get any feedback
    from BTV on the best tagger to use for c-jets?
    • There is feedback from Juan Pablo, who suggested, that there are problems with modeling of reconstruction of SV, and he also has no objections to method 1 - equations solving method.

  • make the selection similar to Duong's selection and compare to their k-factors

  • at line 209 you write that the light fraction has a normalization fixed to 1, I can't recall what

    Juan Pablo does. What happens if you leave it free, is it not possible to constrain it, maybe

    due to the different shape? Do you have plots in the note showing results of the fit?

    • The result of the fit for light component is close to 1 (if one doesn't fix it), so this component was kept fixed. In other analysis it was done the same way, as I understan, Duong finds k-factors only for charm and bottom too. There are figures 27 and 28, showing agreement between data and MC after applying these k-factors. There are also plots 25 and 26, which show measured k-factors as funcitons of pt of Z-boson or c-tagged jet.

  • I understand from the answer to Juan Pablo that you do not have a pt cut on the

    generated leptons. First of all I guess now you are correcting back to dressed leptons. Then it is

    always better to correct back to a fiducial region which is close to the experimental

    one, not to have huge unknown acceptance corrections. It would be good to add in the note

    the exact fiducial region and in principle genjets and genleptons should have kinematic pt, eta cuts

    similar to the reco ones, + the invariant mass Mll gen cut is needed.

    • We have added leptons pt and eta cuts, close to those used at reco level : leading pt > 26 and subleading pt > 10. In new AN version it is stated, that we measure fiducial cross-secion.
  • It would be good to understand some of your systematics like:

    - Fug. 44 (c) - why pt c-jet high at high pt and only in muon channel

    • it seems, that in last bins there may be large statistical fluctuations, so if there are few events, change of one parameter can lead to large change for distribution. Plots data_mu and data_el in attachment show, how these fluctuation can appear varied distribution to central distribution fraction.
  • - Fig. 45 (b) - why pileup high only in electron channel

    • To be understood.
  • - Figs 46 (b) and (d) what is happening in the eID so bad compared to the muon ID?

    • It must be an error: for electron pt~65 and eta~-1.5 the efficiency and error (GetBinError) are equal to 1! So the weight changes by 100%. Will ask electron pog. Update: For electrons SFs depend not on electron eta, but on electron supercluster eta, so that bin should be skipped.

Juan Pablo

  • +) about the c-tag/mistag efficiency. You explain in eq. 5 how you apply the weight for the SFc as recommended. Let me explain what I do on my code.

    Let's imagine that the SFc for your ctagger-T is 0.92 +/- 0.06 +/- 0.01 (overall, then there is the "file" in bins of pt of the jet but for simplicity let's get the single number here). What I do is weightMC *= 0.92; This lowers the MC and improves my data/MC agreement. Could you please check that this is equivalent to the procedure you describe and follow ?

    • The eq.5 looks a bit complicated, because there are 2 weights, which improve data/MC agreement: one improves data/MC agreement for B-F samples and another weight takes into account difference between data and MC in G-H. Since there is only one set of MC samples, the weight to be added to weightMC is made of these two weights, proportional to luminosity of each subset, that what eq.5 tells. In case of tag (when c-jet passed c-tag) there is no such partition so the MC weight is simply multiplied by the corresponding tag efficiency SF .

  • You use "HLT Ele27 WPTight Gsf" for the Z->ee. When I use the W-> e for charm tagging purposes I go up to the "HLT Ele32" because I do not have to deal with unprescaling (may be you do not have to either at "HLT Ele27", did you cross check ?)

    • According to

      HLT ele27 is unprescaled, as I understand, that means, that each time, when the trigger fires, the event is recorded, so there is no SFs to be applied to take into account lowered events recording rate.

  • Your offline cut is 28 GeV. The usual is to leave 2 GeV to make sure you are far from the trigger turn on. Beware you might be asked during the publication process to move you offline cut to the standard "trigger_cut + 2" GeV.

    • The electrons part of the analysis is to be done with 29 GeV cut, to fit the usual way of selecting electrons with this trigger.

L120: "signal muons or muons, which ..."
Regarding the purpose and details of the muon-jet cleaning procedure : is this to remove Tight-ISO-muons in a delta_R (jet,Tight-ISO-muon) < 0.4 ? The fine but just to make sure this is not to remove Tigh-NONISO-muons in a delta_R (jet,Tight-NONISO-muon) < 0.4. If you remove also NONISO, it is up to you but you must already know that in the range 15 GeV < pt_NONISO_muon < 25 GeV you have a chunk of signal. Nothing to worry though.

    • We remove only jets, which overlap with isolated muons. In this case we don't remove signal events.

Can both results be combined even when having different pt cuts. Which is you fiducial region (I mean your cross section is defined for letpons with pt> XXGeV or for Z with pt > XX GeV)? I must have missed it.

    • We have switched to new signal definition, which include cuts for pt and eta for leptons (leading pt > 26 and subleading > 10) to match reco level selections. Thus we measure fiducial cross section of the process.


  • Do you apply any matching between the trigger object (the one that fired the single muon trigger) and the reconstructed muons ?
    • No, there is no matching between the trigger and muons, in order two take into account efficiency for two muons, combinatory formula was used.

  • What is your definition of a c (b) jet at generator level (L 73)?
    • We use hadron flavor for gen jets, using the same algorithm, used for reco level jets. In .py file :
from PhysicsTools.JetMCAlgos.AK4PFJetsMCFlavourInfos_cfi import ak4JetFlavourInfos
process.genJetFlavourInfos = ak4JetFlavourInfos.clone(
        jets = cms.InputTag("ak4GenJets")

from PhysicsTools.JetMCAlgos.GenHFHadronMatcher_cff import matchGenBHadron
process.matchGenBHadron = matchGenBHadron.clone(
             genParticles = cms.InputTag("ak4GenJets"),
             jetFlavourInfos = "genJetFlavourInfos"

from PhysicsTools.JetMCAlgos.GenHFHadronMatcher_cff import matchGenCHadron
process.matchGenCHadron = matchGenCHadron.clone(
    genParticles = cms.InputTag("ak4GenJets"),
     jetFlavourInfos = "genJetFlavourInfos"

and inside .cc file:


  • In the case of b(c)-tagging you do not correct data and MC separately but keep the analysis at the (let´s say) uncorrected data level and apply the corresponding b(c) tagging SF. I find this treatment quite asymmetric. I would treat lepton and b(c) tagging efficiencies on the same footing.
    • Muons id, isolation and trigger efficiencies are dependant on data samples, thus data was reweighted with respect to this dependance . C- and b- tag match/mismatch also depend on data samples, however, it is impossible to calculate these efficiencies separately for data and MC (we can't define hadron jet flavor for data ), so the scale factors for MC were composed of two scale factors corresponding to two sets of data samples.

  • Are b(c) tagging SF applied in Figs. 4 to 8 ? Specify it in the text/caption.
    • Yes, c-tag/mistag SFs are taken into account for these plots, will specify it in the text.

  • Maybe you can test the ttbar MC description with a control sample in the emu channel.
    • We didn't save muons in tuples, so this can't be done soon.

  • Can you describe in detail how you are treating systematic uncertainties in the c(b) tagging (light mistagging) scaling factors ?. You are following the recommendations of the b-tagging group, but it would be good to have it explained also here. How do you treat correlations among the different SF ?
    • All necessary formulas and conditions for pt, eta and discriminator can be found her The systematics taken into account by changing all formulas, used in calculation of tag/mistag SFs, to formulas corresponding to uncertainties up/down. For example, if one wants to get distributions with SF uncertainty up, the weight for event with c-jet is calculated according to formula from with "comb" measurement type , tight working point, and formula, selected according to the pt of c-jet. We don't take into account correlation between differnt SFs.
      Update: I found an error in my code: scale factors for c-mistag for b-jets were equal to 0 in some cases, added detailed description, how SFs are calculated in the AN.

  • Fig. 19 (acceptance) it has a funny shape, can you give more details of which cut(s) are most relevant in the different pT regions ?
    • This shape is similiar to the shape of c-tag efficiency, so it seems, that the most contribution is from c-tagging .

  • Same for fig. 18. I am a bit surprised for the first point in the left plot. Does it come from unmatched reco dileptons with gen dileptons or from unmatched reco c-jets to gen c-jets ? May I assume a ~100% correlation between fig. 18 left and right ?
    • This shape comes from pt migration of Z / c-jet, so this form is expected for pt of any objects, without reference to correlation.

  • As already suggested at the SMP-COM meeting, the sensitivity to pdf should be assessed. Probably a study, similar to the one [1] can be tried. This can also help to define the optimal binning in terms of Yb, Ystar.
    • We can try to do this study, current Yb and Ystar binning is optimal for differential cross-section as a function of Yb and Ystar, since this partition was chosen so that number of events was of the same order and statistical errors were the same for different bins.


  • I am still a bit unhappy that you use Sep2016 and promptreco… Sorry can you remind me here again the details of what prevents you from using a

    more recent reprocessing of data ?

    • There are two main reasons, why we use sep2016 data: first are jet energy corrections, which official version is for 23Sep2016, which JET MET confirmed is ok for my analysis , the second reason is that WPs and SFs for b- and c- tag were also reveived usign 23sep2016 data .

  • If the event has a c-flavour genjet with pT>40 but pt(Z)<40 GeV is the event classified as Z+light ?? Or is it classified as a bkg event ?

    • Cut pt > 40 GeV is applied to both to Z and jet, so events, when either Z or jet has < 40 GeV , are not taken into account.

  • N is the total number of MC events in the sample ? It should be total N(positive)- total N(negative) to rescale correctly to the lumi.

    • Yes, number of events for rescale is calculated as number of positive-weight events - number of negative-weight events.

  • Fig 11 - Can you comment on the shape differences shown in some of these plots ?
    • Difference between data and MC after cuts on b/c-tag discribinators is taken into account by applying SFs. But the SFs are calculated for fixed paramaters / WPs, so in this case no SFs are applied, since the discriminator distribution itself doensn't correspond to any WP.

  • why DeltaR <0.5 and not 0.4 as customary now with run2 0.4 jets ?
    • This parameter will be changed to 0.4 in next AN version (a remnant from another analysis).

  • 62-68 Explain how you classify the selected events in Z+b, Z+c and Z+light. Here you only specify the c-tagging but I think you first apply b-tagging criteria to classify Z+b events.

    • Events are classified according to central jet hadron flavour. There can be only one jet-tag at a time, c-tagging isn't applied after b-tagging.

  • If the event has a c-flavour genjet with pT>40 but pt(Z)<40 GeV is the event classified as Z+light ?? Or is it classified as a bkg event ?

    • In this case the event is not taken into account at generator level. If there is Z+c-jet at reco level, but either pt(gen Z) < 40 or pt(gen jet) < 40, this event goes to background. It can be seen on background plots (figure 18 in AN v6), when events , with Z/jet pt close to threshold , are above this threslod at reco level, but do no exceed it at gen level.

-- AntonStepennov - 2018-10-12

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf data_el.pdf r1 manage 13.7 K 2019-10-16 - 14:43 AntonStepennov  
PDFpdf data_mu.pdf r1 manage 13.6 K 2019-10-16 - 14:43 AntonStepennov  
PDFpdf hYstar.pdf r1 manage 18.3 K 2018-10-16 - 19:33 AntonStepennov Ystar distribution, c-tag applied, NLO cross sections used
PDFpdf hYstarNNLO.pdf r1 manage 18.3 K 2018-10-16 - 19:33 AntonStepennov Ystar distribution, c-tag applied, NNLO cross sections used
PDFpdf hYstarNoTag.pdf r1 manage 18.2 K 2018-10-16 - 19:33 AntonStepennov Ystar distribution, no tag applied, NLO cross sections used
PDFpdf hYstarNoTagNNLO.pdf r1 manage 18.2 K 2018-10-16 - 19:33 AntonStepennov Ystar distribution, no tag applied, NNLO cross sections used
C source code filec kFactorsFit.c r1 manage 16.8 K 2018-10-16 - 12:14 AntonStepennov K-factors obtained with shapes fit for Ystar with c- and b- tags.
Edit | Attach | Watch | Print version | History: r19 < r18 < r17 < r16 < r15 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r19 - 2019-10-16 - AntonStepennov
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback