## Answers (paper draft v1)

### Juan Pablo

• have you applied the ctagging SF's you mention on L123 on figure 2 ?
I understand that this SF is applied to the MC before the unfolding procedure
• Yes, ctagging/mistagging SFs are calculated separately for each event and the whole weight is multiplied by them, before filling any histogram.

• L10: decays into neutrinos -> decays invisibly into neutrinos
• Fixed

• L21 : There is no need to say that measuring the differential cross section is the main goal (the abstract is already there to mention it). So, I would change
--> The goal of this analysis is the measurement of the differential cross section of Z+c jet production as a function of pT of the Z boson and c jet. This is done in several steps
by
The measurement of the differential cross section of Z+c jet production as a function of pT of the Z boson and c jet is done in several steps.
• Fixed

• L64 : ppinteractions -> pp interactions

• L127 : are neutrinos excluded in the gen level jets ? If that is the case , may I suggest to mention it in the paper draft ? Here is an example of the way I would mention it
" Generator level jets are built from all showered particles after fragmentation and hadronization (all stable particles except neutrinos) and clustered with the same algorithm that is used to reconstruct jets in data "
• I have to check this. In analysis, I don't check overlap of gen jets with generator neutrinos, so if they are excluded, this is done by clustering algorithm, not manually in the analysis. This should be done before mentioning in the text.

• L 137: which corrects normalization of bottom -> which corrects for the normalization of the bottom
• Fixed

• L 140 -> along with normalization of charm -> along with the normalization of the charm
• Fixed

• L156 : last sentence is the same is in L133-134. I suggest not to repeat it.
• Fixed

• L169: Efficiency of selections is taken into account by acceptance -> The efficiency of the selection is taken into account by the acceptance
• Fixed

• L170: Nominator distributions is -> The nominator distributions corresponds to the
• Fixed: -> The numerator stands for the ...

• L171: For denominator stands generator... -> The denominator corresponds to the generator...
• Fixed

• L172: Fig 5 shows acceptance ... as a function of ...-> Fig 5 shows the acceptance ... as a function of the ...
• Fixed

• L173 : Efficiency of c-tagging -> The efficiency of c-tagging
• Fixed

• L177 : and then repeating the unfolding procedure, acceptance... -> and then repeating the unfolding procedure. The acceptance ...
• Fixed

• L 191-196 sound quite general to me and I am not sure this is what we are looking for in this particular paragraph but I understand this goes along Joel's comments/suggestions on https://twiki.cern.ch/twiki/bin/view/Sandbox/DifferentialZcJet ( Joel comments : "Described as it is done on correspondign btagging twiki page: methods used for measuring SFs for different types of tag/mistag" ).
• Maybe this part, which describes methods for measuring SFs , should be moved to object reconstruction and event selection section? There is a paragraph, which describes deep CSV algorithm, there we can mention, that there are scale factors, which take into account efficiency etc... And then in uncertainties section just mention, that efficency scale factors can be varied within systematic uncertainties.

• L 191: Depending of the type of jet ... -> [again I am not sure this is appropriate for the paper draft]
Different measurements (each of them enriched on each particular flavor of interest) were performed to estimate the data and MC efficiency difference for each flavour of the jet passing the c-tagging requirement: for b-quarks a tag-and-probe technique was on used ttbar events, W+jets sample was used for c-quarks and an inclusive jet measurement for light jets. Depending on the jet flavor, the corresponding tag/mistag scale factor was varied with respect to the nominal value within the recommended range given by each performance measurement.

• L 212 missing table number
• Fixed

• L 236 Obtained results -> The obtained results
• Fixed

### Elisabetta

• The abstract should be written in proper Latex (fix fb-1, mll). In the first like should it be better "a Z boson and at least a jet..."?
• Fixed

• swap references [3] and [4] in References (both in time and sqrt(s) it would fit better)
• Fixed

• line 27: I would not mention here details on Convino, just replace the last line "and to compare with predictions from QCD".
• Fixed, added "...and to compare with predictions from different MC generators"

• 64: fix space in pp interactions
• Not sure what is wrong here, Joel added special character \pp.

• 155: "overlapping": is there a DeltaR cut, or how is it done?
• Fixed: Jets, overlapping with one of two signal leptons from Z-boson in cone $\Delta R < 0.4$ are not taken into account.

• 170: "The nominator distribution is the generator..."
• changed to

Numerator stands for generator level Z-boson or c-jet $p_T$ distribution ...

• Table 4: too many digits, do you need the 3rd digit after the comma?
• Fixed

• 191: I do not see an "uncertainty" here, only a descritpion of the method.

• 193: fix ttbar
• Fixed

• 196: also here I do not see an uncertainty, but just a description of
the correction

• 204: how large are these uncertainties?
• Added uncertainties values - 5% for electrons and 2% and 1% for muons. Maybe we should add another table with uncertainties summary, which shows max and min deviations up and down (previous version of table) ?

• 212: fix Table number
• Fixed

• 219: add reference to Convino here
• Done

• 223-224: it still needs more physics and comparison. The PDF used in the MC
should be quoted. Is there a problem to add MCFM with different PDFs at least at
parton level, like in SMP-19-004 (Duong's paper)?

• Somewhere in the figure 6 there should be the kinematical cuts, but we can rediscuss this
at the pre-approval, as everybody has different opinions on this. I still did not understand
from the paper and from your explanation in the twiki to which gen jets you correct,
for instance do they have some kinematic cuts in eta or not? And the gen leptons, do they
have eta cuts? Your acceptance around 20% makes me think that you have some more kinematic cuts on
both jets and leptons than what written at lines 151-157. I think this should be clear, both in the
pre-approval presentation and in the paper, to what you correct to.
• Yes, cut on eta for gen jet wasn't mentioned in draft, now fixed. The kinematic cuts for leptons and jets are close to those, which were used for detector level selection, small acceptance is caused by small fraction of c-jets, passing tight c-tagging.

• Figure 6: caption should be more extensive and explain better lines, uncertainties. kin cuts, etc.

• ref. [10] still authors name written in different style.
• Fixed

• fix Sj\"ostrand name in ref. [19]
• Fixed

## Answers (paper draft v0)

### Elisabetta

• Abstract: it has to be longer. I suggest that you start like the first paragraph that you have in the conclusion now, and you end saying that the resulting differential cross sections are compared to predictions from various Monte Carlo models.
• updated

• page 1: something happened to Fig.1, last time I saw it it was ok.
• compiled without problems, could be some temporary bug

• line 53: is alphas(mZ) really 0.130? Also what do I learn from the two matching scales of 19 GeV and 30 GeV
• According to NNPDF 2.3 [21] it seems, that central value for alphas(mZ) = 0.119 in nnpdf2.3. Fixed

• Section 5: I would leave lines 132-138 as they are now, but change: 6 Background subtraction in 5.1 Background subtraction 7 Unfolding procedure in 5.2 Unfolding procedure
• This 3 sections (5, 6 and 7) were changed a bit: chapter 5 was small and changed into part of introduction, bacgkround subtraction and unfolding - chapters 5 and 6 respectively. In my opinion background subtraction looks like separate step, independant from unfolding procedure, thus is in separate chapter. What do you think?

• 178: I think that there is some confusion between acceptance and efficiency. Acceptance for me would mean correct to the overall kinematic region, while efficiency is the correction inside the kinematic region, but it is a question of taste. I think you mean the correction to your kinematic region at gen level.
Anyway:

1) here it is signal at reco/signal at gen

2) In the note at lines 250, 251, it is:
signal reco+gen/signal gen

3) and in Figure 34 of the note again another definition which
I do not quite understand

So what was done exactly? Independently of how you call
it, it should be correct. So the unfolding takes into
account resolution effects from one bin to another one
and so migrations. But still I would say that rec/gen
is still the correct definition, or not?

• There are 4 possible pt distributions: 1) signal gen pt, whithout any reco lvl requirements, 2)signal gen pt, which are matched with corresponging objects at reco level, which pass our reco level selection criteria 3) reco level pt for Z or c-tagged jet, which pass reco level criteria, without any gen level requirements 4) reco level pt for Z or c-tagged jet for events, that are not matched with signal events at gen level. Fraction 2)/1) is defined as acceptance. Fraction 4)/3) is defined as background. Reco level selected events are multiplied by (1 - background), then transformed with unfolding, using response matrix, then devided by acceptance.

• Figure 5: I think this is all simulation, so it should be marked as CMS simulation
• Fixed

• 184-185: if I understood from the note, you not only repeat the unfolding, but before also the extraction of the scale factors for charm and beauty and this should be written.
• Fixed
• Table 5: I am also confused by table 5, I thought I unerstood it but now I am not sure. Could it be that you vary something up or down and the numbers indicate the range of variations in the bins of that variable? Whatever it is, it should be clarified and maybe reformatted, for instance exchanging columns with rows and putting as rows i.e. QCD down variation, QCD up etc....
• changed to integral difference from central value

• 216: here N_i is the number of corrected events, but where is the acceptance correction here? Why not defining an A_i symbol and add it in the formula?
• Definition of N_i fixed: now it is the number of events in bin i of unfolded distribution. This already takes into account acceptance.

• 219: "The results are extracted separately for the muon and electron channels..."maybe add that they are compatible? ".. and combined by a fit using the Convino [33] tool..., taking into account the statistical and correlated/ucorrelated systematic uncertainties..."

• Being not familiar with the Convino tools, it would be good to have some details in the note. For instance some uncertainties are correlated (i.e. c-jete energycale), some not (leptons,..) and I guess that this has been taken into account.

• 225: Of course more discussion is needed to complete this part.

• 227: "production" mispelled
• fixed

• 237: It would be good to have a final sentence that these data will be useful to constrain charm PDFs or something like this.
• Added some general sentence that existing constrains can be improved.

• Acknowlegments missing

• References: [9] some problems in the names, not clear why the first names appear

### Joel

• - (l.119) I have guessed a bit here: still needs some work e.g. do you actually check the quark originator or just go by hadron flavour?
• Yes, we use hadronic flavor definition, thats what was recommended on SMP VJ meetings.

• - (l.131) Could this go as a paragraph in the introduction?
• That's a good idea, this section is very small, I put this to the end of introduction section.

• - (l.140) It is not clear what you do with $SF_c$. Is it just for display purposes e.g. Fig 3?
• charm SF is used to show the agreement between data and MC after applying it along with SFb to corresponding flavor components. For unfolding, only SFb is used for normalizing bottom component.

• (l.162) I think you probably need to mention the matching here

• Fixed.
• - (l.163) What happens if you have multiple c-tagged jets?
• We take into account only the leading central c-tagged jet at detector level, and only leading central c-jet at generator level.

• - (l.170) Is this (background) calculated from data or just from simulation?
• Background definition uses generator level information, so it can be calculated only from simulation (MC).

• (l.171) The next two paragraphs still need work.

• (l.176) Is efficiency incorporated into the response matrix?
• Not in our case, response matrix shows, how some spectrum is changed because of detector resolution: it takes one distribution and changes to another keeping the integral unchanged. Efficiencies are taken into account in acceptance.

• - (l.177) Do we really need to define both acceptance and efficiency?
• Acceptance is the part which takes into account different efficiencies (selection, c-tagging, etc.)
• (l.189)Should mention how much the scales are varied

• Fixed: mu_r and mu_f varied within 0.5 - 2

• (l.190)Please complete - the important thing is what prescription is used, not the technical detail that this is done via weights

• Described the way it is done in other papers:

The PDFs are determined using data from multiple experiments. The PDFs therefore have uncertainties from the experimental measurements, modeling, and parameterization assumptions. The resulting uncertainty is calculated according to the prescription of CT14 at the 90% confidence level and then scaled to the 68.3% confidence level.

• (l.193) Again, needs a description of how the values were estimated, not the technical detail of weighting

• Described as it is done on correspondign btagging twiki page: methods used for measuring SFs for different types of tag/mistag.

• (l.212) I have to admit I can't work out what is going on with the table - perhaps a different format is needed?

• New table added, as suggested by Elizabetta and Juan Pablo, it shows integral deviation from central value in %.

• - (l.225) Needs a discussion of the results/comparisons

• We're still waiting for Sherpa sample to be added, maybe we should add this discussion after all 3 signal models are there

• - (l.232) Isn't there a different cut on the lower lepton pt?
• yes, that's a mistake, subheading lepton pt > 10 GeV.

## Answers (paper draft)

### Elisabetta

• Title and abstract are missing

• Introduction, in my opinion it should be structured in 3 paragraphs:
- why Z+c is interesting, you have it already
- previous measurements
- this measurement, what is new. Also I am not sure that you need all details of all
kinematic cuts at this point.
• Fixed

• Fig. 1, there is a strange gray background. I like this diagram when it is drawn more "rectangular" .
• Fixed

• Lines 108-109. when you talk about c- b-tagging, this part should be expanded and the "tight" point should be defined, usually this is given in terms of the fake rate.
• Fixed

• 113-121 I would move this part on the generator level later, when you talk about unfolding. Please also specify that the leptons are dressed and if parton or particle level jets.
• Fixed. It was specified, that generator leptons pt was corrected to take into account radiated photons in cone of radius dR = ...

• I would make 5.1 its own section and describe how you extract the c-component more in detail, i.e. from a fit to the M_SV distribution, which btw has also to be defined precisely. The name k-factor at line 134 is confusing and probably you also do not need it, you can avoid it or call it with another name.
• Fixed. K-factors were replaced by SF_c and SF_b.

• Fig. 2: you show the pre-fit distribution I guess. Why not the one after the fit? Also:
- use less bins
- use CMS style for figures (see guidelines), all labels must be bigger, CMS is missing on the plot, same
for lumi and sqrt(s), i.e. follow guidelines.
• Fixed

• Then after explaining the fit to extract the c-jet contribution, you can go back to the beginning of Section 5 and explain the cross section that you want to extract, line 123-126 and that you do everything in bins of ptZ, pt-cjet.
• This is now explained twice: there is a short chapter which gives an overview of analysis strategy, then following chapters descibe the process of subtracting backgrounds, unfolding and measuement of cross-section using unfolded distribution.

• Lines 126-129 I would move them to a new section and there also explain the gen cuts you have now at lines 113-120.
• Fixed

• In summary:
- section: first explain the fit
- section: explain which cross section you measure in bins, eventually other backgrounds like top etc.
- section: then section on unfolding to gen level and explain what gen level is
• Fixed

• Your captions are also all not CMS style, there should be only 1 caption explaining all (a) (b) (c)...
• Fixed

• Figure 3: do you need it? Can these numbers and uncert. be in a table?
• Plot removed, k-factors presented in table

• Figure 4: too many bins, please reduce - use CMS style etc.
• Fixed

• Figure 5: do you need it? why not put the numbers in a table? It is clear also that the background shoots up at low ptZ, it needs some explanation in the text. Figure 6: do you need or can the numbers be in a table, i.e. a combined table with k-factors, background and acceptance? The shape of the acceptance needs an explanation in the text.
• There are too many bins for this plot, maybe showing on plot is more compact then table in this case.

• Section 5.3, make a own section. Do not make subsubsection for each systematics, just paragraphs. For the c-tagging efficiency scale factors are mentioned, but they were not mentioned before, this must be first mentioned in the selection. This also for leptons and b-jet scale factors and also it must be written how they are determined (you can find it in many other papers). ttbar backgorund is here mentioned for the first time, it should be also before.
• Fixed

• Result, should be its own section. Formula (1) should have N(p_tbin) and not dN/dpt in the numerator and all the formula could be written better. What about distributions also in eta, no intention to produce them?
• Fixed

• Physics is missing! Comparison to MC with a couple od PDFs, with details on them, especially on the HF scheme.
• PDFs uncertainties will be added to two madgraph models. Sherpa event generator is to be added.
• Figure 7, again not in CMS style, it has to be redone. In addition I find cofusing that in the ratio the dots indicate MC
• Fixed

### Juan Pablo

• Fig 6. I think you will be ask to add some uncertainty to the predictions ( typical are statistical , PDF and scale variations [but Z+c at 8 TeV has no scale in the LO calculations] ) so start working on it (whenever you have spare time ... not priority for now... the priority is just adding/modifying the text).
• In progress

• Fig. 6 again: Do you guys have an idea why the LO gives a better normalization (may be not in shape) while we see that NLO gives always better performance in our Z+jets (you do not have to know the answer of course, just open question) ? Is this something we do not understand at GEN level, gluon-splitting related may be?
• In progress

• L16. Jets with charm quark content are identified using (standard?) charm tagging methods developed in CMS [reference] where the presence of c quarks is inferred from the characteristics of jets (denoted as c jets) that originate from their hadronization products and subsequent decays.
• fixed

• L 50. This generator calculates LO matrix elements for five processes: pp -> Z + Njets with N = 0...4.
• Fixed

• Section 3. Forgot to mention that the predictions use PYTHIA for the hadronization.
• Fixed

• L113 : may be add a reference ( see reference 37 in Dan's paper above) : CMS Collaboration, “Measurement of the Inclusive W and Z Production Cross Sections in pp Collisions at $\sqrt(s)$ = 7 TeV ”, JHEP 10 (2011) 132, doi:10.1007/JHEP10(2011)132, arXiv:1107.4789.
• Fixed

• L116: "algorithm [25], using tight working point, which ... passing this criteria" . Working poing is jargon. Remove and instead put -> algorithm [25]. The threshold applied to discriminate c-jets from b-jets and light-jets gives a c tagging efficiency of about 30% and a misidentification probability of 1.2% for light jets and 20% for b jets.
• Fixed

• L128. Please mention that a generator level leptons are dressed.
• Fixed

• L 218 :feducial
• Fixed

• L219: comment a bit about agreement disagreement seen in shape/normalization with different predictions. I think NLO is better in shape than LO but LO is better in normalization than NLO, right ?
• In progress. There will be also sherpa event generator, once all 3 generators are compared, we'll add conclusion, which one describes data better. We're also checking predictions of number of jets at gen level for different flavors to find out what could cause the difference. Will be added to AN soon.

• Did you evaluate the LO cross section to next-to-next-to-leading order (NNLO) calculation computed with FEWZ[*]? If so , mention it .
• NNLO xsection value was used for both generators (5765 pb). Fixed.

• L17-19: you define you fiducial region here, can Z-ee and Z-mumu be combined when having different pt_lepton cut ? I guess so because in L120 the fiducial pt cut is 26 GeV
• Same cuts were used for leptons at generator level in both channels.

• L111 , 112 : different properties of the jet, such as secondary vertex and tracks -> put here that it accounts for displacement and long lifetime of particles w.r.t. light but no so long as b (I might come with a suggestion if I do not forget about it)

• Fig 4. Too course binning here. Use less bins ( in fact I would just use the same number of bins as in fig. 3).
• Fixed

• L161: at detector level -> I would say at reconstruction level (sometimes I use detector level to refer to gen level but may be it is just me)
• Fixed

• L195: this is the first time you talk about lepton scale factors ( mention in section 4 what they are: lepton identification, isolation, trigger etc with, mention how they are computed :tag and probe with the Z,and mention how this is used your analysis: via weights and add a reference etc)
• Fixed

• L208: channels were combined by a fit -> which fit ? I guess convino as you mention in L129. Can you put convino as the reference there in L208?
• Fixed

• L209: taking into account statistical and theoretical uncertainties -> you should also consider systematic uncertainties in the combination (as recommended by stats. commitee if I am not wrong), did you get in contact with the stats. commitee already ? In orther words , did you fill the stats. questionnaire ? Ask them in case you have doubts . We talked about this in your last presentation.
• Stats. commitee reommended using Convino, which takes into account both stat ans syst uncertainties. Stats questionaire will be filled soon.

• Fig 7.: I do not like the fact that your k-factor binning is not the same as the final binning. The bottom k-factor does not seem to be flat with pt(c-jet) on figure 3 (b, top plot)
• K-factros can't have as fine binning as for pt distributions, because fitting SVM for k-factors requires more statistics.

### Elisabetta

• at line 55 there is a cut of 20 GeV on the jets, while it is 40 GeV at line 49, any reason for that?

• 20 GeV is a threshold for muons, which are checked for presence inside the jet.
• for the cuts on lines 66,67 for the discriminants, is there any study, justification how they were chosen?

• I am not sure that I understand the data-MC comparison in Fig. 9b and in Fig 10b. In Fig 9b the data agree with the overall sum of MCs well. In Fig. 10b they agree less and the figure caption does not help. Is Fig. 10b after applying the kMC factors? Because the agreement looks worse.

• In fact 9b and 10b are two different plots: 9b shows comparison between all data and MC , and 10b represents (data - Top/Dibosons) and Drell-Yan. However, plots at figure 9 were produced with wrong Drell-Yan normilization: for these two figures I used NNLO Drell-Yan xsection values - 4578 pb, 851 pb and 335 pb for DY 0,1 and 2 jets (I was trying to reproduce Duong's results with NNLO and forgot to change xsections back to NLO while making these two plots), and for the rest of the plots in the AN standart (NLO) xsection values were used - 4754 pb, 888 pb and 348 pb for DY 0,1 and 2 jets. I'll replace plots at figure 9 in new version. Two versions of Ystar with no tag and c-tag with NLO and NNLO xsections are in attachment.
• In general, the method to extract the kMC factors is based only on number of events. It would be much better to take a distribution which is sensitive to c- and b-tagging and fit that as sum of the 3 components to extract these kMC. I would recommend to try it, it should not be very complicated.

• We used RooFit to obtain k-mc factors from shapes fit ( see kFactosFit.c in attachment). Is was done as simultaneous fit of two distributions - Ystar with b- and c-tags, with k_MC-factor for light component fixed to 1. As a result, k_mc factors for b and c components were equal to 0.78 and 1.03 respectively , which is consistent with the results, obtained by solving equations with numbers of events.

• in the closure test, are the same events used or which events are used?
• Yes, one sample was used to calculate response matrix, background and acceptance and in closure test. Result of appying unfolding procedure to anther sample in closure test a priori won't coincide with generator level distribution from original sample, so it will be impossible to say, whether this difference is caused only by statistics or some errors in the unfolding procedure.

• I am surprised that the pileup has such a large effect on the last 2 bins for the c-jets, unless it is just statistics.

• These effect is seen for most of the uncertainties, not only pileup, because of the statistics.

• - what would happen if at page 14, first formula, you would take
the N_data-Top/Dibosons-light tagged= k_light*NDY,light,light-tagged+
k_c*NDY,c,light-tagged+
k_b*NDY,b,light-tagged
• Light tagging requires anti-b and anti-c tagging. However, there are no anti-tag SFs, so the number of events in this modified equations can be incorrect, so that the result of the equation solving wouldn't be correct.

• it would be good to have comparisons before and after k-factors, similar to Figs
14-16 of the note, for all the possible cases, for the moment for instance I do not
understand what Fig. 19 is, are k-factors applied?
• These figures show SVM before applying k-factors, additional plots and descriptions added in AN.

• It is indeed not good that using the SVfit mass the k-factors come out so
different including or not SV jets. Concerning the tagger, did you get any feedback
from BTV on the best tagger to use for c-jets?
• There is feedback from Juan Pablo, who suggested, that there are problems with modeling of reconstruction of SV, and he also has no objections to method 1 - equations solving method.

• make the selection similar to Duong's selection and compare to their k-factors

• at line 209 you write that the light fraction has a normalization fixed to 1, I can't recall what

Juan Pablo does. What happens if you leave it free, is it not possible to constrain it, maybe

due to the different shape? Do you have plots in the note showing results of the fit?

• The result of the fit for light component is close to 1 (if one doesn't fix it), so this component was kept fixed. In other analysis it was done the same way, as I understan, Duong finds k-factors only for charm and bottom too. There are figures 27 and 28, showing agreement between data and MC after applying these k-factors. There are also plots 25 and 26, which show measured k-factors as funcitons of pt of Z-boson or c-tagged jet.

• I understand from the answer to Juan Pablo that you do not have a pt cut on the

generated leptons. First of all I guess now you are correcting back to dressed leptons. Then it is

always better to correct back to a fiducial region which is close to the experimental

one, not to have huge unknown acceptance corrections. It would be good to add in the note

the exact fiducial region and in principle genjets and genleptons should have kinematic pt, eta cuts

similar to the reco ones, + the invariant mass Mll gen cut is needed.

• We have added leptons pt and eta cuts, close to those used at reco level : leading pt > 26 and subleading pt > 10. In new AN version it is stated, that we measure fiducial cross-secion.
• It would be good to understand some of your systematics like:

- Fug. 44 (c) - why pt c-jet high at high pt and only in muon channel

• it seems, that in last bins there may be large statistical fluctuations, so if there are few events, change of one parameter can lead to large change for distribution. Plots data_mu and data_el in attachment show, how these fluctuation can appear varied distribution to central distribution fraction.
• - Fig. 45 (b) - why pileup high only in electron channel

• To be understood.
• - Figs 46 (b) and (d) what is happening in the eID so bad compared to the muon ID?

• It must be an error: for electron pt~65 and eta~-1.5 the efficiency and error (GetBinError) are equal to 1! So the weight changes by 100%. Will ask electron pog. Update: For electrons SFs depend not on electron eta, but on electron supercluster eta, so that bin should be skipped.

### Juan Pablo

• +) about the c-tag/mistag efficiency. You explain in eq. 5 how you apply the weight for the SFc as recommended. Let me explain what I do on my code.

Let's imagine that the SFc for your ctagger-T is 0.92 +/- 0.06 +/- 0.01 (overall, then there is the "file" in bins of pt of the jet but for simplicity let's get the single number here). What I do is weightMC *= 0.92; This lowers the MC and improves my data/MC agreement. Could you please check that this is equivalent to the procedure you describe and follow ?

• The eq.5 looks a bit complicated, because there are 2 weights, which improve data/MC agreement: one improves data/MC agreement for B-F samples and another weight takes into account difference between data and MC in G-H. Since there is only one set of MC samples, the weight to be added to weightMC is made of these two weights, proportional to luminosity of each subset, that what eq.5 tells. In case of tag (when c-jet passed c-tag) there is no such partition so the MC weight is simply multiplied by the corresponding tag efficiency SF .

• You use "HLT Ele27 WPTight Gsf" for the Z->ee. When I use the W-> e for charm tagging purposes I go up to the "HLT Ele32" because I do not have to deal with unprescaling (may be you do not have to either at "HLT Ele27", did you cross check ?)

• According to

https://www.epj-conferences.org/articles/epjconf/pdf/2018/17/epjconf_icnfp2018_02037.pdf

HLT ele27 is unprescaled, as I understand, that means, that each time, when the trigger fires, the event is recorded, so there is no SFs to be applied to take into account lowered events recording rate.

• Your offline cut is 28 GeV. The usual is to leave 2 GeV to make sure you are far from the trigger turn on. Beware you might be asked during the publication process to move you offline cut to the standard "trigger_cut + 2" GeV.

• The electrons part of the analysis is to be done with 29 GeV cut, to fit the usual way of selecting electrons with this trigger.

L120: "signal muons or muons, which ..."
Regarding the purpose and details of the muon-jet cleaning procedure : is this to remove Tight-ISO-muons in a delta_R (jet,Tight-ISO-muon) < 0.4 ? The fine but just to make sure this is not to remove Tigh-NONISO-muons in a delta_R (jet,Tight-NONISO-muon) < 0.4. If you remove also NONISO, it is up to you but you must already know that in the range 15 GeV < pt_NONISO_muon < 25 GeV you have a chunk of signal. Nothing to worry though.

• We remove only jets, which overlap with isolated muons. In this case we don't remove signal events.

Can both results be combined even when having different pt cuts. Which is you fiducial region (I mean your cross section is defined for letpons with pt> XXGeV or for Z with pt > XX GeV)? I must have missed it.

• We have switched to new signal definition, which include cuts for pt and eta for leptons (leading pt > 26 and subleading > 10) to match reco level selections. Thus we measure fiducial cross section of the process.

### Isabel

• Do you apply any matching between the trigger object (the one that fired the single muon trigger) and the reconstructed muons ?
• No, there is no matching between the trigger and muons, in order two take into account efficiency for two muons, combinatory formula was used.

• What is your definition of a c (b) jet at generator level (L 73)?
• We use hadron flavor for gen jets, using the same algorithm, used for reco level jets. In .py file :
from PhysicsTools.JetMCAlgos.AK4PFJetsMCFlavourInfos_cfi import ak4JetFlavourInfos
process.genJetFlavourInfos = ak4JetFlavourInfos.clone(
jets = cms.InputTag("ak4GenJets")
)

genParticles = cms.InputTag("ak4GenJets"),
jetFlavourInfos = "genJetFlavourInfos"
)

genParticles = cms.InputTag("ak4GenJets"),
jetFlavourInfos = "genJetFlavourInfos"
)



and inside .cc file:

(*genJetFlavourInfos)[genjetref->refAt(ijet)].getHadronFlavour()

• In the case of b(c)-tagging you do not correct data and MC separately but keep the analysis at the (let´s say) uncorrected data level and apply the corresponding b(c) tagging SF. I find this treatment quite asymmetric. I would treat lepton and b(c) tagging efficiencies on the same footing.
• Muons id, isolation and trigger efficiencies are dependant on data samples, thus data was reweighted with respect to this dependance . C- and b- tag match/mismatch also depend on data samples, however, it is impossible to calculate these efficiencies separately for data and MC (we can't define hadron jet flavor for data ), so the scale factors for MC were composed of two scale factors corresponding to two sets of data samples.

• Are b(c) tagging SF applied in Figs. 4 to 8 ? Specify it in the text/caption.
• Yes, c-tag/mistag SFs are taken into account for these plots, will specify it in the text.

• Maybe you can test the ttbar MC description with a control sample in the emu channel.
• We didn't save muons in tuples, so this can't be done soon.

• Can you describe in detail how you are treating systematic uncertainties in the c(b) tagging (light mistagging) scaling factors ?. You are following the recommendations of the b-tagging group, but it would be good to have it explained also here. How do you treat correlations among the different SF ?
• All necessary formulas and conditions for pt, eta and discriminator can be found her https://twiki.cern.ch/twiki/bin/viewauth/CMS/BtagRecommendation80XReReco. The systematics taken into account by changing all formulas, used in calculation of tag/mistag SFs, to formulas corresponding to uncertainties up/down. For example, if one wants to get distributions with SF uncertainty up, the weight for event with c-jet is calculated according to formula from https://twiki.cern.ch/twiki/pub/CMS/BtagRecommendation80XReReco/ctagger_Moriond17_B_H.csv with "comb" measurement type , tight working point, and formula, selected according to the pt of c-jet. We don't take into account correlation between differnt SFs.
Update: I found an error in my code: scale factors for c-mistag for b-jets were equal to 0 in some cases, added detailed description, how SFs are calculated in the AN.

• Fig. 19 (acceptance) it has a funny shape, can you give more details of which cut(s) are most relevant in the different pT regions ?
• This shape is similiar to the shape of c-tag efficiency, so it seems, that the most contribution is from c-tagging .

• Same for fig. 18. I am a bit surprised for the first point in the left plot. Does it come from unmatched reco dileptons with gen dileptons or from unmatched reco c-jets to gen c-jets ? May I assume a ~100% correlation between fig. 18 left and right ?
• This shape comes from pt migration of Z / c-jet, so this form is expected for pt of any objects, without reference to correlation.

• As already suggested at the SMP-COM meeting, the sensitivity to pdf should be assessed. Probably a study, similar to the one [1] can be tried. This can also help to define the optimal binning in terms of Yb, Ystar.
• We can try to do this study, current Yb and Ystar binning is optimal for differential cross-section as a function of Yb and Ystar, since this partition was chosen so that number of events was of the same order and statistical errors were the same for different bins.

### Paolo

• I am still a bit unhappy that you use Sep2016 and promptreco… Sorry can you remind me here again the details of what prevents you from using a

more recent reprocessing of data ?

• There are two main reasons, why we use sep2016 data: first are jet energy corrections, which official version is for 23Sep2016, which JET MET confirmed is ok for my analysis , the second reason is that WPs and SFs for b- and c- tag were also reveived usign 23sep2016 data .

• If the event has a c-flavour genjet with pT>40 but pt(Z)<40 GeV is the event classified as Z+light ?? Or is it classified as a bkg event ?

• Cut pt > 40 GeV is applied to both to Z and jet, so events, when either Z or jet has < 40 GeV , are not taken into account.

• N is the total number of MC events in the sample ? It should be total N(positive)- total N(negative) to rescale correctly to the lumi.

• Yes, number of events for rescale is calculated as number of positive-weight events - number of negative-weight events.

• Fig 11 - Can you comment on the shape differences shown in some of these plots ?
• Difference between data and MC after cuts on b/c-tag discribinators is taken into account by applying SFs. But the SFs are calculated for fixed paramaters / WPs, so in this case no SFs are applied, since the discriminator distribution itself doensn't correspond to any WP.

• why DeltaR <0.5 and not 0.4 as customary now with run2 0.4 jets ?
• This parameter will be changed to 0.4 in next AN version (a remnant from another analysis).

• 62-68 Explain how you classify the selected events in Z+b, Z+c and Z+light. Here you only specify the c-tagging but I think you first apply b-tagging criteria to classify Z+b events.

• Events are classified according to central jet hadron flavour. There can be only one jet-tag at a time, c-tagging isn't applied after b-tagging.

• If the event has a c-flavour genjet with pT>40 but pt(Z)<40 GeV is the event classified as Z+light ?? Or is it classified as a bkg event ?

• In this case the event is not taken into account at generator level. If there is Z+c-jet at reco level, but either pt(gen Z) < 40 or pt(gen jet) < 40, this event goes to background. It can be seen on background plots (figure 18 in AN v6), when events , with Z/jet pt close to threshold , are above this threslod at reco level, but do no exceed it at gen level.

-- AntonStepennov - 2018-10-12

Topic attachments
I Attachment History Action Size Date Who Comment
pdf data_el.pdf r1 manage 13.7 K 2019-10-16 - 14:43 AntonStepennov
pdf data_mu.pdf r1 manage 13.6 K 2019-10-16 - 14:43 AntonStepennov
pdf hYstar.pdf r1 manage 18.3 K 2018-10-16 - 19:33 AntonStepennov Ystar distribution, c-tag applied, NLO cross sections used
pdf hYstarNNLO.pdf r1 manage 18.3 K 2018-10-16 - 19:33 AntonStepennov Ystar distribution, c-tag applied, NNLO cross sections used
pdf hYstarNoTag.pdf r1 manage 18.2 K 2018-10-16 - 19:33 AntonStepennov Ystar distribution, no tag applied, NLO cross sections used
pdf hYstarNoTagNNLO.pdf r1 manage 18.2 K 2018-10-16 - 19:33 AntonStepennov Ystar distribution, no tag applied, NNLO cross sections used
c kFactorsFit.c r1 manage 16.8 K 2018-10-16 - 12:14 AntonStepennov K-factors obtained with shapes fit for Ystar with c- and b- tags.
Topic revision: r33 - 2020-01-09 - AntonStepennov

 Home Sandbox Web P View Edit Account
 Cern Search TWiki Search Google Search Sandbox All webs
Copyright &© 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback