Analysis Note:

ARC Comments through approval of PAS:

## Comments from Referee EPJC

1. Sec. 4: "For the signal distributions, the W+bb component of an inclusive W+jets sample is used, with the shapes of the distributions taken from a dedicated high-statistics generated sample of exclusive W+bb." It is not completely clear to me what you do here: do you take the W+bb subsample of the inclusive W+jets sample, and then reweight distributions of these events based on those extracted from the exclusive W+bb sample? If so, I would write that more explicitly then.

We had two samples: one which was W+Jets (5F) and one which was W+bb (4F). The first of these was used for the cross section and normalization and the second was used for the shapes.

I think he is right, we need to change what is written and anyway we should not put 5F and 4F in the answer to not mislead him. So I suggest: For the signal distributions, the W+bb component of an inclusive W+jets sample is used, by separating the W+jets simulated sample into three subsamples labeled as W+bb , W+cc , and W+udscg, which are defined below. .... explaining the method ... Leptons that do not originate from the primary vertex are not considered for selection. To further improve the shape description of the Wbb distributions a dedicated high-statistics generated sample of exclusive W+bb is used, while keeping the normalization of the distribution as obtain from the procedure explained above.

2. Sec. 5 par. 6: "allowed to vary within the uncertainties" - If you add Gaussian constraint terms to the likelihood for these uncertainties, I would explicitly write so, here or in sec. 6.

Yes not a strict cutoff

Do not understand your answer, we can add something like - using log norm distributions ...

3. Sec. 5 par. 8: Did you consider an option of simultaneous fitting of the signal and two control regions? Naively I would suppose this might improve the overall precision a bit; if this is not the case (or you don't want to use it for some reason), I think it's worth commenting in the text.

We did try this first, but the shapes are not controlled, etc etc...

Yes we also performed a simultaneous fit and found that it gives worse precision since there are component in the fit that work in different directions, thus compensating each other, that makes fit less stable. Instead we perform step-by-step fit where each component gets some predefined values that at the end help fit to stably converge. Suggest to end at the end of this paragraph. "Similar results can be obtained by performing a simultaneous fitting of the signal and the two control regions. It is found that the b tagging efficiency correction and JES correction have similar effects on the distributions and thus compensate each other effects in a simultaneous fit that reduces its precision. Separating these effects in steps provides better understanding of underlying uncertainties and therefore more precise results."

4. Fig. 2: I am not sure you have a clear message with this figure. If by this you pretend to have studied the kinematical properties of the Wbb system, at least some statement on the agreement with simulation is needed.

We see good agreement in angular separation between well separated b jets and lepton transverse momentum within 10% until 100 GeV with a general falling trend

You need to suggest some text here as I do above

5. Sec. 7 and Table 3: It is not obvious how the uncertainties listed in Table 1 are broken down into (syst) and (theo), e.g. where go the uncertainties of the sample cross-sections? I suggest to clarify what "theo" includes when quoting the final cross-section number.

Sample cross sections are in syst

## Sijin Qian, August 8

In general

(1) The Abstract in Page 0 and the Conclusions Section in Page 11. It is quite surprising that the Abstract and Conclusions Section in v9 are identical; this seems very unusual (if not never) in all 500+ CMS papers published so far. Would you please consult with other CMS papers to see whether you would like to be conformed with them (to avoid this kind of identical)?

• These were different at one point, but I was advised to change and make them the same. We can reword the conclusions, to be discussed during FR

Page 1

(2) This is similar as a previous comment for v5, i.e.

(14) L132-134 (now L13 in v9), at two places, the expression of "a b quark (or hadron)" is looked quite odd similar as "a b c d xxxx". It may be improved by changing from

L132 (now L13 in v9): "a b quark" --> "a bottom quark"

L133-134: "a b hadron" --> "a bottom hadron"

* Your ANSWERs to both cases: ** * changed

= My further discussion: = = In v9, thank you for having changed two places; however, L13 has = a new case of "a b quark", and it will be appreciated if the same = change can be made.

• all instances of "a b xxxx" have been changed

(3) L37, the "eta" should be explained at its 1st appearance in text here, i.e.

"|eta| < 2.1 (2.5)." --> "pseudorapidity |eta| < 2.1 (2.5)."

• changed

Page 5

(4) L193-195, the "muR" and "muF" should be explained at their 1st appearances in text on L194 instead of L195, then L195 can be shortened, i.e.

"The renormalization and factorization scale uncertainties are estimated by simultaneously changing the renormalization and factorization scales, muR and muF, up and down by a factor of two." -->

"The uncertainties on renormalization and factorization scales (muR and muF) are estimated by simultaneously changing the muR and muF up and down by a factor of two."

or

"The uncertainties on renormalization and factorization scales (muR and muF) are estimated by simultaneously changing them up and down by a factor of two."

• explained (and merged with another comment)

Page 6

(5) Table 1, in the header row and the right-most column, it may be looked better if the extra blank line between two lines can be removed, i.e.

"Effect on the measured

cross section" -->

"Effect on the measured cross section"

• gap has been reduced

Pages 7-8

(6) This is a part of a previous comment for v5, i.e.

(17) Figs.1-4 (now Figs.1-2 in v9)

(a) In the legend of each plot, the 2nd and 7th lines, the 2nd words should be in the lower case, i.e.

"Fit Uncertainty . . . Single Top" -->

"Fit uncertainty . . . Single top"

= My further discussion: = = In v9, thank you for having changed the 7th lines; however, the = 2nd lines (i.e. "Fit Uncertainty") in all plots of Figs.1 and 2 = still need to change the word of "Uncertainty" to the lower case.

• changed

(7) Fig.2's left plot, in the horizontal axis label, the extra space between "Delta" and "R" should be removed, i.e.

"Delta R(b,bbar)" --> "DeltaR(b,bbar)"

• changed

(8) This is a part of a previous comment for v5, i.e.

(22) L220-222 (now L226-228 in v9), (a) it may be shortened on L220-221, and (b) a comma at the end of L221 may should be replaced by a word of "and", i.e.

"where NDatasignal is the number of observed signal events, NMCsignal is the number of expected signal events from simulation, NMCgenerated is the number of generated events in the fiducial region, A, epsilon are the acceptance and efficiency correction factors," -->

"where NDatasignal is the number of observed signal events, NMCsignal is of expected signal events from simulation, NMCgenerated is of generated events in the fiducial region, A and epsilon are the acceptance and efficiency correction factors,"

= My further discussion: = = In v9, thank you for having changed for the suggestion (b); however, = the suggestion (a) seems have not been touched yet, i.e = two words of "number" may be saved as = = "where Ndatareconstructed is the number of observed signal events, = NMCreconstructed is the number of expected signal events from = simulation reconstructed in the fiducial region, = NMCgenerated is the number of generated events in the ...," --> = = "where Ndatareconstructed is the number of observed signal events, = NMCreconstructed is of expected signal events from = simulation reconstructed in the fiducial region, = NMCgenerated is of generated events in the ...,"

• omitting the word "number" sounds awkward, will defer for FR

Page 10, Fig.3

(9) In the horizontal axis label, four spaces inside the brackets seem extra and should be removed, i.e.

"sigma( W(lnu) + bbbar ) [pb]" --> "sigma(W(lnu)+bbbar) [pb]"

• changed

Pages 11-14, in the References Section

(10) L313, in [2], to be consistent in this Section, (a) a space should be added before the volume index, (b) an extra index after the year number should be removed, i.e.

"Eur. Phys. J. C75 (2015), no. 5, 212," --> "Eur. Phys. J. C 75 (2015) 212,"

• changed

Other ones which also need to be changed by the similar way are for (a): [35]; for (b): [47].

• changed

(11) L370, in [22], to be consistent in this Section, a space should be added before the "TeV" at the end of article title, i.e.

"= 8TeV" --> "= 8 TeV "

• changed

(12) L401, in [35], to be consistent in this Section, all references should have only one page index instead of two, i.e. (together with the item (10a) above for the spacing)

"Eur. Phys. J. C53 (2008) 473–500," --> "Eur. Phys. J. C 53 (2008) 473,"

• changed

(13) L423, in [42], the volume and page indices as well as a comma (before the doi index) should be added, i.e.

"Phys. Rev. D (2012) doi ..." --> "Phys. Rev. D xxx (2012) yyy, doi ..."

• changed

(14) L436, in [46], to be consistent with other PAS Refs. (e.g. [14] and [16], etc.) in this Section, the city info before the year number should be removed, i.e.

"Physics Analysis Summary CMS-PAS-TOP-14-016, Geneva, 2014." --> "Physics Analysis Summary CMS-PAS-TOP-14-016, 2014."

• changed

(15) L437-438, in [47], to be consistent in this Section and this paper, (a) the fonts of "ttbar", "e", "b" and "pp" in the article title should be changed, (b) an extra space before the "-tagged" should be removed, i.e.

"Measurement of the ttbar(italic) production cross-section using e(italic)mu events with b(italic) -tagged jets in pp(italic) collisions at ..." -->

"Measurement of the ttbar(non-italic) production cross-section using e(non-italic)mu events with b(non-italic)-tagged jets in pp(non-italic) collisions at ..."

• changed

## Darien Wood, August 8

Dear SMP-14-020 authors,

If found the paper to be clear and easy to follow. My comments on draft 9 follow.

Best regards,

Darien

Type B:

1. I would have found it interesting to include the separate results for the different W charges: W+bbbar and W-bbbar. This separation has more physics interest than that between the electron and muon decay modes, which are already included.

• Unfortunately the result can not be changed at this stage.

2. I was puzzled by the definition of the W+ccbar category given on lines 99=101. The motivation for 0,1, and 2 charm jets is clear, but why would three charm jets be categorized as W+udscg, while four would categorized as W+ccbar? I expect the effect is negligible, but the motivation is puzzling.

• Yes, this difference is negligible, especially given that we have a veto on third jets.

Sascha - did you still want to look at this

3. Line 150, "no correlation is observed between $I$ and M_T, ..." It is not clear what sample is used in this correlation study. It would be worth mentioning.

• simulation of QCD with Pythia6, added to text

4. In line 167-168, it is argued that because the signal is calculated at LO, it "therefore is not affected by the jet veto requirement." The actual W+bbbar process may have additional jets, even if they are not represented by the LO calculation. Maybe this should be stated less strongly.

• we have gone through many iterations on this statement, if we try to say "almost not effected" or "minimally effected" people ask for the size of the effect and it is 0, therefore
we left it as it is

5. Line 184: "The uncertainty due to the b-tagging efficiency and the uncertainty due to the JES are allowed in principle to have shape dependencies in this analysis, but only affect the normalizations of the samples in the MT variable in practice." The "in principle/in practice" language is a bit hard to interpret precisely. It would be better to say exactly what you did. According to Table 1, it seems that only normalization uncertainties were used for these effects.

• Correct, we have modified to make it clear "The uncertainty due to the b tagging efficiency and the uncertainty due to the JES are observed to only affect the normalizations of the samples in the \MT variable."

Does the "in principle" refer to studies of the shape uncertainties that found them to be negligible?

• yes, see above

Maybe you don't need to say anything at all about this. I don't find it surprising that the JES and b-tag efficiencies do not affect the M_T shape, but of course it is good if you verified this.

• we were asked at one point to mention this study, and no change has been made to the text on this point

6. Figure 3 caption. It would be helpful to add something more about the five uncertainties shown on the measured points. Are these the four components of the uncertainty and the total uncertainty, or are some of them partial combinations of components?

• these are the four components stat/syst/theo/lumi, caption now reads " The measured cross section is also shown with the total uncertainty in black and the luminosity, statistical, theoretical, and systematic uncertainties indicated."

Type A:

Abstract: The first sentence would be easier to read if you drop "which decays leptonically". The decay modes used are described clearly two sentences later. If you keep this phrase, "which" -> "that".

• changed "which" -> "that", was asked to include "decays leptonically"

Line 13: "Throughout this paper, jets that contain a b quark will be referred to as b jets, and b-tagged jets as the reconstructed objects either in simulation or data which have been identified as such." -> 'Throughout this paper, "b jets" refer to jets that contain a b quark and "b-tagged jets" refer to the reconstructed objects either in simulation or data which have been identified as such.'

• changed to "Throughout this paper, a hadronic shower originating from a bottom or anti-bottom quark is referred to as a b jet, and b-tagged jets are the reconstructed objects either in simulation or data that have been identified as such."

line 57: "It is combined with the pT of a muon or electron passing the identification and isolation requirements to form a W boson candidate and compute its transverse mass, MT." -> "It is combined with the pT of a muon or electron passing the identification and isolation requirements to compute the transverse mass, MT, of the W boson candidate." (I don't think you can say that the lepton pT and the missing pT "form" the W boson candidate.)

• changed

line 71: "b jet identification" -> "b-jet identification" ("b-jet” is used as an adjective)

• we agree, but were also explicitly told not to hyphenate

line 80: "contributing processes" -> "processes contributing"

• line changed

line 91: I think "dedicated high-statistics" can be dropped. It is a technical detail.

• we were asked specifically to include this at one point

line 145; "obtained shapes" -> "shapes obtained"

• changed to "resulting distributions"

## Kajari Mazumdar, August 8

Dear Authros and ARC members,

Congratulations for the interesting work. Please find my comments below for the FR version.

I am very sorry about the delay in sending these.

Kind regards -- Kajari mazumdar

Type B

Throughout the paper, the order for muon and electron needs to be exchanged. Any reference to electron should appear before that of muon, to maintain the natural order.

• have been told to keep mu first, but consistently in all the paper

L73 Is the efficiency of tagging b jets after the kinematic selection is always above 40%? This may be made clearer. Also there could be a statement about dependence of efficiency on any of the kinematic variables, eg. jet pT .

• 40% is for our selections

L142: What do you do with the 2nd lepton which is not used to make M_T?

• We have it included in the selections, but it isn't used in making mT.

L251: CTEQ6L is not a tune for UE! Please check what you wanted to say.

• text clarified

Type A

Abstract: remove “which decays leptonically” from the first sentence

• was asked specifically to put this in at one point

4th sentence: ..via their leptonic decays to muons or electrons ….” No need to explicitly say to muons or electrons. In the next part of the sentence l can be specified as electron or muon.

• removed

Type A

I think the introduction needs good amount of polishing.

• it can be done in FR

L4: replace “production” by “experimental measurement”

• changed

L6: replace “vector boson” by “W or Z” or introduce the notation V with explanation as “vector boson V (V=W or Z)

• we assume the reader is familiar with vector bosons - when we introduce the notation "V" we define V = W or Z

L10-L12: I feel next this sentences is superfluous here. Otherwise the motivation has to be explained better.

• suggest to discuss in FR, fine with us though

I do not understand the use of the phrase “dynamics of associated jets, ….”

• The properties of the jets that are made in association with the boson, changed "dynamics" to "kinematic properties"

L13: Throughout this paper, a hadronic shower originating from a b or anti-b quark, are referred to as b jet.

• changed to " a hadronic shower originating from a b or anti-b quark is referred to as a b jet, and b-tagged jets are the reconstructed objects either in simulation or data that have been identified as such."

L20: W+bbbar ‘production” cross section

L52: The definition of isolation variable includes ..

• added "the definition of the isolation variable.."

L56: .. p_T^miss in the event is defined as the negative of the vector ..

• changed

L59: The variable M_T is …

L70: .. exploits the considerably long lifetime ..

• did not add "considerably" to statement "long lifetime"

L74: Break the sentence in 2 parts: “ … CSV discriminator value. In this analysis …”

• changed

Section 4, L80: ..detailed in section 5???

• this is where the signal selections are detailed

The overall yield after applying all the selection criteria lists the background processes, what about the signal?

L82: as well as production of dibosons .. And remove the word “production” at the end of the sentence.

• changed

L87-89: do we really need to explain the origin of the tune Z2*?

• was asked to include it at one point

L88: ..scheme has been used,

• kept in present tense, according to PubCom rules

L91-L93 : try to rewrite in a better way.

• this formulation is the result of many iterations of comments

L97: provide definition of R

• We have delta R defined, and changed "R" to "delta R"

L98: …, event, excluding the neutrinos.

L100: b jet, NOT b jets!

• changed to "If an event does not contain a b jet"

L135, L140: Are ttbar-multijet or ttbar-multilepton accepted phrases for a paper? I think a modifier should be used before using these. like: Inclusive ttbar events can pose as multijet or multilepton final states when both the W bosons decay hadronically or via leptons.

• definitions of the two regions and their naming scheme is given in the previous paragraph

L138-139: .. as the signal region; however the lepton requirement is modified. The event must contain 2 isolated leptons of different flavors, each with ..

• changed

L143: simplify the statement, may be by writing like: ... using event samples that satisfy all the relevant criteria except for the absence of demand on lepton isolation; for electron and muon, I > 0.15 and 0.20 respectively.

• changed

L161: The control as well as the signal regions both ..

• changed

L163: What type of ttbar events are being talked about multilepton final state?

• multijet - changed to "Because $\ttbar$ production may have more than two jets in the final state, the rejection of events with a third jet makes it sensitive to JES."

L180: remove "uncertainty"

• removed

L181:Start new sentence: These are included ..

• changed

L191: As a conservative estimate of the uncertainty in QCD multijet background, a 50% uncertainty has been considered. This results in an uncertainty of 2-3% in the measured cross section.

• changed

L194-L195: The reference to the renormalization and factorization scale and their variations come out of the blue. Some introductory statement is necessary that at this point you are discussing about theoretical uncertainties.

• changed to "The renormalization and factorization scales respectively are set at $\mu_{\mathrm{R}}=\mu_{\mathrm{F}}=m_\PW$, and the uncertainties on this choice are estimated $\mu_{\mathrm{R}}$ and $\mu_{\mathrm{F}}$, up and down by a factor of two."

L202: presented in the two ..

• changed

L250: ..scheme, Pythia6 with CTEQ6L parameter is used. --> no need to talk about the UE tune again, i feel.

• Z2* tune kept

L268: ..where the bbar system is produced in a different partonic level interaction than the one which produced W, albeit, in the same collision.

• changed

L282: no need to mention: In this paper

• removed

## Michael Schmidt, August 7

Type-B

line 130-1: Does your consideration of these two control regions lead to a consistent correction? If so, perhaps you should state that fact since it is not automatically true.

• yes, we have checked this with closure tests, but did not modify the text

line 132-4: I would have expected a veto region (\Delta R cut) between leptons and b-jets, but none is mentioned. If you made such a cut but it is missing from the text, please include it here.

• yes, we require dR(lepton,jet)>0.5 and this was included on L98

line 204: Why isn't the uncertainty quoted for the combined scale factor smaller than the uncertainties for the individual factors? I would expect a reduction of 40% or so, which should matter for your cross section result. If the 0.08 is correct, please consider adding a sentence explaining to the reader why the uncertainty is not reduced.

• We are taking max(unc1, unc2) and this line has been added " with the combined uncertainty taken as the maximum of the uncertainties on the individual lepton channels."

Can we double-check that is is OK to quote PASes Ref [46] & [55]?

• Ref 55 has been updated, ref 46 is the combined cms/atlas result from papers ref 47,48. I couldn't find the combined result as a paper - Sascha?

========================================================= Type-A

There are several instances in which a number is separated from its unit by a page break. This makes the text slightly harder to scan. I find it worthwhile to connect a number to the unit using either tilde or \mbox. So, in the Abstract, \sqrt{s} = 8~TeV or \mbox{\sqrt{s} = 8 TeV }. You can also do this for W~boson in the title, for example. I admit this is a cosmetic consideration.

• changed
Also, on line 33: Ref.~\cite[xxx]
• changed

Abstract: opening sentence is a little awkward. Here is a suggestion: "The cross section for the production of W bosons in association with two b~jets is measured ... " I don't think you need the leptonically decaying part since you state this information shortly thereafter.

• we were also asked to include leptonically, we suggest to discuss in FR

The same modification should be made for the first sentence of the Summary.

• was also asked to include leptonically, the same as above suggest to discuss in FR

Abstract, line 4: You can drop "In this paper" since it is wordy.

• changed

Abstract, line 8: I recommend you drop "W(\ell\nu)+bbbar production" since this appears inside the paranthesis on the next line.

• removed

line 2: "(henceforth denoted Z)" I don't think you need this and it strikes me as a bit cumbersome. (If you feel you do need it, fine.)

• I agree but was asked to put this in at one point

line 6: I think "collaborations" is better than "experiments" since it is the people not the equipment who perform measurements.

• changed

line 35: I don't think you need "The" at the start of the sentence though it is not incorrect.

• kept so as to not start sentence with a symbol

line 35: I think "that" is better than "which"

• changed

Eq (1): The long line with just "pTell" below looks awkward to me. Can you put \frac{1}{\ptell} in front of the summation sign and a large brackets around the numerator? Yes, this is a cosmetic issue.

• changed

line 53: scalar~$\pT$ add tilde

• changed

line 56: Let me suggest the following wording: "... as the negative vector sum of the pT of all PF candidates ..."

• changed

line 124: You could put "binned" in front of likelihood, for clarity.

line 128: You could delete "thus" since it a little wordy.

• removed

line 145: To my ears "shapes obtained" sounds better, but your LE can decide this.

• changed to "resulting distributions"

line 146: Perhaps you can add "as" before "estimated" ?

line 152-3: add tilde lepton~$\pT$

• changed

line 153-4: add tilde 30~\GeV

• changed

line 161: I think the sentence is easier to understand if you add "regions" after "control"

line 169: I think "in which" sounds better than "where" but your LE can decide.

• changed

line 170: I would delete "which is" - wordy.

• removed

line 191-2: Let me suggest the following wording: " ... and ultimately contributes and uncertainty of 2 -- 3\% to the measured cross section."

• changed

line 236-238: Perhaps you can avoid using percent signs except when you are stating relative amounts. So: 0.10 -- 0.15, 0.80, 0.40, and 10\%.

• not changed, can be discussed during FR if necessary

line 246 uses "PDF" but this is defined later, on line 249. I guess you should move "parton distribution function" from line 249 to line 246.

• definition occurs only once now, the first time mentioned

line 279: Reword as suggested for the abstract.

• changed

line 286: Please drop "measured to be". I find this construction wordy (though I know other people like it...)

• not changed, suggest to discuss in FR

## Steve Wasserbaech, August 6

> Type B

I'm not sure what to propose, but I'm wondering whether it is appropriate for us to say we are measuring the W+bbbar cross section when we don't distinguish between b and bbar jets...

• the title says "in association with two b jets"

> Type A

general:

1. Please do use macros \TeV, \GeV, \fbinv, \unit{pb}, etc. in the title and abstract. They are forbidden for PASes but they should be used for papers to avoid bad line breaks and ensure correct spacing. We also have macros \stat, \syst, \thy, and \lum in ptdr-definitions.sty that will include the needed space between the uncertainties and the parentheses.

• changed

2. "which" is used inappropriately in some places in this paper. Please see the "That vs. which" section of the Guidelines and my specific comments below.

• changed

3. The Guidelines say that we should hyphenate "b-tagged" but not "b tag" or "b tagging".

• ok, changed

title: this formulation with "the W boson" seems slightly peculiar; proposals:

Measurement of the cross section for W boson production in association with two b jets in pp collisions at sqrt{s} = 8\TeV

or

Measurement of W boson production in association with two b jets in pp collisions at sqrt{s} = 8\TeV

we can discuss in FR, one can not measure production (2nd title), the 1st is better but does not look better then the present title

abstract:

1st sentence: To me the first part of this sentence needs major repairs. * ok, we can discuss in FR

I don't think "of" is correct with "cross section".

• cross section of a given process - what do you recommend? Can be fixed in FR

I don't think "the" works with "W boson".

• changed to "a"

"which" isn't correct in this situation, and I think we would be better off omitting the information about the decay mode of the W from this sentence because it is just too much. Proposal: The cross section for W boson production in association with two b jets is measured...

• changed to "that", and we have been told specifically to include this information by other commenters, lets discuss in FR

3rd sentence: "In this paper" is unnecessary.

• removed

3rd sentence: I suggest that we delete "to muons or electrons" to avoid repetition.

• removed

4th sentence: I don't think a region can require.

• changed "requires" to "contains"

in the text...

line 3: the Guidelines say we don't need to define the LHC, but we are supposed to say "at the CERN LHC"

• changed

lines 13-15: we need to say that the reconstructed objects are referred to as b-tagged jets, not the other way around.

• Yes, gen-level b are called b jets, and reconstructed objects are b-tagged - changed to "Throughout this paper, a hadronic shower originating from a bottom or anti-bottom quark is referred to as a b jet, and b-tagged jets are the reconstructed objects either in simulation or data that have been identified as such."

line 17: we don't need to define "pp" or "$\sqrt{s}$"

• changed

line 17: "7\TeV"

• changed

line 18: "at the Fermilab Tevatron"

• changed

line 21: "and" -> comma (we have "and" on line 20, and this "and" is confusing)

• changed

lines 32-33: "Ref. \cite{" -> "Ref.~\cite{"

• changed

line 35: "which" -> "that"

• changed

line 52: comma before "and neutral"

• changed

line 53: It's not correct to refer to "events induced by additional pp interactions". An event contains the hard interaction and many additional interactions.

• removed "events induced by"

line 86: don't capitalize "parton distribution function"

• changed (but had been told at one point that we should capitalize)

line 88: "CTEQ5L PDF set"

• changed

line 100: no hyphen in "nonzero" and no comma after "nonzero"

• changed

line 102: delete "up"

• removed

line 104: "which" -> "that"

• changed

line 104: "in simulation" seems unnecessary

• ok, removed

line 111: comma after "Z"

line 114: "pb and was determined" -> "pb, determined"

• changed

line 121: "...per bunch crossing; the simulated number of pileup interactions is reweighted to match..."

• changed

line 124: "likelihood fit" -> "maximum-likelihood fit"

• changed

line 132: a region cannot require

• changed to "contains"

line 132: "{$\PT > 30$} GeV " -> "$\PT > 30\GeV$"

• changed

line 133: "{$\PT > 25$} GeV " -> "$\PT > 25\GeV$"

• changed

lines 134, 135, 137, 140, 148, 153: six more things like that (GeV -> \GeV)

• changed

line 156: comma before "which"

line 159: "uncertainties on" -> "uncertainties in"

• changed

line 166: "Because it has no veto on a third jet, the ttbar-multijet control region is less sensitive..."

• changed

line 170: "are fit" -> "are fitted" sounds better to me (both are "correct")

• changed

line 181: it's not clear what "is included as a nuisance parameter"; please reword this

• changed

line 188: "templates" is jargon; please reword (or at least define "templates")

• changed to distributions

line 188: comma before "which"

line 192: "2-3\%" -> "2--3\%"

• changed

line 197: comma before "considering"

• changed

line 197: "from CTEQ" -> "from the CTEQ"

• changed

line 225: This equation doesn't show how the cross section is defined; it does show how the measured cross section is calculated.

• changed

fig 1: plots in right column should be aligned

• changed

table 2:

extra vertical space needed above "W+bbbar"

"Drell-Yan" -> "Drell--Yan"

• changed

fig 2 caption: "$\MT<30$ GeV " -> "$\MT<30\GeV$"

• changed

lines 226-230: "L" isn't defined

line 234: "K-factor" -> "$K$ factor"

• changed

line 236: "$10\% - 15\%$" -> "10--15\%"

• was asked to change to "10\% to 15\%"

line 249: "used in the proton PDF sets."

• changed

line 254: I don't think "functioning" is the correct word. What meaning was intended?

• removed, just validity

line 267: I think we mean "take into account" rather than "account for" (explain)

• changed

line 278: "Conclusions" -> "Summary" because no new conclusions are reached

• changed

lines 279-287:

same problems as in the abstract

• changed

line 280:

"{$\sqrt{s} = 8$} TeV " -> "$\sqrt{s} = 8\TeV$"

• changed

lines 284-285: "{$...$} GeV " -> "$...\GeV$" in three places

• changed

references

[1] ATLAS and CMS Collaborations

• not sure how to include in bibtex file, presently as : collaboration = "ATLAS, CMS",

[2]

"8\TeV"

• changed
"C" should be part of the journal name, not the volume number
• changed
delete "no. 5"
• changed

[5] 7\TeV

• changed

[12-13]

ppbar should be written with the appropriate bar symbol

• changed
$\sqrt{s} = 1.96\TeV$

[18] and others

"$\sqrt{s}$ = 7 TeV " -> "$\sqrt{s} = 7\TeV$" and similarly in all other bibitems

• changed

[24] one comma too many after "Salam"

• changed

[28]

ttbar should be written with a bar symbol

• changed

[35]

"C" should be part of the journal name

• changed
give the first page number, not the range
• changed

[39, 48]

please add italics/bold/spaces in the errata, so they look like normal references

• changed

[46]

use symbols for "emu"

• changed

[47]

delete "no. 10"

• changed

[54]

use the same abbreviation for the journal name as in [41]

• changed

## Dick Loveless, July 26

Abstract:

line 1: Change "which" to "that" and delete the comma after "leptonically" since it is a restrictive clause. Maybe leave out this phrase since it is repeated on line 5.

• which->that and comma deleted, phrase kept

line 3: Delete the comma after "TeV".

• removed

Text:

line 3: Delete "at the Large Hadron Collider (LHC)" since the measurement is relevant regardless of any specific accelerator. Maybe add "CERN LHC" in lines 6,7.

• changed

line 5: Delete "exotic".

• removed

line 9: Write " . . are being compared with LHC data using . . ".

• removed

line 11: Delete the phrase "as well as of the vector boson". Do you mean W boson? What does it mean?

• removed

line 13: Change "will be" to "are".

• changed

line 14: Change "which" to "that" -- restrictive clause.

• changed

line 35: Change "which" to "that".

• changed

line 36" Delete "loosely". You define "isolated" later.

• it is looser than our offline selections, do we need to be more specific

line 52: Change "corrections" to "a correction".

• changed

line 53: Change "based on" to "proportional to".

• changed

line 54: Maybe there should be a sigma symbol before "p_T^UP" here and in equation 1.

• changed

line 58: Change "and compute it" to "with a", and maybe define transverse mass explicitly.

• words changed - was asked at one point to take out mT definition

line 62: Change "mitigate" to "minimize".

• changed

line 63: Change t subscript of "k_t" to Roman font and move the phrase "with a radius parameter of 0.5," after "algorithm [23]".

• changed

lines 66-68: Write "Jets from pileup interactions are rejected by requiring that the jets originate at the primary interaction vertex.".

• changed

line 71: Delete "optimized".

• %GEERN% removed

line 73: Change "tagging of a jet" to "identification of a b jet (b-tagging)".

• changed

lines 74,75: End sentence after "value" and start a new sentence "In this analysis, b-tagged jets are required to pass a threshold with an efficiency of 40% in the signal phase space and . . ".

• changed

line 80: Once the section details have been moved to section 3 the first phase can be deleted. Then start the section with "Backgrounds to the W + bbbar production are the associated production . . ".

• Nothing changed yet. Could just start with "Backgrounds to the W+bb..." and remove reference to later section

line 83: Change "The corresponding contributions" to "These background contributions".

• changed

line 89: Change t subscript of "k_t" to Roman font.

• changed

line 93: Merge sentences by writing ". . . W+udscg, which are defined below".

• changed

line 94: Delete the sentence "The separation is done at the particle list level." How else could it be done? If you need the sentence it should go earlier.

• removed

line 95: Delete the comma after "jet" and change "it falls into the W+bbbar category" into "it is categorized as W+bbbar.".

• changed

line 100: Change "non-zero" to "nonzero".

• changed

lines 100,101: Change "it falls into the W+ccbar category" to "it is categorized as W+ccbar.".

• changed

line 101: Write "The remaining events are categorized as W+udscg.".

• changed

line 104: Write "In the simulation leptons that do not originate from the primary vertex . . ".

• changed (with comma after "In the simulation")

line 106: Add a hyphen to "leading-order".

line 112: Add a hyphen to "leading-order".

line 118: Delete "as the ones".

• removed

line 124: Better to use "estimated" instead of "measured" and change "region" to "event sample".

• changed

line 125: Delete "which is defined below" and change "QCD" to "the multijet" and change "initial shapes" to "distributions".

• changed

line 126: Add "background" before "contributions".

• changed

line 127: New paragraph starting with "the dominant background . . " and change "region" to "event sample".

• changed

line 128: Write "Therefore, we compare the data and simulation . . ".

• To avoid pronoun "we", changed to "Therefore, the data and simulation are compared .."

line 129: Change "region" to "samples" and add "(ttbar-multilepton)" after "leptons".

• changed

line 130: Add "(ttbar-multijet)" after "jets".

• changed

lines 132-135: The lines should be moved to section 3.

• Not changed. The idea was to put signal and ttbar selections all in the same place and not repeat. Maybe rename Section 3 as "Event reconstruction" and Section 5 as "Event selection and analysis strategy"

line 135: Change "control region" into "sample"

• changed

line 136: Change "for the signal region" to "the signal event sample"

• changed

line 137: Change "which" to "that" -- restrictive clause again.

• changed

lines 138,139: The ttbat-multilepton sample uses similar selection criteria as the signal event sample, but requiring two isolated leptons of different flavor. . . ".

• changed

line 141: Change "region" to "sample".

• changed

line 144: Change "corresponding region" to "signal event sample".

• changed

line 145: Change "not be" to "is not" and change the "obtained shapes" to "resulting distributions".

• changed

line 148: Change "non-QCD" to "nonQCD".

• changed

line 151: Write "so the use of an inverted isolation requirement to obtain the QCD background distribution is possible.".

• changed

line 152: Add "However," before "this".

line 157: Change "shapes" to "distributions".

• changed

line 159: Change "on" to "in" after "uncertainties".

• changed

line 161: Add "the" before "control".

line 162: Change "regions" to "event samples".

• changed

line 165: Add a comma after "moderate".

line 166: Change "control region" to "sample"

• changed

line 167: Change "control region" to "sample".

• changed

line 170" Change "regions" to "samples" and "region" to "event sample".

• changed

line 171: Add a hyphen to "well-known".

line 172: Write " . . using the ttbar-multijet sample".

• changed

line 173: Change "efficiency that" to "efficiency, which".

• changed

line 175: Change "region" to "sample".

• changed

line 177: Change "region" to "event sample".

• changed

line 182: Change "on" to "in" after "uncertainties".

• changed

line 183: Change "of" to "for" after "section".

• changed

line 186: Move "in practice" after "but" and add "they".

• changed

line 187: Change "which" to "that" and delete "those".

• changed

line 188: Add a comma after "templates".

• changed

line 192: Change "on" to "in" after "uncertainty".

• changed

line 200: Change "region" to "sample". What is a "rescaling factor"?

• changed - Scale factors are provided by the POGs and applied, but are remeasured for this analysis, hence "rescaling" %ENDCOLOR

line 202: Add "in" after "presented".

line 204: Change "combined" to "averaged".

• changed

line 205: Add a comma after "fit".

line 206: Change "on" to "in" after "uncertainty".

• changed

line 207: Change "region" to "sample".

• changed

line 208: Do you mean "results in" instead of "suggests"?

• yes, changed

line 209: Add a comma after "value".

line 210,211: Change "event enhanced control region" to "sample".

• changed

line 213: Change "regions" to "samples".

• changed

Table 1: The format of the table should be improved. Put a box around the outside.

Table 1 caption: A better first sentence "The main sources of systematic uncertainty in the W+bbbar signal event sample.".

• changed

Table 1 caption, line 9: Change "on" to "in" after "luminosity [14] and".

• changed

line 218,219: Write "Based the fits the number of events of each type in the signal event sample is given in Table 2.".

• changed (to "Based on the fits...")

Figure 1 caption, line 3: Change "region" to "sample" both places.

• changed

Table 2: What is the initial distribution? Where does it come from? Format a box on the outside.

• The initial distribution is obtained after applying the results of the ttbar fits on the Wbb simulation. The box seems to be recommended against.

line 230: Change "on" to "in" after "factor".

• changed

line 234: Delete the comma after "applied" and add "that is" before "obtained".

• changed

line 235: It seems "cross section" should be singular.

• good point, changed

line 236: Add "to that" before "with MADGRAPH" and write "10 to 15%".

• changed

Table 3: Format a box around the table.

• Again, not changed

line 264: Change "as" to "because".

• changed

line 274: Add a comma after "level" and delete "and" before "including".

• changed

line 275: Change "when" to "as" and add a comma after "needed".

• changed

Figure 3 caption, line 4: End the sentence after "schemes" and start a new sentence "In the case . . ".

• changed

line 279: Change "the" to "a" before "W" and change "which" to "that" and delete the comma after "leptonically".

• changed

line 280: Delete the comma after "TeV".

• removed

lines 308, 342, 437, 441: particles should all be in Roman font.

• changed

line 313, 401: The boldface "C" is part of the journal name and should be in italic font. See other references.

• changed

line 329, 332, 333, 340, 345, 386, 416, 429: Remove hyphens.

• removed

line 341, 346, 384: Write "ppbar" using LaTex superscript for pbar.

• changed (except 341 which is pp)

line 370: Add a space before "TeV".

line 372: The subscript "t" should be in Roman font.

• changed

line 401: Only the first page should be given.

• changed

line 423: Add the page number.

line 444: Format the erratum properly.

• changed

Type B

Title: The title really should say "fiducial". Event abstract says "fiducial region".

• Not changed, but ok with us, lets discuss in FR

Abstract:

general: It would be good to use the branching fraction of the W leptonic decay to calculate the cross section for W+bb production - as promised by the title.

• We measured the cross section with the leptonic decay of the W, but did not measure the branching fraction ourselves. If interested, theorists can easily do this multiplication

Text:

general: When selecting a specific data sample to estimate backgrounds you are using "region". Why not call it a "control sample" because it is really a specific data sample (and not a region per se). When using multiple control samples it is good to name each one and then refer to it clearly in the text.

• changed in all the places indicated

line 2,3: This is the most boring topic sentence I've seen for a long time. "The measurement . . . is relevant . . to searches and measurements". Please write something more interesting, even relevant.

• changed to "The measurement of $\PW$ or $\PZ/\cPgg^*$ (henceforth denoted $\PZ$) production in association with b quarks in proton-proton collisions provides important input for refinement of calculations in perturbative quantum chromodynamics and is also relevant for searches and measurements."

lines 6-8: This sentence is very old and musty. The Higgs is no longer a "new boson". Please update.

• removed "new"

lines 10-13: How does this measurement relate to "b hadron production mechanism" or "dynamics of associated jets"?

• We are measuring the production cross section of a process containing two b hadrons, changed "dynamics" to "jet kinematics"

lines 132-135: This describes the selection of signal events. It should be moved section 3 so the first sentence of section 4 does not have to refer to section 5.

• The goal was to have all of the Wbb and TTbar selections in the same place, as well as to avoid repeating them - this section is just about the samples used and all we want to indicate is that these are the relevant samples to use for our phase space

Table 1: Is "Variation" the best label or the best quantity for this column?

• Technically this is the one sigma bounds on the input variation used in the fit

Table 2: What is the initial distribution? Where does it come from?

• These numbers correspond to the simulation after applying the corrections from the ttbar fits, but before performing the final Wbb fit

line 230: My understanding of the "signal strength" is that it is the mismatch between the theoretical calculation and the experimental cross section. In a "perfect" world it would be 1.0. Does it make sense to use "mismatch" to describe it. It doesn't seem right to claim that this is "required by the fit". It is simply the ratio of the experimental result to the theoretical calculation.

• You are correct in your understanding of what the signal strength is, changed "required" to "predicted"

## Thomas Ferbel, July 26

Hi Authors et al,

I find this paper somewhat disappointing for several reasons. First of all, the hopes raised in the introduction of tests of the SM and H->bbbar couplings are forgotten by the time you reach the summary section: Did this paper reduce the limits on non-SM contributions? If so, what are these new results?

lets discuss introduction in FR, the goal of the paper is to measure the Wbb cross section that is important in Higgs and BSM searches

Second, in my view, the writing requires some improvement to get ready for the FR. I am surprised It has quite a few violations of our rules, and should be checked again. I have made recommendations for removing the grubby word "shape" and for using more appropriate English words in its place, such as "distribution" or "form," as indicated in my corrections in the attached Adobe Acrobat file.

tried to improve

Connected with this, is the abuse of "QCD" in its usage for "mulitjet" events. Using "QCD multijet" is only correct when it refers to specific MC calculations rather than data from control regions.

this was one of the request during CWR to use this wording

Also, I assume that "variation" means "change," as there is no variation of the kind implied, but only changes made.

we agree and suggest to discuss a possible modification in FR

The structure of the title, abstract, introduction, and summary is not inspiring. Besides the writing (and the repeated summary), there is the change between abstract and introduction in the issue of the associations in the cross section. It is unclear what you wish to impress upon the reader. I would ask you to improve the overall English, and, wth help from your LE, minimize the violations of our beloved CMS PubComm rules.

## Kajari Mazumdar, July 18

This is a nicely written paper. I have few comments only.

Type A

l81: add after ".. massive vector boson and jets" --> (V+jets, where V = W or Z)

• changed

l87: the string "... Pythia 6.4 using Z2* tune for hadronization" give s a wrong impression. I suggest: "... Pythia 6.4 for hadronization using Z2* tune for the underlying event".

• changed ( "using the Z2* tune")

After this the sentences in l107-109 may be OK.

• ok then

l250: can we improve the phrase ".. Pythia6 is run" ... ?

• changed to "... the PDF set CTEQ6L is used and interfaced with PYTHIA6 using the Z2* tune"

# Signal region fitted results

## CWR Comments (on Paper v5)

### Vassili Kachanov:

The paper is interesting from the point of view of the background estimations in the the Higgs production processes' measurements. This is a new necessary step in the increase of the sensitivity of the measurements to the new physics. This analysis is at 8 TeV and it is a continuation of the measurements at lower energy 7 TeV. In this one both muon and electron decays’ channels were used. The paper describes analysis in detail with many tables and graphs as the illustrative material. The results agree with the previous measurements at 7 TeV and are the first ones at the energy 8 TeV.

It is proposed to publish this paper.

• Thanks for the support.

#### Type A: English/Style/Formatting (including figures)

Title: “with two b jets”: the jargon b jets is explained only in the text l.13 Abstract: “two b-tagged jets”: it would be good to be consistent with respect to the title

• removed "-tagged" from abstract

l 15 “also denote Z/g as Z, and Z or W generically as V.” could be reduced to ““also denote Z/g and W generically as V.”

• We use the symbols Z and V both in this text, so both are defined here.

l 16 “has been studied” > “has been measured”

• changed

l 58 “hadron calorimeters using the procedure described in Ref. [18]” > “hadron calorimeters [18]”

• changed

l 87 “inclisuve” > “inclusive”

• changed

l 117 “Since the” > “The” because “Since” is used also in the previous sentence then also “process,” > “process, therefore”

• changed

l 132 “done at the truth generator level.” > “done at the particle list level.” if MC is wrong will not give the truth

• changed

l 137 “no b jets but an even” > “no b jets, but an even”

• changed

l 175 “contributions, e.g. the uncertainty” > “contributions, e.g., the uncertainty”

• changed

l 254/255 “with a similar factor calculated in the 7 TeV Z+b analysis calculated as 0.84+/-0.03 [4].” > “with the factor 0.84+/-0.03 calculated in the 7 TeV Z+b analysis calculated as [4].”

• changed to "with the factor 0.84+-0.03 calculated in the 7 TeV Z+b analysis [4]"

#### Type B: Everything else (e.g. strategy, paper structure, emphasis, additions/subtractions, etc).

l 6: the combination of the Higgs mass measurements from ATLAS and CMS is already published and the combined value with its corresponding reference should be quoted

l 23-31 there is a “The CMS detector ” chapter that is usually used instead of the used paragraph, I would keep it as standard

• changed

l 42-43 please motivate (add a sentence) the higher pT thresholds used for offline selection of the muons and electrons than the one used at the trigger level, same for pseudorapidity

• changed to: "to ensure that the triggers are fully efficient"

l 48 “pileup” is introduced only later at l 109/112, it needs an introduction before it is used

• changed - definition moved up and removed later

l.50 The isolation is corrected for the neutral component arising from multiple pp interactions in the same event, pilep, which is based of the scaled by a factor 0.5 scalar sum of pT of charged particles. Text has been changed.

l 50 I would mention that the isolation cut I was optimised for…

• We did not optimize further than the EG POG recommendation.

l 72 “b-tagging efficiency of about” please give the precise number

l 94 the larger number of events in the simulated sample is not an argument, one could very well simulate same number of events in the five-flavour scheme, please rephrase

• removed 4F/5F discussion and added :
For the signal distributions, the $\Wbb$ component of an inclusive $\Wjets$ sample was used, with the shapes of the distributions taken from a dedicated high-statistics generated sample of exclusive $\Wbb$.

l 124 the jets are not fully reconstructed at 5.0 pseudorapidity, the agreed cut was 4.7 such that the 0.3 difference could ensure a full jet reconstruction

• changed 5.0 to 4.7

l 152 “minimal correlation is observed ”: what is meant here by “minimal ”?

• They are uncorrelated and the text now reflects this. Below is a distribution of transverse mass using QCD MC with inverted and standard isolation cuts.

l 222 “efficiency correction factors” > “corrected efficiency” otherwise is not a cross-section

• changed ao "acceptance and efficiency"

Figure 5 all predictions show a mean value at the border of the one sigma band of the measured value, this indicates that some “contribution” might be missing systematically in the theoretical calculations, any hint from theorists about this?

• This is result of the measurement, and should be discussed by theorists.

l 288/290 the sentence: “The previous CMS measurement at 7 TeV also shows agreement with the SM prediction within the level of one standard deviation.” does not bring extra info related to the subject of the paper. It can be removed, or add a comparison 7 vs. 8 TeV

• removed

ref 42 is missing a report number, or link so reader could access it

• reference has been updated is no longer used (ttbar xc)

### Albert De Roeck

On your comment on line 276 on the 7 TeV data: I noticed that the SM prediction used at 7 TeV is from MCFM and it gave 0.55 pb at 7 TeV. Ie the SM cross section at 7 TeV is LARGER than what you give here at 8 TeV (0.50 pb) So the statement is a bit of a cheat since apparently the SM cross section changed quite a bit since the 7 TeV analysis What would be the prediction of the presently used models/calculations, when ran for the 7 TeV? I guess something in the line of 0.45 pb, while the measurement was 0.53 pb. Still only one sigma or so too high but the same tendency as the data in this analysis.

• There are a few differences with respect to the 7 TeV analysis, in particular, the gen jet definiton at 7 TeV included neutrinos while that 8 TeV does not. At 8 TeV, the lepton pT is 30 GeV (as oppsed to 25 at 7 TeV) and at 8 TeV there is no cut on mT.

Which brings me to the next question: did we try push the systematics enough? Our measurement is within one sigma with theory predictions, but that is totally the result of the large systematic error. The biggest effect comes from the top cross section uncertainty (no way to use our measurements here? We measured the cross section to 5% or so) and the b-tagging rescale for which I think we use the full uncertainty as an error. Is there no way to do better on that uncertainty? We could have an interesting result here, but it gives the impression we are happy if we cover the deviation by the error band and stop there.

• We updated our ttbar reference from the CMS measurement to the CMS/ATLAS combination which brings down the uncertainty. We also moved from taking the uncertainty from the fit as 100% of the fit to the uncertainty from the fit. The comparison of the old results and results after these tighenings is shown at the top of the twiki as 0.64 +- 0.03 stat +- 0.10 syst +- 0.06 heo +- 0.02 lumi pb

In arXiv:1302.2929 ATLAS shows plots showing some discrepancies for W+b (single b-jet) measurements. Was this resolved in the mean while or could we still contribute to that? We did not consider to measure just W+b in this analysis?

• We only measured the W+bb cross section

#### Details

- line 48: what about neutrals in the pile-up? We do not correct for these on some event level based observable? If we do you should mention it here.

• These are the standard CHS jets, and are pileup corrected. Neutrals are taken into account via the factor of 0.5

- line 64: Are these corrections also based on just charged particles? We do not correct for neutrals from pile-up in the jet energies?

• Neutrals are taken into account via their ratio with charged.

-line 73: what is the pt range or which these b-tagging efficiencies/fake rates are given? Is that for the typical b-jet range used in this analysis? Would be useful to add.

• added "in the signal region phase space"

- line 87: inclisuve: please use a spell checker!

• changed

- line 92-95: Do we say here that for the shape it does not matter if we use the 4 or 5 flavour scheme for the shape of the W+bb? (since you use the one which has the best statistical power). It that indeed true: does it not matter? You should add a comment on that or this discussion/decision looks a bit non-scientific

• removed 4F/5F discussion and added :
For the signal distributions, the $\Wbb$ component of an inclusive $\Wjets$ sample was used, with the shapes of the distributions taken from a dedicated high-statistics generated sample of exclusive $\Wbb$.

- line 103: We generate single top at NLO with POWHEG, but then you normalise to cross section with a different PDF at NLO (again). Why was that done? Is it significantly different from the POWHEG one? That would in fact be interesting… I guess this needs a comment on why you do that.

- line 105: Recommend to add the value that you used here, to avoid ambiguity and the reader does not have to find yet another paper.

• updated reference and added value to paper

- line 115-117: “Since… Since” to consecutive sentences starting the same way. This does not look good. (for the LE)

• changed

- line 120: What are typical values for these weights? Are these relatively small? Suggest to add something on that in the paper.

• these are the weights being fit for in Steps 1/2 described more fully in the last paragraph of this section

- line 144: The uncertainties in table 1 for the backgrounds are not unreasonable, but they have generally a 2 digit precision, so look rather well determined. However no motivation for these numbers is given in this paper. Do we have references for these or something we can add for the reader to hold on, for having confidence in these numbers (e.g. if these are just predicted theoretical uncertainties not he cross sections)? (for the QCD background the motivation is given later)

• These are taken from the uncertainties on the cross sections for the given background and text has been updated

- line 162: “may be due in part” is not a good formulation for a paper. Do we have evidence for this statement? Then we can add that (otherwise we should just test it, and see if the suspicion is correct)

• changed to: The same is true for the $\Wbb$ sample, which at LO has only two jets in the final state and therefore is not affected by the jet veto requirement.

- line 166 effeciency. Spelling!

• changed

- line 187: Statement here is too vague. How did you vary the PDF sets?Which ones or set was used (remember the PDF4LHC recommendation )

• added: "following the PDF4LHC prescription" \cite{Botje:2011sn}

- line 198: 1.3 sigma is sort of large, but I guess borderline still acceptable. For my info: do we know any other analyses where we observed a similar recalibration effect? Was this what si found here reported and discussed eg in the JetMET group?

• We do not know of any other analyses that have seen this and did not give a dedicated presentation to the JetMET group (though we gave multiple presentations to the BTV POG). A shift of 1-2 sigma is expected from time to time, and we mention in the paper now that 1.3 sigma corresponds to 3.4% in ttbar normalization.

- line 207: I believe the statement, but what is negligible here? Less than one %?

• yes < 1%

- line 225: k-factor? usually K-factor

• changed

- pdf references: normally we give also the individual papers from NNPDF, MSTW, and CT (some instructions from the pub com a few months back)

• references added and clarified, (NNPDF=48, MSTW=43, CT=47)

- line 276: I suggest to repeat the reference to the 7 TeV data here.

- Figure 4b does not show a good agreement with the data and MC. There is a clear downward slope in the ratio plot. This is not a worry?

• We thought that this was an interesting distribution and may help theorists understand better what is going on.

- ref 42 has not enough information. There should be a report number or alike so the the reader can find it.

• reference has been updated

### Artur Apresyan

Line 77: not clear yet what the "signal enhanced dataset" means, the cuts are only defined starting Line 121. Move the definition of signal sample here.

• We want to separate the discussion of the MC generators used from the statement of signal region selections so have them in different sections. Added internal reference to section

L93: define "factorization scale mu_F"

• changed to use words instead of symbols

L109: The term "pileup" was already used in line 49 and Eq 1. Move the phrase "Events induced by additional pp interactions, referred to as pileup... " around Eq 1, and drop it from here.

• changed

L127: how do you compute MT in the multi-lepton region? there are multiple leptons, which one defines the MT? Clarify in the paper

• added: "In the multilepton region, the m_T variable is calculated with respect to the lepton named in the channel."

L154-163: you say that JES and b-tag efficiency affects shapes and normalizations. How about theoretical uncertainty on the recoil and NJet spectra? Why is this not mentioned as a major contributor to systematic effects on shape and region migration?

• We are using a data-driven technique to estimate the JES and b-tag efficiency, and the phase spaces used to derive these corrections are chosen to be complimentary: The signal region has a jet veto and a lepton veto, the TT-multijet has the jet veto relaxed and the TT-multilepton has the lepton veto relaxed, so everything is held fixed across the phase spaces with exception of a single final state object.

We think Fig 1 and 2 really belong to Section 4. You argue here that the background modeling is understood, but the delaying of the Figures until two sections later leaves the reader confused.

• We have followed another commenters advice and combined Figures 1,2,3

L164-171: Not really clear what the procedure is, please clarify this paragraph as it is critical for understanding the analysis. What do you exactly do, three simultaneous fits in the multijet+multilepton+signal regions? Or it is three separate fits?

• We are using the mT variable in three different phase spaces and performing sequential fits.

Not clear what is meant here "The result of this fit gives an estimation of the b-tagging efficiency rescaling factor". Do you simply derive an overall normalization scale factor in this sample?

• Technically we derive a factor which is shape dependent, but in practice, this shape is effectively identical to the original shape, leaving us with normalization changes only.

Why is that called "b-tagging efficiency rescaling factor", i.e. what is the connection to b-tagging here?

Similar question on the multijet control sample.

• Assuming that you mean multilepton control sample. Why JES: The difference btw this phase space and the multijet is the lepton veto (we see that MES and EES don't have much of an effect) and the jet veto. The JES shifts jets of ~25 GEV into/out of acceptance and amplifies the effect of the JES.
Shape vs. Normalization: We are again deriving a shape based uncertainty and again the difference between the nominal and 1 sigma shifted shapes is negligible - this is fundamentally the source of the difficulty in this analysis and why we need to separate the two effects in the two control regions. What is different about the application of the JES vs. the b-tag rescales is that every event passing the selection is by definition b-tagged in the same way and so scale in the same way across samples. The JES by contrast shifts the normalizations of different samples by different amounts. For example, in the signal region phase space, the JES has an effect of <1% on the Wbb sample,~5% on ttbar, single top and wjets, and <2% on the other samples. These differences are taken into account in applying the JES rescaling.

Alternatively, maybe you are essentially doing a fit in 3 different variables ( MT, N-jets, n-leptons), where NJets and Nleptons essentially have just 2 bins ( <=2 and >= 3; 1 lepton, 2 lepton ). The shape of MT as far as we can tell is the same for ttbar and W+bb. So all of the sensitivity to determine the ttbar bkg comes from an extrapolation in NJets and N-leptons. Is this correct? If yes, then your explanation is insufficiently clear. As it is written now, many readers will be confused about how you can extract W+bb from ttbar if you are only fitting in MT, as you say in your text.

• We are using the mT variable in three different phase spaces and performing sequential fits and have updated the text to be more clear.

L164: Since the main tool in extracting the Wbb cross-section is the fit to MT, it is not clear how different the shapes of ttbar and Wbb are? Can you show a shape comparison between the two, to communicate to the reader where the power of the fit comes from?

• The shapes of the Wbb and ttbar samples in the mT variable in the signal region phase space have some discrimination power

L181: why do you take a 50% uncertainty on the QCD? where does that estimate come from? Is it just some random "large" number?

• We take an uncertainty large enough to allow the QCD normalization to float freely. We have seen that taking larger uncertainties does not affect the final result.

Table 1: since you extrapolate in NJets to determine ttbar, we think you are not adequately addressing the systematic uncertainties from theoretical sources (missing higher order corrections?, modeling of ISR and FSR? ). These are at least not mentioned much in the paper, and it is important to discuss these.

• The important point is that we are using a data-driven technique to estimate the JES and b-tag efficiency with phase spaces that are identical to the signal region except for one specific modification in order to isolate an effect. In this way, effects from things like ISR and FSR cancel.

L201: on L184 you said that the uncertainties on JES are set to be 100% of the factor itself, while here you say they are set to 1.3sigma_JES. Which one is correct. Please clarify.

• 1.3sigma_JES was the rescale factor calculated so these statements are equivalent, but we have moved to using the uncertainty from the fit itself and have changed the text accordingly.

L196: as a result of lack of clarity earlier in the description in the fit, it is not clear to us here how the fit to MT distribution in multilepton sample changes the shape of MT in signal region. How do you propagate the result obtained in multilepton region to here?

• From the fit, we see that a rescaling is necessary and are attributing it to JES. This phase space is pure TTbar, and we know the bounds of the JES uncertainty. Because the JES interacts differently with different samples, we express the outcome of the fit in terms of how many sigma_JES it corresponds to. Then we rescale the samples in the W+bb region based on this number of sigma_JES.

L223: define mu_F and mu_R

• changed

Fig 5: the blue error bars are very hard to distinguish, can you try another scheme of differentiating them? Also, in the caption define that the observed value is shown as the pink line.

• Figure 5 has been updated.

### Louis Lyons

#### GENERAL:

It is a bit difficult to follow in detail your extraction of scale factors. For example, it is not clear why the 2 different final states each are sensitive to just one of the scale factors.

• The two ttbar final states are chosen to be idential to the signal region, with the exception of one change: the relaxation of either the lepton veto or the jet veto. It may be more clear to work backwards. In step 2, the lepton veto is relaxed bu thte jet veto is in place. The effect of relaxing the lepton veto is simply to ensure a pure ttbar sample since few other processes contribute to a 2 os lepton 2 jet final state. TTbar is often a multijet process so the for jets around 25 GeV, the jet veto selects different events for different choices of jet energy scale. In addition, both of the jets which are present are b-tagged so the b-tagging efficiency must be correctly modeled. In the details of this analysis, we see that while we technically allow for the possibility of the JES and b-tagging effeciency to have shape dependencies, in practice, the shapes of the shifted and unshifted distributions are not meaningfully different. We therefore must separate the effects of the b-tagging effeciency from the effects of the JES choice. We acheive this in step 1 where we have relaxed the jet veto but kept the lepton veto. We see that the effect of the lepton energy scale is 2-3% and have independent justification to believe that the b-tag efficiency is not properly modeled in this phase space. In a presentation to the BTV, they accepted our summarized the results, that we see a discrepency of ~5% per b-tag in our phase space (but better agreement in the phase space of the published ttbar measurement) https://indico.cern.ch/event/436169/contributions/1921818/attachments/1134376/1622924/tmperry_2015july30_TT.pdf . These b-tag rescaling numbers are therefore what we take from the result of step 1 and apply before moving to step 2. In the step 2 fit, when we allow the b-tag rescaling to float with a 1 sigma uncertainty large enough to return to it's original value, we see that it stays nicely at 1. Additionally after rescaling by the JES from step 2, we perform a closure test by refitting in the multijet region and again see that the scaling does not move.

It seems that the main advance on the previous 8 TeV CMS publication is the use of the electron channel. But according to Table 3, the present combined (mu + e) result is no better than the muon channel alone, so it looks as if the present result is no advance on the earlier one. Is this interpretation correct? (OK, I admit the electron result is consistent with the muon one, so that adds some confidence to the earlier measurement.)

• Do you mean the 7 TeV, since this is the only one Wbb at 8 TeV? In general you are right that adding electron channel reduces the uncertainties very little, but it is another measurement and it effects also the final cross section value

#### SPECIFIC POINTS:

Line 46: Which sum?

• Text has been clarified

Lines 92-5: Does this say that the 4-flavour calculation is used because it has better statistics, even though the 5-flavour calculation should in be better?

• removed 4F/5F discussion and added :
For the signal distributions, the $\Wbb$ component of an inclusive $\Wjets$ sample was used, with the shapes of the distributions taken from a dedicated high-statistics generated sample of exclusive $\Wbb$.

Line 183-4: Surely not to 100% of the factor itself? i.e. If the factor is 1+epsilon, is the uncertainty 1+epsilon?

• If the factor is 1+epsilon, then the uncertainty was taken to be epsilon. We are now taking the uncertainty from the fit.

Line 192: If the channels are uncorrelated, why isn't the uncertainty on the average 0.16/2 = 0.08? And even if they are correlated, the uncertainty on the weighted average cannot be larger than 0.11

• This number was a result of putting in the 100% uncertainty on the fit and is no longer in place.

Lines 193-4: Are the rescaling factors allowed to vary only within 1 sigma?

• No, this is the gaussian 1 sigma bound. Text clarified

Last sentence of Section 6: It is not clear what this means.

• removed

Figs 1-2 and 3: The x variable is called 'Transverse Mass', but aren't they different transverse masses?

• We have clarified that in the multilepton region, the lepton being named is the one used in calculating mT

Table 1: Please clarify the difference between 'JES rescale' and 'JES'

• This was a conservative estimate, including both the JES and the rescaling of the JES. Now we are taking just the uncertainty from the fit (Step 2).

Line 205: Specify which sources are correlated. * The sources of uncertainty are either fully correlated or fully uncorrelated across MC samples. The fully uncorrelated ones are the uncertainties on the cross sections of the given samples. This is described in Table 1 and we added a reference in the text.

Line 213: It sounds as if different methods are used for estimating the shape of the QCD background for different variables. Is this so?

• Yes, for these other variables, the QCD shape is taken from the low mT sideband and have been validated using QCD MC. Illustrated here are comparisons above and below mT=30 GeV for QCD MC in the dR(bb) and pT_Lep variables.

Table 2 a) The numbers in the 'muon, initial' column do not add up to the data number.

• This is one of the motivations for performing a fit.

b) Is it worrying that the numbers for both electrons and muons change so much between 'Initial' and 'Fitted'?

• Not worrying, but this is what we see.

c) Can you confirm that it is correct that the estimated signal strength is larger for electrons than for muons. (The number of muons is larger than the number of electrons, and the acceptance times efficiency is larger for electons - see line 228)

• Yes, the signal strength is estimated higher for electrons than muons. The acceptance x efficiency are not actually used in the calculation of the cross section, and is affected by the rescalings in the fits. The numbers quoted in the text showed higher precision than we should have (+-10) and we have modified the text

Fig 3: The data does not agree too well with the MC in the peak regions of the two plots

• This is what we observe with the method we described.

Lines 222: Delete the words 'correction factors'.

• changed

Line 255: Why is the uncertainty on the hadronisation correction factor larger at 8 TeV than at 7 TeV?

• This is because of our use of no neutrinos in the gen jets at 8 TeV while at 7 TeV the gen jets had neutrinos in them.

Line 288 ('... once hadronisation and DPS effects are taken into account.'): Do they both make such a change that it is clear that both are needed in the calculation?

• Yes, they are both needed.

### Greg Landsberg

L7: add the CMS long discovery paper as an extra reference.

LL23-31: Really suggest separating this text into a detector section - this hardly belongs to the Introduction.

L37: add the second standard reference to PF: [PFT-09-001,14].

LL46-50: are you sure that the isolation correction for the PU is done the same for electrons and muons? Usually one is based on average PU density, whereas the other one uses charged PF candidates to estimate the contribution from the neutrals. Please, double check.

• Yes we use Beta Star for both leptons

Eq. (2): suggest taking a square root of both parts: MT=2EmissTpℓT(1−cosΔϕ)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√; end the equation with a comma.

• removed equation

L73: ... 0.1\% for light-parton (udsg) jets and ...

• changed

LL83-84: give full versions of the MadGraph and Pythia generators used. Also give a reference to the MLM matching scheme. Finally, neither Ref. [31] nor [32] contains the Z2* tune. You should use the FSQ group recommended description of the tune: "The most recent PYTHIA Z2* tune is derived from the Z1 tune [arXiv:1010.3558], which uses the CTEQ5L parton distribution set, whereas Z2* adopts CTEQ6L [JHEP 07 (2012) 012]. The Z2* tune is the result of retuning the PYTHIA parameters PARP(82) and PARP(90) by means of the automated PROFESSOR tool [Eur. Phys. J. C 65 (2010) 331], yielding PARP(82)=1.921 and PARP(90)=0.227.” The last sentence could probably be dropped for the purpose of this paper.

L93: define μF as the factorization scale here; also say that mb is the b quark mass.

• changed to all words

L95: ... in the M|rmT variable used in fitting and cross section extraction.

L159: multijet nature of the tt¯ production and decay, the JES variations ...

LL161-163: the sentence is very poorly worded and unclear. Please, rephrase for clarity.

• sentence (and paragraph) has been rephrased and clarified

LL181-183: the sentence is really awkward: if the uncertainty is already conservative, what's the point of increasing it even further? Just drop the "and increasing ....the fit" part of the sentence.

• removed

L187: move the references to the PDF4LHC here, i.e.: "... PDF set, according to the LHAPDF/PDF4LHC prescriptions [44-47].

• moved

L192: why the average of 1.17±0.12 and 1.13±0.11 has an uncertainty of 0.14, which is bigger than any of the two input uncertainties? The average must have been done incorrectly - please fix.

• This was a result of the inflating up to 100%, we now take the uncertainty directly from the fit.

LL192-194: your description of the fit constraints in the paper is extremely confusing. Here the sentence implicitly implies that you but rectangular constraints of ±1σ on the parameters in the fit to the signal region, and so it appears from the Table 1 caption. Yet, from the very fact that your JES is 1.3σ off its central value as a result of the fit, clearly shows that the constraints used are not rectangular, so I assume they are Gaussian, which is fine. However, you clearly ned to rephrase the text as the way it's written implies that the parameter can't change by more than ±1σ, which is not true. Similarly the text of Table 1 caption should be rephrased accordingly.

• added "the uncertainty on the rescaling sets the one standard deviation bound on the rescale factor in subsequent fits." and the text has been clarified in the description of the procedure

Figures 1-4: Single Top -> Single t in the legends.

• changed
LL221-222: ... A, ϵ are the acceptance and efficiency. [Those are not corrections to the acceptance and efficiency, but the efficiency itself!]
• you are correct, changed

L227: give the FEWZ version used.

• added (FEWZ 3.1)

L233: define the μR as the renormalization scale.

• defined

LL234-238 and Table 3: was the MC b-tagging efficiency used to define the fiducial region with two b-tagged jets corrected for the data/simulation scale factors, either the standard one from Ref. [25] or from the one obtained in the fit? This is important for the theory values, but has not been discussed at all in the text.

• We start with the standard recommendation and then do the 3-step fitting procedure described to rescale it. This uncertainty is the uncertainty on the final fit (which is equivalant now to the output uncertainty of the b-tag efficiency fit in step 1) .

L239: give the MCFM version used.

• MCFM7.0

L244: first of all, give a reference to the CUETP8M1 tune [GEN-14-001 paper]. Second did you really use CUETP8M1 tune even with Pythia 6, as the sentence suggest? This tune was derived for PY8, not PY6!

• We used CUETP8M1 for PY8, CTEQ6L1 for PY8 and the sentence has been clarified.

Figure 5: remove "CMS 2012" legend; use "(theo)" instead of "(th)" to match the paper text.

• new Figure 5 has been made

LL291-292: the Acknowledgement section is completely amiss; please add a standard acknowledgement for long letters.

Title: suggest "Measurement of the production cross section of the W boson in association with two b jets ..."

• changed

##### Abstract:

L7: fiducial region defined by the transverse momentum

• abstract changed

L9: add a comma before the first "and";

• abstract changed
##### Introduction:

L3: at the CERN LHC is relevant

L13: to jets that originate

• changed

L17-18: using data samples corresponding to integrated luminosity of up to 5 fb−1, and at

• changed

L19: collected with the CMS detector

• changed

LL21-22: muon decay channel of the W boson, this analysis uses both muon and electron decay channels.

• changed

##### Event selection and reconstruction:

L51: Missing transverse momentum, p⃗ missT, is defined;

• kept as MET

L53: a W boson candidate.

• changed

L55: lepton p⃗ T and p⃗ missT. The MT [can't start a sentence with a math symbol];

• changed

L57: The value of p⃗ missT is corrected;

• symbol kept

L60: anti-kT;

• We were previously asked to change to follow reference who uses k_t in the title.

L67: b tagging algorithm;

• We were previously asked to hyphenate all "b-tag"

L70: likelihood ratio discriminator. The tagging;

• changed

L72: has a b tagging efficiency;

• not changed

L74: in the efficiency;

• changed

##### Simulated samples:

L77: After all the requirements for the signal-enhanced dataset are applied;

• changed

L79: single top quark, γ+ jets;

• changed

LL80-81: except for the QCD background, which is;

• changed

L82: add a comma before "or";

• changed "or" to "and"

L84: The kT-MLM matching;

• We were previously asked to make it k_t

L85: add a comma after "respectively";

• changed

L96: Single top quark events;

• changed

LL97-98: using {\sc pythia} with the Z2* tune.

• kept version number

L99: with {\sc pythia} at leading order;

• kept version number

L100: with {\sc pythia} using the Z2* tune.

• kept version number

L103: Single top quark and;

• kept version number

L105: by the ATLAS and CMS Collaborations [42].

• udated reference cross section and citations

L110: generated with {\sc pythia}. During;

• kept version number

LL111-112: adjusted to match that observed in data.

• changed

##### Analysis strategy:

LL115-116: Since initial shapes and normalizations of all contributions, except for QCD, in the fit are taken from simulations, it is;

• changed

L126: have more than two jets. The;

• changed

L132: doen a thte generator level.

• changed to "particle list level"

L133: A b jet at the generator level;

• changed to "bottom"

LL137-138: again either from the matrix element calculation or from parton shower, it falls;

• changed

• changed

L145: The shape of the QCD background distribution;

• changed

L147: to be nonisolated, I>0.20 (0.15).

• changed

L149: 1\% of the QCD background rate.

• changed

L150: The QCD background normalization;

• changed

L153: to obtain the QCD background shape.

• changed

LL155,156: the b tagging efficiency;

• consistent style to keep b-tag hyphenated

L166: b tagging efficiency;

• consistent style to keep b-tag hyphenated

##### Systematic uncertainties:

L174: effect on the signal yield.

• changed

L175: uncertainty in;

• changed

L178: a log-normal distribution;

• changed

L181: The 50\% uncertainty in the QCD background;

• changed

L183: The common b tagging;

• consistent style to keep b-tag hyphenated

##### Results:

L190: the b tagging efficiency;

• consistent style to keep b-tag hyphenated

LL194-195: uncertainty in the rescaling factor is included;

• description changed

Fig. 1 caption, LL1-2: in the tt¯-multijet control region after the fit to obtain the b tagging rescaling factors.

• merged Figures 1,2,3

L3: The last bin contains;

• changed

L4: uncertainty in the simulation as the output from the fit.

• changed

Table 1 caption, L1: in the W +bb¯ signal region.

• changed

L2: of the given systematic source to the;

• changed

L3: in the measured cross section.

• changed

The uncertainty labeled "b tagging rescaling" is;

• changed to b-tag eff rescaling

L4: of the b taggign efficiency. In the "variation";

• changed

L5: uncertainties that are correlated across ...affect both the shape and;

• changed

L7: The UES refers to the scale of energy;

• changed

L*: add a comma before "and MES";

• changed

L10: The uncertainty in the integrated luminosity [12] and the; v * changed

L11: in the acceptance;

• changed

Table 1 body, first column: Single t; Drell--Yan [en dash, not a hyphen];

• changed

b tagging rescaling;

• changed to "b-tag eff rescaling"

JES rescaling; Integrated luminosity;

• changed

Theory (scales and PDF);

• changed

Fig. 2 caption, L3: The last bin;

• changed

L4: uncertainty in the simulation as the output;

• changed to "after the fit"

L210: Int he signal region, lepton

• changed

L212: two b-tagged jets;

• changed

Table 2 caption, L1: uncertainties in the;

• changed

Fig. 3 caption, L3: The last bin; L4: uncertainty in the simulation as the output;

• merged with Figure 1

Fig. 4 caption, L2: The QCD background shape;

• changed

LL2-3: and the muon and electron channels;

• changed

L3: The last bin;

• changed

L4: uncertainty in the simulation as the output from the fit.

• changed

L219: The cross section is defined as;

• changed

L219+1 and L22: use Ndatasignal [no capitalization on "data"] here in three places.

• changed

L225: with fiducial selections applied. Then a K factor for;

• changed

L227: {\sc fewz}; {\sc MadGraph };

• changed

L229: and b tagging efficiency;

• changed to "b-tagging efficiency"

L230: The uncertainty in this;

• changed

LL231-232: using the LHAPF/PDF4LHC prescription considering PDF sets from CTEQ, MSTW, NNPDF, and HERA Collaborations, as well as;

• changed

L240: from {\sc MadGraph } interfaced with {\sc pythia 6};

• changed

L241: and {\sc MadGrah } interfaced with {\sc pythia 8} [48];

• changed

L242: using tune Z2*.

• changed

L244: other) using the CUETP8M1 tune.

• changed

LL251-252: Simulations of {\sc MadGraph } + {\pythia 8} events that include double parton scattering (DPS);

• kept because both Pythia6 and Pythia8 give similar DPS numbers

L268: using the LHAPDF/PDF4LHC prescription in which PDF sets from CTEQ,MSTW, NNPDF, and HERA Collaborations are considered.

• changed

L270: choice of scales are also;

• changed

L285: pℓT>30 GeV, |\eta^\ell| < 2.1; • taken from abstract ##### References: Refs. [1,2,6,8,10,11,21,29,36-39,41,44,48]: separate parts of journal name with a space: e.g., Phys. Rev. Lett. • changed Ref. [3]: fix "gamma*" in the title. • changed Refs. [4-6,8-11,15-19,25,32,40]: fix "sqrt(s)" in the title. • changed Ref. [18]: JHEP {\bf 01} (2011) 080. • changed Ref. [22]: JHEP {\bf 04} (2008) 005. • changed Ref. [27]: JHEP {\bf 06} (2011) 128. • changed Ref. [35]: JHEP {\bf 09} (2009) 111 and add an Erraturm, JHEP {\bf 02} (2010) 011. • changed Ref. [38]: Phys. Rev. D {\bf 86} (2012) 094034. • changed Ref. [42]: ATLAS and CMS Collaborations; give the CMS PAS number. • reference removed Refs. [45,46]: swap the order of the two references, so to list them in the order of appearance. • Latex takes care of this automatically Ref. [47]: Nucl. Phys. B {\bf 867} (2013) 244. • changed Ref. [49]: JHEP {\bf 03} (20140 032. • changed ### Avto Kharchilava #### Type A Eq. (1): Instead of sum(pT^gamma)+sum(ET^neutral), use sum(ET^neutral), or sum(pT^gamma)+sum(ET^neutral hadron) • equation not changed, updated explanation L46: PF candidates (hadrons, electrons, photons): Replace electrons with leptons? • changed L64: Can you give more details or reference on the pileup rejection? Is this beta*? PU MVA? • This is beta star L79-80: QCD multijet. Call this background “multijet” or “QCD multijet” here and elsewhere. “QCD” appears in various contexts currently, e.g. “QCD predictions” (Fig. 5, L287). • changed L84: for hadronization => for parton shower and hdronization • changed L101-105: Missing discussion on the gamma+jet normalization. • added Section 4: Can you provide the pre-fitted templates/figs.? Or pre-/post-fit comparisons? • TT-Multijet Prefit • TT-Multijet Fitted • TT-MultiLepton Prefit • TT-Multilepton Fitted • Wbb Prefit • Wbb Fitted L130-131 seem inconsistent with L86-87. In L86-87, the shape of the Wbb is taken from dedicated W+bb sample, while L130-131 says the shape is taken from W+jets samples. • Has been clarified in text L130-131: Would be good to illustrate the M_T shapes of these samples. Are they different? If not then fit cannot separate them? This is not clear from Figs. after the fit. • The shapes are different enough to have some discrimination power as can be seen in a direct shape comparison here: Paragraph at L130 describing jet flavor definition in simulation: "Jets with distance parameter smaller than R=0.5 with respect to lepton are removed from the event." This is done in order not to count prompt leptons as genjets. So leptons from b-hadrons should not be considered in the procedure otherwise it will remove jets with semileptonic b decays. This indeed is stated in the sentence "Generated leptons originating in simulation from the decay of b-hadrons or tau-leptons are not considered." However, it's unclear why leptons from tau decays should not be considered in the procedure. Does it mean it's ok if they get reconstructed as genjets? Why don't simply state that "Jets with distance parameter smaller than R=0.5 with respect to prompt leptons from V decays are removed from the event." • changed L143-144: Could explain better here, or in caption to Table 1, which uncertainties correspond to the cross section. • clarified text on systematics L164+: Procedure of fitting the ttbar samples twice needs more justification, explanation of purpose. Did the stat comm sign off? • yes they did L186: by varying the PDF set. Describe here which PDF sets and prescription. • added L192-193: The b-tagging rescale uncertainty is 0.14/1.15=12.2% which is a bit different from 12.9% in Table 1. Why? • We now take this to be a closure test and the uncertainty is now taken directly from the uncertainty on the multijet fit. L198: What is the posterior uncertainty on your 1.3sigma JES nuisance parameter? In general: A table of posterior nuisance parameters would be helpful (at least on the twiki). • We no longer adjust the uncertainty on the fit and take it from the fit itself, so 3.8% in this case. L224-228: How does the k-factor affect the measured cross section? Is it included in the fit? Why there is no uncertainty coming from k-factor to the measured cross section? • The k-factor does not have a shape dependence so does not affect the fit. If it were higher, then we would measure a correspondingly lower signal strength, resulting in the same measured cross section. L272: at the hadron level => at particle level, or particle jet level? • We mean at the (neutrinoless) jet level. For MCFM, this is accomplished by multiplying the particle level estimation by a hadronization factor. #### Type B Abstract: Latex: 19.8 fb-1 => 19.8~fb-1 • changed Abstract The W bosons are reconstructed via their leptonic decays to muons (W → μν) and electrons (W → eν) =&gt The W bosons are reconstructed via their leptonic decays, W->\ell v, where \ell = mu or e. Change the rest accordingly, e.g. drop the definition of \ell in the following sentence. • changed L13: which => that • changed L22: channels => modes • changed L23-31: New Chapter, e.g. “CMS Detector” • added L68: b jet discrimination => b jet identification • changed L73: Define “light jets” (u, d, s and gluon jets) • followed another comment L83: Define PDF • changed L87: inclisuve => inclusive • changed L93: mu_F, m_b not defined? • defined now L105: colleced => collected • changed L119: to describe => to describe data in • changed L191: presented => shown • changed L221: A,e => A and e • changed L275: Sentence "The results also agree within one ..." is a bit ambiguous. It seems to imply that 8 TeV results are compared with 7 TeV, which is not the case here. • removed Conclusions => Summary: Could adopt the text from the Abstract • taken from abstract If not, then L285: Need to say there is no other jets pT > 25 and |eta| < 2.5 as in the abstract. L290: Drop the last sentence. #### Figures Could improve on the fill pattern: W+bb and W+cc have ~same color • changed W+cc color Adopt “QCD multijet”, or just “multijet” in legends, instead of “QCD” • changed to "Multijet" in legends Fig. 4: Label dR(J1, J2) => DeltaR (b,b) to be consistent with the caption/text • changed Need to remake the Fig. 5: Data are shown with points+-errors bars in the (and most of the HEP) paper, except this Fig. This change of nomenclature will cause confusion, especially when presented in public, like it was the case at the SM@LHC Conference this week. You could show mu, e and combined results separately and pick just MCFM prediction for comparison, uncertainties shown with shaded bands (like you now have for data). Confusion will be farther amplified when the current plot is put side-by-side with the corresponding Fig. from the ATLAS. • remade Figure 5 Fig. 5: Uncertainty on lumi is not indicated? • remade Figure 5 Fig. 5, caption: First time the “5F” abbreviation appears. Define 4FS and 5FS schemes in the text and use it here. • included description of four- and five-flavour schemes and updated plot/caption #### Tables Add outer borderlines. Table 1: The numbers quoted for variation of normalization uncertainty are not explained, except for multijet. E.g. where does 7.4% variation for ttbar comes from? • The reference has been updated and the uncertainty on the cross section has gone down. Give scale and PDF uncertainties separately. The 7 TeV analysis quotes 10% just for the scale variation? • done Table 2 and equation on p8. From these it looks like signal strength alpha is = Nsignal_fitted/Nsignal_Initial. But the numbers in the Table do not support this: 1731.0/1322.7=1.31, not 1.21 quoted in the Table. This needs a better explanation. • The final normalization of the sample comes from a change in cross section (signal strength) as well as from the effects of systematic uncertainties which may be correlated across different samples. Excessive precision for expected yields, could round to an integer number • make plots and tables consistent ### Salvatore Rappoccio Lines 87-89: Does this procedure accurately pick up the components of W+bb from flavor creation and gluon splitting? Long ago this was checked and found to be lacking. Are your phase space cuts sufficient to suppress the gluon splitting component? It would still be good to quantify this. • How does flavor creation make b quarks? The CKM matrix makes this a very small effect. Gluon splitting is what we are looking at. Section 4: What is the effect of the additional jet veto on the flavor creation and gluon splitting rates? • Flavor creation is negligible and gluon splitting is what is being studied. Line 161: Is your cross check only valid for flavor creation, or does this also work for gluon splitting from the parton shower? • It is valid for gluon splitting. Figure 5: The source of the discrepancy could be either gluon splitting or flavor creation. Have you isolated which? • Flavor creation does not play a role here. ### Theodoros Geralis #### Type A comments (English/Style/Formatting) Abstract: “ … a fiducial region having transverse momentum …” propose to write “ … a fiducial region defined by leptonic transverse momentum …” L 85: the numbers 20 and 30 GeV are given without a reference or an explanation. • numbers removed L 87: “inclusive W … “ is misspelled • fixed L 93: The parameter μF (mu_F) should be defined and not considered by default as well known. • Now described in words L 105: colleced -> collected • changed L 108: "same algorithms as are .." -> "same algorithms as the ones .." • changed L 119: When you say "simulation" do you mean ttbar or all simulated samples? The statement is a bit vague. • all simulated samples L 165 reigion -> region • changed L 166: effeciency -> efficiency • changed Table 2: Add a line for the total MC contribution • changed L 233: mu_R should be defined. • now defined #### Type B Comments (Strategy, paper structure, … ) Title: The title as it is implies an inclusive cross-section, while W decays to (e,nu) and (mu,nu) are considered only Abstract: The same remark for the first phrase of the abstract. * We think this is clear from the abstract, but this can be discussed during the final reading L 18-20: ambigious meaning, it could also be interpreted as that what follows extends the data collected at 8 TeV. • changed to "using" to "and uses" L 73-75: perhaps can be rephrased to give less emphasis on the Pt dependance of the scale factors. • removed, this is covered by "jet kinematics" L 77: “After all selection … applied”, what it means "enhanced dataset are applied" ? Does not make sense. • changed L 87: "the normalization of the distribution is taken from the W+bb component..", but later in lines 101-102 says that the normalization is taken from FEWZ? • clarified L 116-117: "except QCD, it is important to prove that the simulation describes the data". The phrase "except QCD” is not complete, we propose to write “except QCD processes”. Also the statement “prove … data” is very strong and we propose to rephrase it. • added "processes" and changed "prove" to "verify" L 126: In the t-t(bar) multilepton case it is not referred which lepton is used to calculate the M_T. Is the highest Pt or both ? * We describe two channels, muon and electron, and the lepton being named is the lepton being used to calculate the mT %ENDCOLOR L 130-138: It is not clear why the definition of a "W+bb" category is not similar to "W+cc". In the first case, a single b hadron originating jet is enough, while in the second case an even number of c-jets is required. Why this difference in the number of b/c jets required to characterize the event? • W+b is CKM suppressed while W+c is not. So W+bb and W+cc are more simlar (gluon splitting) than W+cc and W+c. L 145-149: Is the shape of QCD in isolated and non-isolated regions similar in simulated events for these control regions? • Yes. Below are the isolated and non-isolated regions using QCD MC • And for the other variables, we alse see good agreement with our choice of QCD sideband (mT<30 GeV) L 150-153: Is the QCD normalization extracted in the MT<20 GeV region the same as the one you get when you use the anti-isolated region? • yes L 164-171: What is the region of the fit in the first step? Does it include the region MT<20 which was used earlier for QCD normalization? • This is the ttbar multijet region (third jet requirement) and includes mT>0 . The QCD is allowed to float in this fit. How in the first fit the 2 scale factors for b-tagging and JES are disentagled? They are both there, are n't they? Would it be better if a region with no bjets was used to extract the JES factor first? * FIrst we use a phase space where just the b-tag scale is relevand, and then we fit in a region where both b-tag and JES are relevant. L 181: The error on the scale factor for QCD would give an estimate for the uncertainty on this background. How much is it? The conservative estimate might be an overestimation of the uncertainty. • Even with the 50% uncertainty on the fit, we see that the QCD contributes 2-4% to the overall uncertainty in the measurement L 183: ok but this is not the uncertainty used in the cross section extraction, right? what is the uncertainty mentioned in Tab 1? Is the propagation of the fit (3rd fit) uncertainty on the JES & b-tagging in the cross section estimation? • That is correct, the scale and PDF are taken (as well as luminosity) separate from the fit. The propagation of the fit (3rd fit) on the JES and b-tagging is reflected in the last column of the Table 1 which indicates that the effect of the given systematic on the final quoted uncertainty. L 192: We don’t understand why you quote such a conservative error in your most important systematic error. The statistics in the two samples (e, mu) are roughly equal so in effect it is as if you double your statistics to measure the rescale factor. One can calculate it and it is 0.08. Instead the 0.14 (half of the widest envelope?) is used in table 1. Still, even if we take the error you quote (0.14) you should have a variation for the b tag rescale: 0.14/1.15 = 12.2 % and not 12.9% ? How the uncertainty is estimated at Tab 1? If one takes the 0.08 as the total error for the b tag rescale should be 0.08/1.15 = 7% which would lower your total systematic from 0.14 to 0.12. • We are now taking the uncertainty directly from the fit. Table 1: The label of the third column (uncertainty) is not correct and leads to confusion in the caption text (e.g. The uncertainty labeled “b tag rescale” is the uncertainty … ). The label should be “uncertainty source” as the “Single top”, “Diboson” etc are really the sources and not the uncertainties themselves. Consequently the phrase: “The uncertainty labeled “b tag rescale” is the uncertainty associated with the rescaling …” should be written as “The uncertainty source labeled “b tag rescale” is associated with the rescaling …” • clarified text also “The uncertainty labeled as “Id/Iso/Trg” is the uncertainty associated with the efficiency of the lepton … “ should be written as “The uncertainty source labeled as “Id/Iso/Trg” is associated with the efficiency of the lepton … “. • clarified text L 200: Is JES a free parameter in the final fit? According to statements in the previous paragraph JES & btagging are free in the last fit. • JES is allowed to fit with gaussian limits, the 1 sigma bounds of which are set by the fit in Step 2. Table 2: As referred in Lines 143 – 144, because of the variation of the backgrounds, do you have bkgd fit values at limits? You have said that the bkgds are free but constraint within the predefined uncertainties. • No we do not ### Tristan Du Pree The main comments are on the fitting procedure and its description, and some unclearity about the main background: ttbar. There are also some comments about the background uncertainties (JES and BTV) and the used generators. General analysis comments are given first, more detailed stylistic comments are below. = = = General analysis comments = = = General comment: It is surprising that the analysis ends up with a ~20% uncertainty of which a large fraction (13.0%) comes from b-tagging. It is expected that the b-tag data-MC scale factors have much smaller per-jet uncertainties in the relevant pT range. Now it appears that this analysis instead takes larger uncertainties from a control region, but there is not sufficient details in the paper draft to understand the reason behind. (why is it needed at all? wouldn't it be proving the BTV results wrong if it's serious (but even this is impossible to judge from the paper)? and similarly the size of the JEC uncertainty used by all other run-1 analyses? why not perform a joint fit of the three regions? why can we measure the ttbar cross section with 4-5% precision but have to put a 7.4% uncertainty here?) • The procedure is needed because we see that the b-tag efficiency scale factors do not describe the data in our signal region phase space. It does not prove the BTV results wrong because we showed good agreement in the ttbar phase space measured by CMS. The JES uncertainty is also within the recommendation and our analysis is particularly sensitive to JES because of the jet veto. We do not perform a joint fit because it is less stable. We have updated the ttbar reference and use a lower uncertainty General comment on backgrounds. Concerning the ttbar background estimate: why the 3-step procedure and not an all-in-one fit? The 3 steps may sound like a simplification, but the description is poor and it raises a number of questions: --> what happens if after step 2 (multi-lepton JES fit) you go back to step 1 and fit again the b-tag scale factor? Does it converge? If you have this information please add a statement to the paper • Yes it converges in closure tests at all steps. --> what happens in step 3? How much do the JES and b-tag scale factor change in the final fit? • They do not move from their new central values --> why is the uncertainty on the JES factor fixed to 100% of the fitted deviation from unity? What was the actual precision from the fit? • The uncertainty on the JES is now taken from the fit General comments on JES: --> It is not clear what 1.3 sigma_JES means: is it a uniform scale factor, or 1.3 times the eta- and pT- dependent envelope of the JES uncertainties? Or ...? • It is the envelope, and we have added the corresponding percent normalization difference in the ttbar sample (3.3%) --> It would be useful to indicate the approximate size of sigma_JES in the paper • done --> It would be interesting to give the sign of the fitted JES variation: up or down? I hope it is in the same direction as the 1-sigma variation fitted in the Top mass analysis? • They go in opposite directions which is why the approach of doing a simultaneous fit does not work. General comments on ttbar: --> In general I think the impact of ttbar systematics should be checked (also top pt...). Is the hadronization correction for MCFM unreliable? Better use aMCatNLO and Powheg. • We see no impact from top pt reweighting which is expected given that we see no shape difference even when varying the JES. The hadronization correction is reliable and we do not have a Powheg sample --> What cross-section was used for the main background ttbar? LO, NLO, or NNLO? This is relevant to make sense of the ttbar scale factors measured in section 6. The ATLAS-CMS recommended NNLO cross-section at 8 TeV is 253 pb, see https://twiki.cern.ch/twiki/bin/view/LHCPhysics/TtbarNNLO • We have updated the ttbar cross section reference ( 239 from CMS to 241.5 from CMS/ATLAS) which is the joint CMS/ATLAS result and has a lower uncertainty --> section 6: some comments on procedure using the ttbar. Experimental cross-section uses loose b tag, is it realistic to have this 15% discrepancy for 2 tight tags? JES "measured" only from number of events passing cuts. Recommend to check all ttbar model uncertainties (scale, matching, mt, MG vs Powheg, fragmentation, ...) • We see good agreement with the experimental value when using loose b tags. It is only when we move to our phase space that the disagreements mount, as was presented to the BTV POG and accepted as being related to the b tagging efficiency --> Are the fitted scale factors compatible with the measured 8 TeV ttbar cross-section and known values for b-tagging SFs ? • Yes, they are compatible, but with a central value that better describes our phase space General comments on b-tagging: --> l.192 : your fit of the b-tag efficiency is basically consistent with one, which is good news since you are already correcting the simulation with data-MC btag scale factors (l 74), which, presumably, come from the BTag recommendation. But your error is very large, 100% of the correction. In the subsequent fit, would you get a different result if you were simply using the results from the BTag group ? i.e. do not apply any additional scaling to the b-tag efficiency, but treat it as a nuisance parameter in the fit, with uncertainties given by the BTag group? My point is that, by applying the additional scaling of 1.15, you may easily bias the subsequent fits - and indeed your fitted JES is a bit off. • This is along the lines of the first approach this analysis tried, to do simultaneous fits. The results are consistent within uncertainties, but less stable. --> Is this after applying the standard (pT-dependent) b-tagging SFs prescribed by BTV? Please clarify this in the paper. • yes we have applied the SFs --> In fact, in ttH and tt+bb analyses one typically goes one step further and derives corrections to improve the data-MC agreement of the shape of the CSV discriminator distribution... has this been tried? • Taking advantage of the pT dependence of the scale factors already derived by the BTV, we found that we only need to adjust this relative normalization. --> l.166: here an efficiency scale factor is discussed and in line 167 a reweighted sample is discussed. Does this mean that the efficiency scale factor is actually a reweighting? Or what does "reweighted" mean in line 167? • text has been clarified --> l.203-205: can't we also fit simultaneously the signal region and the two control regions, with the Wbb signal strength as the fit parameter, treating the btag efficiency and the JES as nuisances that can vary within the uncertainties provided by the POGs? If the knowledge that we (the BTag group) have on the b-tag efficiencies is better than what comes out solely from the analysis of your control region, using that smaller uncertainty in the fit instead of your 13% would reduce the uncertainty of your measured cross-section. • We have found that we need to separate the fit to isolate the effects of b-tagging and JES. We now take the uncertainty directly from our fits instead of inflating them to 100\% General comments on fit --> General comment: suggested to perform a simultaneous fit. • noted --> l.169: what does "jet energy scale adjusted mean"? Does this mean that a residual jet energy scale factor is determined? How much do these scale factors differ from the official b-tag and JES factors? Could the discrepancies not be due to other reasons? How do you know that it must be the b-tag and JES that creates the discrepancies? From Table 1 it looks like the effect is rather large (b-tagging is 13% off???). Has this been discussed with the b-tag POG? • The text has been clarified. The JES is shifted and new templates are made and these are used in the fit. The scale factors are consistant with the official b-tag and JES factors. We are confident that this is coming from the b tag as discussed with the BTV POG, and the JES is expected (and also observed) because of our jet veto. Yes, this discrepancy has been discussed with the BTV POG. --> l.154: it is written that b-tagging has an impact on the shape of distributions, but Table 1 lists b-tagging only as normalisation uncertainty. Which one is true? Where is the b-tagging shape uncertainty taken into account? • b-tagging has an impact on the shape of distributions in principle, but in practice in this phase space, it only affects the normalizations. --> l.189: is this "fit in the tt multijet region" the same that was already discussed in line 164-167? It's just 20 lines above, I don't think that the reader needs to be reminded. • Yes, we start the Results section with a quick overview of the procedure -->l.192: why does the relative uncertainty on the scale factor increase when combining the 2 channels? • This was a result of our inflation to 100% and is no longer the case Ttbar: It would be good to check the effect of the main systematic uncertainties on the ttbar prediction, to check how much of this is absorbed in the fit: --> variation of renormalization and factorisation scales --> top pT reweighting • We see that the variation of renormalization and factorization scales give a final uncertainty of 10% and top pT reweighting has no effect General comments on generator: --> l.259: Is c-had@LO really suitable for NLO MCFM? Better use aMC@NLO or Powheg. • we compared aMCatNLO and Madgraph (+pythia) obtaining similar results for the hadronization factor using WBB samples --> l.262: DPS discussion unclear. According to Pythia8 manual, the default MPI includes 2->g->bb (s channel). So only the t-channel is missing in the 4F samples? • The problem is not pythia8, but the hard interaction. the 4F sample produces directly at hard level a W+ bb sample. The contribution from events with WMuNu + another interaction that gives the bb par, cannot be done simply with pythia. Yes, the subsequent MPI interactions (more than 2 b) are included, but they are negligible in comparison --> l.265 : i.e. DPS increases the Wbb xsection by 10% ??? that sounds a lot ! how exactly is this DPS correction calculated ? • Why is this a lot - do have a reference showing it should be less in mind that you could give to us? It is a value that is cross checked both using Madgraph samples (WMunu inclusive + bbar production, explicitly done with DPS) compared to an older estimation assuming fiducial XSec_WMuNu x fiducial XSec_BB / sigma_Effective. --> l.270: The scale uncertainties on theory predictions mentioned in line 270 are not shown in Figure 5. Or are they included in the "PDF" uncertainty? Please clarify. • Figure 5 has been updated --> Fig. 5: Where are aMC@NLO and Powheg? Should be more reliable than MCFM+had. • We used the samples that were available --> l.95+104: since you use an NLO generator (POWHEG) for single-top production, why can't you stick to the POWHEG cross-section for the NLO normalisation (l.104)? --> l.88-95: the justification for using the four-flavour scheme sample for the shapes (larger MC statistics despite the fact that 5F is deemed more accurate) sounds poor - unless the shapes in 4F and 5F agree well. Is that the case indeed ? at GEN level the statistics should not be an issue for LO samples. Anyway, l 130-131 contradicts what was just said - since it now says that the shape is taken from the 5F W+j sample ? • The text has been reworded - "For the signal distributions, the \Wbb component of an inclusive \Wjets sample is used, with the shapes of the distributions taken from a dedicated high-statistics generated sample of exclusive \Wbb." Data/sim agreement: --> Figure 4: It is mentioned l.217 that there is an agreement between data and simulation. But there seems to be a slope in the lepton pT ratio (the data seems softer than MC). Has this difference been quantified / studied? • We see the slope and thought that others might find this distribution interesting. --> 217: "Agreement between data and simulation is observed." How is this quantified? By eye, it looks like there is e.g. a downward trend in the slope of the ratio of the lepton pT distributions • Statement removed. We will leave it to the reader to decide if they would call this agreement or disagreement. = = = Specific comments (grammar/spelling/style/etc) = = = There were still quite some typos in the document that could have been easily caught using a spell-check. Please run this next iteration. - Abstract: very long sentences • changed - Abstract, 2nd half: this tries to be a sentence but isn't. "agrees" --> "in agreement" or re-write • changed - Abstract: "the W boson" -> "W bosons" ? (also for consistency with line 4 in the Abstract) • first instance refers to the process itself, second to our measurement - l.3: "experimental searches" -> "searches" • changed - l.3: "searches" -> "searches for ..." • added "and measurements" - l.4+5: 2x "production" • changed - l.4+5: swap "vector boson in association with Higgs boson" -> "Higgs boson in association with vector boson" • changed - l.8: "an extension" -> "extensions" • changed - l.9: "lepton(s)" -> "leptons" - l.8-10: what different models? It feels like a reference would be useful here. • There are many searches ongoing in CMS in this general category and we did not want to mention a specific one at the exclusion of others. - l.10: "jet(s)" -> "jets" • changed - l.11: "associated jet dynamics" -> "the dynamics of the associated jets • changed - l.12: drop "to" • changed - l.18: "luminosity" -> "luminosity at the LHC" • added reference to LHC - l.19: remove comma after 8 TeV • changed - l.28: "where pseudorapidity is defined as..." should be moved to section 2 (l.35) • removed - l.34: "loosely isolated" is not clear, and the isolation is described a few lines below anyway. • We expect that most readers will be familiar with the concept of lepton isolation so it was worth mentioning specifically that the triggers have some isolation requirement. As you point out, isolation is described more fully a few lines below. - l.34: this seems to be the momentum threshold of the trigger (and not the muon selection...). This could be said more explicitly. • changed "triggers with a loosely..." to "triggers which require a loosely..." - l.36: remove "then" • changed - l.42: "calorimeter" --> "the calorimeter" • changed - l.42: "Both the muon and electron" --> "Both the muon and the electron" • changed - Eq.1: most symbols in this equation are not explained. Do we assume that they don't need explanation? - Eq.1 : need to say that \Sigma p_T^charged is restricted to charged particles that come from the primary vertex. By the way, do you use only the charged hadrons, or also charged leptons in this sum ? • Explanation clarified - l.54: ...of the W boson... • line removed - l.61: "with a distance parameter of 0.5" -> "with a parameter R=0.5" BTW: Ref.20 mentions "radius parameter", maybe good to follow that convention. • changed to "radius parameter" - l.72: "that both jets" --> "two jets that" • changed - l.72: "which has a" --> "with a" • changed - l.73: rewrite as "probability of 0.1% (1%) for light (charm) jets" • changed - l.74-75 : the data/MC scale factors for the b-tagging efficiencies: besides pT, they do not depend on the eta of the jet ? • They do not, they are the CSV Tight scale factors. - l.77: "for the signal enhanced dataset" is not defined, so better to drop - l.77 : "signal enhanced dataset" sounds weird. Why not simply "After all sel requirements described below are applied,.." • changed - l.80: comma after "simulation" • changed - l.82: drop "events with" • changed - l.82: "+jets or with ttbar" -> "+jets, and ttbar+jets events" • changed - l.85: qcut = 40 in ttbar • line has been dropped - l.86: you mention V+jets and ttbar+jets, but how about photon+jets? • same as V+jets, added - l.87: 'inclisuve' --> "inclusive" (please run spell-check) • changed - l.87: "The normalization" -> "The relative normalization" • removed - l.88: at this stage, you have not said yet that the analysis relies on a fit of the MT distribution, hence the reader does not know of which "shape" you are talking about, such that you could use "shapes" (plural) instead. • removed - l.90: "b-quark" versus "b quark" in line 247. Please make it consistent. I think "b quark" is the right spelling. • removed - l.93: 'mu_F' not defined • removed - l.95: "extraction." -> "extraction, MT." • removed - l.96: "Powheg 2.0" -> "Powheg~2.0" (Latex) • changed - l.97: "in the" -> "with the" • changed - l.99: Rewrite as "Diboson samples are generated and hadronized with Pythia 6.4 at leading order (LO) using the CTEQ6L PDF set and the Z2* tune" • changed - l.101-103: which PDF? - l.105: "colleced" -> "collected" (spell check) • changed - l.105 : something like a report number is missing in the reference [42] • reference dropped - l.108: drop "as are" • changed - l.109: "additional" -> "additional simultaneous" • changed - l.119: "the control regions" -> "the data in the control regions" • changed - l.121-122: drop this whole sentence concerning the muon reconstruction/selection, it was already described before (l.40-50) • We want to state all of the selection cuts concisely in one place. - l.125: "in" -> "for" • changed - l.129: for the tt-multilepton region, do you require that both leptons have opposite charge ? • We do not make this requirement. With the selection as is, the region is already pure ttbar. - l.130-131: cf above, versus l 89 ? • description moved - l.131+138: the notation 'W+udscg' leads to confusion because of the 'c'. • W+c is included in the category (W+cc is not). - l.132: "done at the truth generator level" -> "done using generator-level information" • changed - l.132+137: why at least 1 b jet for W+bb, but 2,4,... jets for W+cc? • W+b is CKM suppressed while W+c is not. - l.135+139: "generated" -> "generator" • changed - l.136: does this include the lepton from W? The last sentence of the paragraph, l.141, should be moved upwards, in l.136. • Yes this includes the lepton from the W. - l.137: you demand at least 2 charm jets for an event to be considered as Wcc. While at least one b-jet is required for an event to be flagged as Wbb. Why do you make this "even" (l.137) distinction for Wcc ? • W+c happens while W+b doesn't. - l.139-141: I did not understand where you need this ? • How about where it is then? - l.139: "for the final" -> "for final" • changed - l.144: "cross sections as" -> "cross sections and shapes as" • description changed - l.145: "for each region" -> in each control and signal region"? • changed - l.147: "be anti-isolated" -> "not be isolated" • changed - l.145-153: make one paragraph (it's all about QCD) • changed - l.152-153: This sentence would fit better at the end of the previous paragraph (l 149). It is not obvious that the QCD mT shapes do not depend on the lepton isolation. Can you quantify that, and/or have a systematics to account for that? By the way, how did you choose the sidebands cuts I > 0.2 (0.15) (l.147) ? • This is covered by the 50% systematic uncertainty and was determined by looking at Isolation vs. MT. We chose our QCD isolation sideband by inverting the cut at the loose isolation threshold. - l.158: "on the third" -> "on a third" • changed - l.161: "JES" -> "JES variations" • changed - l.162+163: Some unclarity about W+bb region vs process. Suggestion "W+bb has" -> "W+bb process has" • changed - l.163: should contain jets from parton shower • at leading order W+bb has exactly two jets - l.164-171: why don't you fit simultaneously the two ttbar control regions ? that would allow to properly account for the correlations between the b-tagging efficiency and the JES. • The shapes are too similar for the fit to distinguish between the systematics - both uncertainties are found to have essentially no effect on the shapes of the distributions - only normalization. We therefore use the three step process to isolate the different effects. Note that this is a different statement than that our choice of variable to be fit does not discriminate between the signal and backgrounds. - l.165: "reigion" -> "region" (spell check) • changed - l.166: "effeciency" -> "efficiency" (spell check) • changed - l.167: "averaged" -> "combined" • changed - l.166-167: "rescaling factor...reweighted samples" --> is it one factor, or per-event weights? • description updated - l.165: "reigion" --> "region" (spell check) • changed - l.166: "estimation" --> "estimate" • description changed - l.166: "effeciency" --> "efficiency" (spell check) • changed - l.165-167: Here only the b-tag efficiency rescaling factor is mentioned as parameter of the fit. It is not clear if there are some other nuisance parameters considered, or if the b-tag efficiency is considered as the only source of data/MC disagreement. • The b-tag efficiency is considered as the main source of data/MC disagreement, though the other uncertainties are applied. - l.167-169: Same thing for the second step, is it only the JES that is fitted? In particular is the b-tag efficiency fixed here? • The b-tag is fixed here, and we confirm with a closure test on the post fit result (applying it in both the tt-multijet and tt-multilepton regions) allowing it to float. The result of these closure tests is unity. - l.170: "properly" -> "better" • changed - l.171: "a fit in" -> "a fit to MT in" • changed - l.173: "major" -> "main" • changed - l.179: "affecting only" -> "in" • changed - l.180: "uncertainties affecting both the shapes and normalizations." -> "uncertainties in the shapes." • changed - l.182: "uncertainty does" -> "uncertainty further does"? • removed - l.186: PDF uncertainties not defined • defined - l.187: give more details of what you do for the PDFs. The reference to the PDF4LHC recommendation should be moved here (l.231) • reference added - l.191: "Fig. 1" -> "Fig.~1" • changed - l.191: "The measured" -> "The central values of the" • changed - l.192: "averaged to" -> "combined to" • changed - l.193: "allows the following fits to vary the rescaling factor between 1.01 and 1.29" --> Please remove, it is trivial (assuming this is a 1-sigma interval, not a step function) • removed - l.193: "where the uncertainty allows the following fits to vary the rescaling factor between 1.01 and 1.29". This sounds like there are cutoffs at 1.01 and 1.29, but it is probably meant that there is a Gaussian prior with width 0.14? • line removed, but you are correct - l.193-194: "uncertainty allows ... 1.29." -> "uncertainty is chosen to cover the individual ranges." • procedure changed, now taking uncertainty directly from fit - Table 1 caption: "major" --> "main" • changed - l.194: "rescaled", but before it was "reweighted"? Make consistent. • changed to reweighted - Fig.1: Suggestion to improve the readability of the legend, possibly combine several backgrounds into "others" (for example diboson + DY + gamma/jet), and explain in the caption what "others" means. • We show all of the independent samples which are allowed to float in the fitting procedure. - Fig.1 Caption: "highest" -> "last" and "as output from" -> "after" • changed - l.196-202 and Fig 2: the e+mu events shown in Fig.2 left and Fig.2 right must be the same, the only difference being that MT(mu nu) is shown on the left, while MT(e nu) is shown on the right. I.e. you perform two separate fits, one to MT( mu nu) and the other one to MT(e nu)? And they happen to give the same result for the JES? Please clarify. • Correct, and there is also the difference in the trigger required. - Table 1 caption: "triggering" -> "trigger" • changed - Table 1 caption: "PDF and scale choices" -> "PDF uncertainties and scale choices" (it's not because of the PDF choice, or is it?) • changed - Table 1: capitalize "Uncertainty" and "Variation" and "Effect" and "Uncorrelated" and "Normalization" and "Norm." and "Correlated" and "Luminosity" and "Theory" • changed - Table 1: also line out 'Correlated' to the middle of the table and change "b tag rescale" -> "b tag eff rescale" • changed - Table 1: why both "JES" and "JES rescale"? • One was for the JES uncertainty and one was for the uncertainty on the rescaling. We now are just taking the uncertainty from the fit and the table has been updated accordingly - Tab 1: no uncertainty on the c --> b and on the light -> b mistag rates ? • These are the b tag efficiency corrections provided by the POG and used as input in Step 1. - Tab 1: you use an uncertainty of 7.4% on the ttbar cross-section? From TOP-14-016 (your ref 42 ?) we measure it better than that! Please check, since that is one of your dominant uncertainties in the end. • changed source of ttbar cross section, and updated uncertainty What are the uncertainties used in the fit of the control regions ? (since Tab 1 refers to the fit of the signal region) E.g. you account for the luminosity uncertainty here ? • same uncertainties with the exception of the rescale factors that haven't been fit for yet. Luminosity is added separately. - l.199: "multilepton enhanced data set" -> "multilepton-enhanced control region" ? • changed - l.200: "simulation with" -> "simulation and" • changed - l.201: "sigma_JES." -> "sigma_JES for subsequent fits". • changed - l.205: "The correlation between different sources of uncertainties is taken into account." Sources are nor independent? Perhaps "correlations between channels"? - l.205: "correlation between the different sources of uncertainties": you do not mean that the JES is correlated with the UES for example? You probably mean instead the "correlation across all simulated samples", as said in the caption of Tab 1. • correlation across all simulated samples - l.208: "boson" -> "boson would" • changed - l.208: "event yield." -> "event yield and are not considered." • changed - Fig.2: the caption refers to the "muon sample" and the "electron sample" as if they were different event samples, while I understand that these are the same events ? • Largely the same events, but not exactly, and mT is being calculated via a different lepton. - Fig.2 legend with 'c': 'W+cc' and 'W+udscg' • W+c and W+cc are two kinematically different processes and we would like to distinguish them in principle since we have some W+udscg contribution but essentially no W+cc in the signal region. - Fig.2 caption: "multilepton enhanced data set" -> "multilepton-enhanced control region" • changed - Fig.2 caption: "highest" -> "last" • merged with Fig 1 - Fig.2 caption: "as output from" -> "after" • merged with Fig 1 - l.210: "the three fits" -> "should be simultaneous" • Do not understand this comment. Is the suggested sentence this? "...are also produced by applying the results from should be simultaneous to the simulated samples" - l.212: "b tagged" -> "b-tagged" • changed - l.214: "sideband" -> "region" (except if DeltaR plot is for MT>30) • changed - l.215: "Table 2" -> "Table~2" • changed - l.216: "in the combined lepton channel" -> "combining both lepton flavors" • changed - l.216: "Fig. 4" -> "Fig.~4" • changed - l.217 and Fig.4 : in contrast to the statement made in l.217, the agreement seen in Fig 4 is not so good ! • line removed - Table 2: Give total sum of backgrounds? Suggest to add line under W+bb (to visually separate sig&bkg). • added total MC to bottom - Table 2: Five uncertainties on W+udscg yields? • The W+udscg normalization changes on the order of 5\% - Table 2: "muon" -> "Muon" and "electron" -> "Electron" and "0.0" -> "-" • changed - Table 2: align numbers right (not left) • changed - Table 2: for "signal strength / combined ", which denominator? FEWZ? • correct - Table 2: "total uncertainty of the measurement" is not true, since that is (or should be) stat+syst+theo • changed to "total uncertainty of the fit" - Fig.3 caption: "highest" -> "last" and "as output from" -> "after" • changed - Fig.4 caption: "highest" -> "last" and "as output from" -> "after" • changed - Fig 3 and 4: it is really impossible to distinguish the Fit Uncertainty band in b&w • Updated plot colors - l.219 and equation: "N_signal" -> "N_reconstructed" ? • changed - l.221: "from simulation," -> "from simulation, reconstructed in two fiducial region" • changed - l.222: drop "correction factors" • changed - l.224: "in the following manner" -> "as follows" • changed - l.224+227: "Madgraph" not written in style consistent with previous occurrences" • changed - l.225: you should give here the fiducial cuts • text has been updated - l.225-227: there is no reason why a K-factor computed for inclusive W production would also apply to Wbb in a given phase space. By the way you do not say what you use this k-factor for. • The W+jets sample is used and the cross section taken from there - but the normalization of the Wbb should not matter. A higher initial cross section would lead to a smaller signal strength and the final measured cross section would be the same. - l.227: style of "FEWZ" not consistent with l.102 • changed - l.228: "11 (13)%" -> "11% (13%)" • changed - l.229: drop comma • changed - l.232: I thought that the PDF4LHC recommendation involved only the global QCD fits (i.e. CTEQ, MSTW and NNPDF)? • we varied all listed here, but HERA was not the largest anyways, so does not set the bound on the uncertainty - l.233: is the variation (2,2) and (1/2,1/2)? Unclear from text. • correct, added "simultaneously" - Tab 3 : why is the syst uncertainty quite larger in the electron channel than in the muon channel? • Electron channel has gamma+jets and electrons aren't as clean as muons. - Table 3: drop comma before "pb" and add line under Muon+Electron, to visually separate from Combined • changed - l.242: "using TuneZ2*" -> "using the Z2* tune" • changed - l.245-251: sounds weird.. It starts with some general statement and ends up saying that actually, the phase space considered here won't bring any of the "important feedback" alluded to in l.246. • Showing that two schemes give the same result is something we consider "important feedback". - l.255: "analysis calculated as" -> "analysis, where it was found to be" (but if same method, fully correlated?) • sentence changed - l.257: "the statistics" -> "the limited statistics" • changed - l.261: what is a "CMS simulation" ? • changed to just "simulation" - l.263: ", therefore" -> ". Therefore" • changed - l.268-271: replace by "same way as before for A x eff" (shorter) • Kept for clarity - l.275-277: How do you compare 7&8 TeV? • sentence removed - Fig.5 caption: no need to refer to colors here. "blue" --> "inner" (2x) and "black" --> "outer" (2x) • Figure and caption updated - Fig.5: statistical uncertainty invisible in B&W • Figure and caption updated - Fig.5: Drop "CMS 2012" in legend and maybe add fiducial region cuts? • Figure and caption updated - Fig.5 caption: "blue" not understandable in B&W • Figure and caption updated - Fig.5 caption, last sentence, rewrite as: "effects of DPS are included in the generated samples so the DPS correction is not needed." • changed - l.279: "the W boson in" -> "the W bosons in" (consistency with 281) • now the same as in absrtact - l.290: "within the level of 1 S.D." -> is this up or down? • now the same as abstract - References: please check bibliography with titles as in Inspire • all references updated - References general comments: "Phys.Lett.B 99" -> "Phys. Lett. B99"and "Phys.Rev.Lett."->"Phys. Rev. Lett." and "gamma" -> "\gamma" and "sqrt"->"\sqrt" and "pbar" -> "\pbar" and "Phys.Rev.D 99" -> "Phys. Rev. D99" and "Eur.Phys.J. C 99" -> "Eur. Phys. J. C99" and "Comput.Phys.Comm." -> "Comput. Phys. Comm" and "Nucl.Phys.Proc.Suppl." -> "Nucl. Phys. Proc. Suppl." and "Nucl. Instrum. Meth. A 99" -> "Nucl. Instrum. Meth. A99" • changed - [18]: "JHEP (2011)" incomplete • reference taken from inspire - [27]: "1106" -> "06" (11 is already in the year) • changed - [42] Citing an unpublished ttbar cross-section measurement technical report in a paper? • removed - [49]: "1403" -> "03" (14 is already in the year) • changed ### Andrea Rizzi #### Type B Abstract: the last sentence is excessively long, with a subject five lines (!) before the verb ! Break it, for instance as “The W+bb production cross section is measured in a fiducial region having lepton transverse momentum, Ptl> xx and eta<yy, with exactly … . The measured cross section is found to be sigma= zzz in agreement with standard model predictions.” Note that you do not need to define \ell because you have already specified that the measurement is done with muons and electrons. • updated abstract Abstract: "b-tagged" means nothing when comparing to theory, we should not measure a "b-tagged" cross section, we imagine you correct for b-tag efficiency and quote a cross section for jets associated to b-quarks or b-hadrons • Good point, changed line 12-14 “Throughout this paper jets that originates … are referred to as b-jets, while b-tagged jets as …” • changed lines 14-15 The sentence on Z/gamma* and V boson is not relevant for the introduction, move the definition of V and Z/gamma* where it belongs, e.g. line 79. Beside, the Z/gamma* specification is an irrelevant detail for this specific paper. • moved line 16 “has been studied at the LHC,” • added line 19 reference [8] is not about 8 GeV and not even [12] is • reference 8 is about 7 TeV and this analysis is about 8. Sentence has been reworded. lines 54-55 please remove the sentence with the formula on the transverse mass, it is a common variable at hadron colliders. Complete the previous sentence with “to form a W candidate and compute its transverse mass, M_T” • removed lines 73: are the numbers signed off by BTV? line 87: exclusive Wbb means nothing because there are bb pairs generated by the PS. So clarify the meaning of W+bb sample here (which generation cut off for the b ?) • reworded line 93 define the mu_F scale • changed to be stated in words lines 105, 106 in the measurements of the ttbar cross section Wbb is a background, have you checked that the correlation is numerically irrelevant ? * We include the Wbb as a background and find that it makes negligible contribution to either ttbar phase space line 118 introduce here the nature of the two control regions (one from lepton+jets and the other form dileptons) and their names (tt-multijet and tt-multilepton, maybe better call it tt-dilepton) • added description line 123 What is the effect of the jet veto on theory uncertainty? • The uncertainty associated with the jet veto comes from the uncertainty associated with the choice of jet energy scale and this is what is being fit for in step 2. line 130 Is W+1b-jet associated to the W+bb jet? How can you predict that the ratio of events where both tagged jets have a real b in it to the ones having only one b associated to the two b-tagged jet is well reproduced in theory? do you have a systematic for this ? • Are we talking about 2 real bs in one jet, and a second light jet mistagged? Because we require both jets to be CSV Tight-ly tagged, this makes a negligible contribution to the signal region and we do not include a dedicated systematic since it is covered by the btagging uncertainty. line 131 how is W+b (single b) treated and defined in this paper ? • The pure process W+(single b) is CKM suppressed (much more so than W+(single c) is) and does not make a contribution to our signal region - especially since we require both jets to be tagged (CSV Tight) line 131 why you call it W+udsCg and not W+udsg if charm is considered separately ? • We consider W+c as part of W+light and W+cc in its own category. This is motivated by the difference in production mechanism between W+c and W+cc. line 135 how about events with 3 charm? what's the point in counting them in W+udsCg rather than Wcc? • It does not matter to the analysis if they end up in W+light or W+cc since they constitute a negligible fraction of events. line 150 is the QCD normalization adjusted in the inverted isolation sample ? It is not clear from the text. • It is adjusted in the standard isolation sample. Text has been clarified lines 161-163 the sentence is obscure, because the W+bb LO sample is not discussed before (in Section 3). • text has been modified lines 178-180 and 181-187 The sentence at lines 178-180 is obscure, in addition there are no details in the paper on how “the fit” is performed. Is it an unbinned or a binned fit ? Are there parameters taken as nuisance parameters ? Which ones ? Are the background parameters marginalized ? Is the uncertainty on the measured cross section reduced because of the marginalization or are the systematics fully externalized ? What does "a quadratic distribution" means in this context? What does setting b-tag and JES to 100% of it self means? • Text has been clarified. lines 184: letting a nuisance parameter (such as JES or btag) free to float doesn't really mean that you are "remeasuring it" in the signal region (there are correlations to account and you didn't show anything about how the various nuisance parameter, and signal strnength modifier, are correlated) • Statement has been removed line 192: how come that the combination of the mu and ele produces an average with larger error?! • This was a result of our 100% uncertinty. We now take the uncertainty directly from the fit line 198 sigma_JES must be defined, where is coming from (some words and a reference) ? The reader is not supposed to know how jet corrections and related uncertainties are defined in CMS. • It is now mentioned in the text that this corresponds to 3.4% for ttbar. line 200: I do not understand the procedure of setting to 1.3sigma, what does it mean? • The JES affects the different samples to different extents so the fit in the ttbar is first converted from a change in normalization (normalization only because we have seen that the shapes are identical, and we do include them in principle) into the number of sigma_JES this corresponds to for ttbar. The other samples are then shifted relative to their own JES. lines 209-217 is this the description of additional checks ? Probably it should be added explicitly at the beginning of the paragraph • This is a description of other variables than mT being shown in the postfit signal region. line 214 in the M_T < 30 GeV sideband there is a considerable contamination of signal, how is this treated ? Is this re-weighting in shape done only to produce Fig. 4 or to make the full fit ? This paragraph is very unclear. • The normaiozation is taken from the final output of the step 3 fit (as are the normalizations for everything else). It is only the shape that is taken from the low mT sideband. Validation plots of the mT<30 sideband using QCD MC are shown below lines 228-229 11% cannot result from 80% times 40% … what you probably have to clarify is that there are these two contributions, but also the geometrical acceptance. • It is 40% per jet and we have two jets so 0.40*0.40*0.80 line 265 and related paragraphs and figure. There is something unclear here: a specific study on DPS is done using MG+Pythia 8 and a large systematic uncertainty is added to the DPS component. Instead for MG+py6 5F the MC simulation of DPS is considered without uncertainty, simply because it is part of the default production. Why? • When we have to separately measure the DPS, we measure it with a large uncertainty, and when it is already included in the sample, the uncertainty is smaller. In both cases, it is important to include. line 276 one cannot state that a cross section measured at 8 TeV is in agreement with the measurement at 7 GeV. Cross sections at different C.M. should not be compared. • The statement has been removed Conclusions: same comment as for abstract on ill defined "b-tagged jets" from a theory point of view. * conclusions now match abstract #### Type A - General: the draft is full of typos, please pass it through a spell-checker, e.g. - line 87 inclusive - line 105 collected - line 165 region - line 166 efficiency • changed - line 7 “its” refers to the subject that is discovery. Rewrite as “… to establish the nature of the new boson and determine its coupling to b quarks” • changed - line 71 “The tagging of a jet is made by … value: both jets are required to pass a threshold, which …” • changed - line 132 remove “the truth” is redundant (you say already “generator level”) • changed to particle list level - line 135 “at generator level” • changed - Fig. 5: "blue error bars" look black - Fig. 5: is CMS 2012 referring to this paper? if so, I'd call it CMS 2016 • Figure 5 has been remade. 2012 was referring to the dataset used but has been removed - line 222: isnt mu rather than alpha the standard name in CMS for signal strength modifiers? • mu is already being used for renormalization and factorization scales - lines 224, 227: Madgraph is cited with a different text style, could you check it thought the paper and use the predefined \MADGRAPH command ? Please check also if you do the same for the other MC generators • changed - line 225 drop the first “calculated” • changed - line 244 add a reference for CUETP8M1 • reference added ### Sijin Qian #### Type B (physics) none #### Type A (style) In general (1) Throughout the paper (including in Abstract, in Figure's and Table's captions, in Tables, etc.), to be consistent with the good examples in this paper (e.g. on L255 and Table 2 (the header column and the 8th row under the header row), etc.) the spaces before and after the symbol "+" in all the expressions of "xxx + jet", "W + bbbar" and "W + ccbar", etc. should be removed, e.g. L19: "W + bbbar" --> "W+bbbar" • changed Other places where also need to be changed by the similar way are in Abstract (the 5th line), L19, L79, L82, L86-91 (many places), L101, L114, L130-133, L138, L162, L170, Tables 1-3's and Fig.5's captions (the 1st lines), Tables 1 and 2's header columns (several rows), L203-204, L218, L225, L234, L260, L262, L276, L283 and L290, etc. • changed (2) Throughout the paper, to be consistent with the good examples for the electron "e", e.g. L33, and in Abstract (the 5th line), etc. the font of "e" at some places should be changed from "e(italic)" --> "e(non-italic)", the places where need to be changed include in Abstract (at the beginning of the 7th line) and L285 (in two superscripts), etc. • changed Pages 1-10 (3) L5, L17, L56 and L83, the "SM", "sqrt(s)", "QCD" and "PDF" may should be explained at their 1st appearances in text on these lines, i.e. (a) L5: "with standard model Higgs boson production ..." --> "with standard model (SM) Higgs boson production ..." • changed (b) L17: "at a centre-of-mass energy of 7 TeV using ..." --> "at a centre-of-mass energy (sqrt(s)) of 7 TeV using ..." • changed (c) L56: "such as QCD multijet events," --> "such as Quantum ChromoDynamics (QCD) multijet events," • changed (d) L83: "using the CTEQ6L [29] PDF set." --> "using the CTEQ6L [29] Parton Distribution Function (PDF) set. • changed (4) L16, it may be shortened from "The production of Z bosons [3–7] or W bosons [8, 9]" --> "The production of Z [3–7] or W [8, 9] bosons" • changed (5) L23-24, the "superconducting solenoid" is repeated on these lines, the latter one may be shortened from "is a superconducting solenoid of 6m internal diameter, providing a magnetic field of 3.8 T. Within the superconducting solenoid volume are a" --> "is a superconducting solenoid of 6m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a" similar as most newly published CMS papers have expressed. • changed (6) L27-28, L34-35 and L121 (a) The "eta" should be explained at its 1st appearance in text on L27; and according to the PubComm guidelines, the definition of eta is no longer necessary, thus L27-28 can be changed from "Forward calorimeters extend the pseudorapidity coverage provided by the barrel and endcap detectors, where pseudorapidity is defined as eta = -ln[...]." --> "Forward calorimeters extend the pseudorapidity (eta) coverage provided by the barrel and endcap detectors." • have been advised to remove the line entirely Then, L34-35 and L121 can be shortened from L34-35: "pT > 24 (27) GeV and pseudorapidity |eta| < 2.1 (2.5)." --> "pT > 24 (27) GeV and |eta| < 2.1 (2.5)." • changed L121: "pseudorapidity |eta| < 2.1, and" --> "|eta| < 2.1, and" • changed (7) L42-43, L48-49, L51-52 and L75 may be shortened from (a) L42-43: "to have pT larger than 30 GeV and ..." --> "to have pT > 30 GeV and ..." • changed (b) L48-49 and L51-52: (two places) "sum of transverse momenta of ..." --> "pT sum of ..." • changed (c) L75: "on the transverse momentum of the jet." --> "on the pT of the jet." • changed (8) L47, the phi may should be explained at its 1st appearance in text here, also as the numerical value of the angle phi has been implicitly shown on this line, etc., and an angle can be measured in either the radians or degrees, therefore, the unit of phi may should be specified as "DeltaR = sqrt(...). The" --> "DeltaR = sqrt(...), where phi is the azimuthal angle in radians. The" • changed (9) Eq.(2), to be consistent in the paper, the subscript of the 1st variable in the right-handed side should be changed from "EmissT(italic)" --> "EmissT(non-italic)" • changed (10) L57-58 and L63, as the ECAL and HCAL have been introduced earlier on L25-26, they can be used here to shorten from L57-58: "in the electromagnetic and hadron calorimeters using ..." --> "in the ECAL and HCAL using ..." • changed L63: "in the hadron calorimeter [23]." --> "in HCAL [23]." • changed Otherwise, if these two places would not be shortened, then the "(ECAL)" and "(HCAL)" can be removed from L25-26, since they now have not been used in whole paper. • kept abbreviation (11) L69 and L140, as the "SV" and "FSR" have not been used afterward in whole paper, they eventually can be removed, i.e. L69: "secondary vertex (SV)" --> "secondary vertex" • changed L140: "radiation (FSR) by summing up ..." --> "radiation by summing up ..." • changed (12) L93, to be consistent in this paper, the font of subscript "b" should be changed from "muF >> mb(italic)" --> "muF >> mb(non-italic)" • this discussion has been removed (13) L128-129, I'm not sure whether the word of "flavor" should be plural or not, i.e. "two isolated leptons of different flavor," --> "two isolated leptons of different flavors," • not changed - "I see two houses of different color" vs. "I see two houses of different colors" the second implies that the two house colors are different than something else (14) L132-134, at two places, the expression of "a b quark (or hadron)" is looked quite odd similar as "a b c d xxxx". It may be improved by changing from L132: "a b quark" --> "a bottom quark" • changed L133-134: "a b hadron" --> "a bottom hadron" • changed (15) L168-169, L196 and the Fig.2's caption (the 2nd line), at 3 places, it can be shortened from "jet energy scale" --> "JES" • changed (16) L191, there seems an extra space which should be removed, i.e. "are presented in Fig. 1." --> "are presented in Fig.1." • changed (17) Figs.1-4 (a) In the legend of each plot, the 2nd and 7th lines, the 2nd words should be in the lower case, i.e. "Fit Uncertainty Single Top" --> • changed "Fit uncertainty Single top" (b) In each horizontal axis label of Figs.1-3, the 2nd word should be in the lower case, or be replaced by the variable "mT", i.e. "Transverse Mass [GeV]" --> "Transverse mass [GeV]" or "mT [GeV]" • changed to Transverse mass [GeV] (c) For the horizontal axis labels of Fig.4, the ones in the left plot should be changed (to be consistent with the caption) and in the right plot can be shortened, i.e. "dR(J1,J2) Lepton Transverse Momentum [GeV]" --> "DeltaR(b,bbar) Lepton pT [GeV]" • changed to dR(b,b) and Lepton transverse momentum (d) The captions of Figs.1-3 are largely identical on the 1st 5 words of the 1st line and most of the lower 3 lines; thus, I'm not sure whether 3 Figures can be combined, with an extended caption of Fig.1 on the 1st two lines, plus the lower 3 lines in the captions. i.e. (together with the item (15) above for the JES in Fig.2) "Figure 1: The transverse mass distributions in the ttbar-multijet phase space after fitting to obtain the b-tag rescale factors. The lepton channels are ..." --> "Figure 1: The transverse mass distributions (upper) in the ttbar-multijet phase space after fitting to obtain the b-tag rescale factors, (middle) in the ttbar-multilepton enhanced data set after fitting to find the appropriate JES, and (lower) in the W+bbbar signal region after fitting simultaneously muon and electron decay channels. The lepton channels are ..." • changed (18) Table 1 (a) In the caption, the 10-11th lines may be shortened from "The uncertainty on the luminosity and the uncertainty on the acceptance due to ..." --> "The uncertainties on the luminosity and on the acceptance due to ..." • changed (b) In the Table, all the contents in the header row and header column should capitalize the 1st letters, i.e. " | uncertainty | variation | effect on the measured cross section =========================================================================== normalization | | uncorrelated |============= | norm. + shape | correlated ======================== luminosity 2.6% theory (scale+PDF) " --> " | Uncertainty | Variation | Effect on the measured cross section =========================================================================== Normalization | | Uncorrelated |============= | Norm. + shape | Correlated ======================== Luminosity 2.6% Theory (scale+PDF) " • changed (c) In the 3rd header column and the 2nd row under the header row, the 2nd word should be in the lower case, i.e. "Single Top" --> "Single top" • changed (d) The location of Table 1 should be moved forward to Page 5 before Fig.1 since its 1st citation is on L144 in Page 4 which is earlier than the 1st citation of Fig.1 on L191. • Table 1 now appears before Figure 1 (19) L199-200, it may be shortened from "for the muon channel (left) and the electron channel (right)." --> "for the muon (left) and the electron (right) channels." • changed (20) At some places, i.e. L211-212 (two places), L214, Fig.4's caption (the 2nd line), L224, L231, L242-243 (three places), L264, Fig.5's caption (the 5th line), to be consistent with the tense in this Section, the verbs on these lines should be changed from "was" --> "is" or "were" --> "are" • changed (21) Table 2 (a) In the header row, the 1st letters should be in the upper case, i.e. "muon | electron" --> "Muon | Electron" • changed (b) In the header column, the 6th row under the header row, the 2nd word should be in the lower case, i.e. "Single Top" --> "Single top" • changed (22) L220-222, (a) it may be shortened on L220-221, and (b) a comma at the end of L221 may should be replaced by a word of "and", i.e. "where NDatasignal is the number of observed signal events, NMCsignal is the number of expected signal events from simulation, NMCgenerated is the number of generated events in the fiducial region, A, epsilon are the acceptance and efficiency correction factors," --> "where NDatasignal is the number of observed signal events, NMCsignal is of expected signal events from simulation, NMCgenerated is of generated events in the fiducial region, A and epsilon are the acceptance and efficiency correction factors," • changed (23) L228, I'm not sure whether the word of "channel" should be singular or not, i.e. "The product A*epsilon is 11 (13)% in the muon (electron) channels and ..." --> "The product A*epsilon is 11 (13)% in the muon (electron) channel and ..." • changed (24) L233 and L270, at two places, I'm not sure whether a comma should be replaced by a word of "and", i.e. "muF, muR simultaneously ..." --> "muF and muR simultaneously ..." • changed (25) L243, to be consistent with the expression on L214, from the pronunciation point of view, the article word may should be changed from "using a NNLO PDF set ..." --> "using an NNLO PDF set ..." • changed (26) L262, I wonder that is the "S" standing for in the "DPS", and whether the "interaction" should be replaced by a word starting with a letter "s". • changed to DPI (S was for scattering) (27) Fig.5 (a) In the legend at the upper-left corner, the 2nd words on all three lines should be in the lower case, i.e. "Total Uncertainty Systematic Uncertainty Statistical Uncertainty" --> "Total uncertainty Systematic uncertainty Statistical uncertainty" • changed (b) In the caption, the color of "blue" is mentioned, but it neither can be seen in the colored Figure nor in the black-white print-out; if so, the "blue" may be removed. • Figure and caption have been changed. Page 10 (28) L286-288, (a) according to the PubComm guidelines, some acronyms in the Conclusions Section may should be spelled out at their 1st appearances in the Section, (b) a word of "flavor" on L287 may be saved, i.e. (together with the item (26) above for the DPS) "with the SM prediction by MADGRAPH + PYTHIA in the four-flavor and five-flavor schemes, as well as with the NLO QCD prediction obtained with MCFM once hadronization and DPS effects are taken into account." --> "with the Standard Model prediction by MADGRAPH + PYTHIA in the four- and five-flavor schemes, as well as with the next-to-the-leading-order QCD prediction obtained with MCFM once hadronization and double parton sxxxx effects are taken into account." • conclusion now matches abstract (29) Between L291 and L292, the Acknowledgment Section is missing, it should be added without needing the Section index number. • added Pages 11-14, in the References Section, (30) L294, in [1], to be consistent in this Section, the spaces should be added between the words in the journal name, i.e. "Phys.Lett.B 716 (2012) 1," --> "Phys. Lett. B 716 (2012) 1," Other ones which also need to be changed by the similar way are [2], [6], [8], [10], [11], [21], [29], [36]-[39], [41], [44] and [48]. • changed (31) L309, in [6], to be consistent in this Section, two spaces may should be added before and after the symbol "=" in the article title, i.e. "at sqrt(s)=7 TeV with the ATLAS detector" --> "at sqrt(s) = 7 TeV with the ATLAS detector" Another one which also needs to be changed by the similar way is [16]. • changed (32) L328, in [12], to be consistent with the PAS Refs. in all other CMS papers, an extra "CMS" before the year number at the end of index should be removed, i.e. "CMS Physics Analysis Summary CMS-PAS-LUM-13-001, CMS, 2013." --> "CMS Physics Analysis Summary CMS-PAS-LUM-13-001, 2013. Another one which also needs to be changed by the similar way is [14]. • changed (33) L344, in [18], the volume and page indices as well as a comma should be added, i.e. "JHEP (2011) doi: ..." --> "JHEP 01 (2011) 080, doi: ... • changed (34) L353, in [22], to be consistent with other JHEP Refs. (e.g. [20] and [30], etc.) in this Section, the volume index should be shortened from "JHEP 804 (2008) 5," --> "JHEP 04 (2008) 5," Other ones which also need to be changed by the similar way are [27], [35] and [49]. • changed (35) L362, in [25], to be consistent with other PAS Refs. in this Section, the document index should be changed from "Technical Report CMS-PAS-BTV-13-001, CERN, Geneva, 2013." --> "CMS Physics Analysis Summary CMS-PAS-BTV-13-001,2013." • added (36) L365, in [27], an extra space before the colon in the article title should be removed, i.e. (together with the item (34) for the volume index) "MadGraph 5 : Going Beyond”, JHEP 1106 (2011) 128," --> "MadGraph 5: Going Beyond”, JHEP 06 (2011) 128," • changed (37) The "year" number should be given for Refs.[31], [45] and [46]. If there would be problems to display the year number with the default bib file, it may be fixed by changing from "article" to "unpublished" in the bib file. • changed (38) L387, in [35], to be consistent with the expression in other CMS papers, the font of "s-" and "t-" in the article title should be changed from (together with the item (34) for the volume index) "s(non-italic)- and t(non-italic)-channel contributions”, JHEP 0909 (2010) 111," --> "s(italic)- and t(italic)-channel contributions”, JHEP 09 (2010) 111," • changed (39) L408, in [42], I'm not sure whether this combination paper should have an author of "ATLAS and CMS Collaborations" and whether two document indices from both Collaborations should be listed. • citation removed (40) L411, in [43], it may be looked better if the hyphen in the article title is extended to a long dash symbol, i.e. "GEANT4 - a simulation toolkit" --> "GEANT4 --- a simulation toolkit" • changed (41) L419, in [47], to be consistent in this Section, a space should be added before the volume index, i.e. "Nucl. Phys. B867 (2013) 244," --> "Nucl. Phys. B 867 (2013) 244," • changed ## LE Comments (on Paper v3) Don Lincoln ### Basic questions: In the abstract, it talks about the cross-section pp -> W(l nu) + bbbar). It is unclear to me if the cross section is really pp -> W + bbbar or W(l nu) + bbbar. In short, given that the W -> e or mu 2/9 of the time, is the 2/9 in there or not? This is just a language thing. Are you correcting for the branching fraction? Or is it truly a cross-section for the lepton and electron final state? We are not correcting for the branching fraction, but are performing the measurement in the W(e,nu) and W(mu,nu) channels. In the abstract, you say “with exactly two b-tagged jets.” It is unclear if that means “exactly two jets, both of which are b-tagged” or whether it means “exactly two b-tagged jets and no other jets.” Please clarify. Changed From : ... exactly two b-tagged jets having ... Changed To : ... exactly two b-tagged jets withp_{\mathrm{T}}>25$GeV and$|\eta|<2.4$and no other jets with$p_{\mathrm{T}}>25$GeV and$|\eta|<5.0$... Is there a |Z_vertex| cut? There is no such cut. There is a cut ensuring that the the leptons come from the primary vertex, designated as the one with the highest sum pT of the tracks. L87 – 89: You talk about the four flavor scheme as compared to the five flavor scheme. This is pretty confusing, given that bottom quarks are obviously the 5th flavor. I think this section needs a little massaging to clarify the ideas. Had: The$\Vjets$samples are generated using the five-flavor scheme that includes massless b quarks in the initial state. Additionally, a signal$\Wbb$sample is generated in the four-flavor scheme which has more statistics than the$\Wbb$component of the$\Wjets$sample generated in the five-flavor scheme. Therefore, in this analysis the four-flavor sample is used for the signal shape contribution, with a normalization taken from the$\Wbb$component of the five-flavor$\Wjets$sample. Became: For the signal distribution, two generated samples are used. The normalization of the distribution is taken from the$\Wbb$component of the$\Wjets$sample and the shape of the distribution is taken from a dedicated$\Wbb$sample. The reason for this separation is that the$\Vjets$samples were produced using the five-flavor scheme in which the b-quark is treated as a massless parton, while the dedicated$\Wbb$sample was produced using the four-flavor scheme in which the b-quark is treated as a massive particle which decouples from the evolution equations. The five-flavor scheme is more accurate than the four-flavor scheme in the limit$\mu_F >> m_b$, but the four-flavor sample was generated with more events passing the signal region phase space selections, and hence has a smoother distribution in the variables used in fitting and cross section extraction. * We could also mention that cross checks were performed using exclusively 5F and exclusively 4F shape/normalization and the results were compatible with what we show. L127 – 139: I think that the selection cuts in this passage are at MC truth level, but I am not 100% sure. Yes, the splitting of W+jets into W+bb, W+cc, and W+udscg is done using MC truth information. Removed: Exactly two b jets with$\PT>25\GeV$and$|\eta|<2.4$are required in the signal region. I don’t know what the last sentence implies. Those jets are “rejected.” What does “rejected” mean in this context? Is the event rejected? Are the jets removed from the event? "rejected" means "removed from the event" and the text has been changed accordingly Why the asymmetry between the fact that if a bottom quark is created at ME or PS level, it is a W +bbbar event, while two distinct charm jets means W + ccbar, while charm that is not resolvable into jets is W + udsgc? This difference comes from the difference in production mechanism. While W+c can be directly produced via the interaction of a 1st generation quark with a gluon and is detectable by CMS (see figure 1 in http://arxiv.org/pdf/1310.1138v2.pdf), the analogous W+b final state can not and makes a negligible contribution to the signal region. In contrast, W+cc where the cc is produced via gluon splitting has very similar kinematics to the signal region: W+bb. So the lack of distinction between W+b and W+bb at the generator level was chosen so as to be as inclusive as possible while the distinction being made between W+c and W+cc comes from the fact that both final states occur at a nonnegligable rate but arise from different processes. I found the entire paragraph L 161 – 168 a bit confusing. I also am not sure if the shapes referred to in L 161 is the eff_b or JES shapes or both? These "shapes" referred to the shapes of the mT variable in the simulated samples (MC and QCD). Had: First, the shapes obtained from the simulation together with the QCD shapes are fitted to$\MT$data distributions in the$\ttbar$-multijet region. The obtained b-tagging efficiency rescaling factors are measured separately in the muon and electron control samples and averaged before being applied. The reweighted simulated samples are used in the next step, where the$\ttbar$-multilepton region is used and the jet energy scale in the simulation is adjusted. As a result of these two steps, the simulation is expected to properly describe the$\ttbar$contribution and can be used to extract the number of$\Wbb$events from a fit in the signal region. Became: First, using the simulated samples detailed above, a fit was performed in the$\ttbar$-multijet reigion using the$\MT$variable. The result of this fit gives an estimation of the b-tagging effeciency rescaling factor, which is measured separately in the muon and electron channels and averaged before being applied. The reweighted samples are then used in the next step where a fit to the$\MT$variable in the$\ttbar$-multilepton region is performed and the jet energy scale in simulation is adjusted. As a result of these two steps, the simulation is expected to properly describe the$\ttbar$contribution and the final step is to extract the number of$\Wbb$events from a fit in the signal region. L178 – 184: I don’t understand how putting the uncertainties at 100% allow the fit to remeasure them. Or maybe I do. But I don’t understand what is going on there. Are these backgrounds essentially allowed to float within a 100% uncertainty? Yes, this is correct. Because we are remeasuring these scale factors whose effects are very difficult to disentangle, our strategy was to adjust the simulation according to the fit, but allow future fits to recover the initial scaling at 1 sigma. If we saw that successive fits pulled the scale factors substantially back toward their original value, this would be evidence of bias in our method, but we see instead that the rescalings stay fixed. This is also confirmed by performing closure tests, re-performing a fit after extracting and applying the scale factors and seeing that it then delivers a scale factor consistent with unity. And I am curious about the asymmetry between allowing the b-tagging and JES uncertainties to float between 0 – 200%, while the PDF uncertainties are a factor of two (50 – 200%)? This asymmetry confuses me. For the b-tagging and JES uncertainties, the uncertainty was chosen specifically with the intention of setting particular values as being one standard deviation away from the central value (namely, the value that "undoes" the rescaling). The uncertainty associated with the change in scale is estimated by changing the renormalization and factorization scales up and down by a factor of two, but the final value is ~1%. The PDF uncertainty is estimated by taking the bounding window of estimations from a variety of PDF sets, and is ~10%, thus giving the combined PDF+scale = 10% given in Table 1 I understand the equation on page 7, give or take. There is an implied “ = “(N^Data_signal / N^MC_signal) x (N^MC_generated / Lumi)” step, or at least so I think. The first term is taken to be the signal strength, while the second term is sigma_gen. I think? I have never heard the term “signal strength” and am confused by it. In addition, It is clear what you are substituting in when you equate the second and third terms. But if I understand that properly...and I probably don’t...N^MC_signal includes acceptance and efficiency, but it also has a background contamination. I guess N^Data_signal also has a background contamination, and it looks like you are assuming they are the same. I think? You are correct both in the missing equality and in the problem that one encounters when trying to use this equation directly substituting N_events for the different variables, and this is what motivated us to use signal strength in the first place. In performing a fit, the final number of signal events will change as a result of two sources: the effect of uncertainties which are correlated across different samples (things like b-tagging, JES, MES, EES, ...) and the effect of a change to the cross section of the signal sample itself. Factoring the total change in signal normalization into these two components then, the signal strength is just the component coming from the change in cross section, and thus is applied to the generated cross section used to find the measured cross section as indicated in the equation. L228 & L227. I don’t know how to reconcile the 11 (13)% and the 80% and 40%. (Because isn’t 80% x 40% = 32%?) The missing piece here is that 40% is the btag efficiency per jet, so the calculation is 80% x 40% x 40% = 12.8% which is consistent with the 11/13% quoted above. To get 11/13% we invert the equation on P7 and the numbers in Table 2 acc x eff = ( n_wbb ) / (lumi * xc * ss) to give mu: 0.1329 ele: 0.1148 We estimate the efficiency of the lepton requirements and (per jet) b-tagging by looking at the raw number of events passing our selections Added statement "per jet" to Paper L252 – 253: you say that it agrees with the 7 TeV Z+b analysis. I know you give the reference, but it would help the lazy reader if you were to quantify your statement here. added ### Editorial rules: In the abstract, you use many symbols, like p_T^l, eta^l, etc. The rules are that the abstract should not include user defined symbols. The question is whether things like p_T count as user defined symbols. I do not think that we have user defined symbols in the abstract. For example, pTlep is given as$p_{\mathrm{T}}^{\ell}$. If there exists a list of allowed symbols, that would be useful, but I have not found such a list. ### Minutia L 4 – 5. The phrase “a background to standard model Higgs boson production associated with a vector boson, where the Higgs boson decays into a bbbar pair” would be better as “a background to vector boson production in association with standard model Higgs boson production in which the Higgs boson decays into a bbbar pair.” done L6: I’m not sure that you need to spell out the acronyms of the two experiments. While it’s not wrong, it’s ugly and jarring. done L9: Remove “(minimal or not)” done L10: through -> using done L11: hadrons -> hadron^M done L12: add comma: boson, done L 13 – 14: replace “to bjets as jets which originate from hadronization of b quarks” with “to jets which originate from the hadronization of b quarks as b jets” done L 17: “In the past, the” -> “The” done L18: Remove “by the ATLAS and CMS collaborations,” done L19: Remove “previously” done L 28 & L35. Shouldn’t pseudorapidity be defined? defined Equation 1: Why I^{rel} and not just I? changed to I L65: that -> which done L72: “Tight” is jargon. Drop and rewrite as “both jest pass a threshold which has” done L77: You say “signal region,” which seems a little jargon-y to me. Maybe “signal sample”? I’m a little iffy on this suggestion. changed Had: After all signal region selection requirements are applied, Became: After all selection requirements for the signal enhanced dataset are applied, L95: You define diboson here and yet use it on L79. Maybe define it earlier? defined on L79 now L100 – 102: “The ttbar...measurements [42].” Is a bit unclear. Maybe “The ttbar cross section was determined from data [42] using ATLAS and CMS data.” ???? how about: The ttbar cross section was determined from data colleced by the ATLAS and CMS detectors [42]. L113: “fit, except QCD, are taken from simulations” -> “fit are taken from simulations, except QCD.” done L145: You say “all other backgrounds.” That means “all backgrounds not yet mentioned,” right? No, this means all backgrounds other than QCD. We apply the QCD selections to Data and MC, then take the difference as the QCD shape. L151: affect significantly -> significantly affect changed L159: you explicitly call out LO and I see how that is relevant in context. But I don’t see where in the background calculation discussion where it was called out earlier. It seems to me that that should be done. Both the 4F and 5F samples are generated with MadGraph as mentioned on L82. No change made - ok? L161 & L166: It says three steps and two steps and I imagine that this means that the steps are broken up into 1 + (2 + 3), but I can’t see how that was done. It isn't. This was just a description of steps 1 and 2. Step 3 is the fit in the signal region described in the Results section. L 161 – 166: I am having trouble understanding the sentence “First, the shapes...multijet region.” Please clarify. Does this fit include the W + bbbar? Agreed, this was awkward. The intention was to say that everything was fit (but QCD comes from a data driven technique, so I didn't want to call it simulation) The whole paragraph changed to: The fit procedure thus consists of three steps. First, using the simulated samples detailed above, a fit was performed in the$\ttbar$-multijet reigion using the$\MT$variable. The result of this fit gives an estimation of the b-tagging effeciency rescaling factor, which is measured separately in the muon and electron channels and averaged before being applied. The reweighted samples are then used in the next step where a fit to the$\MT$variable in the$\ttbar$-multilepton region is performed and the jet energy scale in simulation is adjusted. As a result of these two steps, the simulation is expected to properly describe the$\ttbar$contribution and the final step is to extract the number of$\Wbb$events from a fit in the signal region. L173: add comma: cases, done L174: I think the “ prior to “norm” should be . changed L175: add comma: procedure, done L191 – 192: I don’t understand in detail the implications of this sentence. It means that we rescaled the simulation using the scale factors described in the previous sentence. Removed "accordingly". Table 1, [caption]: what do you think about changing UNC for X? I spent a lot of time wondering about what uncertainty UNC was. changed Also, sentence starting in line 4 “In the ‘variation’...UNC in the fitting procedure.” Seems to have an extra word or something that makes it not entirely clear to me. Not sure what to do - didn't change anything Table 1: JES rescale line & L196: I’m confused by the 1.3 & 1.6 disparity. 1.3 is correct, changed. I’m also a little confused by how these tie into the “in the fit” on L194. Clarified by removing the statement: The energy scale in the fit varies within its uncertainties,$\sigma_{\mathrm{JES}}\$.

Figure 2 [caption] vs. figure 2 [figure]: You use different terms, specifically “ttbar- multilepton phase space” vs. “ttbar multilepton control region”. Personally, I think it might be best to use the phrase “ttbar multilepton enhanced data set.”
changed to ttbar multilepton enhanced data set in both places

L213: There is an extraneous “)”
removed

L240: add comma: scheme,

## Comments on v2

Abstract: try to use the same \bar comment for the anti-b as in the text later (here it appears twice)

• changed
Introduction: not sure that the single paragraph on the detector is enough for the paper. Probably the usual full section should be included, but the LE will check that.
• will wait for LE comment
-L43 : ... and to originate ....
• changed
-L135: summing up the four-momenta ... (remove the "vector", since I don't know what a vector four-momentum would be)
• changed
-L140: The normalizations ... are allowed ...
• changed
-L211 and L214: again a shorter bar over the anti-b -L213: ))
• changed
- Table 3: is it understood why the average value is closer to the electron value than the muon value, despite the muon channel having slightly smaller errors (this question might have been asked before)
• Yes, this is understood and comes from the fact that we are not just averaging the two fit results, but are combining the two channels and performing an entirely separate fit. Before adding the G+J sample which affects only the electron channel, we had previously seen a larger bias, but now the difference from the combined result is +0.03pb (ele) and -0.04pb (mu), so the combined fit is essentially equivalent to the average as one would naively expect.
-L247: use vs.\ in order to avoid the spurious space after the full stop (2x in this line)
• changed
-L269: I would replace "volume" by "phase space"
• changed

# Guenther

Indeed, it might be interesting to hear if there are further proposals for kinematic plots? Maybe a procedure could be to produce a set of plots that are given as suppl. material on some Twiki only.

Regarding the text (and not repeating the comments sent already by Tristan and AM):

l15 : generically instead of generally? changed

l58: no comma after "detector" removed

title of section 3: I would write : Simulated samples changed

l77: I think the MLM matching is relevant for the parton shower, not for hadronization (as written here) you are correct, changed

l91: the CTEQ6M PDF set is used at which order? LO, NLO, NNLO? changed to explicitly state cross section calculated using FEWZ at NNLO

l212: ... is the simulated fiducial cross section.... changed

l248: The uncertainty in the ... changed

# Anne-Marie

l6 I have been bothered by this a while You spell out LHC and CMS but not ATLAS changed

l13 note -> paper changed

l22-23 for the paper you need to add the detector section back to the text. added as directed in https://twiki.cern.ch/twiki/bin/viewauth/CMS/Internal/PubDetector

All figures: should now have CMS without Preliminary. changed

Things we can probably discuss in parallel to the LE review:

New fig 4: could rebin the lepton pT distribution above pT>100 GeV ? Or show in log scale ? agreed

For the choice of distributions to show, I don't know if we want to limit ourselves or show a representative set for the kinematics of both jets and lepton, or search variables like Mjj ? Should we have an ARC-author meeting about this in parallel with the LE review ? possible distributions are shown above

# Tristan

1) What is the motivation to compare the cross-section measurement to these 4F/5F Pythia6/8 predictions of MCFM/MadGraph? The answer is probably because these are the ones that are available... but it would be nice if you could describe some motivation (e.g. around lines 234-247).

Currently, the predicted cross-sections in the paper are 0.51/0.51/0.49/0.50, so practically indistinguishable by the measurement... the reader might wonder what we learn from this comparison, and it's also a question one can expect during CWR.

The differences in predictions that arise between modelling the b quark as massive or massless are still being understood in general, so we provide insight into this comparison within the context of W+bb. Additionally, the improvements to showering that are brought by Pythia 8 over Pythia 6 could have produced different predictions and having interfaced P6 and P8 with the same MadGraph ME calculation allows us to isolate this difference. In general, different calculations giving similar predictions is of itself an interesting result, and we add our observed cross section to complete the picture. Added:

Comparisons between the results of calculations performed
under different assumptions provide important feedback
on the functioning and validity of the techniques employed.
Differences in predictions arising from the modelling of
b quarks as massive or massless are possible, as are
variations in predictions arising from the use of different
showering packages (\PYTHIA6 vs. \PYTHIA8) or matrix element
generators (\MADGRAPH vs. \MCFM).
In the phase space explored here, these predictions are all
very close in their central value and agree with each other
well within their respective uncertainties.

2) You describe around line 81 that "the 4F signal sample is used for the shape.. the normalization from the 5F W+jets sample". It is described quite clearly how the 5F W+bb signal is defined from the W+jets samples, but not so much how it is acutally merged with the 4F sample. E.g.: in my last reading I was wondering if you do you apply the same gen-level selection criteria on the 4F W+bb sample to define this sample. As this merging of two samples is not trivial to do consistently, it might be helpful to give a few more details.

The 4 flavor sample is generated as W+bb, and then the same reco-level signal region selections as are imposed for any other sample. To "merge" with the 5F sample, we simply rescale the yield of the 4F (after all selections) to match the yield of the 5F (after all selections).

3) You have now added Fig.4, with some kinematic distributions, which is nice. However, the description is currently still quite minimal (and as AM said: we could discuss which variables to show).

The technique for obtaining QCD is slightly different for these variables than for mT, but the procedure is otherwise the same (hence not so much of a description for how we obtained the distributions). For QCD, the difference is that instead of using an inverted isolation sideband for the shape, we use an mT<30 GeV sideband. Then the normalization is set to be that used for the mT variable.

As a suggestion, it would e.g. be nice to discuss the data/simulation difference around DeltaR (b,b)~3.5. Note that such a tension was also observed in previous CMS measurements of Z+bb. (E.g. SMP-13-004 and EWK-11-015, but unfortunately I cannot upload attachments in HN now.)

Looking at https://cms-physics.web.cern.ch/cms-physics/public/EWK-11-015-pas.pdf Figure 5 does not show a difference at dR(b,b)~3.5 while some tension is seen in AN2012_303_v7 (I couldn't find dR(bb) distributions in any of the papers) and there the comment is made that this "suggests a harder spectrum in data than expected." In general, the agreement between simulation and data in this analysis is good, and it is hard to draw a conclusion from one data point being 20% high, especially when the points to either side agree at the 10% level. This could be statistics or it could be QCD, but it is not clear what conclusion you would like to be drawn here.

Topic attachments
I Attachment History Action Size Date Who Comment
pdf MCXC_Comparison.pdf r2 r1 manage 14.6 K 2016-02-25 - 00:43 ThomasPerry
png MCXC_Comparison.png r1 manage 118.9 K 2016-03-16 - 16:46 ThomasPerry
png QCD_MC_IsoVNiso_wjj_mt.png r1 manage 21.9 K 2016-06-09 - 13:51 ThomasPerry
pdf QCD_MC_MTG30vMTL30_wjj_goodLep_pt.pdf r1 manage 15.5 K 2016-06-09 - 16:20 ThomasPerry
png QCD_MC_MTG30vMTL30_wjj_goodLep_pt.png r1 manage 525.8 K 2016-06-09 - 16:20 ThomasPerry
png QCD_MC_mt30VmtL30_Isolated_wjj_goodJ1J2_dR.png r1 manage 19.8 K 2016-06-09 - 13:51 ThomasPerry
png QCD_MC_mt30VmtL30_wbb_goodJ1J2_dR.png r1 manage 19.1 K 2016-06-09 - 13:51 ThomasPerry
png QCD_MC_mt30VmtL30_wjj_goodJ1J2_dR.png r1 manage 18.2 K 2016-06-09 - 13:51 ThomasPerry
pdf Results_WBB.pdf r1 manage 14.9 K 2016-06-15 - 04:34 ThomasPerry
png Results_WBB.png r1 manage 651.9 K 2016-06-15 - 04:35 ThomasPerry
pdf Wbb4F_ttjjj_TTbr_Fitted_ele_Step1_TTjjj.pdf r2 r1 manage 19.7 K 2016-06-15 - 04:17 ThomasPerry
png Wbb4F_ttjjj_TTbr_Fitted_ele_Step1_TTjjj.png r2 r1 manage 820.4 K 2016-06-15 - 04:17 ThomasPerry
pdf Wbb4F_ttjjj_TTbr_Fitted_mu_Step1_TTjjj.pdf r2 r1 manage 19.7 K 2016-06-15 - 04:17 ThomasPerry
png Wbb4F_ttjjj_TTbr_Fitted_mu_Step1_TTjjj.png r2 r1 manage 819.9 K 2016-06-15 - 04:17 ThomasPerry
pdf Wbb4F_ttjjj_TTbr_PreFit_ele_Step1_TTjjj.pdf r2 r1 manage 43.7 K 2016-06-15 - 04:17 ThomasPerry
png Wbb4F_ttjjj_TTbr_PreFit_ele_Step1_TTjjj.png r2 r1 manage 869.8 K 2016-06-15 - 04:17 ThomasPerry
pdf Wbb4F_ttjjj_TTbr_PreFit_mu_Step1_TTjjj.pdf r2 r1 manage 19.9 K 2016-06-15 - 04:18 ThomasPerry
png Wbb4F_ttjjj_TTbr_PreFit_mu_Step1_TTjjj.png r2 r1 manage 883.9 K 2016-06-15 - 04:18 ThomasPerry
pdf Wbb4F_ttme_Ele_Fitted_ele_Step3_Wbb_cmb.pdf r2 r1 manage 20.0 K 2016-06-15 - 04:18 ThomasPerry
png Wbb4F_ttme_Ele_Fitted_ele_Step3_Wbb_cmb.png r2 r1 manage 815.5 K 2016-06-15 - 04:18 ThomasPerry
pdf Wbb4F_ttme_Ele_PreFit_ele_Step3_Wbb_cmb.pdf r2 r1 manage 20.0 K 2016-06-15 - 04:18 ThomasPerry
png Wbb4F_ttme_Ele_PreFit_ele_Step3_Wbb_cmb.png r2 r1 manage 916.4 K 2016-06-15 - 04:18 ThomasPerry
pdf Wbb4F_ttme_Muo_Fitted_mu_Step3_Wbb_cmb.pdf r2 r1 manage 19.8 K 2016-06-15 - 04:18 ThomasPerry
png Wbb4F_ttme_Muo_Fitted_mu_Step3_Wbb_cmb.png r2 r1 manage 812.6 K 2016-06-15 - 04:18 ThomasPerry
pdf Wbb4F_ttme_Muo_PreFit_mu_Step3_Wbb_cmb.pdf r2 r1 manage 19.7 K 2016-06-15 - 04:18 ThomasPerry
png Wbb4F_ttme_Muo_PreFit_mu_Step3_Wbb_cmb.png r2 r1 manage 888.8 K 2016-06-15 - 04:18 ThomasPerry
pdf Wbb4F_ttme_TTbr_Fitted_ele_Step2_TTme.pdf r2 r1 manage 19.5 K 2016-06-15 - 04:19 ThomasPerry
png Wbb4F_ttme_TTbr_Fitted_ele_Step2_TTme.png r2 r1 manage 808.4 K 2016-06-15 - 04:19 ThomasPerry
pdf Wbb4F_ttme_TTbr_Fitted_mu_Step2_TTme.pdf r2 r1 manage 19.6 K 2016-06-15 - 04:19 ThomasPerry
png Wbb4F_ttme_TTbr_Fitted_mu_Step2_TTme.png r2 r1 manage 820.1 K 2016-06-15 - 04:19 ThomasPerry
pdf Wbb4F_ttme_TTbr_PreFit_ele_Step2_TTme.pdf r2 r1 manage 19.5 K 2016-06-15 - 04:19 ThomasPerry
png Wbb4F_ttme_TTbr_PreFit_ele_Step2_TTme.png r2 r1 manage 879.7 K 2016-06-15 - 04:19 ThomasPerry
pdf Wbb4F_ttme_TTbr_PreFit_mu_Step2_TTme.pdf r2 r1 manage 19.6 K 2016-06-15 - 04:19 ThomasPerry
png Wbb4F_ttme_TTbr_PreFit_mu_Step2_TTme.png r2 r1 manage 877.2 K 2016-06-15 - 04:19 ThomasPerry
png new_syst.png r2 r1 manage 264.9 K 2016-06-15 - 03:37 ThomasPerry
png new_xc.png r2 r1 manage 54.6 K 2016-06-15 - 03:38 ThomasPerry
png new_yields.png r2 r1 manage 124.8 K 2016-06-15 - 03:39 ThomasPerry
png old_syst.png r1 manage 250.8 K 2016-06-09 - 14:59 ThomasPerry
png old_xc.png r1 manage 69.3 K 2016-06-09 - 14:59 ThomasPerry
png old_yields.png r1 manage 123.0 K 2016-06-09 - 14:59 ThomasPerry
pdf postcfit_wbb_dRJ1J2.pdf r2 r1 manage 18.2 K 2016-02-25 - 00:44 ThomasPerry
png postcfit_wbb_dRJ1J2.png r1 manage 147.6 K 2016-03-16 - 16:46 ThomasPerry
pdf postcfit_wbb_mt_ele.pdf r2 r1 manage 19.9 K 2016-02-25 - 00:44 ThomasPerry
png postcfit_wbb_mt_ele.png r1 manage 168.5 K 2016-03-16 - 16:46 ThomasPerry
pdf postcfit_wbb_mt_mu.pdf r2 r1 manage 19.8 K 2016-02-25 - 00:45 ThomasPerry
png postcfit_wbb_mt_mu.png r1 manage 167.2 K 2016-03-16 - 16:46 ThomasPerry
pdf postcfit_wbb_pTLep.pdf r2 r1 manage 25.7 K 2016-02-25 - 00:45 ThomasPerry
png postcfit_wbb_pTLep.png r1 manage 174.5 K 2016-03-16 - 16:46 ThomasPerry
pdf postfit_wbb_goodJ1J2_dR.pdf r1 manage 18.2 K 2016-06-15 - 04:17 ThomasPerry
png postfit_wbb_goodJ1J2_dR.png r1 manage 761.6 K 2016-06-15 - 04:17 ThomasPerry
pdf postfit_wbb_goodJ1J2_dR_log.pdf r1 manage 18.0 K 2016-02-25 - 02:36 ThomasPerry
png postfit_wbb_goodJ1J2_dR_log.png r1 manage 130.4 K 2016-03-16 - 16:46 ThomasPerry
pdf postfit_wbb_goodJ1J2_mass.pdf r1 manage 27.4 K 2016-02-25 - 01:02 ThomasPerry
png postfit_wbb_goodJ1J2_mass.png r1 manage 193.9 K 2016-03-16 - 16:46 ThomasPerry
pdf postfit_wbb_goodJ1J2_mass_log.pdf r1 manage 30.1 K 2016-02-25 - 02:36 ThomasPerry
png postfit_wbb_goodJ1J2_mass_log.png r1 manage 197.4 K 2016-03-16 - 16:46 ThomasPerry
pdf postfit_wbb_goodJ1_pt.pdf r2 r1 manage 21.3 K 2016-02-25 - 00:57 ThomasPerry
png postfit_wbb_goodJ1_pt.png r1 manage 159.4 K 2016-03-16 - 16:46 ThomasPerry
pdf postfit_wbb_goodJ1_pt_log.pdf r1 manage 21.8 K 2016-02-25 - 02:36 ThomasPerry
png postfit_wbb_goodJ1_pt_log.png r1 manage 150.7 K 2016-03-16 - 16:46 ThomasPerry
pdf postfit_wbb_goodJ2_pt.pdf r2 r1 manage 20.2 K 2016-02-25 - 00:57 ThomasPerry
png postfit_wbb_goodJ2_pt.png r1 manage 151.3 K 2016-03-16 - 16:47 ThomasPerry
pdf postfit_wbb_goodJ2_pt_log.pdf r1 manage 21.7 K 2016-02-25 - 02:36 ThomasPerry
png postfit_wbb_goodJ2_pt_log.png r1 manage 152.0 K 2016-03-16 - 16:47 ThomasPerry
pdf postfit_wbb_goodLep_pt.pdf r1 manage 25.5 K 2016-06-15 - 04:17 ThomasPerry
png postfit_wbb_goodLep_pt.png r1 manage 857.2 K 2016-06-15 - 04:17 ThomasPerry
pdf postfit_wbb_goodLep_pt_log.pdf r1 manage 26.7 K 2016-02-25 - 02:36 ThomasPerry
png postfit_wbb_goodLep_pt_log.png r1 manage 178.8 K 2016-03-16 - 16:47 ThomasPerry
pdf postfit_wbb_met.pdf r2 r1 manage 23.0 K 2016-02-25 - 00:59 ThomasPerry
png postfit_wbb_met.png r1 manage 174.3 K 2016-03-16 - 16:47 ThomasPerry
pdf postfit_wbb_met_log.pdf r1 manage 22.5 K 2016-02-25 - 02:36 ThomasPerry
png postfit_wbb_met_log.png r1 manage 149.0 K 2016-03-16 - 16:47 ThomasPerry
pdf poststep1_ttjjj_mt_ele.pdf r2 r1 manage 19.8 K 2016-02-25 - 01:00 ThomasPerry
png poststep1_ttjjj_mt_ele.png r1 manage 162.9 K 2016-03-16 - 16:47 ThomasPerry
pdf poststep1_ttjjj_mt_mu.pdf r2 r1 manage 19.7 K 2016-02-25 - 01:00 ThomasPerry
png poststep1_ttjjj_mt_mu.png r1 manage 161.8 K 2016-03-16 - 16:47 ThomasPerry
pdf poststep2_ttme_mt_ele.pdf r2 r1 manage 19.5 K 2016-02-25 - 01:01 ThomasPerry
png poststep2_ttme_mt_ele.png r1 manage 173.2 K 2016-03-16 - 16:47 ThomasPerry
pdf poststep2_ttme_mt_mu.pdf r2 r1 manage 19.6 K 2016-02-25 - 01:01 ThomasPerry
png poststep2_ttme_mt_mu.png r1 manage 179.3 K 2016-03-16 - 16:47 ThomasPerry
pdf wbbttbarcomp.pdf r1 manage 13.6 K 2016-06-09 - 15:45 ThomasPerry
png wbbttbarcomp.png r1 manage 9.2 K 2016-06-09 - 15:45 ThomasPerry
Topic revision: r50 - 2016-10-03 - AlexanderSavin

 Home Sandbox Web P P View Edit Account
 Cern Search TWiki Search Google Search Sandbox All webs
Copyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback