# QA on SMP-J 12-010 analysis

Authors: R. Kogler, D. Britzger, G. Grindhammer

Referees: M. Gouzevitch, D. Wegener

• We agree, implemented in analysis/text.
• We disagree, will not implement.
• Authors and/or ARC need to discuss. (Open item.)
• We agree, but need someone to do it. (Open item.)
• We agree, but no changes to the analysis/text are necessary.
The analysis is not finishes if there is anything still in lime or blue .

• [[][Paper Draft]].
• [[][T0 talk]].

### Karin Daum

General: ====

Section 3.1: The reweighting in the "longitudinal momentum balance of the HFS E_h-p_{z,h}" confuses me. Also see text comments below. I assume this is the variable SIGMA, defined on page 4. On hadron level, this variable is identical to 2*Ee*y (y measured at the proton vertex).

So I think for reweighting this variable one should have a good argument why the y-distribution is not modelled correctly. For example, compare the y distribution in your non-radiative MC to the y distribution from a NLO calculation with a NLO PDF. If NLO indeed tells you that the LO MC should be reweighted in y, OK. But then this should be the argument in the paper, not the mismatch in the control plot.

Otherwise I doubt that reweighting SIGMA is the proper solution. Are you sure that the problem is a deficit of the generator and not of the detector? Or maybe it is because the beam energy in MC is fixed whereas it changes from run to run in data?

If there is a detector problem, the weighting has to be done on detector level (efficiency correction of extra smearing). If it is a problem of the beam energy, no weighting has to be applied [try to plot SIGMA/Ee(data)*Ee(MC) ]

Section 3.2 and 3.3 There is some confusion about the matrix A and the right-most "column" in figure 5, which is not present in the matrix A. Also, the matrix shown in figure 5 is transposed wrt to the matrix equation y=Ax. The generator-level has to be on the x-axis and the detector level on the y-axis.

Suggestion1: transpose figure 5 such that it corresponds to the conventional matrix multiplication. The extra "column" becomes an extra "row"

Suggestion2: be more precise with the definition of the matrix A and the "Migration matrix" (matrix of MC events). I think you could introduce the matrix M as in the TUnfold paper, which is filled with the MC events. It has the extra row. Call it "migration matrix". And the matrix "A" could be called "detector response matrix"

Also see my detailed text proposals below.

Section 7.1 I think there is an inconsistency in the way the correlated and the uncorrelated+statistical uncertainties are treated. The correlated uncertainties are treated as "relative" using the exponential function. The uncorrelated+statistical uncertainties are treated as "absolute" using a covariance matrix. Clearly, this is not consistent, especially for those uncertainties which are partially correlated and partially uncorrelated.

I have the following suggestions: because we give all errors in percent, all errors should be "relative"

This can be done by fitting log(ti) wrt log(mi) pi = log(mi)-log(ti)-log(thetai) = log(mi) - log(ti) + sum_k E_ik and the covariance V simply is constructed from relative instead of absolute errors.

Table 3 and formula (12): please choose a different letter than \mu for the correlated and uncorrelated fraction. \mu is reserved for the scale. Maybe f^U and f^C will do?

Figure 3 and 4: I think these log plots are not very convincing. Maybe we should also give ratios here?

Figure 5: This figure should be "transposed" such that it has the same structure as the matrix A in the equation y=Ax (also see comments below)

Figure 8: I think it should be dropped (see comments below)

Figure 18: I think this can be dropped (also see text comments below)

Detailed proposals ==============

11: ... GeV^2 using the H1 ...

12/13: ... of Q^2, of the ... and of the proton's ...

16: Compared to earlier work, the measurements benefit ...

20: why is the exp error (7) here but (8) in the the text (line 907)? Please check all numbers, make sure they are consistent everywhere.

26/27: ... distance interactions. At larger distances they transform into collimated jets ...

27/28: ... and compared to perturbative QCD (pQCD) predictions.

30-32: In contrast to DIS, where the dominant effects of the strong interactions are the scaling violations of the proton structure functions, the production of jets allows for a direct ...

44-45: Complete next-to-next-to-leading ... are not yet available [refs]. (add here some references to incomplete NNLO calculations)

49-51: In this paper improved double-differential measurements ... trijet cross sections. The cross sections are measured as a function of Q^2 and of the transverse jet momentum P_T^jet for the case of inclusive jets. Dijet and trijet cross sections are measured as a function of Q^2 and of the average jet transverse momentum. In addition, Dijet and trijet cross sections are measured as a function of Q^2 and of the proton longitudinal momentum fraction \xi.

54-55: ... respective bins of Q^2, the normalised multijet cross sections, are also reported.

56: ... is primarily in a significant ...

59-60: ... leads to a jet energy scale uncertainty as small as $1\%$.

62-63: ... photon direction, compared to previous analyses [6]. The increase ...

64-65: ... to be measured double-differentially for the first time at HERA. (is this true?)

66-68: In order to match the improved experimental precision, the results presented here are extracted using a regularised ... which properly takes into account detector ... DIS events.

70: ... hadronisation effects. The strong coupling alpha_s is extracted as a function ... in DIS. The measurements ...

91-92: (remove the sentence about the magnet here. It is confusing, one could think that only the tracker is inside the coil. Also, the magnet is explained in 109-111.)

102: only 4500 cells? I thought it is ten times more???

114: [33] ??? The reference is wrong and is not in order (previous ref is [13], next is [14]).

125: ... $99.5\%$ [14]. (add a reference here. Is it [14]?)

127: ... the cluster is required to be associated ...

129: ... to below $0.3\%$. (if this is true)

148-149: This sentence about the MC used for background is confusing here. Later you say (line 158) that this bgr is only 0.2%. But then, QEDC is 1%. Is there no MC for the QEDC? Why then explain the gamma-p MC which is non-significant and not to explain the QEDC MC? People will start to think that you do not have a MC for QEDC.

Suggestion: remove 148-149. Add after 158: The remaining background originating from the sources discussed above is modelled using a variety of Monte Carlo event generators as described in [ref]. (reference Roman or Daniel thesis)

152-153: ... the scattered lepton and any reconstructed photon. The background from lepton pair ...

153-156: this reads strange. Why is the background from lepton pair production negligible for all flavours, although only the electrons are suppressed actively? Suggestion (if this is true): Background from electron pair production processes is suppressed to a negligible level by rejecting ... activity. Background from muon or tau pair production and from other rare processes such as charged current DIS or deeply-virtual Compton scattering are found to be negligible.

168-169: ... uncertainty needed to be minimised, since it proved to be the dominant experimental uncertainty in earlier works [6].

174: ... due to the superior resolution ...

179: ... better into account [ref]. (add a reference here) (or describe CJC1 and CJC2 in the detector chapter and say here: .. multiple scattering in the detector material between CJC1 and CJC2 into account.)

194: ... allow for the in situ calibration ...

195: remove "by"

195-196: remove this sentence, see comment to line 200

195-196: (add here:) The calibration procedure described below is applied both to the data and to the Monte Carlo event simulations.

200: ... while the expected transverse momentum P_T^da is calculated ... ... which to good approximation is insensitive to ...

202: ... theta_e and of the inclusive ... [23,24] (remove "to define ...")

209-210: ... variables. In addition, a secondary calibration function for clusters associated to jets measured in the laboratory frame is derived. This function depends ...

211-212: (drop this sentence, it is too technical) (or change to:) ... functions are evaluated as a function of time, for four ...

212-215: (drop this, it is too technical) (or change to:) In summary, the calorimeter clusters receive ... shower and a secondary calibration in the case where they are associated to a jet measured in the laboratory frame

223-224: ... alternatively the anti-k_t ... algorithm. The jet finding is implemented ...

232-233: This inclusive definition ... implies that the trijet ...

237: ... and $\xi_3=...$, respectively, with $M_{123}$ being the invariant mass of the three leading jets. (use words. The formula is too confusing here. One has to digest that the P^{jet} are four-vectors whereas the P_T^{jet1} in line 235 are scalars)

243: ... helps to quantify migrations ...

Table 1: "Common jet phase space" I think it is confusing Better say "jet polar angular range" or something.

267: (e?p something is wrong here...) $e^{-}p$

271: ... is employed. A detector response matrix ...

272: ... simultaneously [ref]. (add reference to Daniel's thesis here)

274: The procedure also takes into account the statistical ... these measurements as well as statistical correlations of several jets originating from a single event.

274-276: (replace by) QED radiation corrections are included in the unfolding procedure.

Section3.1: this paragraph is very complicated to read. I think it could be simplified by introducing a weighting function

w=w_\Sigma(\Sigma) \times w_{mul}(n_{jet},Q^2) \times w_{fwd}(P_T^{fwd),eta^{fwd}) \times ...

then explain the individual contributions.

283-284: (this variable already has a name, $\Sigma$. I suggest to repeat this here. Also, longitudinal momentum balance is not the correct term. There is nothing "balanced" unless you include the electron in the sum) Suggestion: ... distribution of the variable $\Sigma=\sum_h(E_h-p_{z,h})$ shows ...

(And see the general comment, is a gen-level reweighting procedure really correct here?)

288: remove "as expected", it is not needed.

289-290: Event weights as a function of Q^2 and the jet multiplicity are applied.

290: remove "used"

291-292: weights are applied depending on the transverse ...

293: .. in the event as well as for the jet ...

295-298: this is much too complicated. Also, it does not become clear to me what is done, even after reading it several times. "All weights" <-> "typically" this does not work. Either it is all weights or only some ("typically"). What the "second observable" does and how it influences the weighting does not become clear...

Suggestion: introduce a formula as suggested above, then it becomes very clear which weight depends on which variable. The second polynomial is a detail and should be skipped.

297-298: The argument about "big" or "small" average weights is arbitrary, because each weight will introduce its own normalisation. If the first reweighting fixes the normalisation, the other weights all have to be close to one "on average". You rather have to argue about the amplitude (max/min) for each weight. But I think this is too much for this paper. I suggest to drop this.

303-306: new text proposal: The events are counted in bins, where the bins on hadron level are arranged in a vector $x$ with dimension 1370 and the bins on detector level are arranged in a vector $y$ with dimension 4562. The vectors $x$ and $y$ are connected by a folding equation $y=Ax$, where $A$ is a matrix of probabilities, the detector response matrix. It accounts for migration effects and efficiencies. The element $A_{ij}$ of $A$ quantifies the probability to detect an event in bin $i$ of $y$, given that it was produced in bin $j$ of $x$. Given a vector of measurements $y, the unknown hadron level distribution$x$is estimated [ref] in a linear fit, by determining the minimum of (3) (add the reference to TUnfold) 311-313: remove this, it is not needed. Also, people will confuse "B" with B1, B2, B3 defined later. 314-318: remove this, but add the following text: The detector response matrix$A$is constructed f4rom another matrix$M$[ref-TUnfold], called migration matrix throughout this paper. The migration matrix itself is obtained by counting MC events in bins of$x$and$y$. It also contains an extra row to account for events which are not reconstructed in any bin of$x$(inefficiency). The determination of$M$is explained in the following. 324-325: (simplify) ... account for detector inefficiencies. 327-328: ... introduced to account for cases where a jet is reconstructed although it is absent of hadron level. 329: ... hadron level, caused by limited detector resolution ... 333-336: remove this, it is too confusing. 338: add some text: ... in the following. More details can be found in [ref-thesis]. In total, 1370 bins are unfolded, of which ??? are located in the analysis phase space. Of the ??? bins, adjacent bins at low transverse momenta are further combined to arrive at the final cross section bins. This procedure reduces model-dependent systematic effects. 340 and footnote: Remove 342: ... Out of these 16 bins, only 6 data points are used for the determination of the normalised cross sections. 354-355: Detector-level jets which are not matched on hadron level are filled ...$B_1$. 355-356: Hadron-level jets which are not matched on detector level are filled into$\epsilon_j$(inefficiency). 357: ...are described using 16 bins ... level and 8 bins on ... 357-359: The 8 bins in$P_T$are finally combined to ??? bins for the cross section measurement. 366: As for the inclusive jet case ... 368: ... described using 18 bins ... level and 11 bins on ... 371-372: Similar to the case of inclusive jets, the 11 bins in$P_T$are combined to ??? bins for the cross section measurement. 377-378: ... Due to the limited number of trijet events, the number of ... is reduces as compared to ... 380: The resulting detector response matrix$A$has an ... 394: ... calculated by averaging the matrices ... 395-396: remove (the statement about the "nominal result") 397: ... as efficiency correction [ref-thesis]. 400: ... corrections is of order$10\%$for ... and of order$5\%$... 405: .. using an L-curve scan [references-in-order]. 405-409: remove 412: ... L=351 pb^-01 is analysed. (remove last digit, do not quote uncertainty) 417-418: ... (4), also including QED radiative corrections. 434: ... are correlated between ... (remove propagated ... fully) 435-436: ... uncertainty cancel and many other ... cancel to a large extent. 438-439: .. unfolding process [ref-tunfold]. 439-440: remove 441-447: remove this paragraph 453: ... caused by the limited number of data events, these ... 460-461: ... two components related to the two-stage calibration procedure described in section 2.3. 467: ... by$\pm1\%$as determined ... 470: remove the footnote and write out transverse momentum: ... at low transverse momentum, where ... 500-501: ... uncertainties are varied simultaneously in the numerator ... 504: ... HFS cancel partially. 510: ... into account by correction factors. 515: ... corrections cancel and the cross ... (remove: due ... tensors) 527: ... are known only to NLO. 540-541: ... whereas$\mu_f$is chosen such that the same factorisation scale can be used for the calculation ... 544: ... and both$\mu_f$and ... 548-549: ...$c_{had}$account for long-range ... 553: ... level those partons which nominally are taken as input to the string fragmentation are taken as input to the jet algorithm. These partons may originate directly from LO matrix elements or from a parton shower. (otherwise, if really all partons are used there is double-counting. I do not think that this is the case) 554-555: ... cross sections predicted by the NLO ... and by the MC generators on parton level is small ... 556-557: ...on the hadronisation corrections. (remove: well within...) 557-558: Hadronisation corrections are computed for both the kT and ... 559-561: This does not become clear. You just explained that it is not necessary to reweight the MC to NLO, but now you seem to use the weighted MC. Suggestion: ... Rapgap prediction (see section 3.1) is used. 561: I think one expects here something on the uncertainties (how large they are). Also, why are the uncertainties not in the tables? 587: typo: hadronised 593: .. are found to be unreliable ... 596: (maybe: add a statement how the DJANGO-RAPGAP difference compares to the hadronisation errors determined with SHERPA) 605: .. labelling of the bins ... 605-608: remove (this sort of information belongs to the table caption) 609-630: I think that the description of the tables should be simplified. Also, the discussion of the figures (7 and 8) should be done separately from the tables. Maybe one can give a "table-of-tables" right here in the text. The text describing the tables: The jet cross sections ... in table 6-10. (take from line 609) The corresponding tables for the anti-kt .. are given in table XXX-YYY. Normalised cross sections are given in table XXX-YYY for the kt jet algorithm and XXX-YY for the anti-kt algorithms. This information is summarised in table META1. Point-to-point statistical correlations are given in table XXX-YYY in the form of correlation coefficients. Within the accuracy of this measurement, the correlation coefficients are identical no matter whether the kt or anti-kt jet algorithm are used. Similarly, the statistical correlations of the normalised and the absolute cross sections are the same. The information on correlation coefficients is summarised in table META2. META1 kt anti-kt kt(normalised) anti-kt (normalised) incl table 6 ... dijet PT table 7 triget PT ... dijet xi triget XI META2 incl dijet PT trijet PT dijet xi trijet xi incl tab 11 tab 16 tab 17 tab 19 tab 20 dijet PT tab 16 tab 12 tab 18 N.A. N.A. trijet PT tab 17 tab 18 tab 14 N.A. N.A. dijet xi tab 19 N.A. N.A. tab 13 tab 21 trijet xi tab 20 N.A. N.A. tab 21 tab 15 Then, decribe the figures with correlation coefficients: Figure 7 shows the correlation coefficients of the inclusive, dijet and trijet cross sections, corresponding to tables 11,12,14,16-18. Also included are the correlations to the six bins in Q^2 corresponding to the inclusive NC DIS cross section. Large positive correlations are observed between inclusive jet and dijet cross sections with the same Q^2 and similar PT. When looking at the inclusive jet, dijet or trijet cross sections alone, negative correlations up to 0.5 are observed between adjacent bins in pt. Figure 8: I think this figure should be removed from the paper (keep the plot for conferences). 629-630: I think this can be removed. It should be made clear in the figure and table captions whether you quote bin-integrated or differential cross sections and how the bin centres are calculated for the plots. 632-633: remove: "corrected for hadronisation ,,, in figure 9". this is for the figure caption 634-635: change to: The theory uncertainties from scale variations dominate over the sum of the experimental uncertainties in most bins. 636-638: I think this statement should be removed. None of your negative correlations is larger than 0.4, so this effect is rather small. For example, when adding two bins with similar errors, the error changes only by a factor sqrt(1+rho)=0.8 as compares to the uncorrelated case. 653-656: Similar to the case of absolute cross sections, the theory uncertainty ... experimental uncertainty in almost all bins. The experimental uncertainty as dominated ... 656-659: Given the high experimental precision as compared to the absolute cross sections one observes that the normalised ... fro many data points. 659-660: Move these lines to 639, I think this effect is already visible for the non-normalised cross sections. 668-669: ..in figure 17, where the error bars correspond to the anti-kt experimental uncertainties. 672-673: (perhaps) remove "whenever a ..." 694: I think the notation \delta_k^{+}m_i is confusing One thinks that "delta" and "m" are multiplied, which is not the case. Suggestion: use \delta^{+}_{ki} instead. 701: it does not become clear how the covariance matrix is constructed if \mu^U is not equal to 1 or if the uncertainties are asymmetric. I think this has to be explained. Suggestion: The symmetrised uncorrelated uncertainties squared$\mu^U_k(\delta^{+}_{ik}-\delta^{-}_{ik})^2$are added to the diagonals of the covariance matrix V. 702: remove "inverse": The covariance matrix$V$thus includes ... 706-733: I think this part should be removed (give a reference to the thesis). It is sufficient to add one sentence after line 704: The correlated and uncorrelated uncertainty fractions \mu_C and \mu_u are summarised in table 3. Details can be found in [ref-thesis]. 742-744: ... predictions are often determined ... . In this analysis a different approach is taken. The theory uncertainties are determined ... ... error propagation [41]. (remove the rest "similar to ... ") 745-746: Uncertainties on alpha_s originating from a specific source of theory uncertainty are calculated as: 747-748: I have difficulties to understand the formula (13) in the form it is written. I think it should be changed to: ... = f^C(\sum_i \frac{\partial\alpha_s}{\partial t_i}\vert_{\alpha_0} \Delta t_i)^2 + f^U \sum_i (\frac{\partial\alpha_s}{\partial t_i}\vert_{\alpha_0} \Delta t_i)^2 Also, change the text below (13): ... in bin$i$,$\Delta t_i$is the uncertainty of the theory in bin$i$and$f^C$($f^U$) are the correlated (uncorrelated) fractions of the uncertainty source under investigation. 750-754: remove these sentences 754-757: ... obtained this way are found to be of comparable size to the uncertainties ... method [6,58]. Because formula (13) is linear, the theory uncertainties are symmetric. 765-775: this is almost impossible to understand. I think it will be easier to explain if (13) is rewritten as suggested above. Suggestion (after changing (13)): For the variation of a scale$\mu$($\mu=\mu_r\,\text{or}\,\mu_f$) a problem arises with the estimation of the uncertainty, if the contribution \frac{\partial\alpha_s}{\partial t_i} \Delta t_i is small for a given bin or after summing over all bins. This may happen if \frac{\partial t_i}{\partial \mu} is close to zero or has alternating sign for different cross section bins. In order to avoid this problem, a conservative approach [60] is made, using the following replacement \frac{\partial t^C_i}{\partial \mu}\Delta \mu \to ... (Eq 16) 791-802: I think this should be simplified (point to your thesis). The formula is not needed. suggestion: PDF uncertainties are estimated by propagating ans symmetrising the uncertainty eigenvectors of the MSTW2008 PDF set to the results. details are described in [ref-thesis]. 807-816: remove (add [ref-thesis] in 802). 819-835: remove these lines and modify 835-837: 819: ... different groups. (remove text and continue in 835) 835: Half the difference between ... CT10 PDF sets is assigned as PDF set uncertainty [ref-thesis], denoted$\Delta_{PDFset}\alpha_s$. 844: ... together are referred to ... 845: The statistical correlations (Tables XXX-YYY) are taken into account. 847-849: I do not understand this argument. One can see from the chi**2/ndf that there are incompatibilities with the theory, so ultimately there have to be significantly different alpha_s results in different kinematic regions. Also, if you want to discuss this, it fits better after the global fits are presented (after line 865) -> Remove this statement? 851: remove "therefore" (the 30-64% argument is not very strong) 852: ... applicable, and that the .... 853: ... accounts for the unknown contributions ... 855: The$\alpha_s$fit results, determined from the individual data sets or from the multijet using either the$k_T$or the anti-... 859-860: I prefer to have the values chi**2/NDF as the ratio of two numbers XXX/NNN. 860: ...$1.02$($0.88$), respectively. For the absolute ... 861-865: suggestion for a new text: Note that the theoretical uncertainties on$\alpha_s$are not considered in the calculation of$\chi^2/N_{dof}$. The fact that$\chi^2/N_{dof}$degrades as more data are included (multijets as compared to individual data sets) or as the experimental precision is improved (normalised as compared to absolute cross sections), indicates a problem with the theory, believed to be related to higher order corrections which may vary from bin to bin. Similarly, the fact that$\alpha_s$extracted from the dijet data is below the values found in inclusive jet or trijet data, is attributed to unknown higher order effects. 866: ... uncertainty when considering the variations ... 868: ... cross sections, not considering the multijet fit, the trijet ... 869-870: ... if$\alpha_s$with the ... because the LO trijet cross section is proportional to$\alpha_s^2$, whereas the inclusive or dijet cross sections at LO are proportional to$\alpha_s$only. 871: The best experimental precision on$\alpha_s$is achieved for normalised ... 873-875: ... variations are somewhat reduced ... of the scales in the nominator and the denominator. 875-878: I do not understand these statements. One problem is to locate$\Delta_{PDF}\alpha_s(M_Z)$in the table, there it looks white different. Please make sure that the error names exactly as used in the text appear somewhere in the table. Furthermore, for the multijet I can not see the reduction in uncertainty, so the statement on the reduction is not really convincing. 875-884: This discussion should be shortened a lot. The footnote (6) should be removed. Proposal: The uncertainties from PDFs are of similar size when comparing absolute and normalised cross sections. The residual differences are well understood [ref-thesis]. 885: remove the statement "These ... equation 13." 885-886: For the$\alpha_s$extraction using absolute cross sections, the ... 896: remove footnote 7 (reference thesis in 897). 897: ... to the dijet value [ref-thesis]. 898-904: remove this paragraph. 909: ... anti-$k_T$jets. (remove "differing ...") 914: Add a statement here: Complete next-to-next-to-leading order calculations of jet production in DIS are urgently to solve this mismatch in precision between experiment and theory. 918: This value of$\alpha_s(M_Z)$is the most precise value ever derived at NLO from jet data recorded in a single experiment. 953-954: ... precision of$0.7\%$is obtained ... 956: the equation has two dots at the end. 957-958: remove this statement, or write: A very similar result is obtained when using the anti-$k_T$jet algorithm. 960-961: ... sections, albeit with larger experimental uncertainties. (and remove: "which is ... latter") 961: ... between the extracted value of ... 962: .. inclusive jets or trijets. 965: When restricting the measurement to regions of higher$Q^2$, where the scale uncertainties are reduced, the smallest total uncertainty on the extracted$\alpha_s(M_Z)$is found for$Q^2>400\,\text{GeV}^2$. There, the loss in experimental precision is out-weighted by a reduced theory uncertainty, yielding 967: remove the leading "x" 967-968: remove this sentence (it is a technicality). 968-969: The extracted$\alpha_s(M_Z)\$ values are compatible within uncertainties ...

970-971: remove this statement

971: add the statement: Full NNLO calculations are highly desired to fully benefit from the superior experimental precision of the DIS jet data.

-- MaximeGouzevitch - 13 May 2014

Topic revision: r1 - 2014-05-13 - MaximeGouzevitch

 Home Sandbox Web P View Edit Account
 Cern Search TWiki Search Google Search Sandbox All webs
Copyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback