General comments and most important issues:
The lifetime analysis and the constraint on the width via anomalous coupling are very different analyses; keeping them in the same AN/paper seems a little bit hard, even if at the end the result of both the analysis could be interpreted in terms of bounds on the higgs width. A possible choice if to combine the two contraints in "Combination" paper and keep the two anaysis separated.
Dear Nicola,
Overall, we appreciated your careful reading of the analysis note. Please find below the detailed answers. However, for the comment above
and several cases below, let us stress that there is a difference between review in the Analysis Working Group and review in the Analysis
Review Committee. You are an active member of the AWG and were able to follow this analysis form its proposal about a year ago and through
all the development and review within the AWG for the past year. Once analysis is pre-approved and goes to the ARC, it is not really the right
time to bring the comments which are appropriate a year ago, but not now. Having said that, it was agreed within the AWG that both parts of this
analysis, lifetime and limits on the width from off-shell with anomalous couplings, target the same physics, that is as model-independent as
possible limit on the Higgs boson width, or equivalently lifetime, from both sides. This is in fact a novel feature of this analysis that a two-sided
limit is set. How can we argue that the two sides should be separated into different publications, especially when the same events H->4l
are used to set both limits? So, we understand your concern, but this kind of a concern is a judgement call which does not really put in question
validity of this analysis, which we review here.
There are one important issues spotted within the two analyses reported:
- the authors reported the use of SIP<4 cut when referring to the legacy paper and reproduce the analysis while the right cut used in the past legacy analyses is with absolute value. |SIP|<4. That will change all your numbers and will give larger statistics of events at each step of the analysis when compared witht he legacy paper. It is importatant to fix this issue asap.
We always use SIP_3D to refer to the absolute value – and the plots of SIP_3D show max |SIP_3D| over the leptons as it should be.
- the use of sip12,sip34 and chi2 does not give very good data to MC agrement and the discrepancy cannot only be quoted as a correction factor or as a systematic but need to be studied in detail looking at the performance of the vertex fitting, the stability of the fit and of the quality of it...also we don't know if there is some physics behind that discrepancy. In addition the use of the 3 variables could be redundant. Are those correlated ? Is it not enoucg to use the 4l vertex and the chi2 instead of complicate the analysis by suing the sip12 and
sip34 with resepct to Z1. I remind you that the Z1 vertex is done with just two tracks and the unvertainty on his position is larger than the uncertainty on the primary vertex (fit by hundreds of tracks). In order to understand the discrepancies and the physics-related issues it is important to provide distribution of SIP12 and SIP34 at an earlier step of the analysis where you shuold have enouhg MC to describe data and undestand the source of the discreancy in the event yield and the shape.
I would suggest to build those distribution just once the Z1 is build and try ti understand how they behave. It is also important to make a distribution of the error on the vertex position and of the chi2 of the fit and its probability. In the past those distributions were not well modeled and we decided not to use them. I know those studies very well.
It is clear that SIP_3D cannot be used for this analysis. As documented on the AN, the SIP_3D cut a) kills all events with lifetime b) has worse consistency between data and MC than the new vertex selection criteria. As you also indicate, the variables are correlated, but not perfectly. SIP^Z1_1,2 quantify the compatibility of the Z1 vertex and SIP^Z1_3,4 quantify the compatibility of the Z2 daughters to the reconstructed Z1. chi**2 cut is to require that the reconstructed Z objects remain consistent with a 4l vertex. The correlation distributions, while not documented on the AN v5, can be found in the earlier talks (eg. 21.11.2014 HZZ Meeting, back up slides 19 and 20). As we want minimal changes on the SIP_3D cut scheme, we instead cut at values that correspond to SIP_3D=4 lines. We will add the correlation plots to the AN. Our analysis does not use Z1 vertex position directly, so in contrast to SIP^Z1, the position and its uncertainty alone are not directly relevant for a cut and for this analysis. The editors would also like to prefer looking at a specific documentation when you refer to previous studies. The new studies have been extensively documented and discussed at the HZZ meetings, HIG PAG pre-approval, and the Analysis Note, and we find that
the new set of cuts provides as good, if not better, consistency between data and MC. In the end, one of the conservative systematic uncertainties in this analysis is based on data-to-MC
comparison of the vertex resolution, so whatever you are concerned about in data-MC discrepancy is reflected in the errors.
- all the reweighting/rescaling/smearing procedures across the analyses are not well documented and difficult to validate since you didn't have MC samples. ...I would have preferred you to make an effort to generate some samples even privately and compare your guess. this would have made your assumptions more solid and well justified. We cannot simply use only SM Higgs samples to guess things beyond SM. Also Here and there, there seem to be apparent inconsistencies from the interplay of assumptions valid for SM and then implicitely used for BSM models.
Lets first be clear about which reweighting/rescaling/smearing you are referring to. First of all, in the width part of analysis, re-weigting for an alternative
f_LQ value is a simple procedure, which is possible both with a generator matrix element and analytically. Both approaches are consistent, as plots show.
Re-generating all the large samples would put serious constraints on computing resources, which we cannot afford for this task which can be covered otherwise.
In the lifetime part of analysis, we have gg->H samples for 0, 100, 500, and 1000 um, and anything in between is trivially covered with re-weighting.
It would be impossible to generate infinite number of samples for each ctau value, and it is not needed.
We have ctau=0 samples for VBF, VH, ttH, and all BSM samples. However, we have shown it explicitly that the only relevant effect from this
variation of production is the pT spectrum of the Higgs boson. Therefore, simply re-weighting the pT covers all effects, as it is shown explicitly
with a closure tests for all of the above samples. Once again, generating all ctau samples for all types of Higgs production is not feasible
from resources point of view, and is not needed. Note that it actually took considerable effort to allocate resources to the samples that we already have.
To explain deeper, we always reweight higher lifetime to lower lifetime and not the other way around since the event-by-event lifetime of higher lifetime sample fully covers that of the lower lifetime (notice also that PV pulls are also covered since these pulls occur on an event-by-event basis either). We are also experienced enough from spin-parity analyses to be careful about how to combine the reweighted samples. We will add a plot for closure test using 100 um in v6.
- it is also not understood your comment about the model-independence of the analysis as a answer to the criticism raised for the higgs width analysis. Most of the work reported in the AN is based or dependent on SM Higgs- based feautures and tools. I would remove that comment honestly.
We do not claim complete model-independence. What we claim is that anomalous couplings in decay do not affect conclusion of the width analysis,
that is limits remain conservative.
From Figure 1, we know that the signal contribution is enhanced offshell for any anomalous couplings wrt the SM. By having tested both the SM and LambdaQ contribution (which as Figure 1 shows is the most extreme invariant mas enhancement), we would be testing two extreme scenarios for the width.
In the end, this is just an analysis note and we make a true statement, and it does not even go into any public paper. Why should remove this statement?
It is one of the arguments to complete this analysis.
- also it is not clear how a VBF discrminant built with matrix element techniques with leptons and partons can describe events with leptons and jets...how to handle the ffect of the hadronization and jet reconstruction ?
The study addressing this concern was addressed during the HZZ and pre-approval meetings and is already documented in the AN.
This VBF discriminant has been shown and adopted in the AWG for two years now and it shows as good if not better performance with
respect to other techniques used in the group. The answer to your question is that jet carries most kinematics of the original quark. The idea
is the same as the MELA discriminant contracted for the H->4l decay, where also reconstructed leptons carry most of the kinematics of the original
leptons. In the end, the matrix element is simply a technique and it is used to calculate an observable. This observables may be more or less optimal,
but it cannot be wrong. Studies indicate that this observable is rather optimal.
- concerning the appendix about Z+X: there is not any single plot about data to MC comparison and any closure test. I think those things have to be provided.
First of all, let us say that historically, long before this analysis, the study of Z+X was data-driven because MC did not perfectly reproduced
the effects. Therefore, this analysis is also data-driven in the Z+X estimation and parameterization. Therefore, MC comparison is not as critical.
Nonetheless, Z+1-lepton tests, which feature mostly DY and W/Z+jets events, are also meant as such a comparison.
Moreover, data-MC comparisons presented on AN-13-108 Figure 55 already show mismatch on the MC side,
so we only use the OS/SS ratio predicted by the MC as done in AN-13-108.
- in the past we always asked another group within the HZZ group to validate the numbers. Is this requirement fulfilled or partially fulfilled?
You probably confuse something here: There is no rule on CMS which requires each analysis to have an independent analysis.
On CMS, we have published probably close to 1000 papers now and majority of those did not have such a test. This analysis is in the hands
of the ARC now and we follow the CMS step of review of this analysis. You mention the HZZ group, but we are not in the HZZ group but in the ARC
review. As a side comment: in the HZZ group the legacy analysis is always required to have an independent test. This is a followup to the legacy
analysis and the requirement is not as strict since we are working with the same 20 events as we already discovered a couple of years ago.
Note that neither the width nor spin-parity analysis, which were also follow-ups to the legacy, do not have a complete independent test. However,
instead, within the HZZ group, we ensure high quality of results through various closure, redundancy, consistency and various other tests.
Detailed comments:
line 69: Here you want to say that the present measurements didn't use explicitely any constraint on the boson lifetime but implicitely assumed that it decays prompty....we were looking for the SM higgs with
Change to "assume that the boson lifetime is SM-like"?
tau=1.6x10^-22 s. Anyway the MC samples used until now use lifetime=0 in the simulation.
We understand that we are saying the same thing, just slightly differently.
So, it would be better to focus on wording in the paper (which will become public), rater than in analysis note (which remains internal).
About the zero lifetime in MC samples, in SM, 99.99\% of events will be within 4.4e-7 um, so having MC at 0 s or 1.6e-22 s makes no practical difference.
line 104: "tested on shell boson decay"....I would say "tested with a boson with mH=125.6
GeV, mostly in the region around the resonance peak".
Again, we are saying the same thing. Once again, about such wording, lets focus on the paper.
Equ. 6...you have to quote this as the most general espression for the amplitude of A->VV decay with A with any spin and parity or similar..better to quote a theoretical reference here.
The best reference is already given in the sentence above, that is [21]. This is the most comprehensive overview of
the anomalous couplings.
By the way, this equation only specifies spin-0 couplings as spin 1 and 2 are excluded at or beyond 99\% CL.
Another teoretical reference can be found on arxiv 1309.4819
Equ. 7...please quote what is sigma 1 and sigma _LQ and what they represent physically.
For the purpose of this analysis the ratio given just below as mH^-4 is enough.
For deeper reading, see Ref. [21] quoted above, these are explained clearly on the HIG-14-018 analysis, either the AN or the paper.
line 122: I'm not sure we can quote a criticism on a published paper by
CMS and produced by almost the same people..... why didn't you raise this issue before published the higgs width analysis if you knew that ? Are you sure that your analysis could not get the same cristicism? From the comments below I would suggest not to raise this issue and delete that sentence in the text.
Again, you suggest to erase a statement, while it was our true motivation. Once again, lets focus on what we
write in a public paper, rather than in internal AN.
Among the issues raised, these are only relevant for off-shell contribution from BSM quarks or scalars neither
of which have been observed, so deemed minor. Internally, it was discussed to add anomalous coupling from
this factor, but it was not ready in time for that publication.
line 123..."conservative" of "wrong" ??...or maybe "model-dependent" ?
Original paper was conservative and not wrong as the results of this paper show.
Figure 1: At which step this plot is produced. What are the values of the cross section for each BSM hypothesis.
The plot is produced at gen. level. The full-m4l cross sections are irrelevant for the paper (since we use the shape,
not rate to set constraints), and the ones other than LQ are already known from the HIG-14-018 analysis.
line 127-130: how can we validate this procedure and be sure about the correctness of that ? How can we trust this reweighting of
MCFM (which version) with gg off-shell production...how this conribution of gg offsehell production is evaluated and matched with the on-shell provided by
MCFM...why did you
MCFM and not JHUGEN or the calculation in the theoretical reference ? What exactly you modified in
MCFM ...did you get any blassing from the generator group or the authors of th generator ?.... Is there any systematics included to take into ccount the bias related to this "ad hoc" reweighting ? Which version of
MCFM and JHUGEN are you referring to ?
Other alternative couplings, e.g. fa2, fa3, are not considered in this paper; they are only included for demonstration that fLQ is most extreme variation. MCFM (original samples with v6.7) off-shell production agrees 100% with gg2VV under same settings. Anomalous coupling is provided from JHUGen as plug-in to MCFM with collaboration of authors. These shapes are used only as a cross-check in this analysis; fLQ is done with an analytic reweighting and future analyses could investigate other BSM models in the off-shell region.
Figure 2: what is the binning in those two figures and how can you say that the left plot agrees with figure 1 ? I would report fa3=1in the legend of the plot to make the comparison easier. On the x axis the units are missing. ...sometimes you quote m4l and sometimes ou quote mZZ..the notation shuold be uniform. In the caption you quote that the plot is obtained with
MCFM while I understood thaty you want to validate JHUGEN vs
MCFM... Plot are normalized to the relative cross sections? Can you quote the values ? Are those values certified by somebody or withing the LHC Higgs WG for example? Figure 2 left.... BSI means Backgrou+Signal+Interference ...can you quote that in the legend ?...how did you compute the interference for the pseudoscalar hypothesis all the others? I don't think gg2VV or phantome can do that ! How can we validate the calculation of the interfence ?
We can find better binning, and we agree with aesthetic changes; better plots can be made toward the paper. No attempt is made to validate JHUGen vs MCFM in this figure; these validations are done in private among JHUGen authors. Plots are normalized to respective cross sections as indicated in caption, which could be quoted. We stress however that Fig 2 is for demonstrative purposes and has no baring on analysis.
line 136: "when other kinematic obsersvable are integrated out"...what does that mean ?
Just like in a PDF, it refers to not putting any cuts on the other variables. In the case of the specific plot, we do not look at a particular phase space, we lump everything together and plot the m4l distribution. For example, it could still be that angular variables could show the interference between PS and bkg.
line 157: The JHUGEN 4.8.1 is used to generate higgs decaying in 4l with different lifetime ? where are the samples ? are they official production samples ? Are they processed with full simulation ? Can you list them in a table with the releted cross section ? Are they produced for each configuration listed in the legend of figure 1 ? Probably not.
JHUGen 4.8.1 was used and sample names can be provided. They are official full-sim samples and only for SM. BSM models can be considered in lifetime via reweighting. The list will be added the v6.
line 164-165: you quote a reweighting procedure to interpolate between ct=0, 100, 500, 1000 micron...can you include a plot of something ? what exactly did you reweight ? How can we check this reweighting ? Is there any systematics included to take into ccount the bias related to this "ad hoc" reweighting ? Also you mention that you reweighted the samples according to the production mechanism...are you assuming that a BSM with anomalous coupling could be also produced with the same production mechanisms of SM higgs so with the same couplings to gauge bosons and fermions? Is that a bias ? In principle that is wrong since you want to test anomalous coupling for the decay of the higgs but then you assume the SM Higgs coupling for the production mechanism of the higgs.
We commented on this above. We have all anomalous couplings samples for both production and decay, generated with JHUGen
(includes gg->H, qqbar->X, VBF, ZH, WH and various decays). The main relevant effect is pT of the Higgs boson, while everything
else has little effect on the observables. Therefore, re-weigting of pT is sufficient to model lifetime samples for different production
mechanisms. There is no systematics on re-weigting as it is simply an exp function.
line 168: why did you use gg2VV 3.1.5 if I produced samples with 3.1.6 and they were processed through full simulation ?
We use the version quoted on the Higgs width publication. We will fix it if the version number is incorrect..
line 186: Even if we could use the same K-factor for background and SM Signal, nobody tell us to use the sames K-factor also for BSM signal ?
The signal looks exactly the same as the SM in the tensor structure of all interactions, therefore the same K-factor.
fLQ only requires analytic reweighting. Besides, as long as NLO or NNLO corrections do not contain loop corrections due to Higgs, SM and BSM
K factors would be similar.
Section 4: Lifetime analysis
line 199: concerning Figure 3, 4 and 6 you quote new cuts in the legend even if new cuts have not even discussed. Also you quote data in the text but no data are shown in the plots. The title of the histogram report data but data are not reported in the plot.
The text does not specify what we do for vertex selection, and we think it is better to present the distributions with a selection that has almost no effect on Txy than to replace the plots with old selection. Furthermore, we prefer to leave the text as quoting the data since Fig. 3 will be replaced with an unblinded version, just like what is shown for Dbkg.
line 211: is the Dbkg the same used as before to discriminate HZZ from qqZZ in the legacy paper ? or is a new one including rejection of ggZZ and Z+X. Do you build D_bkg for each theoretical model ? or is it the same of SM ? How eventually is it changed ?
For the lifetime analysis, the definition is that same as doen in legacy.
Note that the strongest discriminating observable is m_4l, as part of D_bkg.
D_bkg basically combines m_4l and MELA D_bkg^kin.
It has been shown that there is little dependence on alternative spin-parity models, see
https://twiki.cern.ch/twiki/bin/view/CMSPublic/Hig14018PaperTwiki
Also Fig 3 bottom right shows the performance of the discriminant for gg or qqb bkgs are very similar.
line 222: the cut on SIP we used for the legacy paper is on the absolute value of SIP. Use SIP<4 ou are keeping events also with SIP<-4 while before they were excluded. That also could give you more events in the final estimates. Did you compare the numbers for Z+X, signal and ZZ obainted with SIP<4 with those reported in the AN 108 v9 for the legacy paper. Can you provide this table ?
Would you propageae this change of the cut on SIP to the full results of the analysis ? I don't think that is a typo. You are convinced that we used SIP<4 and that is not true.
See response to general comments, we use absolute value.
line 228: sentence not completed.
It seems complete to the editors.
line 232-234 the sip cuts shuold be in absolute value. "with respect to Z1" shuold be "with respect to Z1 vertex built with the Kalman vertex fitter", Is the kalman vertex fitter used or a kinematic fit or what?
Both kinematic vertex fitter and Kalman vertex fitter were checked, and the difference observed were ~1e-4 um in 4l vertex coordinates, so we use Kalman vertex fitter. We would like to refer the ARC to slide 17 of HZZ meeting presentation on 17.10.2014.
Here the problem I see is that the uncertainty on the position of Z1 vertex built with only 2 tracks can be much larger than that on primary vertex that is built with much many tracks. Is the SIP cut on 1,2 applied to both the leptons or to at least one lepton?
The cut is applied on both leptons.
The SIP cut on 3,4 the SIP of the leptons 3 and 4 with respect to the Z1 vertex. Here the uncertainty of the vertex fit will be much larger because at least one of the track could be not isolated, low pt and more difficult to reconstruct...specially on the background.
We do not change the isolation requirements
The 4l vertex built with the kalman vertex fitter will suffer from a large uncertainty with respect to primary vertex. Also the fit is believed to be unstable and we demostrated in the past that was the case. could you try to use only the 4l vertex and the chi2 of that instead of using sip12 and sip34 in addition to that?
The tracker-based uncertainty on the 4l vertex had been shown on previous talks (i.e. what we called delDxy). We found no issues in the 4l vertex. Please refer tho slides 5 and 22 shown in HZZ meeting on 21.11.2014 and 5-12 on 17.10.2014.
In general, there is little difference between using only chi^2 or using sip12 and sip34 in addition, but we chose to use
sip12 and sip34 in addition mostly for consistency with the legacy analysis. Once again, nothing will change much one
way or another, the two sets of cuts are highly correlated.
Are those cut on 4l vertex and SIP redundant or highly correlated ? I guess so! Can you plot the correlation of those variables ?
See comment in the general comments section
Also the comparison with data was not so good in a enough wide phase-space....that is why we decided not to use. Are you sure that the
3 new observables and their uncertainties are well modelled with real data ? Did you include an uncertainty related to that ? Is the chi2 of the vertex fit a reliable observable to describe data via MC ? Is it better to use the probability of the chi2 ?
Please see our comment on tracker-based uncertainties.
How many times does the fit fail and do you loose a good candidate ?.
What is the efficiency of the vertex fitter and the purity ? We need those numbers for the Z1 vertex and the 4l vertex and we need to compare those for the primary vertex.
The observed failures of Z1 vertex on the MC are at the rate of ~1/1M after all other selections, and Kalman vertex fitter is seen to fail only twice among all of the samples we have processed so far. No failures in 4l data events are observed.
line 240-242: how did you choose those cuts ? Did you optimize them by using the significance or the p-value ? Are you sure we still get 7 sigma discovery ? Did you validate those cuts in some way? Can you produce a ROC curve of signal to background rejection ?
See comment in the general comments section. We also add that a 3-dimensional ROC curve is not very illuminating to analyze.
line: 249: why did you use the range 105.6-140.6 if in the past we used 121.5-130.5 ? Is there any justification for the choice of this range ?
This is the range used in either the prior width paper or the spin-parity analysis, which refers to a -20/+15 GeV window. We do not change these ranges.
121.5-130.5 was never used in any analysis, it was used only for illustration on plots. Here we want to keep the sideband to keep background for modeling.
table 3: please include a table of comparison with the old selection in the legacy paper....table 17 in AN-108v9 and also with SIP<4 cut without the absolute value.
We can split this table for each channel. Our framework does not look at the sign of SIP_3D, and it is also irrelevant for this analysis.
Figure 11 top: binning is not defined. The meaning of thise plot is not clear and what is the conclusion. Can you plot the mean and the RMS of the distribution of the bottom plot and compare with the same for the position of the offline primary vertex ? Can you compare the bias and the RMS and report in a table?
The explanation of these plots are on lines 254-262 (for v5). The mean of the bottom plot is <1 um in each selection scenario (notice the plot shows the 4l vertex). We also do not think that te quantification of the bias is not as easy as quoting the RMS and the mean. Please note that the bias is correlated with the number of tracks used for the OPV, and comparing the RMS as a function of lifetime in particular to the RMS of the 4l vertex or the RMS of the 0-um samples assumes the uncertainty on the PV or 4l vertex is perfectly gaussian.
line 258: it is not true that all the reconstruction remains like in the legacy because the 4l vertex would require extensive validation with data in the signal and control region and the was not done. ....this comment is even worse for the SIP_Z1 and chi2.
Sideband comparisons for data - MC are already presented for both 4l and 3l events both in the AN and in previous talks. We would also like to remind that there is a great overlap in the events that pass either the new or the old selection. It is not correct that there is extensive validation needed since almost all of the vents have already been validated.
Eq.10 ...what is the difference with repsect to the standard likelihood used for the legacy paper ? Is it only related to the different definition of Psig and Pbkg ?
That is correct. Our main goal here is to emphasize that Dbkg and Txy are decoupled for a particular production.
line 274: it is not true that the kinematic of the decay and the vertex position are completely independent since the quality of the 4l vertex depends on the quality and the number of the tracks and so on the decay. Can you plot the two observables, one vs the other and check the correlation?
It was already shown in previous presentations in HZZ meetings that processes apart from Z+X have essentially the same SIP and chi**2 distributions, and the vertex resolution issues were discussed during our ARC meetings.
We have now shown this explicitly with various BSM samples for the Higgs boson decay and observe no difference.
line 280-281...I want to understand how the interpolation is done and a plot demonstrating the quality of that.
See Fig 4 top. All points besides 100, 500 and 1000 um are from interpolation. Reweighting in 10-um intervals is done exactly by reweighting each events at larger ctau to smaller ctau by exp(-[t/ctau_H2-t/ctau_H1]) since the true distribution is an exponential decay distribution.
Section 4.4 Systematic uncentainties
line 301. Why for Z+X did you study only the sideband 2?
We divide the data into 4 regions: Sideband 1 (Z->4l region), where the events are almost exclusively qq->Z->4l, signal region, Sideband 2 (where the Z+X relative contribution is similar to signal region) and Sideband 3 (region with the 2m_Z peak where qqZZ background is dominant over all other processes). We use Sideband 1+ 3 for prompt decay systematics since the relative Z+X contribution is negligible, and we use Sideband 2 as a statistically-independent m4l window for Z+X systematics itself.
figure 13: binning is not defined. Can you plot the ratio of data / sum of MC? Can you plot the distribution os SIP3,4 with respect to Z1vertex ? There is a big discrepancy in the chi2 distribution, as expected. Do you have an idea of where it come from ? Are you sure you can only handle that as a systematic ? At which step fo the selection those plot are filled ? Can you plot the TXY in log scale and plot the ratio ?
Since we have many of these plots generated in a streamlined way, we prefer to leave the y axis label as it is. We could change the labels as special instance for the paper once we converge on the plots that go to the paper or the twiki.
We do have SIP^Z1_34 distributions, but we already indicate in the legend our reason for omitting it.
As discussed on lines 305-318 (for v5), we choose not to assign additional systematic uncertainties to new vertex selection other than those already present for the legacy analysis. As alsol mentioned on those lines, we disagree with the comment that the disagreement is big since our cut is at a flat region of the tail.
Table 3 needs to evaluated with the cut n the ablsolute value of the SIP variables ! Is the source of the 1.6 sigma understood o guessed ?
We never use the sign of SIP_3D. On Table 5, many of the channels have uncertainty much less than 1.6 sigma. While we do not know the reason for the discrepancy, it is alo a valid question for SIP_3D as well for which there is no documented answer. We believe that these two discrepancies would be from the same sources.
figure 14: can you make the plots in log scale and define the binning on the plots? The are discrepancies in the plot of Dxy and Txy as well as for
SIP3D and chi2...are those understood ? It is not enough to include a correciton factor like in table 4 ...we need to understand the physics behind. How the error on the vertex constraint propages to the final result ?
Anyway what is more important is to show that for the Z+X there is a good data to MC treatment...so where the vertex variable shuold be very useful to reject the background.
The discrepancies are within statistical uncertainties. We could provide a log-scale plot for Txy, but we do not think it is critical for the analysis.
While understanding the physics behind a discrepancy is important, since discrepancies could be caused by many factors including additional misalignment, we prefer to approach the issue more pragmatically.
Plots in figure 15 top are messy..can you separate the final states please ?
The editors opt to keep them together to keep the comparison between different channels.
line 340-342 in the sideband 1 you shuold have enough Z+X events and the most diffcult to model...did you try to check the variables in sidebad 1 for not prompt decays events ?
Z+X estimation in Sideband 1 has already been done during the spin-parity Z->4l cross checks, and comparing to the qqZZ yield contribution, we find it negligible for this analysis.
line 344-350...the parametrization does not work very well looking at figure 16...please include chi2 of the fit....do you really need to parametrize the shape?...can you just add the histograms ?
Please see the explanation given in the text. The parameterization works very well for the prompt processes, and it is only supposed to model non-prompt Higgs decay perfectly in the event that there is no OPV pull.
figure 17: please specify the binning.
Please see previous comment on specifying the binning.
line 393: what does "as opposed to the data sample " mean ?
We refer to samples from the main analysis selection as the data sample and those from CR selections as Z+ll or Z+l control region.This distinction can be clarified in v6.
line 443-to the end...I don't think you use the same reweighting becuase of the production mechanism also for exotic model since the different couplings t fermions and gauge boson change the cross section for production and the relative contribution of the different mechanism to the cross section. Please also checj why Dxy depend on the production mechanism...
This is already discussed during the ARC meeting.
figure 21: binning of middle and bottom plots is not clear.
Please see previous comment on specifying the binning.
Table 4 for systematics seem not complete...it does not include all the systematics related to the several reweighting procedures used in the analysis and also the impact of the fitting procedures, interpolation, exptrapolation, assumptions about the exotic models ....
Re-weighting is exact, there is no uncertainty.
Fitting is a method and it propagates all uncertainties in the pdf.
Exotic models are all covered with the conservative range of pT assumption from the extremes of gg->H and ttH.
We do not assign additional systematics for MC statistics.
Figure 22: How this plot is modified in the exotic models?
This will be addressed with embedded asimovs as discussed during the ARC meeting,
but again, exotic models are all covered with the conservative range of pT assumption from the extremes of gg->H and ttH.
Comments about the width analysis:
- here the modification of the approved higgs width and high-mass paper analysis seem ok to me. Actually the result n the higgs width with the inclusion of the treatment of VBF full simlation samples and di-jet categories makes the analysis more accurate than in the past, hoping than it does not contraddict too much the previous result on the higgs width. In the last metting with ARCS you said that more or less the result is the same. We need final numbers and plots to compare.
This has been presented multiple times before at HZZ/HZZ4l meetings and also presented in the last ARC meeting.
line 498-499 there seem to be a contradiction...the a affects the sensitivity of the analysis but then if it is varied up and down of a facror of two the sensitivity does not change.. ?? Something missing ?
a has very small effect on sensitivity of analysis, 10 was chosen as it was expected to be near expected sensitivity and this was tested extensively before. There is no contradiction.
line 501-505....here is a question the matri element tecnique can only produce a discriminant to separate 4l+2 forward partons and H+2parton....that is becuase
MCFM and JHUGEN only provide partons at the level of generation and the calculations of the matrix element can be done only with basic matrix elements. How the discriminant can be valid one we use jets instead of partons...in other words what is the effect of the hadronization and the reconstruction of jets. How can we trust the VBF discriminant built with matrix element techniques if the jets cannot be modeled at the level of generator ? How can we trust a VBF disciminant built with matrix element with leptons and partons, if instead we need to handle leptons and jets... How do you introduced the effect of the detector and model them ? What is the purity of the events selected with the MELA VBF discriminant ? We don't need only the efficiency and ROC curve. Also what about NLO QCD effects n the discriminant? Is JHUGEN accurate as MINLO to handle events with two parton/jets ? Why not using minlo to build the discriminant ?
vbfMELA is validated and parameterized with the full simulation samples, which include all effects that you mention.
The method itself does not need to be optimal to be correct, but it also happens to be very close to optimal.
This has been presented in HZZ/HZZ4l studies for over a year and a half.
Drastic difference in parton behavior between HJJ and VBF lead to very good discrimination even without any hadronization/reco effects.
Performance roughly identical in either case as expected.
In addiotn all the discriminant used for the width analysis are build by using SM expectation...how can we use them for the BMS models without changes or additional studies ? This seem to me a bias.
fLQ only requires analytic reweighting of mass and has no effect on decay kinematics. fLQ would have identical production kinematics.
In the width part of analysis we do not pretend that we cover all BSM models, we cover only fLQ models. However, we cover them in both production
(relevant to VBF and VH) and decay.
Figure 28 right....the distribution of the number of jets is different across the generators used. Are you sure you understand that ? Are you sure about the validity of the reweighting used ?
Jet multiplicity is different between LO and NLO generators, seemingly due to effect of hadronizers. VBF WG and MC Conveners are aware of this, no large concern. Reweighting can be done on jet multiplicity to account for differences, only expected change would be in categorization between Djet>0.5 and Djet <=0.5 categories which shows little difference in left Fig 28 and is covered very well by the systematics assigned on the splitting.
line 563-564: not clear rewighting with Phantom.
We will include more explanations in v6.
line 567: the procedure to create templates and the code for that has not been made available to the HZZ group, even from the time of the higgs widht analysis....the result is that the templates could be reproduced ...this part of the analysis is not reproducible. Can Ulascan relase the package to handle the tmeplates and made it available so somebody else can try to use it and check the output templates ?
The package has been available publicly for a while, but there may have been a miscommunication regarding its distribution.
Please also see the framework comment about validation above.
In either case, this comment is not for the ARC review, but for discussion in the AWG.
line 588-590: another example of not clear assumtion/extrpolation. ...is it based on what physical reasoning?
On-shell, mZZ/mH=1, so fLQ reweighting is also valid on-shell.
seciton 5.4: you need to add a systematics related to the absent modelization of the impact of detector and hadronization on the VBF discriminant.
The VBF discriminant is validated and parameterized with full simulation, so all effects are included.
The variables are calculated from reconstructed information, so systematics on the ME are irrelevant as long as data
and MC are treated the same. Comments on hadronization etc are more relevant for modeling the data with the MC.
line 618 and 629: quastionable rescaling procedures....clearly rescaling doesnot mean understanding the issues
Rescaling is only done as a systematic check. Please also see the comment above.
figure 32: comments come from what we discussed in the last
ARC meeting
Finally it could be more appropriate to quote as authors only the people that did the real work. That what we made also for the higgs width AN and the differential cross section AN.
The ditors decided to quote only the people that have a direct analysis contribution as authors.
This implies people who developed the framework for processing the samples and those who contributed to
the development of the statistical analysis.
In either case, this comment is not for the ARC review, but for discussion in the AWG.
Comments about tthe appendix about the Z+X estimate:
overall the SIP cut shuold be with absolute value,
See general comments
line 672. Is the |SIP|<4 cut aplpied on tight leptons...is so is it SIP with respect to primary vertex or to the Z1 vertex fitted ?
|SIP_3D|<4 is always wrt the beam spot as it has always been in the previous analyses.
line 681: the chi2 for the three leptons requires itself full validaion with data
See Fig. 14 top for the relevant chi**2 distribution
comparing figure 35 and 36 with figure 59 and 50 of AN108_v9 I noticed higher fake rate for muons in the first bins (the most important ones) with respect to legacy paper. Is this related to the mistake in applying the cut on SIP instead of absolute vlaues of SIP.Can you please verify that ? In addtion the fake rate is slightly higher in the case of new cuts w.r.t old cuts...an expanded y scale can help the comparison.
As expressed a number of times and agreed by the HZZ group, our goal is not to synchronize the frameworks entirely. Please also refer to Table 13 vs 14 for a comparison of the two schemes, where the differences are due to statistics in the Z+ll OS region.
On Fig 35 and 36, the fake rates are the right-most plots, not the left or the middle, and tey are almost exactly on top of each other.
figure 40: can you use log scale and improve the binning...it is very hard to spot the differences between the histograms
While it is a good practice to make sure there are no differences, the low m1 regions in the Figure 40 are consistent within statistics with very little event content. Therefore, the editors do not think this plot needs further improvement.
line 721: can you plot the mZ2 in the case of Z+2l OS ? Or better m4l ?
The editors think the quantitative picture does not change much for m2 or m4l, so they will be omitted from the AN. They could certainly be added as a back-up slide.
eq 22: the transfer function was not quoted for the legacy paper. What about eq. 16 of the AN_108_v9? Is that used like that ?
The transfer function was used in the past for the spin-parity and width analyses, and we construct it in te same way as done before.
There is not any single plot about data to MC comparison and any closure test. I think those things have to be provided.
Please see relevant comments above.