Search for dark photon in Higgs decay in four lepton channel at 13 TeV
Color code:
- Answered
- Open discussion
- Not answered
Questions and comments on the way to approval (17-06-2019)
- Check (at least for SM Higgs and qqZZ) the impact of allowing τ decays
*
Not answered
- Discuss in the AN the agreement for the 3P1F 2017 and 2018 plots
*
Not answered
- Result plots minor changes: zoom version of the limit focusing on the low mass, add the quarkonia veto area around the Y, m4l plot for the non-resonant part, general cosmetic improvement
*
Not answered
Questions and comments from pre-approval (17-06-2019)
- Datacards should be checked and approved by the combination group
*
Not answered
- Fill in the
StatComm questionnaire
*
Not answered
Questions and comments on AN v3 from HZZ convenor (16-05-2019)
- Some of the appendix materials can be definitely moved to the main body. For example, the data/MC distribution and yield table in the SR can go to Sec 9. I think you need to remove all the data points, not just in the region between 4-12
GeV, although we already “unblind” the rest part in HIG-19-001. The reason is just to be blind to your search regions. And maybe for the MC shapes in the plot, you can replace them by the real parameterization instead of MC, because that’s what you feed to the fit. The plot as it is will probably trigger misunderstanding that you don’t have a good parameterization of the components.
*
The search region from 4 < mZ2 < 120 GeV has been reblinded and the corresponding plots in AN have been updated. Due to the addition of pp -> ZZd, we have decided to not use the shape parametrisation and switch back to the simpler cut-and-count analysis, as the improvement from a shape analysis is not big and limited to 5-10% in upper limits on various branching fractions.
- In general it would be good to quote the overall signal efficiency, better as a function of the Zd mass, to get a feeling how sensitive the analysis is, and whether there’s room for improvement.
*
A table of signal efficiency has been added to the AN. Overall the signal efficiency looks reasonable across mass points. At very low masses we have lower signal efficiency.
- In sec 10 you show some m2 parameterizations, would be good to have some numbers. Similar to the above comment, I’m wondering what’s the mass resolution of m2, and how they vary as a function of Zd mass, and how they differ in electron and muon channels.
- A question related to the above comment, is the resolution smaller or larger than 5%? Because this is the difference you allow between m1 and m2. For sure if the resolution is > 5%, you lose signals by imposing this cut.
*
Slide 2 and 3 in the attachment shows the upper limit on signal strength and significance against the window width for various Zd masses, for ZZd. From these plots, a 2% (5%) window for the muon (electron) channel gives the best performance in sensitivity. A new short session has been added to the AN to cover this study.
-I think it’s not mentioned in the note, what’s the bin width for the counting in H->ZdZd search? Again related to the resolution question, to justify that the choice of the bin is well covering the resolution.
*
For ZdZd, window optimisations have also been studied in a same spirit as ZZd and similar conclusions (2% and 5% window for the muon and electron channel respectively) can be seen in Slide 4 and 5 in the attachment. A new short session has been added to the AN to cover this study.
-You probably want to add a paragraph to explain why the limits look as it is now, for sure they don’t follow exactly the bkg m2 distribution. It’s probably a combination of #bkg and signal efficiency (and maybe resolution?) effects.
*
A paragraph has been added.
- I’m wondering which total xsec you used to extract limit on BR(H->ZZd)? You used LO MC sample, but the overall Higgs xsec should come from
N3LO prediction right?
*
For statistical analysis, the total normalisations of the SM Higgs and dark photon signal are allowed to float freely in the likelihood model. The normalisation of the SM Higgs are constrained by the events outside the mZ2 window. For data/MC distributions, the signal cross section is calculated by the Higgs production cross section (N3LO) multiplied by Br(H->ZZd->4l). Br(H->ZZd->4l) is obtained from this theory paper
and is proportional to epsilon^2.
- What is the signal interference uncertainty in Table 7?
*
We have added a short session in the AN explaining the interference effects between H->ZZd and pp->ZZd.
- Did you also vary the QCD scale and PDF uncertainty in the LO signal sample to derive the systematic uncertainty in table 7? Was it reflected in " Common theory related uncertainties”? And in the same table, what does “ Higher order correction” mean?
*
Sorry for the confusion. It is intended to be QCD and PDF scale uncertainty for signal samples. As the production mechanism of the signal is the same as Higgs for H->ZZd and H->ZdZs (ppZZd same as qqZZ), we take the uncertainty to be same as ones used for Higgs or qqZZ, but in the likelihood model the nuisance parameters are uncorrelated. The corresponding table in the AN has been updated.
Questions and comments on AN v2 from HZZ convenor (18-04-2019)
Z+X estimation (ZZd)
- Fig10-12: This is data vs MC expectations. It is known since
RunI that it's very hard to reproduce fakes with MC in our phase space (especially for muons). As you don't use MC expectations from DY, tt, etc... in your final estimation, please add a sentence about that. Otherwise reviewers will ask unnecessary studies about these discrepancies.
*
A sentence has been added.
- Validation with side band: I'm afraid that, given the small contribution for Z+X and low statistics, these sidebands are not able to robustly confirm the Z+X prediction...
*
This validation is meaningful because we can inspect the contributions from Z+X in the sidebands, especially in the 4 < mZ2 < 12 GeV. From the plots, we can see that the Z+X contributions are negligible compared to qqZZ.
- validation with the wrong-favour-charge:
i) this looks like a nicer check to me, although you cannot check separately the correctness of the electrons or muons contribution.
ii) you say "pairs of leptons either with same charges or different flavours". Are you saying you are not selecting events with both same charge and different flavours, like e+mu+ mu-e- ?
*
The exact requirement we used is anything other than OSSF final states, i.e. "inverting" the OSSF requirement.
iii) "data agree with the predictions withi 20%". Not really true for 2017.
Also, what do you finally extract from this ? Would you rescale your overall prediction to what you observe ? Would you assign some uncertainty to cover the discrepancy ? I guess that at the end, you only want to demonstrate that your prediction is well within the observed values, given the uncertainty you already assign. If so, I would make it clearer in the text and probably move this part after you discussed the systematics.
*
The validations are used to demonstrate agreements between observed data and predictions are within the systematic uncertainties assigned. A sentence has been added at the end of this section. The text for validations has also been moved after the discussion of systematic uncertainties.
- Table 2:
i) the numbers does not seem to scale well with the lumi. For 2016 vs 2017/2018, they should not as pixel detector was changed + improvement in
MVA eID so I won't comment. But when I look at 2017 vs 2018. If I naively extrapolate 2017 to 2018 (increase of ~1.44 in lumi), I get 4.3, 4.2 and 7.2 for 4e, 2e2mu+2mu2e and 4mu respectively, instead of 2.5, 4.6 and 4.4 you report. So overall, 2e2mu is fine. But 4e/4mu do not scale well.
Is it something you observe already in the raw numbers of the CR ?
*
The lower predictions for 4e and 4mu in 2018 are caused by the upward fluctuation in data count for the 2P2F control region in 2018. For example, going from 2017 to 2018, the total data count in 2P2F increases from 321 (1052) to 643 (1741) for 4mu (4e), while the total data count in 3P1F increases from 67 (100) to 107 (141) for 4mu (4e) only.
- Systematic uncertainties : what is the final number you consider for each final state (after summing in quadrature the various contributions) ?
*
A table has been added following the discussion of Z+X systematic uncertainty. The final numbers ranges from 50% to 100%, depending on year and final state.
Z+X estimation (ZdZd)
- Please show numbers/distributions for 2018.
*
A table containing Z+X predictions for each final state and year has been added. Data/MC distributions for 2018 with either a inverted mass ratio or m4l cut have been included.
- Fig 22-25: Could you please re-check both plots and caption ? For instance, Fig 22 and 23 have the same caption. Fig 24 and 25 are identical.
*
Typos and plots have been updated.
- Again, the contribution from Z+X is too small in these sidebands to be able to say that you validate the estimation of Z+X or not.
*
Validations with these sidebands demonstrate small contributions of Z+X background in the signal region.
- Please add a table similar to Table 2 .
*
A table containing Z+X predictions for each final state and year has been added.
Systematic uncertainties
- l299: lumi uncertainty for 2018 is 2.5 %
*
The text has been updated.
- I guess you will also update the paragraph as the text does not reflect fully the uncertainties in the 3 data taking periods (for instance, the numbers for lepton SF are 2016 only)
*
The text has been updated.
Yields and distributions
- Fig 51: there is a clear excess in 2mu2e channel no ? could you please quantify a bit (numbers of data vs expectations in the unblinded region,...) and investigate a little bit where it comes from ? This is not something you saw in the sidebands, right ? In HIG-18-001 we had an excess from low pT muons, especially in 2e2mu, but here it may come from low pT electrons ??
*
After some investigations, we discovered that there is a small bug in our code. The labels for 2e2mu and 2mu2e were swapped accidentally. Also the excess is again from the region with low pT muons and similar to what we have observed in HIG-18-001.
- Table 19: please add 2018 numbers
*
Yield tables and distributions for 2018 have been added.
- Fig 53-56: you don't expect a lot of events but still... just to be sure, did you blind all distributions?
*
Yes, the signal region is completed blinded.
Parametrization
- Fig 27-28: why do we see long tail in the distribution of mZd=7
GeV?
*
These are outliers which contributes very little to the overall statistics of the signal sample.
- Fig 30-31: Both captions say "qqZZ", I guess one of it is ggZZ.
*
Yes they are qqZZ and ggZZ respectively. The typo has been fixed.
- No year is mentioned in the plot. Don't you derive these per year?
*
The plots are derived with 2016 setup. We plan to derive the parametrisation for each year separately.
- Also, only ZZd is mentioned. What about
ZdZd ?
*
We do not plan to do a shape analysis for ZdZd because of the limited statistic with MC after full selection.
Extraction of final results
- Fig 33: why don't you have results at 35
GeV ? You generated this mass point as well, no ?
*
Unfortunately we do not have private samples for this mass point. However, we have this mass point in our official signal samples and will add this back when we switch to the official signal samples.
Appendix A
- If I understand well, at the end, you use LO MADGRAPH samples, right ?
*
Yes LO Madgraph samples are used.
- why do we see a peak at 65
GeV for Madgraph ?
*
The bin at 65 GeV includes counts from the overflow bin.
- Fig 37: I guess the x-axis is wrong.
*
The x-axis has been fixed.
Questions and comments on AN v1 from HZZ convenor (13-12-2018)
Datasets:
- Table 2: A Re-Reco is already available for 2017 data (31Mar2018). We
should check the exact policy for analysis aiming at publications (ie,
whether or not we could still use the ones listed in Table 2). I guess
that unless you critically depends on some aspect that has a fix only in
the 31Mar2018
ReReco it's fine, but we should check to be 200% sure.
*
As this analysis strategy is similar to the SM HZZ analysis, it does not depend critically on changes in ReReco datasets and MC samples. There is no strong motivation for us to use the ReReco datasets. We can consider switching if the SM HZZ analysis decides to switch for paper publication.
- Trigger Efficiency: what do you do at the end ? Do you apply some SF ?
Do you consider some systematic uncertainties ? (I did not see in Table 17)
*
Trigger efficiency SFs are applied and the corresponding systematic uncertainties have been propagated. Table 17 has been updated.
- Signal: I understand you set kappa to have negligible mixing between
Higgs and Dark Sector. But you have a mixing between SM and DS. In your
analysis you consider the case where you can H->ZZd (via "hypercharge
portal"). But what about ZZd not when the Higgs decays but "directly" ?
Can you have pp->ZZd without Higgs ? If so, there are also interferences
to consider. How do you treat these cases ?
*
The effect of pp->ZZd is summarised in this talk
in the HZZ meeting
Objects:
*
[General comment for the session Object] we follow the same object definition as HIG-19-001 for each year. Therefore, the same set of working points and scale factors will be used. We also note that the publication timeline for this analysis turns out to be similar to the SM HZZ legacy paper, we can consider switching to new object definitions and SFs. We expect them to have a small effect on the sensitivity of our analysis. The text in the AN has been generally updated to explain our object definition in a better way. This entire session has been cleaned up to refer to HIG-19-001 to avoid any confusion.
- l136: you say that the object definition follows the same as the
previous analysis (2016 and 2017). But we actually change the definition
for electrons. We went from BDT ID +
RelPFIso < 0.35 to a BDT combining
ID and Iso.
As it seems there is a lot of copy/paste from the 2016 AN, it's not
fully clear to me what do you do exactly. I suggest you to clean up this
part and explain exactly what do you for the different periods. There is
no need to describe in details the algorithms, variables, the discussion
about TMVA or overtraining, ... You can just refer the e/g twiki and the
HZZ4L notes.
BTW:
- a new recommanded training (ID+Iso) is available. Please have a look:
https://twiki.cern.ch/twiki/bin/view/CMS/MultivariateElectronIdentificationRun2
(it's also summarized in the e/g plenary talk during
CMS Week).
I suggest you move to it.
- This training is also considered to be usable for 2016 data as well
(although we are in the process of doing a dedicated retraining for 2016
for the HZZ4l analysis)
*
The BDF ID in this analysis for 2017 includes isolation. The text in the AN has been updated to explain our object definition in a better way.
- Which set of e-scale and mu-scales corrections are you using ? The
plots you show in this AN are copy/pasted from the previous HZZ4l AN and
may not correspond to the latest corrections (or dataset used)
available. For instance, you use a
ReReco for 2016 while the 2016
analysis was using some
PromptReco. But your plot on slide 9 is directly
extracted from AN2016_442...
In general, if you copy/paste from material from previous AN, please
cite them.
- Figure 12-16: did you rederive the SF or are they taken from previous
HZZ4L analyses ? (again, if the dataset changed, you have to rederive
the SF in principle)
*
The SFs are taken from HIG-16-041 using re-reco datasets (03Feb2017).
- Same comments for Muons: I think we have new nice measurements from
Tahir for the T&P but the SF shown seems to be copy/pasted from previous
AN. Is it the case ?
*
The SFs used are the same as in HIG-18-001. The new SFs from Tahir have not been propagated to this version of the AN yet.
Background Estimation
- Figure 37: you say l479-481 that the differences are due to:
a) photons conversions
b) and overlapping isolation cones.
Do you have any evidence for that ? For instance, photons conversions
don't affect muon in principle. So the 4mu and 2e2mu should only be
affected by b). But b) is mostly for mZ<12
GeV while it's not clear to
me that the 3P1F data vs 2P2F predictions disagreement is only located
in this region.
Also, for electrons, 2mu2e is fine. And 4e is not so bad while it should
be affected by both effects...
*
To clarify, the data/MC ratios shown on Figure 37 are calculated with data divided by MC predictions, but not the data-drive estimation from 2P2F. The corresponding plots are updated in the AN. For mZ2 > 12 GeV, the agreement between data and 2P2F predictions are consistent with the HZZ analysis. For mZ2 < 12 GeV in which effects from overlapping isolation cones could be significant, reasonable closure are observed with data and 2P2F data-driven predictions.
- What about similar plots for 2017 data ?
*
The corresponding plots are added in the new version of AN.
- Figure 38: is it 2016+2017 and all channels combined ?
*
The plot is made with 2016 and all channel combined. We have updated the AN by adding the same plots for 2017.
- l520-526: do you have updated measurements for 2016 or 2017 ?
*
At the time of writing this AN, systematic uncertainties have not been updated and uncertainties derived with the 2016 analysis are used. Systematic uncertainties for 2017 and 2018 will be derived shortly.
- Could you please give tables of predictions of Z+X for both 2016 and
2017, in the different channels ?
*
We have added a table summarising the total predictions for each channel for each year.
- what is the final uncertainty on the predictions for Z+X ? How do you
combine the 30-40% uncertainty from FR composition and the problem of
overlapping isolation cone ?
*
Uncertainties from each contributions will be added in quadrature since they are two different sources of uncorrelated systematic uncertainties .
Systematic Uncertainties:
- 2.5 - 9% for lepton reco/id: this was the numbers from the 2016 AN.
Numbers from 2017 should be different.
This is actually a general comment: in your table, you have only one set
of systematics while I'm expecting that, at least the experimental ones
should differ for the various data taking periods. Please update.
*
Table for 2017 and a placeholder for 2018 have been added.
Yields:
- Please add tables with the yields (data vs breakdown of backgrounds,
for various channels + total, for 2016 and 2017 separately)
*
Table with yields has been added in the appendix.
- l569-570: this statement does seem to be always supported by the plots.
i) Fig 41-43: data/MC ~0.7 for muons with 2016 data
ii) Fig 45: data/MC~0.83 for muons with 2017 data
iii) Fig 48: data/MC~1.14 for electrons with 2017 data
iv) Fig 49: consequence of Fig. 41-43: ~0.81 in 4mu, 0.52 for 2e2mu (2016)
v) Fig. 50: consequences of Fig. 45 and 48: 0.66 for 2e2mu, 1.32 for 2mu2e
Did you try to investigate a bit more these deficits or excesses
(depending on the channels) ? Did you try to quantify the (dis-)agreement ?
*
The data counts that used to calculate the data/MC ratio do not account for the blinded region 4 < mZ2 < 12 GeV. For regions mZ2 > 12 GeV, there are reasonable agreements between data and MC.