HIG-12-001
"A search using multivariate techniques for a standard model Higgs boson decaying into two photons"
This is a condensed description with plots for the analysis HIG-12-001
Table of contents
Abstract
A search for a Higgs boson decaying into two photons is described. The analysis is performed using a dataset recorded by the CMS experiment at the LHC from pp collisions at a centre-of-mass energy of 7 TeV, and corresponds to an integrated luminosity of 4.8/fb. Limits are set on the cross section of a standard model Higgs boson decaying to two photons. The expected exclusion limit at 95\% confidence level is between 1.2 and 2.1 times the standard model cross section in the mass range 110-150 GeV. The observed limit excludes at 95% confidence level a standard model Higgs boson decaying into two photons in the mass ranges 110.0-111.0, 117.5-120.5, 128.5-132.5, 139.0-140.5 and 146.0-147.5 GeV. The largest excess of events over the expected background is observed around 125 GeV. Taking into account the look-elsewhere effect in the search range 110-150 GeV the excess has a global significance of 1.6 standard deviations. More data are required to ascertain the origin of this excess.
Further details
The search is a development of that reported previously,
CMS-HIG-11-033
(published in Physics Letters B), and uses multivariate techniques to improve the search sensitivity. The expected limit is improved 20% in terms of cross section.
To improve the sensitivity of the search, selected diphoton events are subdivided into classes according to the output value of a diphoton BDT which is designed to have the following properties:
- it classifies events with signal-like kinematic characteristics with a high score,
- it classifies good diphoton mass resolution events with a high score,
- it classifies events with a high score from the photon identification BDT with a high score,
- it should be mass independent, it should not select events according to the invariant mass.
Five mutually exclusive event classes are defined, four defined by the diphoton BDT output, and a fifth class into which are put all events containing a pair of jets passing selection requirements which are designed to select Higgs bosons produced by the vector boson fusion process.
The background model is obtained by fitting polynomials to the observed diphoton mass distributions in each of the five event classes. A cross check analysis is performed using an alternative background model where the result is extracted from a fit to the output distribution of a BDT (mass-window BDT) which has two inputs: the diphoton BDT output, and the mass. The results of the cross check analysis are consistent with those obtained from the mass fit.
Figures from the PAS
Image |
File links |
Description |
|
pdf png |
Figure 1: Background model fit to the mgg distribution for the five event classes, together with a simulated signal (mH=120 GeV). The magnitude of the simulated signal is what would be expected if its cross section were equal to the SM expectation. The sum of the event classes together with the sum of the five fits is also given. a) The best event class defined by diphoton BDT output value. |
|
pdf png |
b) The 2nd best event class defined by diphoton BDT output value. |
|
pdf png |
c) The 3rd best event class defined by diphoton BDT output value. |
|
pdf png |
d) The 4th best event class defined by diphoton BDT output value. |
|
pdf png |
e) The dijet-tagged event class. |
|
pdf png |
f) The sum of the event classes together with the sum of the five fits. |
|
pdf png |
Figure 2: Examples of fits to determine the fraction of events in BDT output bin 0. The value of the fraction of events is shown on the y-axis. a) For a Higgs hypothesis mass of 120 GeV where three sideband windows can be used on either side of the three excluded signal windows. |
|
pdf png |
b) For a Higgs hypothesis mass of 110 GeV where only one sideband window can be used on the low side without using data below 100 GeV and hence five sideband windows are used above the three excluded signal windows. |
|
pdf png |
Figure 3: Example fit of a double power law function to the data over the mass range 100 to 180 GeV, used in the (cross-check) analysis to obtain the background normalization. The 2% signal window shown by the red vertical lines, centred on a Higgs mass of 120 GeV for this example, is excluded from the fit. The values of the three parameters from this fit are a = 4.27 +- 0.05, b = 0.02 +- 0.22 and c = 10.0 +- 2.3. The chi-squared is 158.3 for 156 degrees of freedom, correspond to a probability of 43%. The chi-squared is calculated by binning the data as shown and integrating the fitted function over each bin. |
|
pdf png |
Figure 4: Exclusion limit on the cross section of a SM Higgs boson decaying into two photons as a function of the boson mass relative to the SM cross section, where the theoretical uncertainties on the cross section have been included in the limit setting. The limit is calculated using the frequentist CLs method. The expected limit obtained in the earlier analysis of the same dataset is shown for comparison. On popular demand: the observed and expected limit at mH = 125 GeV are 2.87 and 1.20, respectively. |
|
pdf png |
Figure 5: Observed local p-values, for the combined event classes, and also for the dijet-tagged class and the combination of the other four classes. |
|
pdf png |
Figure 6: The best fit signal strength, in terms of the standard model Higgs boson cross section, for the combined fit to the five classes (vertical line) and for the individual contributing classes (points) for the hypothesis of a SM Higgs boson mass of 125 GeV. The band corresponds to +- 1sigma uncertainties on the overall value. The horizontal bars indicate +-sigma uncertainties on the values for individual classes. |
|
pdf png |
Figure 7: Exclusion limit on the cross section of a SM Higgs boson decaying into two photons as a function of the boson mass relative to the SM cross section, where the theoretical uncertainties on the cross section have been included in the limit setting. The limit is calculated using the modified frequentist CLs method and uses the mass sideband background model. |
|
pdf png |
Figure 8: Observed local p-values, obtained using the mass sideband background model, for the combined event classes, and also for the dijet-tagged class and the combination of the other four classes. |
Analysis flowchart
Data/MC comparison
Image |
File links |
Description |
|
100-180 90-180 90-180x5 |
Diphoton mass distribution for data (data points), and Monte Carlo simulation of SM processes which constitute the background to the search (histograms). A simulated signal for a Higgs boson with a mass of 120 GeV is shown by the red histogram. The shaded band represents the theoretical (k-factor) uncertainty on the the MC prediction. The plot is available in three versions. In two versions, covering different mass ranges, the magnitude of the signal is what would be expected if its cross section were equal to the SM expectation, and in the other it is SM x 5. |
Vertex identification and correct vertex probability
Image |
File links |
Description |
|
pdf png |
Fraction of Higgs boson vertices found within 10 mm of their true location, for a Monte Carlo signal sample (mH = 120 GeV), as a function of the Higgs boson transverse momentum. The distribution of the number of interactions per bunch crossing (nPU) in the Monte Carlo is adjusted to be the same as in the data by weighting the events. |
|
pdf png |
Validation of the vertex selection BDT with Z→μμ events. |
|
pdf png |
Validation of the vertex selection BDT with photon + jet events. |
|
pdf png |
Validation with Z→μμ events of the BDT used to predict the per-event probability that the vertex selected is within 10mm of the true vertex. The BDT is trained to predict when the wrong vertex is chosen, so a BDT output value of -1 corresponds to a high probability of correct vertex. |
|
pdf png |
Validation with γ+jet events of the BDT used to predict the per-event probability that the vertex selected is within 10mm of the true vertex. (Includes use of photon conversion tracks). The BDT is trained to predict when the wrong vertex is chosen, so a BDT output value of -1 corresponds to a high probability of correct vertex. In this validation the jet is used to determine which is the correct vertex, and the sample is split into events where the correct vertex was chosen (according to the jet), and those where the wrong vertex was chosen. Unfortunately, the jet points to the wrong vertex quite often (the brown line, from MC, shows the vertex probability BDT output value for these cases) so there are many events with a very good probability that the correct vertex was chosen (MVA=-1) even in the sample of events where the jet indicates that the vertex chosen is wrong (curves in red). |
Photon identification BDT
Image |
File links |
Description |
|
png pdf |
Distribution of photon ID MVA output in data and MC simulation for the leading photon in preselected diphoton events with mgg >160 GeV. Photons in the ECAL barrel. The hatched area shows the systematic uncertainty assigned to the photon ID BDT output. |
|
png pdf |
Distribution of photon ID MVA output in data and MC simulation for the leading photon in preselected diphoton events with mgg >160 GeV. Photons in the ECAL endcap. The hatched area shows the systematic uncertainty assigned to the photon ID BDT output. |
Energy resolution and scale
Image |
File links |
Description |
|
png pdf |
The energy resolution in different regions of the detector is investigated by smearing the energies of electrons in MC simulated Z→ee and comparing the resulting Z mass distribution with what is found in data. For electrons with R9>0.94 in the central part of the barrel (η<1) and away from module boundaries, the required additional smearing is very small. This is illustrated in the plot. |
|
png pdf |
Systematic uncertainties for the photon energy scale, for separate regions of detector and for photons with R9>0.94 and R9<0.94. |
Diphoton BDT
Image |
File links |
Description |
|
png |
BDT output distribution for Z→ee events in data and MC simulation, for all selected events (left), for events with both electrons in the ECAL barrel (second from left), for events with one electron in the barrel and one in the endcaps (third from left), and for events with both electrons in the endcaps (right). The bottom row of plots show the data/MC ratio. |
|
png pdf |
Comparison of the diphoton BDT output for data and for MC simulation in the "control region" (mgg > 160 GeV; black) and in the signal region (100 < mgg < 160; red). The data is shown as points, and the MC simulation is shown as hatched boxes whose size indicates the uncertainty. |
|
png pdf |
The effect of the systematic uncertainty assigned to the photon identification BDT output on the diphoton BDT output, for background MC simulation (100 < mgg < 180 GeV), and for data. The nominal BDT output is shown as a stacked histogram, and the variation due to the uncertainty is shown as a hatched band. This plot, and the one below, show only the systematic uncertainties which are common to both signal and background, there are additional significant uncertainties on the k-factors and background composition which are not shown here. |
|
png pdf |
The effect of the systematic uncertainty assigned to the photon energy resolution BDT output on the diphoton BDT output, for background MC simulation (100 < mgg < 180 GeV), and for data. The nominal BDT output is shown as a stacked histogram, and the variation due to the uncertainty is shown as a hatched band. This plot, and the one above, show only the systematic uncertainties which are common to both signal and background, there are additional significant uncertainties on the k-factors and background composition which are not shown here. |
Relating the photon BDT output for MC simulated signal, and for data, to the event classes used in HIG-11-033
Image |
File links |
Description |
|
png |
Diphoton BDT output distribution for events in a simulated Higgs boson signal (mH = 120 GeV) where pT(γγ)>40 GeV. The events are shown classified according to the eta/R9 event classes used in the earlier (HIG-11-033) analysis. The event class boundaries based on the diphoton BDT output are shown as vertical blue lines, and events in the blue shaded region are not used in the statistical treatment, limit setting etc. Note that not all of the events shown in the plot would have entered the HIG-11-033 event classes since the event selection used for that analysis was tighter than the preselection used for the multivariate analysis. |
|
png |
Diphoton BDT output distribution for events in a simulated Higgs boson signal (mH = 120 GeV) where pT(γγ)<40 GeV. The events are shown classified according to the eta/R9 event classes used in the earlier (HIG-11-033) analysis. The event class boundaries based on the diphoton BDT output are shown as vertical blue lines, and events in the blue shaded region are not used in the statistical treatment, limit setting etc. Note that not all of the events shown in the plot would have entered the HIG-11-033 event classes since the event selection used for that analysis was tighter than the preselection used for the multivariate analysis. |
|
png |
Diphoton BDT output distribution for events in data (100 < mgg < 180 GeV). Both photons in the events have additionally been required to pass the HIG-11-033 selection requirements ("CiC supertight"). The events are shown classified according to the eta/R9 event classes used in the earlier (HIG-11-033) analysis. The event class boundaries based on the diphoton BDT output are shown as vertical blue lines, and events in the blue shaded region are not used in the statistical treatment, limit setting etc. |
Signal model fits to a simulated signal for the five event classes
Image |
File links |
Description |
|
pdf png |
Event class 0: best event class defined by diphoton BDT output |
|
pdf png |
Event class 1: 2nd best event class defined by diphoton BDT output |
|
pdf png |
Event class 2: 3rd best event class defined by diphoton BDT output |
|
pdf png |
Event class 3: 4th best event class defined by diphoton BDT output |
|
pdf png |
Event class 4: Dijet-tagged event class |
|
pdf png |
The five event classes combined. |
p-values
Image |
File links |
Description |
|
pdf png |
Local p-values for event classes shown individually. |
|
pdf png |
Local p-values. Also shown is the median expected p-value for a Higgs boson with SM cross section at each mass hypothesis (blue dashed line), and the full local p-value shape of the median expected p-value for a Higgs boson with SM cross section at 125 GeV (red dashed line). |
Sideband background model
Image |
File links |
Description |
|
png |
Illustration of the definition of the signal region and sidebands for mass hypothesis mH=120 GeV. |
|
png |
Distribution of the first mass-window BDT input variable: the diphoton BDT output value, for data and MC simulationb in the signal widow for mH=120 GeV. |
|
png |
Distribution of the first mass-window BDT input variable: the diphoton BDT output value, for data and MC simualtion in the signal widow for mH=120 GeV. |
|
pdf png |
Distribution of the data, background model and signal model across the optimized bins of the mass-window BDT for mH=124 GeV. |