Measurement of a 4-jet production with 2 b- and 2 light-jets at the CMS experiment
Abstract
Due to the large parton density in proton-proton collisions at the LHC, the probability of having more than one parton interaction per event is non-negligible. In particular, double parton scattering (DPS), an interaction where each proton has two active partons giving rise to two different subprocesses, is relevant. These additional interactions may reach a hard scale comparable to the primary scattering and become experimentally distinguishable at high energies. In an interaction with DPS, pairs of jets are expected to exhibit specific angular and momentum distributions that reflect the uncorrelated nature of the pairs. We present a measurement, performed with the CMS experiment at the LHC with 2010 data corresponding to 36 pb^(-1), of observables to discriminate single and double parton scattering.
Introduction
A proton-proton collision at the LHC can be interpreted as a hard scattering where two partons, each of them belonging to a different proton, interact and eventually lead to the particles observed in the detector, and an underlying event, that is basically all the rest that occurs during the collision. The hard scattering can be described by perturbative QCD by giving as input the parton distribution functions and the matrix element, while for the underlying event, models of soft physics have to be provided since the scale of the process does not allow a description in a perturbative regime. The underlying event comprises initial and final state radiation (even if sometimes this component is difficult to separate from the hard scattering), beam remnant and multiple parton interactions. This analysis focuses on the last component and studies its contribution in collisions at the LHC. Due to the complex and composite nature of the each colliding hadron, it is possible to have multiple parton scattering where two or more distinct hard parton interactions occur simultaneously in a single collision. With increasing collision energy, in particular, the contribution of these additional interactions become more and more relevant because partons with successively lower longitudinal momentum fraction x become active and can be probed. In particular, if these interactions reach relatively large values of the exchanged transverse momentum, the observation of MPI mostly relies on the Double Parton Scattering (DPS).
Theoretical investigations of DPS have a long history with a large number of studies evaluating the DPS contribution to high energy processes. Some DPS processes have been already observed at the previous accelerators: final states involving 4 jets have been studied at the AFS collaboration at the CERN ISR and the CDF and D0 collaborations at the Fermilab Tevatron have measured the gamma+3 jets channel and the four-jet channel. At the LHC, these processes involve different scales and initial state partons, hence providing complementary information on DPS. CMS and ATLAS collaboration already released an observation of the DPS in the W+dijet channel. The importance of the DPS is significant since it can result, for example, in a larger rate for multi-jet production than normally predicted and thus produce relevant backgrounds in searches for signals of new phenomena. So, it is important to know how large the DPS contribution may be and what dependence on relevant kinematic variables it has. Specifically, the differences between final states produced in single chain processes or in double parton scatterings need to be studied in order to separate them and gain detailed experimental information on DPS. In addition to its role in the general LHC phenomenology, the DPS measurements will have an impact on the development of partonic models of hadrons, since it gives information about the size in the impact parameter space of the partonic hard core of the incident hadrons.
The applied selection
The discriminating observables
In order to separate the contributions of SPS and DPS, some discriminating variables were used. These were chosen based on the study described in arXiv: 0911.5348v1, defined in such a way to exploit the different behaviours of the two processes. In fact, a final state arising from a single chain tends to have a strongly correlated configuration in the azimuthal angle and $p_{T}$-balance distribution between the two jet systems, while a DPS event has a preferred back-to-back topology for the separated systems that are not correlated.
Thus, intererting observables to look at to separate the two processes are:
- S_{p_T}^{bottom} = (p_T^(bottom1)+p_T^(bottom2)/(|p_T^(bottom1)|+|p_T^(bottom2)|
- S_{p_T}^{light} = (p_T^(light1)+p_T^(light2)/(|p_T^(light1)|+|p_T^(light2)|
where p_T^(bottom1) and p_T^(bottom2) are the vectorial momenta for the two bottom-jets and p_T^(light1) and p_T^(light2) are the ones for the soft-jets. S_{p_T}^{bottom} and S_{p_T}^{light} are defined as the normalized p_T balances for the bottom- and light-jets, respectively. In particular, a back-to-back topology for the two separate jet-systems contribute at low values of S_{p_T}^{bottom}, $S_{p_T}^{light}$ while correlated pairs of jets bring to a broader distribution over the whole phase space. Going further, it is possible to combine the information from the two sets of jets, by building a new variable:
- S_{p_T}' = sqrt(S_{p_T}^{bottom}^2+S_{p_T}^{light}^2)
The distribution for S_{p_T}' is the same expected for the single variables S_{p_T}^{bottom} and S_{p_T}^{light} when considering correlated and uncorrelated systems and it gives a better overview of the process, taking into account information of the whole final state. The different configurations for the jets in the final state translate also in different regions for angular variables. It is important to study the phase space for variables defined by the azimuthal angles of separate and combined jet pairs. In particular, these two variables
- Delta_phi_{bottom}=phi_{bottom-jet 1}-phi_{bottom-jet 2}
- Delta_phi_{light}=phi_{light-jet 1}-phi_{light-jet 2}
- S_{phi} = sqrt((\Delta\phi_{bottom})^2+(Delta_phi_{light})^2)
have a good distinguishing power. S_{phi} accounts for the difference in azimuthal angle between the selected jets and Delta_phi_{bottom} and \Delta_phi_{light} compute the angle between the two jet-systems. SPS events lead to a broad distribution for these quantities, while DPS events contribute most at values close to 3.1415, meaning an uncorrelation for the jets of the same type. Recently, also the configuration of the jet pairs in pseudorapidity came up into the discussion since it can be also interesting for the signal discrimination. The difference in eta between the hard jets and the soft jets has been studied and the following observables have been defined:
- Delta_eta_{bottom}=eta_{bottom-jet 1}-eta_{bottom-jet 2}
- Delta_eta_{light}=eta_{light-jet 1}-eta_{light-jet 2}
In particular, some differences between the two processes are mainly expected for Delta_eta_{bottom}: a SPS process presents a longer tail towards high values due to the randomization introduced from the emitted radiation, while DPS should be more relevant at low values. Even for Delta_eta_{light}, the same behaviour can be exhibited and is worthy to be checked.
Study at the generator level
A study at the generator level for these observables is shown in
https://twiki.cern.ch/twiki/bin/view/Sandbox/AnalysisBBbarJetEventsPlusTwoLightJets
The experimental measurement
Three different runs from the 2010 data taking were used for the measurement. They are listed in the table below, along with the run ranges and the integrated luminosity for each of them. They correspond to different pile-up conditions with a mean value of the pile-up interactions ranging from 1.5 to 2.8. The treatment of the pile-up is shown in section 5. Data from the early stages of the LHC can give reasonable statistics for the applied selection and have been chosen in order not to have a high contribution on the jet spectrum coming from the pile up.
Data Sample |
Run range |
Trigger |
Integrated luminosity (pb^(-1) |
JETMET |
141950-144114 |
HLT_Jet15U |
0. |
|
141950-144114 |
HLT_Jet30U |
0.192895 |
|
141950-144114 |
HLT_Jet50U |
2.896 |
JETMETTAU |
135821-141887 |
HLT_Jet15U |
0. |
|
135821-141887 |
HLT_Jet30U |
0.117223 |
|
135821-141887 |
HLT_Jet50U |
0.278789 |
JET |
146240-149711 |
HLT_Jet15U |
0. |
|
146240-149711 |
HLT_Jet30U |
0.026783 |
|
146240-149711 |
HLT_Jet50U |
0.239874 |
Different MC generators were used to compare predictions at the detector level and to correct at the generator level. Two samples generated respectively with Pythia6 tune Z2 and Herwig++ from a central CMS production and another one generated with Pythia8 tune 4C from a private production are compared. The first two were generated with a flat distribution in p_T_hat of the outgoing interacting partons between 15 and 3000
GeV while the third one used a generation in p_T_hat slices with a cut at the generator level for at least 4 jets in the central region (|eta|<2.5) with a p_T>15
GeV in order to increase the statistics for the applied selection. The first two include pile-up events while the third one is generated without it. The detector behaviour is simulated through a full simulation performed with Geant4.
The details of the MC samples can be read in table below.
Data and MC have been analyzed inside the CMSSW framework with the release 4_2_4_patch2, with the recommended global tags (respectively GR_R_42_V19::All and START42_V19B::All).
Trigger selection
The CMS trigger system is designed to control event rates consistent with available bandwidth. It consists of two parts, the Level-1 Trigger (L1) and High-Level Trigger (HLT), where the former one is mainly a hardware based trigger, where as the later one is a software based trigger. In this analysis, the trigger paths which were used are single jet triggers:
L1SingleJet and HLTJet which combinedly forms the HLTJet trigger path. It is to be noted that jets used in the trigger paths are corrected AK5 calorimeter jets.
Two different trigger paths were used from the data samples. In particular, the phase space was divided in three different regions as a function of the leading jet p_T selected in the central region corresponding to |eta|<2.5 and in each of these regions, only one trigger was used for each data sample. The regions were characterized by:
- 25 < leading jet p_T < 50 GeV
- 50 < leading jet p_T < 140 GeV
- leading jet p_T > 140 GeV
A schematic picture that explains the idea of the separation of the phase space in regions, can be found in the picture.
For the first region, the HLT_Jet15U has been exploited, while for the second one, the HLT_Jet30U trigger was used; the third one was instead triggered by the HLT_Jet50U trigger. This analysis strategy allows to avoid double counting of events that triggered two different triggers and to reach significant statistics for the applied selection.
Trigger efficiency
The behaviour of the used triggers as a function of the leading jet p_T was studied in order to correct for possible inefficiencies in the different regions of the phase space. In order to do this, turn on curves for each HLT trigger paths were produced. The trigger efficiency was studied in two different ways and the results were compared for cross-check.
The first method uses and the trigger efficiency for HLT_JetY is defined as:
Here the denominator is the number of events for which the trigger path HLT_JetX has fired. Here the value of X is chosen previous to that of Y in p_T ordering from the trigger list so that the higher trigger condition can be emulated from the lower trigger path. The numerator is the number of events for which HLT_JetX has fired and the p_T of HLTObject corresponding to the trigger path HLT_JetX is > Y. This efficiency is plotted against the corrected inclusive leading
RecoJet p_T. For example, in order to obtain turn on curve for HLT_Jet30U, the immediately HLT path of lower threshold HLT_Jet15U is chosen, the p_T cut on
L1Object corresponding to the trigger path HLT_Jet15U is 15
GeV. In figure below, we display the trigger turn on curve for HLT_Jet30 and HLT_Jet50 trigger paths for the inclusive central region |eta|< 2.5, as a function of the leading jet p_T. The dependence of the trigger efficiency on the pseudorapidity was also investigated and found to be negligible. In fig. below, the trigger efficiency is shown as a function of the pseudorapidity of the leading jet and it appears to be flat.

At the thresholds used for each trigger, there is no need to correct for the L1 trigger efficiencies in the second and third region, while the first region has been corrected by using the fit represented in fig. below for HLT_Jet30U, since it is not 100% efficient in that p_T range as seen in table below. The best fit to the curve was found to be a 8-degree polynomial function.
Pile-up reweighting
In order to study the pile-up contribution for each data sample, a pile-up reweighting has been applied to the MC samples. A pile-up event is defined as an additional interaction inside the same bunch crossing. The pile-up in the MC samples has been implemented by adding at the hard scattering, several additional interactions, recorded from data as Minimum Bias events. By reweighting the pile-up, it is possible to study its contribution to the measured observables.
The reweight procedure is based on an iterative process: the absolute reconstructed vertex distribution is extracted from data and MCs. The MCs are then reweighted according to this ratio as a function of the number of pile-up interactions for each event. The absolute reconstructed vertex distribution obtained after the reweighting is then considered, a new ratio with the vertex distribution in the data is evaluated and the MC sample is again reweighted as before. An iteration of 4 reweights was found to be satisfying and, as shown in figure \ref{vertexreweight} for the JETMET data sample, a nice agreement for Pythia and Herwig++ has been found.
B tagging
Jet Selection
In both data and MC, a request for a good quality of at least one reconstructed vertex was applied. It implies the presence of a vertex with number of degrees of freedom greater than 4 and a distance to the beam spot in the longitudinal coordinate smaller than 10 cm.
The same jet selection described in section 2 in the phenomenological study was applied at the detector level. In particular, 4 jets in the central region, corresponding to |eta|<2.5, were requested exclusively. Two of them were requested to have a corrected transverse momentum above 50
GeV and the other two a corrected transverse momentum greater than 20
GeV. A tight selection was applied for the selected jets: this is required in order to suppress non physical jets, i.e. jets resulting from noise in the electromagnetic and/or hadronic calorimeters. Each jet should contain at least two particles, one of which is a charged hadron and the jet energy fraction carried by neutral hadrons, photons, muons and electrons should be less than 90%. These tight criteria have an efficiency greater than 99$\%$ for physical jets. For the jet p_T correction, the L1 (Offset) + L2(Relative) + L3 (Absolute) + Residual jet energy correction were applied on the data. On Monte Carlo L1 (Offset) + L2 (Residual) + L3 (Absolute) corrections have been applied.
Results of the selection
A table with the total number of events and with the number of selected events progressively after each applied cut can be found in tables below for each data and MC sample.
Results at the detector level
The observables defined in section 3 have been measured for data and MC and are shown in this section. These plots are referred to the JETMET sample and the MCs have been reweighted with the corresponding reconstructed vertex distribution. The trigger efficiency correction is here applied, as described before.
First of all, it is worthy to look at the distributions of the jet multiplicity in order to check the agreement between data and MC. In figures below, three different p_T thresholds have been applied at the jets in the central region: in (a), all the jets with p_T greater than 20
GeV have been selected, in (b), the number of hard jets (p_T > 50
GeV) is shown while in (c), only the soft jets (20 < p_T < 50
GeV) are taken into account.
The following figures represent the absolute cross sections and the shapes for the same observables. Both the figure sets deal with data at detector level, i.d. uncorrected data.
There is an overall agreement in all the distributions, both in the absolute cross sections and in the shapes. For the cross sections, in particular, both the MC samples predict a slightly bigger (~10%) total number of events that makes the distributions a little shifted up. The comparison between data from the other samples and MC brings to the same conclusions.
Systematic uncertainties
Some additional uncertainties due to systematic effects were also evaluated. Analyses using jets have to consider in particular the impact of the jet energy scale and the jet energy resolution. The uncertainty on the luminosity and a study on the model dependence has to be also performed. All these effects that can play a role and can affect the measurements are described and treated in the following sections and they are evaluated both for the absolute cross section measurements and for the shape distributions.
Jet energy scale uncertainty
The applied jet energy correction carries a defined uncertainty whose effect has to be evaluated when dealing with jets. Since this analysis is based on the selection of a quite high number of jets, it is expected that this is the major factor that affects the total systematic uncertainty. The effect of the jet energy scale has been evaluated by varying up and down the jet transverse momentum by the uncertainty; the observables obtained with these changes are then compared to the nominal distributions, by evaluating the ratios between them. The maximum discrepancy between the nominal distributions and the ones got from the modified jet scale is referred as the jet energy scale uncertainty. The results show a contribution around 20-25% for the absolute cross section and less than 5% for the shape distributions.
Model dependence uncertainty
Different MC generators were used to compute the uncertainty due to the physics models. In particular, the half discrepancy between Pythia and Herwig samples at the detector level is referred as the model dependence uncertainty. They are using different models for MPI and hadronization and it is important to study the effect of the different physics used by the generators on the measured observables. The results show a contribution around 5-10$\%$ for the absolute cross section and around 5% for the shape distributions.
Jet energy resolution uncertainty
One of the most important detector effects on a jet measurement is the energy resolution. The jet energy response is not exactly corresponding to the true value of the measured physical quantity but it results in a gaussian distribution around it. The wider the distribution is, the less accurate the measurement is. The width of this distribution is called resolution. While the angular resolutions in eta and in phi were found to have negligible effect for the described measurement, the resolution in the transverse momentum (equivalent to the one in the energy) is more relevant and its effect needs to be taken into account. In order to do this, the p_T of the jets was smeared out around its true value by matching every jet at the detector level to the closest one at the generator level. The match is performed through an angular cone algorithm with an aperture \Delta R=sqrt{(Delta eta)^2+(Delta phi)^2} = 0.3. The smearing procedure is summarized by the formula:
- p_{T}^{smeared}=p_T^{true} +/- a x (p_T^{true}-p_T^{det level})
with a = official parameters of the detector resolution for the 2010 data and p_T^{true}, p_T^{det level} transverse momenta of the two matched jets, respectively the generator and the detector level one. The uncertainty due to this effect was computed by taking the ratios of the samples with the sign plus and the sign minus with the nominal sample. These values for each bin are taken as uncertainties. The results show a contribution aroundless than 5% for both the absolute cross section and the shape distributions.
Total uncertainty
The previous uncertainties are finally combined in order to get the total systematic uncertainty. For 2010 data, the official uncertainty on the luminosity is around 4% and this value was taken for this analysis and included in the combination. The combination of the uncertainties has been evaluated by summing in quadrature the single contributions, assuming absence of correlation among the different sources. The final results are shown in figures in the AN.