Analysis of final states with two tagged-b jets and two light jets

In this page I will describe the status and the progress of my thesis, related to the study of events with two tagged-b jets (one b and one anti-b) associated with two other light jets.

This analysis is useful from one side to try to distinguish the final states arising from single-chain process with a hard radiation of two jets besides to the primary scattering and double-chain process with the two pairs of jets coming from different interactions, and from the other side to measure experimentally the Kt factorization and the effective sigma for the multiparton interaction cross section. For this purpose, an initial study at generator and hadron level has been performed through a RIVET analysis implemented in CMSSW_4_4_0 and CMSSW_5_0_1 (for the CASCADE interface). The main goal of this part is to try to identify some variables whose behaviour is different for single-chain and double-chain processes to be then studied with collision data.

Introduction and Event Selection

The starting point of the analysis is a study of the hadron level performed by RIVET, interfaced with CMSSW_4_4_0. For the generation with CASCADE, I have used the software release CMSSW_5_0_1 because in the previous releases there is no interface for this generator. In this part, I tried to reproduce the results described in the paper by E.Berger et al. (Characteristics and estimates of Double Parton Scattering at the Large Hadron Collider, arXiv: 0911.5348v1) where the kinematic distributions are compared between single-chain and double-chain process. It takes into account just the central region (|eta| < 2.5) for all the selected jets. In order to reproduce this analysis, hard scattering events with a production of a pair of b-quarks in the final state were generated applying a threshold of 20 GeV for the pthat of the hard scattering (exchanged transverse momentum between the two initial interacting partons). Light and bottom Jets in the final state were then selected with these requests:
  • Jets picked up from FastJets collection in RIVET
  • Anti-_Kt_ jet reconstruction algorithm with R=0.5
  • lower threshold of 25 GeV/c in pt for central and forward jets
  • lower threshold of 5 GeV/c in pt for CASTOR jets
  • central region defined as |eta| < 2.5
  • forward region defined as 3 < |eta| < 5
  • CASTOR region defined as -6.6 < eta < -5.2

For b-jets, a further request was applied through the variable j.containsBottom() == true, while for light jets the same variable is requested to be false. Two generators were initially used: Pythia6 Tune Z2 and CASCADE. For both generators, production of heavy b-quark pairs was set as primary process and different kinds of samples were studied by progressively switching on initial and final state radiation and multiparton interactions to look at the effects of each new physical processes. A ptHat It has to be reminded that CASCADE generator does not implement multiparton interaction. For the analysis with jets in the central region, 1M events were generated while for the analyses in the forward and CASTOR regions, the statistics was increased up to 4M events because of the lower impact of this kind of events. The event selection has required:

  • ANALYSIS IN THE CENTRAL REGION: two b-jets in the central region, two light-jets in the central region;
  • ANALYSIS IN THE FORWARD REGION: two b-jets in the central region, two light-jets in the forward region;
  • ANALYSIS IN THE CASTOR REGION: two b-jets in the central region, one light-jet in the forward region, one light-jet in the CASTOR region.

All the selected jets of the same type have to be separated by a DeltaR = sqrt (DeltaEta^2+DeltaPhi^2) > 0.4 to remove overlappings.

For each sample, some important observables were checked in order to find different shapes and hints to separate processes occured with a single-chain diagram or with a double-chain one. The main ones are:

  • the leading jet pt in the event (independent on the type of the jet);
  • DeltaPhiBottomJets = Phi(b1) - Phi(b2);
  • DeltaPhiLightJets = Phi(j1) - Phi(j2);
  • DeltaPhiEvent: delta phi between the two planes containing respectively the bottom and the light systems (special case is when the b jets or the light jets are close to be back-to-back -> cos(Phi) < -0.9: in this case, the plane of the bottom or the light system are evaluated taking the directions of one of the incoming protons and of one of the two selected jets. Then, the DeltaPhiEvent is evaluated with the versor extracted from this plane);
  • DeltaPhiNewWay: delta phi between the vectors obtained by the sum of the two three-momenta of the selected jets of the same type;
  • PtBottomImbalance = |_pt_(b1,b2)|/(|_pt_(b1)|+|_pt_(b2)|) -> Vectorial sum of the two three-momenta for b-jets normalized to the sum of their modules
  • PtLightImbalance = |_pt_(j1,j2)|/(|_pt_(j1)|+|_pt_(j2)|) -> Vectorial sum of the two three-momenta for light-jets normalized to the sum of their modules
  • SPhi = (1/sqrt(2)) x sqrt(DeltaPhiBottomJets^2+DeltaPhiLightJets^2);
  • SptPrime = (1/sqrt(2)) x sqrt(PtBottomImbalance^2 + PtLightImbalance^2);
  • Momentum fractions of the initial interacting partons (not yet studied).

Plots of these variables are shown in the following sections.

Control Plots

Some control plots are shown for the different generated samples. Particular attention is taken in the understanding of the new effects introduced by the radiation and the multiparton interaction and also some comparisons with the parton level have been reported. In the following plots, from left to right, pt of the b quarks, pt of the b jets and eta of the b jets are represented for the samples without radiation and MPI (black curve), with radiation and without MPI (red curve) and with radiation and MPI (blue curve).

As you can see, at the parton level, the introduction of radiation and MPI reasonably smeares the distribution of the b-transverse momentum down to low _pt_s (the peak around 25 GeV for the sample without radiation and MPI is due to the fact that I set a lower cut for _pt_hat of 20 GeV/c) and this is reflected at the hadron level with a lower statistics of selected jets because of the cut for the b-jet _pt_s at 25 GeV. Then, we can look at the distributions of DeltaPhi for bottom and light jets separately.

We see that for b-jets, again the introduction of radiation smeares the peak at DeltaPhi ~ 3.14 corresponding to a back-to-back configuration, because the directions of the originary jets are randomized from the emission of new objects. The back-to-back configuration characterizes just the primary hard scattering. For light jets, this is not the case and the distribution is quite flat for the emitted jets but it appears a peak around Pi after the introduction of MPI that adds jets coming from a different independent interaction. Finally, it's worth to say that for the sample without radiation and MPI, we don't have any light jets in the final state because they are not produced by the primary process.

Construction of some discriminating variables

We can now move to describe the discriminating variables I listed previously. Plots for different samples are shown: I generated samples with radiation on and MPI off and radiation off and MPI on and tried to look at the shapes for the sum of these samples. This sum has been then compared with a sample with both radiation and MPI switched on.

As you can see, the topology of the events coming from single and double-chain processes is quite different. In general, everything is related to the fact that for a single chain process we have a correlation between the two systems (bottom-jets one and light-jets one) because the light ones are emitted from the original b-jets while for a double chain process the two pairs are almost completely uncorrelated because they come from different interactions. As already mentioned, for the DeltaPhi between the pairs of jets of the same type, we have a peaked shape at Pi for events occurring for MPI and a flatter distribution for events due to emission of radiation. Looking at the shape for DeltaPhi between the two subsystems, we see a flat distribution for MPI events and a slight increase around Pi/2 for events from radiation. This is not a very clear variable because of its ambigous definition in case of back-to-back jets and I decided to look preferably at the new way of evaluation (DeltaPhi_newway). Here the differences can be clearly seen between the two types of events. Again, some differences are clear in the other variables: the imbalance for pt of the two pairs, the variables Sphi and Sptprime that tries to take into account respectively the phi and pt topology of both bottom and light jets and there are also different shapes related to the pt of the leading jet.

The important aspect of this analysis is also that even the selection of light jets in the forward or in the CASTOR region in place of the central one exhibits the same shapes of these variables. The only difference is the reached statistics: events with 2 b-jets in the central region and 2 light jets at higher pseudorapidities have a lower cross section.

In particular, the message of this part is mainly that in principle, one could easily separate events coming from different diagrams by looking at some kinematic variables and cutting on them. Unfortunately, as I will try to show in the next section, this does not seem to be the case.

This is the webpage where you can find all the plots for these analyses:

Discussion of the results

First of all, we can look at the events obtained for each sample for every applied selection. I have also generated a sample with both radiation and MPI on for every kind of selection.
Applied selection Total generated events Selected events for RAD on MPI off Selected events for RAD off MPI on Selected events for RAD on MPI on Cross section for the hard process
Central region 1000000 615 230 1025 1.012 microbarn
Forward region 4000000 178 94 323 1.012 microbarn
CASTOR region 4000000 144 16 422 1.012 microbarn

We see a progressive decrease of the number of events going to higher and higher pseudorapidities for the light jets. I move now to consider the samples with both MPI and radiation on that correspond to the reality of a set of collisions. The shapes for some discriminating observables (red curve) are compared to the distributions obtained from the sum of the two single samples (radiation on MPI off and radiation off MPI on) plotted in the blue curve.

As it can be seen, there are some important differences. The parts of the distributions corresponding to the MPI processes (peak at 0 for Sptprime, peak at Pi for Sphi in particular) are completely absent in the mixed sample. This means that the effect of the radiation in events occurred for MPI completely washes out the primary topology. The main consequences are:

  • the mixed sample, even if it contains MPI, tends to look like the sample with radiation included but MPI off;
  • in the mixed sample there is no hope to separate MPI processes from the ones arising from a single-chain diagram, by cutting on the described variables.

What seemed at the beginning to be a good way to separate two different processes, it has shown that it's not applicable in the reality. I tried to go deeply inside this problem, applying further selections on the mixed sample: the main goal is to study if it's possible to isolate the MPI contribution inside it. For this purpose, a very tight selection was applied to the events: I required that two b-jets and two light jets are inside the central region and there are NO jets above 2.5 GeV in the region of |eta| < 6.6 (the measurable pseudorapidity region for CMS). In this way, it's probably possible to keep just the double-chain events without radiation and to remove the ones with an even small emission. Two samples have been generated with this selection: one with only initial radiation on and the other one only with final radiation on (in both MPI are included). Results for Sptprime are shown in the following plots, respectively left and right one:

The selected statistics is of course very low (respectively 11 and 2 events over 9 millions) but some observations can be done: it seems that the initial radiation completely randomizes the variable Sptprime while the applied cuts can be efficient in case of only final radiation. The problem for initial radiation needs to be understood. It could be two-fold:

  • the emitted particles don't create a jet (are they of too low pt?);
  • is the initial radiation really the problem?

The fact is that we were expecting to observe again the peak at 0 and the SPS distribution in the "mixed" sample but it doesn't seem to be the case!!It's quite reasonable because also MPI events are affected by parton shower and so they can't be pure MPIs as before. But still, if we compare the samples before and after switching on the parton shower we see some differences that can be assigned to the contribution of MPI. However, the contribution of MPI is still there and it's not washed out but it's not separated as much as without parton shower. The last check will be the generation of a sample without MPI and a "enriched MPI" sample to check the difference between them and look at the realistic distributions for MPIs. We will use PYTHIA8 because it has a better treatment of the MPIs.

This is quite crucial to understand, because then we can modify the event selection in order to isolate the MPI contribution. I tried to go higher in the centre-of-mass energy because maybe it could happen that the radiation for the hard scattering removes lots of available energy for a secondary interaction, not allowing to create jets over threshold. The idea of going up in energy is to test if the problem of not seeing any effect of MPI events, is just due to the high threshold. Some samples with radiation and MPI on were generated at 14, 28 and 56 TeV. The plot for the variable Sptprime is shown but in any of them there is no evidence of MPI events (no expected peak at 0). The problem does not seem to be due to the loss of energy for the secondary scattering.

Another final observation about the Sptprime observable is related to the fact that the heavy-quark production in Pythia6 (MSUB=81,82) does not include higher-order diagrams: indeed, we know that the b-jet production is mainly due to gluon splitting diagrams. I tried then to include them by setting a more general MinBias production. In these settings, it's much less probable to have an event that passes the selection but the gluon splitting is included and it's possible to look at the effect on Sptprime of these processes.

As you can see, the new shape exhibits a peak around 1. This is probably due to the emission of pairs of collinear jets from a boosted gluon that are strongly unbalanced in pt.

Further studies using PYTHIA8

A more realistic study has been performed using Pythia8; in fact, the suspect is that the distributions seen in the previous samples (especially the one without parton shower to study the contribution of the MPI) are strongly affected by the contribution of the soft radiation. In order to study these effects, it's better to switch from Pythia6 to Pythia8 since the latter has a better description of the MPI and more reliable tools to treat them. The new strategy is not related to the generation of extreme approximations (MPI-on, PS-off and viceversa) but to the analysis of more realistic samples where it's possible to study the contributions of the different processes. The generated samples were:
  • MPI enriched sample: a secondary hard interaction was forced to exchange a transverse momentum higher than 20 GeV and to generate a dijet system
  • MPI ON: the realistic sample where MPI and parton shower are switched on without any changes
  • MPI OFF: the MPI are switched off and all the contriubutions are let to the parton shower
  • MPI suppressed: the secondary hard process is forced not to exchange a transverse momentum above 20 GeV.

For each sample, the contribution of the initial and final state radiation was studied separately by switching them on and off. The parameters used for the generation were: (HARD SCATTERING)

  • 'HardQCD:gg2bbbar = on',
  • 'HardQCD:qqbar2bbbar = on',
  • 'PhaseSpace:pTHatMin = 20',
  • 'SecondHard:generate = on',
  • 'SecondHard:TwoJets = on',
  • 'PhaseSpace:sameForSecond = on',
  • 'PartonLevel:MI = off',
  • 'PartonLevel:MI = on',
  • 'PhaseSpace:sameForSecond = off',
  • 'PhaseSpace:pTHatMaxSecond = 20',
  • 'PartonLevel:FSR = on (off)',
  • 'PartonLevel:ISR = on (off)',

These samples could also be generated with HERWIG++ but not with PYTHIA6 that has no control of the secondary hard interaction.

Parton-level study

First of all, a study at the parton level was performed in order to analyze the contribution of the partons coming from separate processes: in PYTHIA8, it is possible to choose partons of given flavours coming from the hard scattering, the MPI and the parton shower, by requesting certain values of the status (genparticle->status()) and the identification number (genparticle->pdg_id()). The used settings are listed below:
  • HARD SCATTERING: status 21-29
  • MPI: status 31-39
  • PARTON SHOWER: status 41-59

  • B-quarks: pdg_id 5
  • Light-quarks and gluons: pdg_id 1,2,3,4,21

It's worth to underline that the jets coming from the secondary hard interaction are included in the hard scattering part (like the primary one), while in the radiation, each parton after an emission is included (even the ones coming from the hard scattering after one emission). Here are some plots useful to understand the situation:

The first plot refers to the pt spectrum of all the partons (bottom and light) coming from the hard scattering and MPI: this shows two separate and different behaviours at low pt (where the contribution of MPI is larger but rapidly decreasing) and at high pt (where the hard scattering part plays the major role). The starting point of the hard scattering spectrum at 20 GeV is basically due just to the settings for it. The information that can be extracted from the plot are essentially:

  • MPI suppressed and MPI on sample are pretty similar: this is reasonable because in general MPI are soft and to put a high limit on them doesn't affect much the pt spectrum.
  • At around 20 GeV, the contribution of MPI spectrum is 2 orders of magnitude less than the one from the hard scattering
  • Every sample is quite similar to each other, especially at low pt, except the MPI enriched one that has a higher contribution at high pt

Hadron-level study

After the study at the parton level, it's useful to observe the distributions for the hadron level and the behaviour of the discriminating variables:

It's immediately clear that there are strong differences in the distributions obtained for MPI contribution with respect to the ones got when the parton shower was off; the contribution of the radiation even for events arised from MPI is very important and has the effect of randomizing and broadening the originary back-to-back configuration. Furthermore, again the MPI suppressed and MPI on samples are very similar to each other and it's evident how the initial state radiation is more relevant than the final state one.

In any case, there are still clear regions of the phase space where the MPI contribution can be isolated and better studied.

It's possible, then, to identify some regions of the phase space suggested by the previous plots, where MPI could play a major role and could be better isolated and studied.

  • Low values for Spt' (~0.2)
  • Low values for Spt for bottom jets (~0.2)
  • Low values for Spt for light jets (~0.2)
  • Low values for DeltaPhi of the combined systems (<2)
  • Low values for DeltaPhi of the b-jets (<2)
  • High values for Sphi (~Pi)

Finally, it's possible to try to get the distributions for the realistic sample (MPI on) by summing and mixing the two samples with and without MPI (MPI enriched and MPI off). The plot shown below represents the distribution obtained by adding the 1% (that is roughly the percentage observed in the parton level study) of the MPI enriched sample: even if these two different samples are not exactly independent, it can be considered a good appromixation and brings effectively to a good agreement with the realistic case.

New applied selections

New selections were applied in order to improve the separation between MPI and single chain processes:

  • the light jets are constrained to be in the eta range identified by the b-jets with the same thresholds as before: this was done in order to suppress the contribution of the initial radiation (the previous plots have showed that this is the most relevant one) that is supposed to originate jets outside the eta range of the hard scattering partons;
  • all the thresholds for the jets were set to 10 GeV;
  • with thresholds at 10 GeV, the two leading b-jets and the two leading light-jets in the event were picked up without the request of exactly 2 bottom jets and 2 light jets in every event.

The plots for Spt' for the three selections are shown below.

From the plots, it's quite clear that it's not possible to gain much from the new selections. It's still relevant the contribution of the initial radiation (probably due to PYTHIA parton shower effects) and the decrease of the thresholds in the exclusive sample doesn't allow to increase much the MPI contribution but just to have more radiation that can pass the threshold and veto events. Even in the inclusive sample, there is just an increase of the contribution of the combinatorials, that are events when one of the two jets comes from the radiation and the other one from the MPI, but it could be useful to improve the statistics for the data: indeed, the contribution of MPI is not spoiled by the combinatorials.

Applied selection Total generated events Selected events for MPI enriched Selected events for MPI on Selected events for NO MPI
Initial selection 1000000 20784 987 635
Light jets between b-jets 1000000 1590 84 47
10 GeV threshold 1000000 25843 7955 19383
10 GeV threshold for inclusive sample 1000000 492755 88229 23449

Herwig study

A similar study has been performed by using Herwig++ instead of Pythia8 to try to investigate a bit the dependence of the distributions on the model. The plot for Spt' is shown below and the number of selected events for each sample is shown in the following table.

Applied selection Total generated events Selected events for MPI enriched Selected events for MPI on Selected events for NO MPI
Initial selection 1000000 1281 568 451

The distribution for the MPIENR sample seems to be closer to the other distributions than previously, but a small excess of events in the "signal region" is still present.

Forward-region study

A further study was also done, by considering the forward region (3<|eta|<5) for the light jets. Again, the distribution for Spt' and the number of selected events are reported below.

Applied selection Total generated events Selected events for MPI enriched Selected events for MPI on Selected events for NO MPI
Initial selection 1000000 885 41 35

The behaviour of this selection is very similar to the standard one but we suffer of low statistics in these regions. Anyway, the MPI events seem to appear quite different from the SPS ones and this selection can be still promising with a good amount of selected events.


Comparison with CASCADE generator

A generation in CASCADE has also been performed. MPI interactions are not implemented in CASCADE generator but it's useful to compare the previous distributions for the case of radiation on. Here Sptprime, Sphi, Deltaphi and the combination between Cascade distribution and PYTHIA MPI enriched with a factor of 0.02 are shown. 1M events were generated for this sample.

Here it is the table with the number of selected events for each sample. It is clear that, as observed before, only initial radiation is important in this type of events.

Applied selection Total generated events Selected events for radiation on Selected events for ISR on Selected events for FSR on
Initial selection 1000000 232 302 1

No strange or different distributions are relevant in these plots but they follow pretty well the shapes obtained previously. It is also important to remark the presence of events with Sptprime close to 1 as shown in the inclusive MB production for Pythia. The plot for the combination is not so instructive but it shows how different is the combined distribution with respect to the previous one is when using the sample generated with CASCADE.

-- PaoloGunnellini - 20-Mar-2012

Edit | Attach | Watch | Print version | History: r19 < r18 < r17 < r16 < r15 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r19 - 2013-03-26 - PaoloGunnellini
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback