First Monte-Carlo studies (November-December 2007)

$t\bar{t}$ final states

The final states of the $t\bar{t}$ produced in an event can be grouped by the different decay channels of the W. We group the final states according to the decays of the W into light leptons (e, $\mu$), $\tau$s and quarks (jets). For the $\tau$ special care must be taken to account for the $\tau$ leptonic decays which will yield final states indistinguishable from the ones resulting from W decays into light leptons. From the Particle Data Group we have the following branching ratios:
  • $B(W\rightarrow l\nu_{l})= 0.213 \pm 0.002$
  • $B(W\rightarrow \tau\nu_{\tau})= 0.113 \pm 0.002$
    • $B(\tau\rightarrow l\nu_{l}X)= 0.373 \pm 0.003 $
    • $B(\tau\rightarrow hadrons)=0.627 \pm 0.003 $
  • $B(W\rightarrow q\bar{q}')=0.674 \pm 0.003$
We can easily build the branching ratios for the final states of the $t\bar{t}$ system using the fact that:

$\{ B(W\rightarrow l\nu_{l})+B(W\rightarrow \tau\nu_{\tau})[B(\tau\rightarrow l\nu_{l}X)+B(\tau\rightarrow hadrons)]+B(W\rightarrow q\bar{q}') \} ^{2}=1$

This yields the following final states:

Expected rates for $t\bar{t}$ final states in the SM
Final state Branching ratio Value Approx.
di-leptonic (DIL)     5/81
$\tau_{had}$ + lepton     3/81
lepton + jets (L+J)     28/81
fully hadronic (HAD)     45/81

Monte-Carlo analysis


For the introductory study of the event kinematics we are generating tops using PYTHIA8. We also compare the results obtained from PYTHIA 8 with the samples generated for the release validation of CMSSW_1_6_7 (RelValTTbar). We have divided the samples generated into three groups according to the branching ratio of the top decay. The table below summarizes the samples generated.

Sample $t\to Wb$ $t\to Wq$ (q=s,d) $W\to d_{i}u_{j}$ $W\to l\nu_{l}$ (l=e,$\mu$) $W\to \tau \nu_{\tau}$ $N_{evts}$ Luminosity
All b 100% 0% 0% 100% 0% $50\cdot 10^{3}$  
All q 0% 100% 0% 100% 0% $50\cdot 10^{3}$  
Mix 90% 10% 0% 100% 0% $40\cdot 10^{3}$  


In the following sections we try to extract the signal of the top decay with di-lepton+jets final states from the Monte-Carlo samples.

Top decay kinematics

Below you can find several plots related to the kinematics of the top decay products as given by the PYTHIA8 generator.

Top generation and decay Top pT spectrum (generator level) t->Wq pT spectrum (generator level)
W-boson (leptonic decay) W->lnu_l pT spectrum (generator level) W->lnu_l angle of emission in CM frame wrt to W direction in the lab (generator level) W->lnu_l pT spectrum after y cuts (generator level) W->lnu_l ET spectrum (generator level)
W -boson (hadronic decay) W->qqbar spectrum (generator level) W->qqbar angle of emission in CM frame wrt to W direction in the lab (generator level)

From the pT distributions of the W decay products we conclude that the particles with $I_{Z}=-1/2$ have a slightly harder spectrum than the ones with $I_{Z}=+1/2$ because they are preferentially emitted along the hemisphere where the W is propagating to in the lab. The angular distribution of the charged leptons is theoretically given by (see R.K. Ellis et. al., QCD and Collider Physics ):

$\frac{1}{N}\frac{dN \left( W \to e\nu \right) }{dcos\theta^{\star}_{e}}= \frac{3}{4\cdot \left(m_{t}^{2}+2M_{W}^{2} \right) } \cdot \left[m_{t}^{2}\sin\theta^{\star}_{e}+M_{W}^{2} \left( 1-\cos\theta^{\star}_{e} \right)^{2} \right] $

Event selection

To study the event selection we will try to approach the Monte-Carlo generator level results to the experimental measures by clustering high pT objects which will correspond to the groups of particles flowing into the detector. This can be done by using the standard clustering algorithms provided by PYTHIA. We choose the CellJet algorithm (aka as PYCELL in PYTHIA 6).

The CellJet algorithm consists in dividing the $(\eta,\phi)$ phase space in equally distant cells. The flux of energy-momentum is measured in each cell. The cell with highest pT is chosen as a cluster seed and will aglutinate all the neighboring cells within a range: $(\eta-\eta_{seed})^2+(\phi-\phi_{seed})^2 <R^{2}$. If the cluster's total pT is higher than a given threshold then the cluster is considered as a jet or CellJet. The algorithm proceeds to find other cells with high pT which can be used as seeds for clustered jets. For more details (consult PYTHIA 6 manual or the PYTHIA 8 online documentation ).

To match a jet to a primordial parton we find within a range $\Delta R(match) \leq 0.1$ which is the jet that minimizes $\frac{\Delta E}{E_{parton} }$. If no jet is found within $\Delta R(match)$ or $\frac{\Delta E}{E_{parton} } > 0.5$ we discard the event. Some 'event displays' are shown below.

Event display (with matched leptons and b-jets) Event 5 with b-jets and leptons matched Event 1027 with b-jets and leptons matched

The following plot shows the distribution of the number of jets found in event after subtracting the jets matched to leptons. One can see that there's a non negligible number of events where 0 or 1 jets were found. Even selecting only the events in which both b-quarks are matched to a jet we count events in which only one jet was found.

Number of clusters in event

Due to the fixed size of the $(\eta,\phi)$ cells used to cluster the high pT objects and to the usage of a minimum pT threshold to initiate the clustering procedure, some errors might occur when matching the generated leptons and partons to the pT clusters (CellJets). Some examples are shown below:

  • if the pT is too low no jet will be found in the event for the generated particle;
  • if the generated particles are not well separated they will be gathered in the same cluster.

Event display (with matching exceptions examples) Event with a non-matched b parton Event with 2 leptons clustered

To minimize the errors in the identification of the jets corresponding to the generator level particles we:

  • try to match the leptons to one of the jets. If a match is found we subtract the lepton momentum from the jet momentum
  • try to match the b-quarks to the remaining jets
  • select events in which a match of the 2 b-quarks was successful
This procedure introduces a bias in the event selection because events with low pT jets/leptons will discarded.

In the following plots we compare the generator level distributions (plot on the left) with the jet level distributions. Note that for the special case of the leptons we use always the generator level values because experimentally they will be, in principle, resolved by the detector.

Object separation $\Delta R=(\Delta \eta^{2}+\Delta \phi^{2})^{1/2}$ Separation between lepton-lepton and b-jet-b-jet Maximum and minimum separation between all high pT objects in events
Kinematics $H_{T}, Centrality, miss(p_{T})$ PT sum for the objects in the events PT sum for the objects in the events PT sum for the objects in the events

Event yields after selection

MC sample All b All q Mix
$N_{evts}$ 50000 50000 40000
$N_{evts}$ "reconstructed" 50000 22837 50000 23252 40000 17720
Selection Generator level Jet level Generator level Jet level Generator level Jet level
Event topology: $H_{T} > 100 GeV/c$ 49750 22043 49745 23217 39796 17671
$Centrality > 0.6$
2 charged leptons: $\mid\eta\mid < 2.4$ 21306 9018 21091 9432 16832 7073
$p_{T}>30 GeV/c$
$\Delta R_{min}(neighbor) >0.3 $
2 jets: $\mid\eta\mid < 2.4$ 14623 6918 14437 7554 11482 5509
$p_{T}>30 GeV/c$
MET MET > 60 GeV 14019 5282 13819 5771 11051 4171
Yield = $\frac{ N_{evts} (selected) } {N_{evts} } $ (%) $28.0 \pm 0.2$ $10.6\pm 0.2$ $27.6\pm 0.2$ $11.5\pm 0.2$ $27.6\pm 0.3$ $10.4\pm 0.2$

As a combined result we obtain for the yield $10.84\pm 0.09 % $.

The error presented in this yield is statistical (from counting). The measurement of this yield is however affected by several systematic effects, namely:

  • the jet energy scale;
  • the b-tagging efficiency
We take a look at this effects separately

Influence of jet energy measurements (Jet Energy Scale)

When the energy of a jet is reconstructed it depends on several factor namely the jet algorithm used, the resolution and granularity of the calorimeter, etc. At the generator level we can estimate how these errors propagate to the final yields. In our case the jet algorithm that clusters the particles collected in cells within a radius $\Delta R \leq \Delta R_{in}$ will introduce an error in the measurement of the pT and $\eta$. This mainly due to the granularity of the $(\eta,\phi)$ grid used to count particles. Below we plot the error distributions for: $\Delta R_{in}=0.5$, using a grid $(\Delta\eta,\Delta\phi)=(0.04,0.04)$ for the MC samples in which $t\to Wb$ 100% of the cases.

Pt and Eta systematic errors from jet cell measurements

From this distribution we conclude that the CellJet tends to underestimate the pT of the original parton. This can be related to the fact that not all the hadrons are clustered in the same jet. Some of the hadrons might be left outside of the cells used to construct a jet. An improvement on this algorithm would be to re-iterate the clustering using the CellJets and the remaining SingleCells (not clustered yet). This second step would enable to colect the remaining energy and would also merge jets that are very close to each other.

We turn now to the effect that the calorimeter measurement of the particles might have on the event yield. In a crude approach the calorimeter will measure the pT of the particles with a resolution that can be given by $\frac{\sigma_{E}}{E} = \frac{1}{\sqrt{E} }$ if we only take into account the fluctuations from Poisson statistics (calorimeter with linear response). We can estimate how this fluctuation propagates to the expected event yields by changing the pT of the leptons and the jets by the following amount:

$p_{T} \to p_{T} +\delta p_{T} = p_{T} + R\cdot\sqrt{p_{T}}\cdot Gaus(0,1)$

To discriminate the contribution from positive and negative smearings we try also the following changes:

$p_{T} \to p_{T} + \delta p_{T} = p_{T} \pm R\cdot\sqrt{p_{T}}\cdot |Gaus(0,1)|$

Applying both these smearings to the sample "All b" sample with a resolution R=0.4 we get the results summarized in the table below:

  Generator level Jet level
$N_{evts}$ 50000 22043
Smear mode $-\delta p_{T}$ 0 $+\delta p_{T}$ $-\delta p_{T}$ 0 $+\delta p_{T}$
Topology in range 49601 49748 49834 21965 21983 22005
Leptons in range 19702 21282 22862 8350 9031 9730
Jets in range 13019 14571 16160 6156 6935 7717
MET in range 12452 13984 15536 4673 5283 5926
Yield(R=0.4) 0.249 0.280 0.311 0.212 0.240 0.269
Yield(R=0.4)-Yield(R=0) -0.031 0.000 +0.031 -0.028 0.000 +0.029

From these yield diferences we get an estimate for the systematic error due to the precision in measuring the pT.

Influence of b-tagging efficiency
To estimate how b-tag efficiency will propagate to this measurement we proceed as follows:

  1. Select the events in which the 2 quarks+2leptons from $t\bar{t}$ decay pass all the kinematic cuts
  2. For each jet:
    • trace back if any of the clustered particles came from a hadron whose proper lifetime ($\tau$, measured in mm/c) was shorter than $10^{3}mm$ (exclude $\pi^{0}$ contributions);
    • compute an 'effective' proper time for a second vertex candidate as: $\tau^{\star}=\frac{\sum p_{T}^{2}\tau'}{\sum p_{T}^{2} }$ where $\tau'$ is the proper time of the hadron measured in the laboratory frame (Note: the $p_{T}^{2}$ weights are used to enhance the contribution from the hadrons belonging to the '$p_{T}$ core' of the jet);
  3. Generate the time of generation (or displacement) of the second vertex candidate from an exponential distribution $p(t)=\tau^{\star}\cdot e^{-t/\tau^{\star}}$.
  4. Correct the generated value by the rapidity of the jet: $t \to t \sqrt{1-\tanh^{2}(\eta)}$
  5. If $ 100\mu m < t < 25 mm$ then b-tag the jet with a secondary vertex.
  6. Count how many b-tags are attributed in each event.
  7. Count how many b-tags are attributed to the jets matched to the top decay ($t\to Wq$) to estimate the b-tagging efficiency ($\epsilon_{q}$) in $t\bar{t}$ events.

Using the previous method we obtain the following result:

MC sample All b All q Mix
$N_{evts}(selected)$ 5282 5771 4171
Events counted using all jets found
N b-tags 0 tags 471 5134 517
1 tag 2018 547 1815
2 tags 2524 82 1677
$\geq$ 2 tags 269 8 162
Events counted using jets from top decay only
N b-tags 0 tags 521 5630 570
1 tag 2164 138 1924
2 tags 2597 3 1677

If we consider the All b sample and measure the efficiency on the b-tag of the jets from top decay we find $\epsilon_{b}=0.697\pm 0.008$. On the other and if we take the All q sample and measure the probability of identifying a jet from top decay as a b-jet we get $\epsilon_{q}=0.012 \pm 0.001$. This are first estimates for the b-tag and mistag efficiencies. Next we try to discuss this efficiencies in more detail.

R measurement

R is the the ratio of $B(t\to Wb)$ with respect to $\sum_{q=b,s,c}B(t\to Wq)$.

Using the MC samples one can attempt to reproduce the R simulated in the samples (1,0 or 0.9 for the All b, All q and Mix samples respectively). In order to do that one must account for the number of events expected with 0, 1 or $\geq$ 2 b-jets and for the probability to identify them correctly.

When writing down the expected rates we make use of the following variables:

  • $N_{ev}$ - number of $t\bar{t}$ with di-leptons in the final state
  • $\epsilon_{b}$ - probability to identify correctly the flavor of a b-jet (b-tag efficiency);
  • $\epsilon_{q}$ - probability to identify q-jet as a b-flavored jet (b-mistag probability);
  • $N_{i}^{meas}$ - number of events measured with i b-tags;
  • $R=\frac{B(t\to Wb)}{B(t\to Wq)}$;

Doing so we have the following expressions for the probability of measuring 0, 1 or 2 b-tags in a di-lepton $t\bar{t}$ decay:

  • $P_{0}=(1-R)^{2}(1-\epsilon_{q})^{2} + 2R(1-R)(1-\epsilon_{q})(1-\epsilon_{b}) + R^{2}(1-\epsilon_{b})^{2} $
  • $P_{1}=2(1-R)^{2}(1-\epsilon_{q})\epsilon_{q} + 2R(1-R)[(1-\epsilon_{q})\epsilon_{b} + (1-\epsilon_{b})\epsilon_{q}] +    2R^{2}(1-\epsilon_{b})\epsilon_{b}$
  • $P_{2}=(1-R)^{2}\epsilon_{q}^{2} + 2R(1-R)\epsilon_{q}\epsilon_{b} + R^{2}\epsilon_{b}^{2} $

To find the value of R that best fits the data we then proceed to maximize the following likelihood:

$L = \prod_{i=0}^{2} Poisson(N_{i}^{meas},N_{i}^{exp}) $         where $N_{i}^{exp}=P_{i}N_{ev}$

Applying this procedure for the All b and Mix samples, choosing only the jets generated from $t\bar{t}$ decay and using the values for $\epsilon_{b}$ and $\epsilon_{q}$ computed before, we get the following result:

R measured from top decay jets in MC events

The table below summarizes the results obtained using also all the jets and tags in an event.

MC sample All b Mix
R expected 1.0 0.9
R measured from all jets in an event $1.03^{+0.006}_{-0.007}$ $0.944^{+0.007}_{-0.008}$
R measured jets generated from top decay only $1.007^{+0.006}_{-0.007}$ $0.906^{+0.007}_{-0.007} $

We see that when we take all the jets measured in an event R is not well estimated, specially in the mix sample. As so we tune $\epsilon_{b}$ in order to reproduce R=0.9 in the Mix sample when all the jets are used. This gives us $\epsilon_{b}(R=0.9)=0.73\pm 0.03$. The plot below shows the result of fitting a straight line to this scan.

Finding b-tag efficiency in the mix sample

Event yields for $100pb^{-1}$

Using $\sigma (pp\to t\bar{t})= 833\pm 52 pb$ and $B(W\to l\nu_{l})=0.22$ for $l=e,\mu$ we make a estimate for the event rates with a luminosity of $100pb^{-1}$ and for the statistical error associated to the measurement of R.

Event yields for $L=100pb^{-1}$
Yield after kinematics' selection (%) $10.39\pm 0.09$
$N_{evts} (expected)$ $441\pm 4$
MC sample All b Mix All q
$R_{generated}$ 1.0 0.9 0
$N_{evts}$ with k b-tags 0 $43\pm 2 $ $60\pm 3 $ $429\pm 7$
1 $181\pm 4$ $203\pm 5 $ $11\pm 1$
2 $217\pm 5$ $177\pm 5 $ $0.2\pm 0.1$
R $0.996\pm 0.02$ $0.90\pm 0.02$ $0.002^{+0.006}_{-0.002}$

-- PedroSilva - 11 Feb 2008

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r3 - 2008-05-14 - PedroSilva
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback