First Monte-Carlo studies (November-December 2007)
Event selection
To study the event selection we will try to approach the Monte-Carlo generator level results to the experimental measures by clustering high pT objects which will correspond to the groups of particles flowing into the detector. This can be done by using the standard clustering algorithms provided by PYTHIA. We choose the
CellJet
algorithm (aka as
PYCELL
in PYTHIA 6).
The
CellJet
algorithm consists in dividing the

phase space in equally distant cells. The flux of energy-momentum is measured in each cell. The cell with highest pT is chosen as a cluster seed and will aglutinate all the neighboring cells within a range:

. If the cluster's total pT is higher than a given threshold then the cluster is considered as a jet or
CellJet
. The algorithm proceeds to find other cells with high pT which can be used as seeds for clustered jets. For more details (consult
PYTHIA 6 manual
or the
PYTHIA 8 online documentation
).
To match a jet to a primordial parton we find within a range

which is the jet that minimizes

. If no jet is found within

or

we discard the event. Some 'event displays' are shown below.
Event display (with matched leptons and b-jets) |
|
|
The following plot shows the distribution of the number of jets found in event after subtracting the jets matched to leptons. One can see that there's a non negligible number of events where 0 or 1 jets were found. Even selecting only the events in which both b-quarks are matched to a jet we count events in which only one jet was found.
Due to the fixed size of the

cells used to cluster the high pT objects and to the usage of a minimum pT threshold to initiate the clustering procedure, some errors might occur when matching the generated leptons and partons to the pT clusters (
CellJets
). Some examples are shown below:
- if the pT is too low no jet will be found in the event for the generated particle;
- if the generated particles are not well separated they will be gathered in the same cluster.
Event display (with matching exceptions examples) |
|
|
To minimize the errors in the identification of the jets corresponding to the generator level particles we:
- try to match the leptons to one of the jets. If a match is found we subtract the lepton momentum from the jet momentum
- try to match the b-quarks to the remaining jets
- select events in which a match of the 2 b-quarks was successful
This procedure introduces a bias in the event selection because events with low pT jets/leptons will discarded.
In the following plots we compare the generator level distributions (plot on the left) with the jet level distributions. Note that for the special case of the leptons we use always the generator level values because experimentally they will be, in principle, resolved by the detector.
Object separation |
|
|
Kinematics |
|
|
|
Event yields after selection
MC sample |
All b |
All q |
Mix |
|
50000 |
50000 |
40000 |
"reconstructed" |
50000 |
22837 |
50000 |
23252 |
40000 |
17720 |
Selection |
Generator level |
Jet level |
Generator level |
Jet level |
Generator level |
Jet level |
Event topology: |
|
49750 |
22043 |
49745 |
23217 |
39796 |
17671 |
|
2 charged leptons: |
|
21306 |
9018 |
21091 |
9432 |
16832 |
7073 |
|
|
2 jets: |
|
14623 |
6918 |
14437 |
7554 |
11482 |
5509 |
|
MET |
MET > 60 GeV |
14019 |
5282 |
13819 |
5771 |
11051 |
4171 |
Yield = (%) |
|
|
|
|
|
|
As a combined result we obtain for the yield

.
The error presented in this yield is statistical (from counting). The measurement of this yield is however affected by several systematic effects, namely:
- the jet energy scale;
- the b-tagging efficiency
We take a look at this effects separately
Influence of jet energy measurements (Jet Energy Scale)
When the energy of a jet is reconstructed it depends on several factor namely the jet algorithm used, the resolution and granularity of the calorimeter, etc. At the generator level we can estimate how these errors propagate to the final yields.
In our case the jet algorithm that clusters the particles collected in cells within a radius

will introduce an error in the measurement of the pT and

. This mainly due to the granularity of the

grid used to count particles.
Below we plot the error distributions for:

, using a grid

for the MC samples in which

100% of the cases.
From this distribution we conclude that the
CellJet
tends to underestimate the pT of the original parton. This can be related to the fact that not all the hadrons are clustered in the same jet. Some of the hadrons might be left outside of the cells used to construct a jet. An improvement on this algorithm would be to re-iterate the clustering using the
CellJets
and the remaining
SingleCells
(not clustered yet). This second step would enable to colect the remaining energy and would also merge jets that are very close to each other.
We turn now to the effect that the calorimeter measurement of the particles might have on the event yield. In a crude approach the calorimeter will measure the pT of the particles with a resolution that can be given by

if we only take into account the fluctuations from Poisson statistics (calorimeter with linear response). We can estimate how this fluctuation propagates to the expected event yields by changing the pT of the leptons and the jets by the following amount:
To discriminate the contribution from positive and negative smearings we try also the following changes:
Applying both these smearings to the sample
"All b" sample with a resolution R=0.4 we get the results summarized in the table below:
|
Generator level |
Jet level |
|
50000 |
22043 |
Smear mode |
|
0 |
|
|
0 |
|
Topology in range |
49601 |
49748 |
49834 |
21965 |
21983 |
22005 |
Leptons in range |
19702 |
21282 |
22862 |
8350 |
9031 |
9730 |
Jets in range |
13019 |
14571 |
16160 |
6156 |
6935 |
7717 |
MET in range |
12452 |
13984 |
15536 |
4673 |
5283 |
5926 |
Yield(R=0.4) |
0.249 |
0.280 |
0.311 |
0.212 |
0.240 |
0.269 |
Yield(R=0.4)-Yield(R=0) |
-0.031 |
0.000 |
+0.031 |
-0.028 |
0.000 |
+0.029 |
From these yield diferences we get an estimate for the systematic error due to the precision in measuring the pT.
Influence of b-tagging efficiency
To estimate how b-tag efficiency will propagate to this measurement we proceed as follows:
- Select the events in which the 2 quarks+2leptons from
decay pass all the kinematic cuts
- For each jet:
- trace back if any of the clustered particles came from a hadron whose proper lifetime (
, measured in mm/c) was shorter than
(exclude
contributions);
- compute an 'effective' proper time for a second vertex candidate as:
where
is the proper time of the hadron measured in the laboratory frame (Note: the
weights are used to enhance the contribution from the hadrons belonging to the '
core' of the jet);
- Generate the time of generation (or displacement) of the second vertex candidate from an exponential distribution
.
- Correct the generated value by the rapidity of the jet:
- If
then b-tag the jet with a secondary vertex.
- Count how many b-tags are attributed in each event.
- Count how many b-tags are attributed to the jets matched to the top decay (
) to estimate the b-tagging efficiency (
) in
events.
Using the previous method we obtain the following result:
MC sample |
All b |
All q |
Mix |
|
5282 |
5771 |
4171 |
Events counted using all jets found |
N b-tags |
0 tags |
471 |
5134 |
517 |
1 tag |
2018 |
547 |
1815 |
2 tags |
2524 |
82 |
1677 |
2 tags |
269 |
8 |
162 |
Events counted using jets from top decay only |
N b-tags |
0 tags |
521 |
5630 |
570 |
1 tag |
2164 |
138 |
1924 |
2 tags |
2597 |
3 |
1677 |
If we consider the
All b sample and measure the efficiency on the b-tag of the jets from top decay we find

. On the other and if we take the
All q sample and measure the probability of identifying a jet from top decay as a b-jet we get

. This are first estimates for the b-tag and mistag efficiencies. Next we try to discuss this efficiencies in more detail.
R measurement
R is the the ratio of

with respect to

.
Using the MC samples one can attempt to reproduce the R simulated in the samples (1,0 or 0.9 for the
All b,
All q and
Mix samples respectively). In order to do that one must account for the number of events expected with 0, 1 or

2 b-jets and for the probability to identify them correctly.
When writing down the expected rates we make use of the following variables:
-
- number of
with di-leptons in the final state
-
- probability to identify correctly the flavor of a b-jet (b-tag efficiency);
-
- probability to identify q-jet as a b-flavored jet (b-mistag probability);
-
- number of events measured with i b-tags;
-
;
Doing so we have the following expressions for the probability of measuring 0, 1 or 2 b-tags in a di-lepton

decay:
To find the value of R that best fits the data we then proceed to maximize the following likelihood:
where
Applying this procedure for the
All b and
Mix samples, choosing only the jets generated from

decay and using the values for

and

computed before, we get the following result:
The table below summarizes the results obtained using also all the jets and tags in an event.
MC sample |
All b |
Mix |
R expected |
1.0 |
0.9 |
R measured jets generated from top decay only |
|
|
R measured from all jets in an event |
|
|
We see that when we take all the jets measured in an event R is not well estimated, specially in the mix sample. As so we tune

in order to reproduce R=0.9 in the
Mix sample when all the jets are used. This gives us

. The plot below shows the result of fitting a straight line to this scan.
Event yields for
Using

and

for

we make a estimate for the event rates with a luminosity of

and for the statistical error associated to the measurement of R.
Event yields for |
|
Yield after kinematics' selection (%) |
|
|
|
MC sample |
All b |
Mix |
All q |
|
1.0 |
0.9 |
0 |
with k b-tags |
0 |
|
|
|
1 |
|
|
|
2 |
|
|
|
R |
|
|
|
--
PedroSilva - 11 Feb 2008