Analysis Info


In top-quark physics, nowadays, almost all precision measurements are limited by theoretical, systematic and modelling uncertainties. In addition, the interpretation of ttbar data with modern pQCD predictions is also limited by theoretical uncertainties; some of them are even inherent to the data. This is because ttbar data is not accurate to NNLO accuracy, since some contributions have been removed from the data in an uncontrolled way. Also, top-quark pair cross section measurements are of overall moderate quality only, since those have commonly very low acceptance (~10%) or suffer from 'combinatorial' background, which is terrible, or have large QCD background, or two neutrinos.

The lepton+jets final state tries to improve on all of that by measuring the actual final state of events including top-quarks.
  pp -> (tt ->) WWbb -> lvbbqq 
Infact, when measuring pp -> WWbb not necessarily all diagrams must include a top-quark, and also top-quarks in these diagrams can become off-shell.

One theoretical limitation arises from the fact, that higher-order QCD diagrams alter the top-quark decay, lower-order tW diagrams contribute to the signal, and the top-quark can become off-shell.

Presently, many theoreticians work on NLO+PS Monte Carlo, which include a correct treatment of non-resonant mass effects.

The goal of the analysis is to measure cross sections for pp->WWbb production. Of course, in a later stage, one can do a lot of physics with it, like top-quark mass determinations, top-quark width, PDF fits, alpha_s fits, running top-quark studies etc....

The primary goal of the analysis?
I would say, this should be, first of all, a high-quality measurement of differential WWbb cross sections:
W-pT, b-pT, W-eta, b-eta, dR(b,b), dR(l,W), m_W[?]
Maybe double-differential?
This may look a bit shallow at the first view, but it is finally one of the most reasonable and comprehensive (and precise?) Standard Model measurements at the LHC.

The theory community once made a statement at the LHCTopWG-Meeting: "There are too few useful measurements in the top-quark sector". At the same time, the SM community complains that that all present ttbar-measurements exhibit some tensions when used, e.g., in PDF fits.
So, I believe, that WWbb will become a very useful measurement, both for SM and for top-quark physics. And although it appears to be simple, it is the first measurement of its kind and is highly sensitive to all SM parameters: m_top, Γ_t, alpha_s, gluon-PDFs (possibly even m_W). It is sensitive to higher-order QCD, higher-order EW, etc..., but it is free from awkward mismodelling effects, soft-QCD, resummation, interference, QED radiation, etc..., and at a very reasonable renormalization scale. Beautiful!

The sensitivity to interference effects is neat, but I would not give it high priority. Why? In WWbb we accurately measure the actual final state; at the same time the predictions are self-consistent. Why should someone try to fudge ttbar+Wt predictions together to describe these data if already more complete predictions are available? Why should someone subtract Wt-predictions from WWbb data to obtain ttbar data, if our accurate WWbb data themselves are available ?
Though, it is nice to have such studies, and it may be indeed useful for some modelling purposes, and further enhance the usability of our measurement, but I would not try to 'enhance' these effects.

Then, one can extend the measurement further:
+ Either extend the observables towards lvbbqq, where even more non-resonant effects are important, and where NLO-predictions are available.
+ Or combine W+b's. This gives obviously m_lb and m_top. Both observables measured at a time with correlations should provide maximum sensitivity to m_top, with small syst. uncertainties and low modelling uncertainties. Such, it targets to reduce the dominant uncertainties in m_top, but not by reducing these uncertainties, but by circumventing them. It also combines the advantages of di-lepton and l+jets in a single measurement, if done with care. We should see...

Analysis strategy

Lepton triggers


The l+jets analysis is designed such, that one selects events with
  • 1 lepton
  • 1 MET
  • 1 b-jets (or better 2 or more)
  • 1 light jet (or better 2 or more)
Then, one has first large combinatorical background, large QCD background, and low acceptance.
But when then requiring:
  • lepton-pT > 60 GeV
  • MET > 60 MeV
  • b-jet > 60 MeV
then looking for two light jets that fulfill a certain pt--delta-r requirement and have a dijet mass similar to a w-boson, then one has > 80% acceptance. In such a topology, the correct W+b assignment is possible without ambiguity.

Our analysis builts upon Single-top ntuples
These are ~100TB of data, but calibrated and including all systematic uncertainties.

What needs to be done:
We develop a new (fast) code to make the event selection, apply cuts and do the histogramming. Also, we need some unfolding code (TUnfold). Then, the observables need to be defined, the analysis cuts optimised, unfolding, etc...

In later stages, we can incorporate also new NLO MC-generators, which are presently under development by theoreticians.

In later stages, one can do top-quark mass fits, etc...

From the point-of-view of the 'WWbb' analysis, I think, it is reasonable to have symmetric W-pT cuts, so the same for Wlep and Whad.

Because of the ljet.pT, lept.pT and MET cuts that are given by trigger, pile-up and resolution, there is very low acceptance for low W.pT. This is seen as the turn-over of the at lower pT.
For a high-quality analysis, the acceptance (with kinematic correlation) should be as large as possible.
Therefore, I think, we need a W.pT cut of at least 50GeV. More reasonable to me appear 70 GeV.
Possibly, we even have to have 100 or 120 GeV !?

a pT-cut of 50GeV sounds also very reasonable.
But, I was more thinking about the two decay products: at 50GeV, the two decay products will unlikely both exceed 25GeV such that they are within our 'pT'-acceptance. So, in this kinematic region large correction factors need to be taken from MC, which I would try to avoid as much as possible.
The other 'W': the efficiency for the reconstruction of the hadronic W is quite low at low-pT. 50GeV would be my dream as well, but likely we run into troubles there. Thus, I suggested to start with a bit higher cut, and once we understand more, we can go down. I hope, that this facilitates the analysis at this stage.

Therefore, I would propose to add a pre-selection of Wlep.pT>70 GeV from now on, and also to impose that cut on Whad.

Of course, you can come to other conclusions and select any different value between 60 and 100 GeV. Though, I would not go to much lower values

This has further the advantage, that the W-tagging becomes much more efficient.
And even further, also the dR(W,b) correlation becomes much cleaner, as you already have shown.

Whether or not to increase the bjet.pT cut needs to be studied/discussed. Maybe, one wants to choose the same pT as for the W's, maybe not...

Choice and cuts on variables

As expected: if one resticts the data to higher scales (higher W-pt), the W+jets background becomes more relevant than at lower scales. Infact, W+jets is more important than DR. Hence, if we would like to draw conclusions about 'interference' the W+jets should be made conisderably smaller than Wt.

From these plots we now see that W+jets is not an irreducible background, but has a distinct structure:
+ The Whad-mass has likely this second peak (probably we have to impose a cut on WHad.M anyhow...)
+ dR(b,b) is small for W+jets, since both b's likely originate from the same gluon and have a common boost
+ dM(b,b) is small for W+jet, since it comes from a massless gluon
+ dR(Whad,b) and dR(lept,b) are both large
I do not yet see a single variable that reduces W+jets with a considerable efficiency and backround/signal ratio.

There are different options:
+ keep the backgroud as it is, or cut on one of the present variables
+ combined several variables to enhance and then cut the background.
R(b,b)<1.6 && R(l,b0)>2 && R(l,b1)>2 && M(b+b)<70) ...
+ Use an MVA

+ find more, and even better, observables.
- (b+b).pT / Wlep.pT ?
- R((b+b, l)
- R((b+b, Whad)
- ...
motivated by the fact that the 2-bjet system has a different origin in W+jet than in ttbar.

+ For good control, it would be good to define a 'background enhanced' control region and check if the backround normalisation is well-described. This can be done by defining the control-region 'ontop' of the present preselection (like adding R(b,b)<1.6 && M(b+b)<70 )
Or with a completely different pre-selection (like lept.pT, MET, 2(?)3 jets, (no-bjets?) only. Then plotting n-jet, jet-pt...
-> Possibly there are good examples in previous papers.

I think, I understand the reason for the two lines:
In W+jet one has three processes that produce b-jets:
+ the hard gluon (qq->W+g) decays into two b
+ a soft-gluon (qq-> W+g+g) decays into two b
+ b's from parton shower or hadronisation.

The second and third will likely not yield two high-pT b's, so the high-pT bjets originate from the recoiling gluon. (therefore, I have proposed to plot (b+b).pT/Wlet.pT), which causes the distinct blob at dR~pi.

Since W-pt is somewhat high, (b+b).pT is equally high and thus the two b's are boosted. They originate from a massless gluon and hence they are still 'collinear' after the gluon-decay. Consequently, the two bjets are close together (dR~0.4) and therefore dR to the Wlep is similar, but modulo 0.4--0.6 -> two lines with distance ~0.5?.

From your plot R(b,b) one sees, that W+jets have always R(b,b)<1.5, but mainly R~0.6

Since we need further 1-2 hard light-jets, the b+b are not exactly back-to-back to the W, and R(W,b+b) can be different. This causes the two lines.

So, one key to suppress the W+jets background is R(b,b).

Comparison to other analysis

In the top-sector, there are a number of related measurements. Of course pp->tt->l+jets. Others include associated production tt+Z, tt+gamma, tt+bb, tt+..., and of course the single-top measurements.
Have a look: Top Public Results.
If you select the 'glance' entry, you often find a comprehensive int-note and further details on the analysis.

The ttbar measurements are further divided into 'differential' and 'total' cross section measurements. The later measures only the total cross section.

Our measurement is indeed very (VERY) closely related to ttbar->l+jets. Some may even think, this is the same measurement, but that's not true.

They 'reconstruct' their event using by evaluating a combinatorial kinematic fit, the KLFitter. They assume, that all events arise from a resonant tt-bar process and the final state is lvbbjj. However:
  • there are non-resonant contributions
  • most events have low-pt decay products and the final state is not within the acceptance
  • there are interfernce contribtions from the NLO Wt-process.
You see it from fig. 4c of that publication. The efficiency is only 6-10 % !!
One could say, that 94% of what is published is not data, but just the MC simulation.
In addition (from other plots) these measurements have 'combinatorial' backgroudn that can be as large as 50%. So every second event is wrongly re-combined and such is a huge 'noise'. Furthermore, it is assumed that two top-quarks have been present resonantly, but this is quantum-mechanically not required.

We intend to improve on these aspects. Therefore, we cannot make use of a kinematic fit, but intend to reconstruct the final state products, b, b, W and another W one after the other.
The tricky part is then the hadronic W, which after decay and hadrionsiation may become 1,2 or 3 jets or is not within the acceptance. Importantly, we cannot assume for W-tagging, that the W arises from a resonant top, since we do not measure pp->tt->WWbb->l+jets but pp->WWbb->l+jets.

Therefore, we could not build upon ttbar analysis code, since it completely relies on 2 resonant tops, but that's not our intention.

Skimming idea

If we need small trees.
  • MET > 25 GeV
  • at least 1 lepton with pT > 25 GeV
  • Leading lepton IsTight
  • Lepton |Eta| < 2.5
  • (MET+lepton).pT > 30 GeV (W-leptonic)
  • at least 1 bjet with pT > 25 GeV, dl1r >= 5 (4?)
  • at least two additional light jets with pT > 25 GeV
  • drop all jets with pT<20 GeV

For truth, there are O(2000) 4-vectors, but only O(10) are useful.


Possible timetable for Charlie's contributions to the lepton+jets (and di-lepton) analysis

Oct/Nov/Dec 2020:
  • get familiar with analysis codes, and 'singe-top' ntuple format
  • setup analysis environment, copy ntuples to local cluster, etc...

Dec/Jan 21

  • validation of the ongoing di-lepton analysis cutflow
  • preparation of l+jet single-top ntuples (v31) [see link below]

Jan-Apr 21

  • possible (small) contribution to di-lepton analysis (cut-optimization, same-flavor lepton selection, or similar) (this is important for your possible contribution to the di-lepton paper)

Jan-Aug 21 'design' of the lepton+jets analysis.

  • phase space selection
  • analysis 'up-scaling'
  • background studies
  • resolution studies
  • observable reconstruction
  • efficiencies
  • etc...

Sept-Feb 21/22

  • channel combination (e,mu)
  • Unfolding
  • systematic uncertainties
  • Model predictions

Feb-Jun 22

  • Phenomenological interpretation (NLO, NLO+PS, approx. NNLO,
  • optional: possible extension to pp->lvqqbb
  • optional: possibly include 1-bjet, and/or 1-ljet channel (requires new ntuples)

Jun-Aug 22

  • paper preparation, conf note (ICHEP?)

Aug-Jan 22/23

  • extended phenomenological analysis (m_top, \gamma_top, PDF, alpha_s, etc...)
  • paper preparation

Feb-May 23

  • extra-time

Jun-Oct 23

  • Writing thesis.

MC samples

tW samples
  • DR = Diagram Removal
  • DS = Diagram Subtraction

Analysis code

  • WWbbLoop currently used for the di-lepton analysis
  • WWbb intended by Daniel for the l+jets analysis. There are infact two analysis packages included. One by Andrii (neowwbb) and one by Daniel WWbb (libwwbb)

Group Twiki, mailing lists, etc.

-> v31 single-lepton (for l+jets analysis)
-> v31 mulit-lepton (for di-lepton analysis)

  • WbWb twiki?
  • WbWb analysis meetings: Thursdays 16:00 (used to be Fridays 12:00)
  • WbWb analysis group mailing list:
  • Analysis contacts: Daniel Britzger (Max-Planck-Institut für Physik München), Serena Palazzo (The University of Edinburgh (GB))

Presently, our analysis group has a mandate for WbWb analyses in the di-lepton and in the l+jets channel. We had been in contact with the top-group and top-cross-section group conveners before about this question, and it was agreed on that we can aim for a combined analysis (di-lepton plus l+jets) or for two analyses - depending on the progess of the two channels and the actual physics messages.

Talks, Meetings

WbWb meetings

Top cross section meetings

Theory related

  • TOP2020, 14-18 Sep 2020 Conference timetable
  • TOP2020, 15 Sep 2020 Theoretical issues in event generation (Silvia Ferrario Ravasio) (NLO+PS Monte-Carlo generators with a full non-resonant aware matching are in development, p3-6)

Theory papers


  • arXiv:2008.11133 (hep-ph) NNLO QCD corrections to leptonic observables in top-quark pair production and decay (Michal Czakon, Alexander Mitov, Rene Poncelet) (An excellent summary on many theoretical and experimental issues in top-quark measurements. Most notably the introduction gives an excellent overview.)


  • arXiv:1012.3975 (hep-ph) NLO QCD corrections to WWbb production at hadron colliders (A. Denner, S. Dittmaier, S. Kallweit, S. Pozzorini)
  • arXiv:1207.5018 (hep-ph) NLO QCD corrections to off-shell top-antitop production with leptonic decays at hadron colliders (Ansgar Denner, Stefan Dittmaier, Stefan Kallweit, Stefano Pozzorini) (see fig 3)
  • arXiv:1412.1828 (hep-ph) Top-pair production and decay at NLO matched with parton showers (John M. Campbell, R. Keith Ellis, Paolo Nason, Emanuele Re)


  • arXiv:1711.10359 (hep-ph) Off-shell production of top-antitop pairs in the lepton+jets channel at NLO QCD (Ansgar Denner, Mathieu Pellen)

-- KenjiHamano - 2020-11-06

Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r13 - 2021-05-18 - KenjiHamano
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback