Summary of ttH HXSWG meetings (Oct 2014-Jan 2015)

In the following we summarise some aspects that have been pointed out during the ttH HXSWG meetings and that could be pursued within the framework of the HXSWG. This document will serve as a basis for a future working group report. It will be continuously updated by the ttH conveners, including input and feedback from the whole ttH group.

Oct 20 Signal modeling in ttH ( Indico)

Impact of signal modeling on ttH searches: At present MC signal simulations are not a serious source of uncertainty in ttH searches. Nevertheless a good understanding of ttH MC systematics will become relevant in the context of Top Yukawa measurements at higher luminosity. In order to prioritize the needs for future theory developments, we urge the experimental collaborations to quantify the impact of signal modelling uncertainties on the ttH signal acceptance: does it exceed the 20% level (relevant for y_t precision)? If yes, in which ttH analysis? What are the most relevant observables?

Shape uncertainties: thee tail of the pT(ttH) and pT(tt) distributions as well as the eta(ttH), N(jets), and HT(jets) distributions show significant shape discrepancies (20% and beyond) between NLO+PS predictions based on different matching methods, scale choices, and parton showers. Such dependencies should be significantly alleviated using NLO merging methods (FxFx in aMC@NLO/Madgraph5 and MEPS@NLO in Sherpa+OpenLoops).

Scale choices: Two conventional scale choices (a “fixed” and a “dynamic” one) are used in ttH MC simulations within ATLAS. Dynamic scale choices are preferable at large pT, and in general it might be useful to recommend (a set of?) standard scale choices and appropriate prescriptions for shape uncertainty estimates based on different scale choices.

Uncertainty estimates at NLO: methodological aspects related to theory scale choices and uncertainty estimates, especially in the framework of new NLO+PS and NLO merging methods, should be discussed in the framework of the HXSWG.

Official input parameters: Standard input parameters and PDFs for the ttH signal have been requested. A list of input parameters recommended by the HXSWG can be found here. At present Mt=172.5+-2.5 GeV (and no MH value) is recommended. These recommendations might change in the future. No recommendation at present for the electroweak input scheme to be used in the top-mass/top-Yukawa relation. (For ttH also recommendations on the MC modeling of Higgs decays might be useful).

Importance of Jet activity:In order to assess the possible need of theory improvements in the modelling of QCD radiation, we urge the experimental collaborations to assess the relative importance of extra jet emissions, i.e. ttH+1,2,3 jets events, in the framework of specific ttH analyses. Such events could play an important role if jets resulting from top/Higgs decays are often out of acceptance.

Minimal prerequisites for reliable ttH modeling

  • NLO+PS precision

  • spin correlated top decays (off-shell top decays through smearing of on-shell tops)

Recent and ongoing theory developments

  • NLO merging for ttH+0,1 jets is available in Madgraph5_aMC@NLO and Sherpa (in combination with OpenLoops or with code by Dawson, Reina, Wackeroth). Stefan Hoeche and collaborators offered Sherpa support to ATLAS and CMS

  • weak corrections are available at parton level in Madgraph5_aMC@NLO; extension to full EW corrections ongoing. Relevant for boosted regime (-8% correction)

Future theory developments (in tentative order of priority)

  • NLO top/Higgs decays

  • ttH signal/background interferences

Nov 3 Backgrounds and uncertainties in experimental ttH, H-->bb searches ( Indico)

General considerations: tt+jets MC modeling (especially tt+b-jets) is a dominant source of uncertainty in ttH(bb) analyses at 7+8 TeV . Run1 analyses are either based on LO ME+PS or inclusive NLO+PS MC, and in both cases the formal accuracy of tt+jets final states is only LO. On the one hand, tt+jets MC uncertainties should be reduced by means of state-of-the art NLO simulations. On the other hand, given the significant impact of MC uncertainties even at NLO, their estimate requires a transparent and theoretically motivated methodolgy. The issue of MC uncertainties is intimately connected to the methodology employed in the experimental analysis (jet-flavour categorisation, top-pT reweighting, other data-driven procedures,...) and to the subtle interplay between various levels of MC simulation (matrix elements, shower,...). In this context, as a starting point, it is highly desirable to identify and understand all essential aspects (theoretical and experimental) that are relevant for MC uncertainty estimates in ATLAS/CMS analyses, and to document them in a precise and transparent language that could facilitate the exchange between theory and experiment.

In the following we propose a first synthesis of tt+jets MC uncertainty issues emerged from the meeting. This includes also a detailed description of top-reweighting (in ATLAS) and other informations that have been collected after the meeting.

tt+jets categorisation for Monte Carlo uncertainty (MCU) estimate: tt+jets MC samples are split into a certain number of independent subsamples (tt+light-flavour, ttb, ttbb, ttc,...) that are defined in terms of the numbers of b- and c-jets (Nb,Nc) and/or the total number of jets (Nj). Top-decay products are typically not considered in this categorisation, and ATLAS/CMS employ different subsamples and different definitions of Nb,Nc,Nj (see the descriptions for what was used in Run 1 here). The various subsamples can be obtained from a single inclusive tt+jets generator or using dedicated generators for certain subsamples (e.g. for tt+b-jets). It is highly desirable that both experiments adopt a common categorisation approach, based on a proposal from the theory community. This requires a precise definition of:

  • Nb, Nc, Nj: which simulation level (MEs, shower, hadronication, detector)? which definition of flavour jets? What are the relevant pT-thresholds and cuts?
  • a definition (in terms of Nb, Nc, Nj) of the most appropriate subsamples that require independent MCUs

This standard definition should be as simple as possible and should allow for a consistent assesment of MCUs (ideally with a clean separation of perturbative/non-perturbative effects). It should also facilitate comparisons among the various MC tools on the market.

Treatment of MCUs in experimental fits Normalisation and shape variations for each tt+jets subsample are represented in terms of independent nuisance parameters that are fitted to data together with the signal strength. Each theory uncertainty enters the fit as a prior distribution for the related nuisance parameter, and various MCUs (like the normalisation of tt+light-jets) are strongly reduced when MC predictions are fitted to data. Typically tt+HF subsamples feature the largest post-fit uncertainties. Moreover, due the limited shape separation between the small ttH(bb) signal and the large tt+HF background, the fit tends to constrain only their combination, which is dominated by tt+HF, while the signal component remains poorly constrainted.

MCU estimates in CMS, using inclusive LO ME+PS tt+jets sample (Madgraph):

  • normalisation and uncertainty of total ttbar+X cross section from NNLO
  • ad-hoc 50% rate unc. for ttbb, ttb, ttcc subsamples (uncorrelated)
  • factor-2 ren and fact scale uncertainties for subsamples with different parton multiplicity (uncorrelated): weights of events originating from tt+n-parton matrix elements are varied as alphaS^n(Q) at LO keeping fixed the total rate of tt+X => impacts shape of Nj distribution; scalings simultaneously applied in the shower to adjust for variations in the amount of ISR/FSR
  • DATA/MC reweighting of top-pT (impacts shape of leading-jet and lepton pT): a top-pT dependent correction factor K(pT) is introduced, such that MC(x)=MC*x*K(pT) yields agreement with data at x=1 for the inclusive top-pT distribution. The nuisance parameter x is varied in the range [0,2]. This induces a 20% correction and MCU in the boosted-top regime.
  • No additional merging-scale variations are applied
  • tt+c-jets contributions:* for tt+c-jets (20% of background in signal region) a dedicated NLO simulation would be desirable (not yet available)

Dominant sources of MCU in CMS: ttbb rate, top-pT reweighting, ttb and ttcc rates, MC statistics

MCU estimate in ATLAS using NLO+PS (Powheg+Pythia), ME+PS (Madgraph) and S-MC@NLO ttbb (Sherpa+OpenLoops) samples:

  • normalisation and uncertainty of total ttbar+X cross section from NNLO
  • ad-hoc uncorrelated 50% uncertainties for inclusive tt+b and tt+c cross sections
  • DATA/MC reweighting of inclusive distributions in ttbar-pT (yields correct Njet distribution) and top-pT (to correct other shapes) is applied to all tt+jets subsamples (including tt+HF). In this context, MC is compared to unfolded data, which involve a significant dependence on Pythia (and even on the employed tool) and on the related uncertainties. See more details below.
  • tt+b-jets MC predictions and uncertanties are obtained by reweighting the inclusive NLO+PS sample with a dedicated S-MC@NLO ttbb sample in the 4F scheme; in this context, various tt+b-jets subsamples (see slides) that allow for a consistent matching of the two samples are used; an independent and differential reweighting is applied to each subsample; MCUs are taken from variations of PDFS, ren/fact scales (factor-2 and kinematical), and shower parameters in the S-MC@NLO ttbb sample.
  • the employed tt+b-jets categorisation is based on the number of reconstructed b-jets at particle level (MC truth, after hadronisation). It involves pT-thresholds for B-hadrons and b-jets. Consistent matching is ensured by removing b-jets from UE and top-decay showering.
  • comparisons of NLO+PS, ME+PS and S-MC@NLO ttbb are used as a sanity check: S-MC@NLO features an excess in subsamples with “merged HF jets” (more b-hadrons in a jet). Here one should keep in mind that, in the inclusive NLO+PS and ME+PS simulations, b-quarks in tt+b-jet subsamples originate mostly from the shower (unless a small merging scale is used).
  • All comparisons in the ATLAS talk involve reweighted Powheg/Magraph+Phythia predictions, while Sherpa+OpenLoops is not reweighted: it’s a first principle NLO MC prediction. Top/ttbar-pT reweighting significantly improves the agreement with the S-MC@NLO ttbb prediction.
  • tt+c-jets contributions:* for tt+c-jets (20% of background in signal region) a dedicated NLO simulation would be desirable (not yet available)

Dominant sources of uncertainties in ATLAS: ttbb rate, top- and ttbar-pT reweighting, ttcc rate (MC statistics is also an issue)

Top reweighting and related systematics. To compensate for the mismodeling of the top and ttbar pT distributions, MC simulations are reweighted with a pT-dependent correction factor derived from data. The reweighting is applied at the level of the unfolded top-pT distribution(s), which are derived from “data” using a migration matrix obtained from “pseudo data”. In the following, as an illustration of top-reweighting (and related systematics) we sketch the approach employed by ATLAS (see http://arxiv.org/abs/1407.0371). While ATLAS performs a double-differential reweighting of top- and ttbar-pT, here we consider only top-pT. The nominal tt+jets MC sample, generated with Powheg+Pythia, is passed through detector simulation and is used to determine a reconstructed top-pT distribution (pseudo-data). The relation between the top-pT distribution in pseudo-data and the corresponding distribution at MC truth level is encoded in the migration matrix. More precisely, MC truth corresponds to the top-pT in showered (or non showered?) parton-level ttbar events within the Powheg+Pythia.

The migration matrix is supposed to describe pT-distortions resulting from detector-smearing and acceptance cuts, and is also sensitive to QCD radiation effects due to the different QCD-radiation dependence of the top-pT at MC-truth and reconstruction level. The reconstructed top-pT is obtained from a kinematic likelihood fit on the events, where the jets/lepton/missing ET are fitted to the ttbar hypothesis and the different permutation of jets are checked. Events with low likelihood are cut out to remove non-ttbar background. For the events passing the cut the permutation with highest likelihood is taken and the hadronic top-pT and leptonic top-pT are extracted. The reconstructed top-pT is typically more sensitive to QCD radiation (and related uncertainty) wrt the MC-truth top-pT.

Finally, using the Powheg+Pythia based migration matrix, the top-pT distribution reconstructed from real data is converted into an unfolded top-pT distribution. The latter is used to reweight the Powheg+Pythia pT-distribution at MC-truth parton level by a factor rw(x_i,pT)=f(x_i,pT)/MC(pT), where MC(pT) is the MC prediction, while f(x_i,pT) denoted the unfolded distribution. The variables x_i=(x_1,x_2,...) parametrise the dependence of the migration matrix on the various relevant uncertainties, and x_i=0 corresponds to the nominal prediction.

Each independent experimental uncertainty (b-tagging, jet energy scale, etc.) is described by a corresponding x_i variation and a related variation of rw(x_i,pT). All sources of uncertainty could be in principle propagated to the full simulation (MC + x_i dependent reweighting + detector simulation + x_i dependent top reconstruction) in a correlated way, in such a way that x_i variations tend to cancel in the reconstructed top-pT distribution, and the latter always agrees with data within statistical uncertainties (note that the final ttH(bb) fit is based on reconstruction level!). However, since top-reweighting is based on 7 TeV data (and related detector calibration + uncertainties), x_i variations in top-reweighting and top-reconstruction (at 8 TeV) cannot be correlated. In practice, the nominal reconstruction (w.o.) x_i variations is always employed. This tends to overestimate x_i uncertainties.

The MC Generator uncertainty is encoded in a modified reweighting rw’(x_i,pT)=f’(x_i,pT)/MC(pT), where unfolded data f’(x_i,pT) are based on a migration matrix obtained from an alternative generator (MC@NLO). This reweighting is defined for (and applied to) the default MC prediction, MC(pT). The uncertainty associated to the rw-rw’ difference is not correlated since the alternative generator is never used for the simulation.

The ISR/FSR systematic is evaluated in a completely different way. Pseudo-data generated with two MC simulations with ISR/FSR variations (up/down) are unfolded with the nominal migration matrix, and the relative effect with respect to the central MC prediction is used as a systematic. Such ISR/FSR variations shift the unfolded top-pT(ttbar-pT) distribution by about 5%(15%). These variations are not correlated with corresponding ISR/FSR uncertainties of the tt+jets MC sample (for which only the nominal Pythia settings are used). Thus they do not cancel out when the reweighted sample is passed through detector simulation and the tops are reconstructed.

The reweighting of the inclusive top-pT distribution (and related uncertainties) is applied to all tt+n-jet subsamples in a fully correlated way. In particular, also tt+HF final states are reweighted with the same top-pT correction factor. This procedure is supported by the observation (in ATLAS MC studies) that, in tt+b-jets subsamples, reweighted Powheg+Pythia and Madgraph+Pythia predictions for the top/ttbar-pT distributions are in better agreement with S-MC@NLO ttbb wrt non-reweighted ones.

Electroweak contributions to ttbb: it was pointed out that pp->ttbb might receive significant tree-level EW contributions of order alpha^2*alphaS^2. This should be checked.

Nov 10 Theory perspectives on tt+jets and tt+HF production ( Indico)

Nov 24 Backgrounds and uncertainties in ttH, H-->gamma gamma ( Indico)

Dec 1 Backgrounds and uncertainties in ttH, H-->multileptons ( Indico)

Dec 15 Signal modeling in tHq ( Indico)

Jan 12 Backgrounds and uncertainties in tHq

Feb 2 ttH Combination: Systematics and correlations

-- StefanoPozzorini - 27 Oct 2014

Edit | Attach | Watch | Print version | History: r13 | r11 < r10 < r9 < r8 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r9 - 2014-11-16 - StefanoPozzorini
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCPhysics All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback