In the following we summarise some aspects that have been pointed out during the ttH HXSWG meetings and that could be pursued within the framework of the HXSWG. This document will serve as a basis for a future working group report. It will be continuously updated by the ttH conveners, including input and feedback from the whole ttH group.
Impact of signal modeling on ttH searches: At present MC signal simulations are not a serious source of uncertainty in ttH searches. Nevertheless a good understanding of ttH MC systematics will become relevant in the context of Top Yukawa measurements at higher luminosity. In order to prioritize the needs for future theory developments, we urge the experimental collaborations to quantify the impact of signal modelling uncertainties on the ttH signal acceptance: does it exceed the 20% level (relevant for y_t precision)? If yes, in which ttH analysis? What are the most relevant observables?
Shape uncertainties: thee tail of the pT(ttH) and pT(tt) distributions as well as the eta(ttH), N(jets), and HT(jets) distributions show significant shape discrepancies (20% and beyond) between NLO+PS predictions based on different matching methods, scale choices, and parton showers. Such dependencies should be significantly alleviated using NLO merging methods (FxFx in aMC@NLO/Madgraph5 and MEPS@NLO in Sherpa+OpenLoops).
Scale choices: Two conventional scale choices (a “fixed” and a “dynamic” one) are used in ttH MC simulations within ATLAS. Dynamic scale choices are preferable at large pT, and in general it might be useful to recommend (a set of?) standard scale choices and appropriate prescriptions for shape uncertainty estimates based on different scale choices.
Uncertainty estimates at NLO: methodological aspects related to theory scale choices and uncertainty estimates, especially in the framework of new NLO+PS and NLO merging methods, should be discussed in the framework of the HXSWG.
Official input parameters: Standard input parameters and PDFs for the ttH signal have been requested. A list of input parameters recommended by the HXSWG can be found here. At present Mt=172.5+-2.5 GeV (and no MH value) is recommended. These recommendations might change in the future. No recommendation at present for the electroweak input scheme to be used in the top-mass/top-Yukawa relation. (For ttH also recommendations on the MC modeling of Higgs decays might be useful).
Importance of Jet activity:In order to assess the possible need of theory improvements in the modelling of QCD radiation, we urge the experimental collaborations to assess the relative importance of extra jet emissions, i.e. ttH+1,2,3 jets events, in the framework of specific ttH analyses. Such events could play an important role if jets resulting from top/Higgs decays are often out of acceptance.
Minimal prerequisites for reliable ttH modeling
NLO+PS precision
spin correlated top decays (off-shell top decays through smearing of on-shell tops)
Recent and ongoing theory developments
NLO merging for ttH+0,1 jets is available in Madgraph5_aMC@NLO and Sherpa (in combination with OpenLoops or with code by Dawson, Reina, Wackeroth). Stefan Hoeche and collaborators offered Sherpa support to ATLAS and CMS
weak corrections are available at parton level in Madgraph5_aMC@NLO; extension to full EW corrections ongoing. Relevant for boosted regime (-8% correction)
Future theory developments (in tentative order of priority)
NLO top/Higgs decays
ttH signal/background interferences
General considerations: tt+jets MC modeling (especially tt+b-jets) is a dominant source of uncertainty in ttH(bb) analyses at 7+8 TeV . Run1 analyses are either based on LO ME+PS or inclusive NLO+PS MC, and in both cases the formal accuracy of tt+jets final states is only LO. On the one hand, tt+jets MC uncertainties should be reduced by means of state-of-the art NLO simulations. On the other hand, given the significant impact of MC uncertainties even at NLO, their estimate requires a transparent and theoretically motivated methodolgy. The issue of MC uncertainties is intimately connected to the methodology employed in the experimental analysis (jet-flavour categorisation, top-pT reweighting, other data-driven procedures,...) and to the subtle interplay between various levels of MC simulation (matrix elements, shower,...). In this context, as a starting point, it is highly desirable to identify and understand all essential aspects (theoretical and experimental) that are relevant for MC uncertainty estimates in ATLAS/CMS analyses, and to document them in a precise and transparent language that could facilitate the exchange between theory and experiment.
In the following we propose a first synthesis of tt+jets MC uncertainty issues emerged from the meeting. This includes also a detailed description of top-reweighting (in ATLAS) and other informations that have been collected after the meeting.
tt+jets categorisation for Monte Carlo uncertainty (MCU) estimate: tt+jets MC samples are split into a certain number of independent subsamples (tt+light-flavour, ttb, ttbb, ttc,...) that are defined in terms of the numbers of b- and c-jets (Nb,Nc) and/or the total number of jets (Nj). Top-decay products are typically not considered in this categorisation, and ATLAS/CMS employ different subsamples and different definitions of Nb,Nc,Nj (see the descriptions for what was used in Run 1 here). The various subsamples can be obtained from a single inclusive tt+jets generator or using dedicated generators for certain subsamples (e.g. for tt+b-jets). It is highly desirable that both experiments adopt a common categorisation approach, based on a proposal from the theory community. This requires a precise definition of:
This standard definition should be as simple as possible and should allow for a consistent assesment of MCUs (ideally with a clean separation of perturbative/non-perturbative effects). It should also facilitate comparisons among the various MC tools on the market.
Treatment of MCUs in experimental fits Normalisation and shape variations for each tt+jets subsample are represented in terms of independent nuisance parameters that are fitted to data together with the signal strength. Each theory uncertainty enters the fit as a prior distribution for the related nuisance parameter, and various MCUs (like the normalisation of tt+light-jets) are strongly reduced when MC predictions are fitted to data. Typically tt+HF subsamples feature the largest post-fit uncertainties. Moreover, due the limited shape separation between the small ttH(bb) signal and the large tt+HF background, the fit tends to constrain only their combination, which is dominated by tt+HF, while the signal component remains poorly constrainted.
MCU estimates in CMS, using inclusive LO ME+PS tt+jets sample (Madgraph):
Dominant sources of MCU in CMS: ttbb rate, top-pT reweighting, ttb and ttcc rates, MC statistics
MCU estimate in ATLAS using NLO+PS (Powheg+Pythia), ME+PS (Madgraph) and S-MC@NLO ttbb (Sherpa+OpenLoops) samples:
Dominant sources of uncertainties in ATLAS: ttbb rate, top- and ttbar-pT reweighting, ttcc rate (MC statistics is also an issue)
Top reweighting and related systematics. To compensate for the mismodeling of the top and ttbar pT distributions, MC simulations are reweighted with a pT-dependent correction factor derived from data. The reweighting is applied at the level of the unfolded top-pT distribution(s), which are derived from “data” using a migration matrix obtained from “pseudo data”. In the following, as an illustration of top-reweighting (and related systematics) we sketch the approach employed by ATLAS (see http://arxiv.org/abs/1407.0371). While ATLAS performs a double-differential reweighting of top- and ttbar-pT, here we consider only top-pT. The nominal tt+jets MC sample, generated with Powheg+Pythia, is passed through detector simulation and is used to determine a reconstructed top-pT distribution (pseudo-data). The relation between the top-pT distribution in pseudo-data and the corresponding distribution at MC truth level is encoded in the migration matrix. More precisely, MC truth corresponds to the top-pT in showered (or non showered?) parton-level ttbar events within the Powheg+Pythia.
The migration matrix is supposed to describe pT-distortions resulting from detector-smearing and acceptance cuts, and is also sensitive to QCD radiation effects due to the different QCD-radiation dependence of the top-pT at MC-truth and reconstruction level. The reconstructed top-pT is obtained from a kinematic likelihood fit on the events, where the jets/lepton/missing ET are fitted to the ttbar hypothesis and the different permutation of jets are checked. Events with low likelihood are cut out to remove non-ttbar background. For the events passing the cut the permutation with highest likelihood is taken and the hadronic top-pT and leptonic top-pT are extracted. The reconstructed top-pT is typically more sensitive to QCD radiation (and related uncertainty) wrt the MC-truth top-pT.
Finally, using the Powheg+Pythia based migration matrix, the top-pT distribution reconstructed from real data is converted into an unfolded top-pT distribution. The latter is used to reweight the Powheg+Pythia pT-distribution at MC-truth parton level by a factor rw(x_i,pT)=f(x_i,pT)/MC(pT), where MC(pT) is the MC prediction, while f(x_i,pT) denoted the unfolded distribution. The variables x_i=(x_1,x_2,...) parametrise the dependence of the migration matrix on the various relevant uncertainties, and x_i=0 corresponds to the nominal prediction.
Each independent experimental uncertainty (b-tagging, jet energy scale, etc.) is described by a corresponding x_i variation and a related variation of rw(x_i,pT). All sources of uncertainty could be in principle propagated to the full simulation (MC + x_i dependent reweighting + detector simulation + x_i dependent top reconstruction) in a correlated way, in such a way that x_i variations tend to cancel in the reconstructed top-pT distribution, and the latter always agrees with data within statistical uncertainties (note that the final ttH(bb) fit is based on reconstruction level!). However, since top-reweighting is based on 7 TeV data (and related detector calibration + uncertainties), x_i variations in top-reweighting and top-reconstruction (at 8 TeV) cannot be correlated. In practice, the nominal reconstruction (w.o.) x_i variations is always employed. This tends to overestimate x_i uncertainties.
The MC Generator uncertainty is encoded in a modified reweighting rw’(x_i,pT)=f’(x_i,pT)/MC(pT), where unfolded data f’(x_i,pT) are based on a migration matrix obtained from an alternative generator (MC@NLO). This reweighting is defined for (and applied to) the default MC prediction, MC(pT). The uncertainty associated to the rw-rw’ difference is not correlated since the alternative generator is never used for the simulation.
The ISR/FSR systematic is evaluated in a completely different way. Pseudo-data generated with two MC simulations with ISR/FSR variations (up/down) are unfolded with the nominal migration matrix, and the relative effect with respect to the central MC prediction is used as a systematic. Such ISR/FSR variations shift the unfolded top-pT(ttbar-pT) distribution by about 5%(15%). These variations are not correlated with corresponding ISR/FSR uncertainties of the tt+jets MC sample (for which only the nominal Pythia settings are used). Thus they do not cancel out when the reweighted sample is passed through detector simulation and the tops are reconstructed.
The reweighting of the inclusive top-pT distribution (and related uncertainties) is applied to all tt+n-jet subsamples in a fully correlated way. In particular, also tt+HF final states are reweighted with the same top-pT correction factor. This procedure is supported by the observation (in ATLAS MC studies) that, in tt+b-jets subsamples, reweighted Powheg+Pythia and Madgraph+Pythia predictions for the top/ttbar-pT distributions are in better agreement with S-MC@NLO ttbb wrt non-reweighted ones.
Electroweak contributions to ttbb: it was pointed out that pp->ttbb might receive significant tree-level EW contributions of order alpha^2*alphaS^2. This should be checked.
-- StefanoPozzorini - 27 Oct 2014