TWiki> Main Web>MyTopics>UnfoldingTopChargeAsymmetry (revision 28)EditAttachPDF

Review of TOP -21-XXX


  • CADI Line:
  • AN Note: AN-2021-069
  • HN Forum:
  • Stat Questionnaire: Link to statistics questionnaire
  • Combine workspace and data cards location (please include tag/branch if applicable and make sure it's readable CMS-wide):
  • Analysis code repository (please include tag/branch if applicable and make sure it'shere.

(Analysts: Please do not use .png format for plots on the twiki-- they do not show up on Safari browsers.)

Color code for answers to reviewer questions:

  • Green -- we agree, changes to analysis/documentation implemented.
  • Lime -- we agree, but the item hasn't been done yet. (Open item.)
  • Red -- we disagree, changes to analysis/documentation is not implemented.
  • Teal -- we agree, but we don't think any change to analysis/documentation is needed.
  • Blue-- authors/ARC/conveners need to discuss. (Open item.)

Explicit green-lights from experts

Category Name Status
Conveners   Not done.
PPD   Not done.
MUO Not done.
BTV   Not done.

Object Review

Comments from Conveners

comments from Jan and Matteo (Oct 12th)

  • In your selection, can a jet be both t-tagged and W-tagged? Or do you first check for t-tags and subsequently for W-tags?
    • The conditions for an AK8 jet to be t-tagged or W-tagged are exclusive of each other and a jet cannot fulfill both. We have added more information in the selection section to hopefully make this clearer.
Hugo to correct the definition of W and t tag

  • For clarification: for leptons you use an ID not requiring isolation (as it should be). Both are cut based; are there more powerful IDs around that you could use in case the analysis would profit (e.g. the ttbar reconstruction)?
    • We had compared with the MVA cut a while ago, and the cut based ID was better. Titas will check the exact MVA cuts right now. Cecilia thinks MVA based selection always includes the isolation and we thus not use it.

  • (303): MET: not too important now, but please don’t forget to change it to “missing transverse momentum” for the paper
    • Titas will add this

  • (315): HEM: is the PU distribution affected by this? Do you have control plots showing that? (This might affect more analyses, so please feel free to just refer to a check here)
    • We haven't checked the PU distribution, will point to a B2G paper (Titas).

  • Section 4 in general: having in mind the upcoming review process, it would be good if you could motivate the choices, categories, and strategy etc a bit better right in the beginning; right now, it is hard to follow. Some comments below are related to that

  • L 348: How do the AK8 jets and the AK4 selection relate? When do you pick which combinations?
    • We have changed the description of the event selection and categorization and hope this is clearer now. Only AK4 jets that are at dR>0.8 from a top or W tag are considered for jet assignment.

  • L357: Lepton 2D cuts. You claim that the QCD background is reduced strongly. Do you have control plots (in general a few more control plots would help)
    • Titas will add the reference to the 2D lepton cut section.

  • Fig.1: could you include more details in the relevant range (around 30, where you cut)?
    • Hugo will add plot to show below 30.

  • All figures: keep in mind that for the paper version of all plots you will need to increase the legend sizes (not needed for AN imho). The rest already looks good in terms of sizes etc. I would also recommend choosing different colours. A great way to do this is using .
    • Hugo will change: ttbar and others- two reds, green for Wjets and blue for everything else

  • L398: defining these categories (even if you don’t use them explicitly) in the beginning of Section 4 would help the flow of the whole section
    • We have moved them to the end of the event selection (Sec 4.1) before the kinematic reconstruction.

  • L401: in the boosted jet category, is it strictly necessary to require 0 W-tagged jets? Maybe you can check (for signal and main backgrounds) how many events have 1 t-tagged jet and one or more W-tagged jet. Maybe it’s totally negligible, but you may gain some events by being inclusive on the number of W-tagged jets in this category.
    • top and Wtag conditions are exclusive of each other (softdrop mass covers different ranges). We cannot have both at the same time. We also do not expect more than 1 hadronic top in our sample, and it would be either a boosted (top) or semi-resolved (W), not both. We would not expect to gain anything except some all hadronic candidate which we do not want in our sample.

  • Section 5: Also here, starting by stating that you are measuring in the full and fiducial phase space defined by XYZ would improve clarity
    • TBD

  • L442: you say that the priors for the nuisance parameters follow a log-normal distribution. This is correct for normalisation parameters, but in all other cases I believe one should use a normal distribution. Is there any specific reason for this choice?
    • Titas will fix this in the AN

  • L448: in the formula, where do you account for correlations between nuisance parameters (e.g. from years)?
    • We take the correlations across the years into account in the combine data card as well. Not sure how we will do this in the formula.

  • About the previous questions, have you had your data cards reviewed yet?
    • We will submit for reviewal

  • L455: In the next presentation in the TMP meeting, I think we should have a discussion about visible and full phase space again, and how exactly the extrapolation etc. is done.
    • yes, that will be helpful

  • Section 5.1: it is not quite clear why this is necessary. In the end you don’t have a choice w.r.t. ‘forward’ versus ‘backward’ in terms of binning, and also the binning is very coarse. This is also reflected in the high and continuous purity and stability, and the fact that you can easily do it without regularisation for the forward-backward unfolding. It is not clear from this section though, what happens to migrations between the mttbar bins, and how they relate. This would be the more important question and should be described.
    • Since we are calculating the Ac for both the "forward" and "backward" as one number, we are taking the migrations between the two into account. We initially included the section as all the work was done here, we can put it in an appendix if it is more suitable.

  • Figure 7: it looks to me like purity and stability are almost the same in every bin. I am not saying this is wrong, but it would be great if you could double-check this result
    • We have double checked this already and this is correct.

  • Table 13: when you say “Shape” you mean that the normalisation component is removed? I would advise against doing this in the visible phase space: the uncertainty should not have a normalisation effect in the full phase space, but due to acceptance effect you will have a genuine normalisation effect in your visible phase space
    • Hugo will fix this table
  • In the same table, hdamp is classified as a “Normalisation” uncertainty. Is it a mistake in the table?
    • Hugo will fix this

  • Do I understand correctly that the JES are still not split in this version of the fit? What else is missing?
    • We are running the JEC uncertainty based on the different sources - there are more than 10, so it will take a few more weeks. JER is missing from the note as well, but we already have it and it will be added.

  • Which distributions exactly are used as input to Combine? These should be in the main body of the AN
    • the delta|y| is used as an input. Hugo: The plots will be added, and we will be mentioning it after Sec 4.3

  • It looks to me like the MC stat uncertainty is quite large in your fit. This is clear from the impact plots, and most likely is also responsible for the strong constraint on hdamp. Would that be possible to a) See the effect of hdamp, on your fit distributions (nominal vs up vs down), including the MC statistical uncertainty b) Re-bin some of the distribution in order to reduce the effect
    • hdamp up/dowm plots will be added. We have only two bins. Rebinning is not an option

  • Why is the effect of the top pt re-weighting two-sided? With one-sided uncertainties it’s common practice to set the down (or, equivalently, up) variation equal to the nominal. In this way the impact will be one sided, and one side of the uncertainty will be unconstrained
    • Yes, we will fix this.

comments from Robert Schoefbeck

  • L73 Ref to the theory predictions?
    • added

  • L104 13 TeV I hope ...
    • fixed

  • Sec 4.3.2. / 4.3.3 - Please explain if/how the measurement regions for the top and W SF differ from the analysis selection and whether or not it is a concern.
    • The description of the control regions used for the mistag measurement is available in Appendix C. The selection is the same as for the signal region except for an inverted cut on the leptonic term of the chisq discriminator and a veto on b-tagged jets. We believe that the regions are similar in phase-space as to be relevant and at the same time exclusive of the signal region because of the inverted chisq cut and the btag veto.We have added some text in the systematic section to make it easier to follow. This method has been used before and we are confident that it is of no concern.

  • Sec. 4.4 I assume the 2017 Met EE fix is applied?
    • yes our analysis selection uses jets with pT>50 GeV and in range [-2.5,2.5] eta. So this is taken care of.

  • L338 What motivates the dR>1.2 cut (as opposed to 0.8)?
    • this is a typo, now fixed to 0.8

  • L402 please fix
    • fixed

  • Unfolding: Are weight-based inefficiencies accounted for in TUnfold? Please see AN-19-227 Sec. 10 for a detailed explanation. (Please not that a perfect closure when unfolding the MC to itself is NOT a check if the weight-based inefficiencies related to reco level objects are incorrectly applied at the parton level where the closure is checked; please clarify what is done in Fig. 6 in this respect)
    • We are following the procedure used in AN-17-130 and believe it is taking correct care of the weight-based inefficiencies related to reco level objects as we never cut any MC event, the weights are just carried along and might be very small but all events are kept.

  • Please compute the condition number of the unfolding matrix (see Statcom twiki).
    • done, as recommended by the Statistics Committee, we use the singular values instead of using Condition() method. The conditional number is ~10 for all the responses matrices, which together with the very small tau values, allows us to proceed without regularization. We updated the AN.

  • Figure 4 - please add statistical error bars.
    • done

  • Is Fig. 4d from 2017? The lumi value says otherwise.

    • Please remember that the 2017 electron channel does not include Run 2B and therefore has less integrated luminosity. It just happens to be very close to the 2016 luminosity, but not exactly the same. We have added a note in the section to remind the readers of this and hopefully avoid confusion.

  • Please indicate in the caption which row is which. In Fig. 4b,d the stability is asymmetrical, but in opposite ways. Why is it so? Is it significant? Is it numerically relevant for the result?
    • We have improved the caption. We do not know why the up/down variations are not symmetric in the last bins but we have very few events there and also, given the statistical errors, the impact is likely not significant.

  • L471 please fix
    • done

  • L480 Numbers should be justified, i.e., based on previous measurements and, maybe, inflated with extrapolation uncertainties to your measurement regions. Alternatively, can you consider measuring the background normalization in-situ by letting the nuisance float?
    • We use the same prescription as used in AN-2017-130 and assign a 30% rate uncertainty to all MC-derived backgrounds (all in our case, including the ttbar dilepton and all hadronic). We have added a reference to the note. We are not fitting for the background rates in this analysis.

  • L485 Please update to current recommendations with partial correlations.
    • done and AN updated

  • L496, 503. Afaik the Muon SF are partially correlated, e.g. the statistical component of the SF uncertainty is uncorrelated but other sources are not. Please specify whether this is so for the high pt muon ID. Please provide pointers to the exact SFs you are using.

  • We either need to see the impacts for the missing uncertainties or a study that shows they are negligible. Otherwise, we can't know whether the strategy is feasible.
    • We show here the impacts for the entire sample, and separately for boosted, semi-resolved and resolved. As you can see, top pT dominates in all, but top tag is important in the boosted one, where you expect it to be, and not in the resolved.

* Impacts for all combined

* Impacts for resolved

* Impacts for boosted

* Impacts for semi-resolved

  • L567, Fig. 9 As said during the talk, the constraint on top pt should be understood. For example, if there is a mismodelling of the acceptance in one of the categories, it could easily lead to a bias/constraint because your measurement regions roughly correspond to top-pt. I think a dedicated study of the effect of top-pt reweighting is needed.It looks like your pre-fit top_rew uncertainty is already substantially different between e+jets and mu+jets. This should be explained,I see no reason for that.Looking at the top pt shapes in the backup (thank you for adding these), I do not find the pulls so surprising. Can you show a reweighted top pt spectrum and e.g. also bin the reweighted shapes in the analysis categories?

* We have plotted the top pT for the leptonic and the hadronic top in the muon and the electron channel in 2018 to compare data with the reweighed MC with its error, which is given as symmetrized difference between the MC ttbar pT with and without the ptreweighing correction. As can be seen, the description and the error appears adequate in our signal selection (chisq<30 and Mttbar>900 GeV), however, the correction has a trend at high top pT where it is not sufficient as preferred by the data.

  • Hadronic top pT for the 2018 muon signal sample:

  • Leptonic Top pT for the 2018 muon signal sample:

  • Hadronic Top pT for the 2018 electron signal sample:

  • Leptonic Top pT for the 2018 electron signal sample:

* We show below a comparison between the data and MC with and without the top pT reweighing in the 3 regimes: resolved, Boosted and Semiresolved. As you can see, the correction helps in all cases and the uncertianty covers the difference between the corrected MC and the data. Keep in mind that in the combined sample the main contribution comes for the resolved regime, then the boosted and a small contribution from the semi-resolved.

  • ttbar pT Boosted:

  • ttbar pT Resolved:

  • ttbar pT Semi-resolved:

  • I suspect hdamp could provide a substantial uncertainty, will be interesting to add it.Once the missing systematics are there, it will be good to show the systematic correlation plot to learn more about the important players.
    • It does not seem to be very important as you can see in the file below that shows the main systematics for the ttbar signal. We also include the correlation plots
* Systematics for signal

  • hdamp and toppT systematics:

  • correlations:

  • Fig. 71p How is it possible that toptagUp has zero effect (Does it mean the plot is dominated by dileptonic top)?
    • Our candidate sample is dominated with the resolved top sample, we do consider ttbar dilepton and all-hadronic as backgrounds

  • Fig. 72p Does the size of the variations make sense with the top tag SF uncertainties? Please comment on that in the main body of the AN.
    • We believe the answer to your question is the same as above though, our candidate sample is dominated with the resolved top sample.
  • Please comment on ultra-legacy usage for later / paper
    • We have been approved to stay with the current samples for this publication, will move to ultra-legacy and extend to lower Mttbar regions later

  • before Eq.1 - a short discussion of the BSM models predicting modifications of Ac should be added.

    • Done

MC concerns

comment from Enrique Palencia Cortezón:

  • I suggest that for 2016 ttbar and single top, you use the CP5 samples.
    • we switched to the CP5 samples for 2016 and updated the AN accordingly

Trigger concerns

comment from Charis and Nicolò:

  • In order to ease our review and the book-keeping of all the analyses reviews, we would ask you to fill the questionnaire in the TOP trigger TWiki, in particular listing the trigger paths you are considering, the scale factors you are applying, and the relevant AN where we can find the details. fill out this questionnaire:
    • Done

Follow up from questionnaire:

  • For the muon triggers/trigger SFs we agree that the strategy is fine since you are following the recommendations and using the appropriate centrally provided SFs.

  • For the electron cross trigger SF derivation we are a bit confused by the approach you are following, since you are referring to it as 'Tag-and-Probe'. What we would like to understand is : - Are you indeed implementing a T&P approach? If this is the case we would like to understand which is the tag/probe/passing probe selection and what peak do you reconstruct in eμ events?
    • In our method that uses the ttbar e mu channel, the tag is the muon and the probe is the electron. Both leptons pass the tight cuts we use in our candidate sample selection. We do not look at the peak in the sense of the Z to dilepton tag & probe method, but we show with the plots in the appendix that the resulting sample is a very pure ttbar e, mu dilepton sample, which means that the "probe" electron should pass the electron trigger.

  • Could you be using the orthogonal dataset approach? We understand that you are using an orthogonal, to your analysis region, set of events (eμ di-lepton events) and a reference trigger path (HLTMu50) to determine the trigger efficiency. Is this the case? If so, what dataset are you using to determine the trigger efficiency, SingleMuon or SingleElectron?
    • We use the SingleMuon dataset and the muon trigger to measure the electron trigger efficiency. We changed the description in the AN to orthogonal sample to make it clearer and we added the information that we use the Single Muon dataset.

  • Also, taking a look at the trigger efficiency and SF plots we observe that there are many bins with very large uncertainties.
    • Is this a result of low stats? Have you tried further optimizing the SF binning?
    • For events that fall in the empty bins what is the SF and corresponding uncertainty that you apply?
    • Do these large electron trigger SF uncertainties have a large/ high ranking impact in your final fit?
      • We received this comment from the eGamma POG, and they suggested to merge bins and then see if we needed to add systematic uncerainties. We are in the process of doing this, but we do not expect the result to be affected, at the most the trigger uncertainty will be smaller but the SF central value will not change.

JET/MET concerns

comments from Ashley and Mikael

  • You need to fill out the survey on jet/MET use before we can begin the review [1], to save yourself time please check that what you did is consistent with the recommendations for JER [2] and JEC [3] and explain if there is any variation, as that is what I will ask about.
    • Filled out the survey and we conform that we have no departures from the recommendations




  • L 184 : “pileup-hadrons” -> “pileup candidate” hadrons to be consistent with previous description on L 180 (same comment for L 254)
    • made the correction and updated AN

  • L 263 : Please explicitly state the JER version used as you did for JEC on L 261
    • Summer16_25nsV1*(2016v3), Fall17_17Nov2017_V3_* (2017), Autumn18_V7_* (2018). AN has been updated

BTV concerns

questions/comments from Jan and Denise:

  • Please make sure that you also apply nJets/HT-based corrections to your phase-space (omitting b-tagging requirements) as explained in our TWiki for the discriminator reshaping method [1]. Side note: If you are using UL samples, that effect might be negligible, we have not investigated this with UL. In the case of the Re-reco samples, these corrections have to be applied
    • Yes, we are applying the 2D correction (nJets/HT) to all events that pass our events selection and reconstruction as instructed. This has been documented on an Appendix

  • When using dedicated tagging SFs for your AK8 jets, remove the AK4 subjets of these jets from the collection of jets that is considered for deriving your AK4 b-tagging SFs.
    • Yes, we don't take into account the AK4 subjets for deriving AK4 b-tagging SF's.

  • In l. 194, you say you are using the btag shape reweighting method. The description of your systematic uncertainties in l. 370 then does not match your chosen reweighting method. For this method, there are in total 8 uncertainties that are applied to jets of all flavors. The recommended correlation scheme is detailed here [2].
    • Yes, we are applying the btag shape reweighting method with all 8 uncertainties (cferr1, cferr2, hf, hfstats1. hfstats2, jes, lf, lfstats1, lfstats2) and taking into account the correlation. The AN has been updated with more details

  • As soon as you obtained your results, we would like to have a look at the impact plot as well, to make sure that no unexpected behavior is observed for the b tagging nuisance parameters. Please consider filling in the TOP BTV questionnaire by then as well [3].
    • We added an appendix with all the details and filled out the questionnaire in [3]




EGamma concerns

questions comments from Alessia and Mohsen

*Can you confirm that the ID that you are using is indeed the recommended Fall17v2?

Yes. we are using the recommended Fall17v2 ID

  • Can you comment on the procedure that you use to remove the isolation cut from the ID definition?
    • Isolation can be applied separately from the other ID cuts and we do not apply any isolation requirement (track or calo-based) offline. We also use a trigger that does not include any isolation requirement. However, we realized that the SF is not centrally provided without isolation and remeasured it ourselves. AN was updated with the study and the SF were shown at the EGamma POG meeting on July 2nd.

  • Are you applying the reconstruction scale factors? You only mention the trigger and ID scale factors, but you should also apply the reco ones as specified here :
    • yes, we are applying the reconstruction scale factor. AN has been updated to make this clear.

  • About the electron trigger SFs, I see that you derive them yourselves, but they should be approved by the EGM POG first. In this case, if not already done, you should present the SFs in the Egamma Reco/Comm/HLT meeting and get green light from them.
    • We presented our results on June 11th and they were approved. They did comment though that we need to measure the ID SF without isolation ourselves, which we did ane presented at the July 2nd meeting. They have been documented in the AN.

MUON concerns

  • you are using the high Pt ID, but this selection does not use the Particle-Flow algorithm. From this twiki page you can read: "The High-pT selection does not use the Particle-Flow algorithm. Please consider this option ONLY if you do not use the Particle-Flow event description in your analysis. If you do, start from the Loose (or Tight) ID and then consider possible addition (or removal) of further quality cuts." It is also true that your pT range is not that high, so using PF might be fine in your case, but we just would like to know whether you considered this or not.
    • We use Particle-Flow for all objects except for the muon. This has been made clear in the object ID section of the AN

  • It seems to us that you are not applying any ISO SFs. Since you don't have a standard ISO cut, the official ISO SFs are not suitable for your analysis and you should compute these SFs yourselves, then write this to the MUO POG.
    • Our 2D cut is really a topological cut and not an isolation cut. It has been studied with 2016 data using a ttbar dilepton (e, mu) sample and the SF was consistent with 1. See AN-2015-107, Section 5.2

  • Is there a reason why you are not considering reconstruction SFs? Are they negligible? * reconstruction SFs are negligible but we are applying them. The AN has been updated.

  • Are you applying any additional uncertainty to cover the phase space extrapolation (Zs-to-ttbar) in the SF computation? You can apply 0.5% per muon on the ISO component, following results in * We looked at the note you suggest and the 0.5% for muons is specifically for the extrapolation of the isolation component of the SF. Because we do not apply any isolation cut on our muon, we think this is not applicable.

CWR comments to the paper (TOP-21-XXX-paper-vxx)

Comments to the pre-approval talk (AN-2021-069)

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng correlations.png r1 manage 1229.3 K 2021-07-11 - 23:06 UnknownUser hdamp and toppT systematics
PNGpng impacts_all.png r1 manage 244.5 K 2021-07-12 - 19:01 UnknownUser  
PNGpng impacts_boosted.png r1 manage 242.9 K 2021-07-12 - 19:17 UnknownUser  
PNGpng impacts_resolved.png r1 manage 244.6 K 2021-07-12 - 19:17 UnknownUser  
PNGpng impacts_semiresolved.png r1 manage 251.3 K 2021-07-12 - 19:08 UnknownUser  
PNGpng pT-had-electron-2018.png r1 manage 65.7 K 2021-07-09 - 01:18 UnknownUser Hadronic Top pT for the 2018 electron signal sample
PNGpng pT-had-muon-2018.png r1 manage 64.3 K 2021-07-09 - 01:14 UnknownUser Hadronic top pT in the 2018 muon signal sample
PNGpng pT-lep-electron-2018.png r1 manage 60.9 K 2021-07-09 - 01:19 UnknownUser Leptonic Top pT for the 2018 electron signal sample
PNGpng pT-lep-muon-2018.png r1 manage 64.4 K 2021-07-09 - 01:16 UnknownUser Leptonic Top pT for 2018 muons signal sample
JPEGjpg systematics.jpg r1 manage 832.8 K 2021-07-11 - 22:52 UnknownUser files
JPEGjpeg systemtics2.jpeg r1 manage 263.3 K 2021-07-11 - 23:06 UnknownUser hdamp and toppT systematics
PNGpng ttbarpTBoosted.png r1 manage 140.4 K 2021-07-11 - 23:17 UnknownUser ttbar pT
PNGpng ttbarpTResolved.png r1 manage 256.0 K 2021-07-11 - 23:17 UnknownUser ttbar pT
PNGpng ttbarpTSemiresolved.png r1 manage 181.3 K 2021-07-11 - 23:18 UnknownUser ttbar pT
Edit | Attach | Watch | Print version | History: r40 | r30 < r29 < r28 < r27 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r28 - 2021-10-13 - unknown
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback