Review of B2G -AN17-247

Documentation

Color code for answers to reviewer questions:

  • Green -- we agree, changes to analysis/documentations implemented.
  • Lime -- we agree, but the item hasn't been done yet. (Open item.)
  • Red -- we disagree, changes to analysis/documentation is not implemented.
  • Teal -- we agree, but we don't think any change to analysis/documentation is needed.
  • Blue-- authors/ARC/conveners need to discuss. (Open item.)

AN-2017-247 (Di-leptonic)

Documentation

Explicit green-lights from experts

Category Name Status
Conveners   Not done.
PPD   Not done.
GEN   Not done.
TRIG   Not done.
EGM   Not done.
MUO   Not done.
TAU   Not done.
JME   Not done.
BTV   Not done.
TRK   Not done.
STAT   Not done.

Group Review

Comments from Devdatta (12/18/17) on AN-2017-247 V4


Is the ANv4 updated with the responses? I do not see Table 8 in the new AN.

Yes, the analysis note is updated with the responses. The Table 7 shows the same information as Table 8 (below in the response), except the test results correspond to statistical error only, and therefore present most pessimistic outcome. This is now explicitly stated in Table 7 caption. We can, of course, include Table 8 in the next version of AN if desired.

On using ST vs mass, it should be driven by your sensitivity, evaluated using the full stat+syst errors. Did you check that?

The sensitivity study, the results of which are shown in Figure 17, already includes full stat+syst errors.

I did not understand he response "Our approach is to include background region together with the signal region in the limit derivation procedure. This way, background control region helps to constraint both, background normalization and shape." --> Do you mean to say that both regions are signal regions, or are you saying that you will use the DR > 2 region to constrain your data/MC in the signal region?

The expected limits are obtained using six signal and three background categories. The six signal categories are [Boosted (DR<1), Non-boosted (1<DR<2)] x [mm, ee, em]. The three background categories are [DR>2] x [mm,ee,em]. All these 9 categories are used in the fit, results of which are shown in Section 10. Including background categories help to constrain data/MC in the signal region.

Comments from Jim (12/03/17) on AN-2017-247 V3


The object section (Section 4) generally requires more detailed information in order to be reviewed quickly by the object experts. Some examples:

Section 4.1: Reference is made to the general muon ID twiki but no reference is made to the SF twiki or to the specific SF file version numbers https://twiki.cern.ch/twiki/bin/view/CMS/MuonReferenceEffsRun2 https://twiki.cern.ch/twiki/bin/view/CMS/MuonWorkInProgressAndPagResults https://gaperrin.web.cern.ch/gaperrin/tnp/TnP2016/2016Data_Moriond2017_6_12_16/JSON/RunBCDEF/EfficienciesAndSF_BCDEF.root https://gaperrin.web.cern.ch/gaperrin/tnp/TnP2016/2016Data_Moriond2017_6_12_16/JSON/RunGH/EfficienciesAndSF_GH.root

Section 4.2: L176 The electron SF are updated on occasion. The specific SF used should be identified or defined.

Section 4.3: L185 Jet ID definitions also change so perhaps reference the specific twiki revision used (https://twiki.cern.ch/twiki/bin/view/CMS/JetID13TeVRun2016?rev=9) L198 I would also give the specific global tag or db file names for the JEC

Section 4.4 Please give the b-tag SF file name (CSVv2_Moriond17_B_H.csv) L217. Which b-tag SF method listed here was used https://twiki.cern.ch/twiki/bin/viewauth/CMS/BTagSFMethods? Please provide some details.

Done. Detailed information has been added for all object ID and SFs to the AN v4.

Comments from Devdatta (12/04/17) on AN-2017-247 V3


- On the background strategy, like we discussed in the meeting, I am still slightly worried about the closure. For, e.g. 18 (upper left and right) shows some discrepancy, which could be because of the DY component, or single top component? Same can be seen in your numbers in Table 8. Also, can you confirm that the ttjets/ DY ratio is the same for DR < 2 and DR > 2 regions?

Tests results shown in Table 8 are performed using histograms with statistical errors only. If we include both statistical and systematic errors, the numbers improve considerably, and indicate that the deviations are covered by systematic uncertainties:

Here is TTjets/DY/Stop composition (in %) of the background in the signal (DR<2) and background (DR>2) region.

  mumu ee emu
DR < 2 90.4 / 6.8 / 2.7 91.4 / 5.2 / 1.4 95.8 / 0.2 / 3.7
DR > 2 80.6 / 12.6 / 6.1 79.7 / 13.3 / 6.3 92.0 / 0.5 / 6.9

- Fig 8 (upper left) is lepton and not jet?

Fixed.

- Line 322: "We use ST variable to extract limits on heavy tt resonance production cross section. " If the mass variable performs as well, why choose ST? The ST is has more dependence on the top pt reweighting than the mass.

Data-MC agreement improves with top pt reweighting for the mass variable too. However, it's true the reweighting benefits ST more than it benefits the mass variable. We are open to using the latter in the fits.

- Fig 17, 21: Why does the Z' of 10% width loose sensitivity faster than the other models at high masses?

The observed faster loss in sensitivity for Z' of 10% can be explained by the lineshape of the resonance mass shown below. Z' of 1% retains resonance structure for all mass values up to 5 TeV. For Z' of 30% width, loss of resonant structure already occurs at e.g. 4 TeV, lineshape of which is similar to the 5 TeV one. Whereas for Z' of 10% width, there is quantitatively large change from 4 TeV to 5 TeV.

- On the statistical treatment, we discussed in the meeting the possibility to constrain the background normalization using the DR > 2 region. Did you try this out? This may improve your background prediction and make you less reliant on the Monte Carlo.

Our approach is to include background region together with the signal region in the limit derivation procedure. This way, background control region helps to constraint both, background normalization and shape.

Comments at B2G RES meeting 11/24/17 (talk)


How are three ttbar MC samples combined? Is care taken to correctly normalize each mass region?

We use three ttbar POWHEG samples listed on p5 of the presentation: 1) Inclusive TT, 2) TT_Mtt_700to1000, 3) TT_Mtt_1000toInf. To cover Mttbar<700 GeV region, we use events from sample 1) applying Mttbar<700 GeV cut at the parton level. The cross-section used to normalize the obtained sample is 831.8-80.5-21.3=730 pb. To cover 700 GeV < Mttbar < 1000 GeV region, we use combination of sample 2) and events from sample 1) passing 700 GeV < Mttbar < 1000 GeV cut at the parton level. The cross-section used to normalize this mass region is 80.5 pb. To cover Mttbar > 1000 GeV region, we use combination of sample 3) and events from 1) passing Mttbar>1000 GeV cut at the parton level. The cross-section used to normalize this mass region is 21.3 pb.

Show distributions of pt^rel to illustrate choice of pt^rel>15 GeV cuts on leptons

The plots below show pt^rel distributions for the leading and subleading leptons in three channels. With pt^rel>10 GeV cut, we see slight excess of data over bkgd at low pt^rel end, indicative of presence of small QCD contamination. Therefore pt^rel>15 GeV seems safer choice against QCD background.

pt^rel > 10 GeV

mumu ee emu
Leading lepton
Subleading lepton

pt^rel > 15 GeV

mumu ee emu
Leading lepton
Subleading lepton

Including original Control Regions in the note.

Done. AN17-247 v3 has now original background Control Regions, CR1 and CR2 described in Appendices.

Is top pt reweighting necessary?

Our observation is that top pt reweighting does have sizable impact on data/MC agreement. This is discussed in dedicated Section 5 of the AN. We have also performed statistical tests for the distributions before and after reweighting. Results of the statistical tests, which support these observations, are shown below.

Perform tests with additional relative normalization factor between background and signal region in the statistical interpretation.

Done. The results of the test are shown below. Expected limits are stable and Nuisance behavior with Asimov data does not reveal any special features. We have to, of course, see how data behaves after unblinding.

No additional relative normalization factor between signal and background regions.

With additional relative normalization factor (revnorm) between signal and background regions.

Comments from Annapaola (On AN-17-047_v0)


l 61-69: The recipe from TOP PAG is valid only for top pt < . Do you apply the reweighting beyond this point? If so, how do you do it? From the text I assume that the reweighting is applied to nominal distributions and used for this search. However I could not find plots comparing the behaviour of MC vs data with and without this correction. Please, can you add these information to the text and explained why you do want to apply it for your search, instead of considering an uncertainty on it?

Yes, the recipe is valid for top pt <800 GeV. We apply reweighting above this value too. However, less than 0.5% of ttbar events have top pt>800 GeV. The top pt reweighting is our default, and now it is described in dedicated Section 5, which shows distributions with and without top pt reweighting applied. Observation is that the reweighting improves data/MC agreement.

l 81-82: in which steps of mass signal samples are produced?

For Z' signal with 1% width and 10% width, as well as for KK gluon signal, samples are produced in mass range of 500 GeV to 5 TeV with 500 GeV steps. There are also additional samples at 750 GeV and 1250 GeV mass points. For 30% width Z', samples are produced at mass points of 1 TeV, 2 TeV, 4 TeV and 5 TeV. Signal MC samples are listed in Table 3 of the AN.

l 102: “at least one reconstructed good primary vertex”

Done.

Figure 2: Please, can you show the distribution before reweighting is applied for comparison?

The distributions below show nPV in the preselection sample for emu channel before and after PU reweighting. The ee and emu channels show the similar trend: reweighting improves data/MC agreement.

No PU reweighing With PU reweighting, using sigma(MB)=69.2 mb

l155-156: How was optimised the 2D cut?

The 2D cut is chosen such that the sample is QCD background free. Going below pt^rel=15 GeV lets QCD events in our preselection sample.

l159: In which way this uncertainty is assigned? Does it come from POG recommendations?

Muon trigger and ID and uncertainties of 0.5% and 1% are taken from MUON POG twiki MUON POG twiki We don't apply "standard" isolation requirement, rather use 2D requirement: deltaR(l,jet)>0.4 or ptrel>15 GeV. To this 2D requirement, we associate 1%/lepton uncertainty which is less than typical 0.5% recommended for "standard" isolation requirement.

l 195: please, can you explicitly add the list of cuts employed? This will help out with the review

Done. Section 4 lists preselection cuts in more concise manner.

l 253: It is not clear from the text, what is the reason to apply such a criteria to assign the event to one category or the other. Please, can you elaborate more on this?

Events with three leptons of mixed flavor mme (mee) can pass selection of mm (ee) and em channels. In order not to double count such events, we have to make a decision and assign these of only one channel. We choose to make assignment to em channel, since this channel has larger branching fraction compared to mm or em channels separately.

l 270- 276: Please , can you expand a bit more this section? For instance, how is it calculated exactly the Mass variable?

Done. Mass variable is more explicitly described on Section 5.

Section 4.1: Can you, please, explain better how the agreement between data and MC in these CRs can assure us that we have good modelling of backgrounds? The phase space is actually quite different wrt signal region.

We have redefined our background CR. Now we start from the same preselection for signal and background but employ sumDeltaR variable to define signal-enriched and background-control regions. This is described in Sections 6 and 7 of the updated AN.

l 303: Where do these numbers come from? Please, can you add a reference or an explanation?

16% and 15% uncertainties on cross-section for the single top and diboson productions are based on CMS measurements of these processes. Corresponding references have been added to the description in Secction 8.

Are you using muon and electron trigger, ID and isolation uncertainties provided by POGs?

Muon trigger and ID and uncertainties of 0.5% and 1% are taken from MUON POG twiki. Electron trigger HLT_DoubleEle33_CaloIdL efficiency SF is derived in this analysis (Section of the AN) and associated systematic uncertainty of 2% (1% per electron leg) is extracted based on the observed statistical errors and on SF variation vs. deltaR(l, jet). Electron ID uncertainty of 1% is larger than the statistical errors on SFs provided by EGamma POG

l251: what do you mean by “taking RMS deviations in the acceptance and physics observables?”

PDF uncertainty procedure uses the +/- RMS (68%CL envelope) of the 100 MC replica weights as up and down shift per recommendation in PPD talk by J. Bendavid

l 376: As above, I do not understand if the reweighting is applied as default. Also, SFs are defined by the POG only in a certain pt range, what do you do beyond?

Now text explicitly states (in Sections 5 and 8 of AN) that the reweighting is the default. As explained above, less that 0.5% events fall in the region above pT=800 GeV. We apply reweighting for these 0.5% events too.

l 407: In the event selection section it was not clear that you were defining Delta R as the sum of two delta R, please, can you elaborate on this in the text? Also, I think there is a typo here, because the two addends are identical.

This is fixed now. sumDeltaR is defined as the sum of two deltaRs: minimum deltaR between the leading lepton and it's closest jet + minimum deltaR between the subleading lepton and it's closest jet. The variable is described in Section 6.

Now that you have decided which is the best variable for the final statistical analysis, how do you plan to exploit the other variables for the selection?

In updated version of AN, we are apply cut on sumDeltaR variable and study sensitivity of ST and mass variables.

A general comment: as discussed during the presentations of the analysis at the RES meetings, it would be good to think about a data-drive estimation of backgrounds, you can think about using your CRs and additional regions to perform a simultaneous fit for instance.

We now have redefined background control region. We also use background CR in simultaneous fit together with the signal region. This is described in Section 9 of the updated AN.

-- IaIashvili - 2017-11-21

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng bkgd_KsChi2_StatSys.png r1 manage 48.7 K 2017-12-08 - 23:56 IaIashvili  
PNGpng brazil10.png r1 manage 176.9 K 2017-12-03 - 03:30 IaIashvili  
PNGpng brazil10_revnorm.png r1 manage 176.8 K 2017-12-03 - 03:43 IaIashvili  
PNGpng btag_eff_Pt_b_ll_individual.png r1 manage 161.6 K 2018-01-03 - 20:20 IaIashvili  
PNGpng btag_eff_Pt_b_ll_individual_stop.png r1 manage 164.7 K 2018-01-03 - 20:20 IaIashvili  
PNGpng correlation_asv.png r1 manage 273.6 K 2017-12-03 - 03:30 IaIashvili  
PNGpng correlation_asv_revnorm.png r1 manage 274.0 K 2017-12-03 - 03:43 IaIashvili  
PNGpng ee_lep0perp_7ptrel10.png r1 manage 158.2 K 2017-12-01 - 02:03 IaIashvili  
PNGpng ee_lep0perp_7ptrel15.png r1 manage 160.0 K 2017-12-01 - 02:16 IaIashvili  
PNGpng ee_lep1perp_7ptrel10.png r1 manage 139.1 K 2017-12-01 - 02:04 IaIashvili  
PNGpng ee_lep1perp_7ptrel15.png r1 manage 140.8 K 2017-12-01 - 02:16 IaIashvili  
PNGpng em_lep0perp_7ptrel10.png r1 manage 143.7 K 2017-12-01 - 02:03 IaIashvili  
PNGpng em_lep0perp_7ptrel15.png r1 manage 144.0 K 2017-12-01 - 02:16 IaIashvili  
PNGpng em_lep1perp_7ptrel10.png r1 manage 131.7 K 2017-12-01 - 02:04 IaIashvili  
PNGpng em_lep1perp_7ptrel15.png r1 manage 130.5 K 2017-12-01 - 02:16 IaIashvili  
PNGpng em_nPV_noPUreweighting_noPUuncert.png r1 manage 141.2 K 2017-12-01 - 20:49 IaIashvili  
PNGpng em_nPV_noPUuncert.png r1 manage 143.4 K 2017-12-01 - 20:49 IaIashvili  
PNGpng gkk.png r1 manage 160.8 K 2017-12-08 - 23:18 IaIashvili  
PNGpng mm_lep0perp_7ptrel10.png r1 manage 159.6 K 2017-12-01 - 01:19 IaIashvili  
PNGpng mm_lep0perp_7ptrel15.png r1 manage 155.9 K 2017-12-01 - 02:16 IaIashvili  
PNGpng mm_lep1perp_7ptrel10.png r1 manage 136.5 K 2017-12-01 - 02:04 IaIashvili  
PNGpng mm_lep1perp_7ptrel15.png r1 manage 135.9 K 2017-12-01 - 02:16 IaIashvili  
PNGpng nuisance_asv.png r1 manage 96.5 K 2017-12-03 - 03:30 IaIashvili  
PNGpng nuisance_asv_revnorm.png r1 manage 101.1 K 2017-12-03 - 03:43 IaIashvili  
PNGpng toppt_reweighting_tests.png r1 manage 77.3 K 2017-12-02 - 23:57 IaIashvili  
PNGpng zp1.png r1 manage 111.7 K 2017-12-08 - 23:18 IaIashvili  
PNGpng zp10.png r1 manage 150.9 K 2017-12-08 - 23:18 IaIashvili  
PNGpng zp30.png r1 manage 148.1 K 2017-12-08 - 23:18 IaIashvili  
Edit | Attach | Watch | Print version | History: r19 < r18 < r17 < r16 < r15 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r19 - 2018-01-03 - IaIashvili
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback