# Difference: EXO19010 (1 vs. 66)

#### Revision 662019-09-10 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

>
>
• AN version under discssuion in September 10th ARC meeting: here

Line: 36 to 37
Current version of the AN: here
>
>
AN version under discssuion in September 10th ARC meeting: here

### ARC action items from July 25 meeting

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 1262 to 1265

 META FILEATTACHMENT attachment="tfGaus_ZtoMuMu_NLayers5.png" attr="" comment="" date="1567007928" name="tfGaus_ZtoMuMu_NLayers5.png" path="tfGaus_ZtoMuMu_NLayers5.png" size="18835" user="bfrancis" version="1" attachment="tfFlat_ZtoMuMu_NLayers5_v2.png" attr="" comment="" date="1567010202" name="tfFlat_ZtoMuMu_NLayers5_v2.png" path="tfFlat_ZtoMuMu_NLayers5_v2.png" size="16846" user="bfrancis" version="1" attachment="AN_18_311_currentVersion.pdf" attr="" comment="" date="1567515552" name="AN_18_311_currentVersion.pdf" path="AN_18_311_currentVersion.pdf" size="14257372" user="bfrancis" version="2"
>
>
 META FILEATTACHMENT attachment="AN_18_311-3.pdf" attr="" comment="" date="1568122430" name="AN_18_311-3.pdf" path="AN_18_311-3.pdf" size="14600392" user="bfrancis" version="1"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 652019-09-05 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 128 to 128
AN Table 22 question. I do not understand the answer. It seems like you are agreeing with me. Maybe I don't understand Table 22. Let's just take the muon case for now. You want to measure the probability that a muon passes the lepton veto. You basically just measure the fraction of probe tracks that pass the lepton veto. According to Table 22, you measure the fraction of probe tracks that pass the criteria minDeltaR_track,muon > 0.15 and missing outer hits > 2. This is P_veto. However, the actual veto that you apply in the signal region is the 5 requirements in Tables 20-21. So, why don't you measure P_veto for muons by measuring the fraction of probe tracks that pass the criteria of all 5 requirements in Tables 20-21.
Changed:
<
<
We are including all of the signal requirements in the background selections.
>
>
It seems as if you might be correct that you don't understand Table 22, which we agree is probably misleading. We apply all of the signal selection criteria (including those for the leptons you were asking about) except for those criteria listed in Table 22, which are inverted to allow the calculation of the conditional probability for the specific lepton who's veto probability is being calculated. The AN has been clarified to reflect this.
Figure 19 and Tables 29-31 questions. If Figure 19 and Tables 29-31 both utilize all probe tracks then I would expect them to give the same result. So I'm not sure how that affects anything. In principle, the same sign subtraction could have an effect. Hopefully once the tables I asked for are made I will be able to understand if that is the reason.

#### Revision 642019-09-04 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 26 to 26

 31 May 2019 LL EXO WG Pre-approval Agenda slides 7 June 2019 LL EXO WG Followup to pre-approval Agenda slides 19 July 2019 LL EXO WG 2018 ABC background estimates Agenda slides
>
>
 30 Aug 2019 LL EXO WG 2018 background estimates Agenda slides

#### Revision 632019-09-03 - ChristopherHill

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 131 to 131
Figure 19 and Tables 29-31 questions. If Figure 19 and Tables 29-31 both utilize all probe tracks then I would expect them to give the same result. So I'm not sure how that affects anything. In principle, the same sign subtraction could have an effect. Hopefully once the tables I asked for are made I will be able to understand if that is the reason.
Changed:
<
<
These are "all probe tracks", there has been no Z -> T&P requirement yet. This is a technical necessity in creating a pool of tags and probes in which to find all possible pairs. The table you requested has been made, and for this channel:

P(veto) := (35 - 2) / (1213660 - 1437) = 2.72e-5

>
>
The table you requested has been made and is in Table 25 the updated AN linked from this page.
Section 6.1.5 question. I'm not sure this makes sense. You say that you only use ttbar because that contributes the most. But the analysis includes a tag-and-probe assuming Z production. So, while Z->ll may not contribute to the overall background, it will certainly contribute a great deal to the measurement of P_veto. I think you should use all of the MC samples you have if you are trying to imitate the data.
Changed:
<
<
The closure test doesn't necessarily need to mimic the data, but should demonstrate that the method closes within a given sample. The other samples just did not have the statistics available to provide a reasonable observation and estimate. We are now running over additional samples, but are confident the method will still close because after the signal selection the tracks in other samples will be very similar to the ones selected in ttbar.
>
>
A MC closure tests is not intended to mimic the data, but to provide confidence in a data driven background estimate that the method is sound. The method involves the calculation of probability of isolated lepton identification and ttbar was selected as the largest available sample of isolated leptons to demonstrate this method closes. We are running over additional samples, but are confident the method will close in these also because after the signal selection the tracks in other samples are very similar to the ones selected in ttbar.
Section 8.2.2 question. You write that the chargino behaves like a muon and so muons should be used as proxy for measuring the efficiency. It is true that the chargino does not undergo hadronic interactions so it is more like a muon than a hadron. However, the chargino of interest for this analysis does NOT get reconstructed as a muon in the muon system. So it is unlike a muon in that sense. The track reconstruction explicitly uses information from the muon detectors to find all possible muon tracks and to associated as many hits as possible with the track. This is possible for muons but NOT for charginos that decay before the muon system. Therefore, using muons may overestimate the hit efficiency. If you found the same result for hadrons, then I would not be concerned. Or, if you just used muon tracks that were not found by the special muon-finding steps, then I would be happy. I know you have already used this but for reference, this shows the overall tracking efficiency for muons with and without the muon-specific steps: https://cds.cern.ch/record/2666648/files/DP2019_004.pdf As you can see there is a significant reduction in efficiency when the muon-specific steps are removed and a much larger disagreement between data and MC. I have no idea how this translates (or not) to the hit finding efficiency.
Changed:
<
<
The issue in hit requirement efficiencies being different between data and simulation is in the simulation's handling of non-functional channels. This difference will be the same for both simulated muons and charginos. These differences as well can only occur when a track has been reconstructed in the first place, an issue covered by the systematic we take from DP2019_004. So while the issues you mention do affect tracking, they do not affect the discrepancy in non-functional channels handled by the systematics in question.
>
>
The issue in hit requirement efficiencies being different between data and simulation is in the simulation's handling of non-functional channels. This difference will be the same for both simulated muons and charginos. These differences can only occur when a track has been reconstructed in the first place, an issue covered by the systematic we take from DP2019_004. So while the issues you mention do affect tracking, they do not affect the discrepancy in non-functional channels handled by the systematics in question.
Section 8.2.5 question. This answer misses the point of my comment. Your estimate of the systematic uncertainty is simply a measure of MC statistics. It does not address the question of the systematic uncertainty associated with the method itself.
Line: 294 to 292
Table 19: More clarification on dxy and sigma of the track is needed. If dxy is truly with respect to the origin, that is a terrible idea. The beamspot is significantly displaced from the origin (by several mm). So dxy should be measured with respect to either the beamspot or the primary vertex. Regarding sigma, I guess you are saying that it only includes the calculated uncertainty on the track parameters. Can you provide a plot of dxy and sigma for the tracks. Preferably for all tracks. These are applied in the signal selection, so why not here?
Changed:
<
<
See the above answer from the ARC action items. Regarding sigma that is correct.
>
>
See the above answer from the ARC action items.
Table 22: I don't understand your response. Regarding Table 22, my specific questions are: for electrons, why is there no veto on minDeltaR_track,muon>0.15 or min DeltaR_track,had_tau>0.15 or DeltaR_track,jet>0.5

#### Revision 622019-09-03 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 127 to 127
AN Table 22 question. I do not understand the answer. It seems like you are agreeing with me. Maybe I don't understand Table 22. Let's just take the muon case for now. You want to measure the probability that a muon passes the lepton veto. You basically just measure the fraction of probe tracks that pass the lepton veto. According to Table 22, you measure the fraction of probe tracks that pass the criteria minDeltaR_track,muon > 0.15 and missing outer hits > 2. This is P_veto. However, the actual veto that you apply in the signal region is the 5 requirements in Tables 20-21. So, why don't you measure P_veto for muons by measuring the fraction of probe tracks that pass the criteria of all 5 requirements in Tables 20-21.
Changed:
<
<
It seems as if you might be correct that you don't understand Table 22, which we agree is probably misleading. We apply all of the signal selection criteria (including those for the leptons you were asking about) except for those criteria listed in Table 22, which are inverted to allow the calculation of the conditional probability for the specific lepton who's veto probability is being calculated. The AN has been clarified to reflect this.
>
>
We are including all of the signal requirements in the background selections.
Figure 19 and Tables 29-31 questions. If Figure 19 and Tables 29-31 both utilize all probe tracks then I would expect them to give the same result. So I'm not sure how that affects anything. In principle, the same sign subtraction could have an effect. Hopefully once the tables I asked for are made I will be able to understand if that is the reason.
Changed:
<
<
These are "all probe tracks", there has been no Z -> T&P requirement yet. This is a technical necessity in creating a pool of tags and probes in which to find all possible pairs. The table you request has been made for the coming AN update, and for this channel:
>
>
These are "all probe tracks", there has been no Z -> T&P requirement yet. This is a technical necessity in creating a pool of tags and probes in which to find all possible pairs. The table you requested has been made, and for this channel:
P(veto) := (35 - 2) / (1213660 - 1437) = 2.72e-5

#### Revision 612019-09-03 - ChristopherHill

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 127 to 127
AN Table 22 question. I do not understand the answer. It seems like you are agreeing with me. Maybe I don't understand Table 22. Let's just take the muon case for now. You want to measure the probability that a muon passes the lepton veto. You basically just measure the fraction of probe tracks that pass the lepton veto. According to Table 22, you measure the fraction of probe tracks that pass the criteria minDeltaR_track,muon > 0.15 and missing outer hits > 2. This is P_veto. However, the actual veto that you apply in the signal region is the 5 requirements in Tables 20-21. So, why don't you measure P_veto for muons by measuring the fraction of probe tracks that pass the criteria of all 5 requirements in Tables 20-21.
Changed:
<
<
We are including all of the signal requirements in the background selections.
>
>
It seems as if you might be correct that you don't understand Table 22, which we agree is probably misleading. We apply all of the signal selection criteria (including those for the leptons you were asking about) except for those criteria listed in Table 22, which are inverted to allow the calculation of the conditional probability for the specific lepton who's veto probability is being calculated. The AN has been clarified to reflect this.
Figure 19 and Tables 29-31 questions. If Figure 19 and Tables 29-31 both utilize all probe tracks then I would expect them to give the same result. So I'm not sure how that affects anything. In principle, the same sign subtraction could have an effect. Hopefully once the tables I asked for are made I will be able to understand if that is the reason.

#### Revision 602019-09-03 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 127 to 127
AN Table 22 question. I do not understand the answer. It seems like you are agreeing with me. Maybe I don't understand Table 22. Let's just take the muon case for now. You want to measure the probability that a muon passes the lepton veto. You basically just measure the fraction of probe tracks that pass the lepton veto. According to Table 22, you measure the fraction of probe tracks that pass the criteria minDeltaR_track,muon > 0.15 and missing outer hits > 2. This is P_veto. However, the actual veto that you apply in the signal region is the 5 requirements in Tables 20-21. So, why don't you measure P_veto for muons by measuring the fraction of probe tracks that pass the criteria of all 5 requirements in Tables 20-21.
Changed:
<
<
In the muon case Table 16 details the muon tag-and-probe selection, which includes the requirements on dR(track, jet)>0.5 and ECalo<10GeV and min(dR(track, hadronic tau))>0.15 and min(dR(track, electron)) -- the "missing" 4 from Table 22. If such a muon probe track were then to pass the two requirements in Table 22, all 6 (there are six really) requirements would be satisfied. Similar statements about the electron/tau cases can be made -- it is a matter of which side each cut is on the conditional probability, i.e. P(1,3 | 2,4,5,6). The reason why the cuts in Table 22 are chosen is because the listed requirements are the most powerful at rejecting the given flavor, so they are the correct choices to study when defining a veto for that flavor.

We require a pure sample of each flavor, and using all five requirements would reduce that purity. The listed requirements are the most powerful at rejecting the given flavor, so they are the correct choices to study.

>
>
We are including all of the signal requirements in the background selections.
Figure 19 and Tables 29-31 questions. If Figure 19 and Tables 29-31 both utilize all probe tracks then I would expect them to give the same result. So I'm not sure how that affects anything. In principle, the same sign subtraction could have an effect. Hopefully once the tables I asked for are made I will be able to understand if that is the reason.

#### Revision 592019-09-03 - ChristopherHill

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 59 to 59
If possible, find the justification for using dxy with respect to the origin for the track isolation pileup subtraction.
Changed:
<
<
As it turns out the AN is incorrect, and actually describes two versions of this isolation sum. AN lines 355-361 and equation 4 describe the method used, which is to use dZ with respect to the candidate track's vertex and include both track's parameter uncertainties in the calculated sigma. Lines 352-355 and Table 19 are a very old version which is not used. These have been removed from the AN.
>
>
After closer examination of the code, we confirm dxy is not calculated with respect to the origin for the track isolation, i.e. the text in the previous version of the AN was incorrect. The updated AN contains the correct description of the track isolation calculation on lines 383--390.
Suggestion: compare the nominal Gaussian fit to a flat line for NLayers5, as there's a concern the bias towards the PV changes as nLayers increases.
Line: 1264 to 1264

 META FILEATTACHMENT attachment="sidebandProjections.png" attr="" comment="" date="1565712492" name="sidebandProjections.png" path="sidebandProjections.png" size="70993" user="bfrancis" version="1" attachment="tfGaus_ZtoMuMu_NLayers5.png" attr="" comment="" date="1567007928" name="tfGaus_ZtoMuMu_NLayers5.png" path="tfGaus_ZtoMuMu_NLayers5.png" size="18835" user="bfrancis" version="1" attachment="tfFlat_ZtoMuMu_NLayers5_v2.png" attr="" comment="" date="1567010202" name="tfFlat_ZtoMuMu_NLayers5_v2.png" path="tfFlat_ZtoMuMu_NLayers5_v2.png" size="16846" user="bfrancis" version="1"
Changed:
<
<
 META FILEATTACHMENT attachment="AN_18_311_currentVersion.pdf" attr="" comment="" date="1567520138" name="AN_18_311_currentVersion.pdf" path="AN_18_311_currentVersion.pdf" size="14257379" user="bfrancis" version="3"
>
>
 META FILEATTACHMENT attachment="AN_18_311_currentVersion.pdf" attr="" comment="" date="1567515552" name="AN_18_311_currentVersion.pdf" path="AN_18_311_currentVersion.pdf" size="14257372" user="bfrancis" version="2"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 582019-09-03 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 1264 to 1264

 META FILEATTACHMENT attachment="sidebandProjections.png" attr="" comment="" date="1565712492" name="sidebandProjections.png" path="sidebandProjections.png" size="70993" user="bfrancis" version="1" attachment="tfGaus_ZtoMuMu_NLayers5.png" attr="" comment="" date="1567007928" name="tfGaus_ZtoMuMu_NLayers5.png" path="tfGaus_ZtoMuMu_NLayers5.png" size="18835" user="bfrancis" version="1" attachment="tfFlat_ZtoMuMu_NLayers5_v2.png" attr="" comment="" date="1567010202" name="tfFlat_ZtoMuMu_NLayers5_v2.png" path="tfFlat_ZtoMuMu_NLayers5_v2.png" size="16846" user="bfrancis" version="1"
Changed:
<
<
 META FILEATTACHMENT attachment="AN_18_311_currentVersion.pdf" attr="" comment="" date="1567515552" name="AN_18_311_currentVersion.pdf" path="AN_18_311_currentVersion.pdf" size="14257372" user="bfrancis" version="2"
>
>
 META FILEATTACHMENT attachment="AN_18_311_currentVersion.pdf" attr="" comment="" date="1567520138" name="AN_18_311_currentVersion.pdf" path="AN_18_311_currentVersion.pdf" size="14257379" user="bfrancis" version="3"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 572019-09-03 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 127 to 127
AN Table 22 question. I do not understand the answer. It seems like you are agreeing with me. Maybe I don't understand Table 22. Let's just take the muon case for now. You want to measure the probability that a muon passes the lepton veto. You basically just measure the fraction of probe tracks that pass the lepton veto. According to Table 22, you measure the fraction of probe tracks that pass the criteria minDeltaR_track,muon > 0.15 and missing outer hits > 2. This is P_veto. However, the actual veto that you apply in the signal region is the 5 requirements in Tables 20-21. So, why don't you measure P_veto for muons by measuring the fraction of probe tracks that pass the criteria of all 5 requirements in Tables 20-21.
>
>
In the muon case Table 16 details the muon tag-and-probe selection, which includes the requirements on dR(track, jet)>0.5 and ECalo<10GeV and min(dR(track, hadronic tau))>0.15 and min(dR(track, electron)) -- the "missing" 4 from Table 22. If such a muon probe track were then to pass the two requirements in Table 22, all 6 (there are six really) requirements would be satisfied. Similar statements about the electron/tau cases can be made -- it is a matter of which side each cut is on the conditional probability, i.e. P(1,3 | 2,4,5,6). The reason why the cuts in Table 22 are chosen is because the listed requirements are the most powerful at rejecting the given flavor, so they are the correct choices to study when defining a veto for that flavor.
We require a pure sample of each flavor, and using all five requirements would reduce that purity. The listed requirements are the most powerful at rejecting the given flavor, so they are the correct choices to study.

Figure 19 and Tables 29-31 questions. If Figure 19 and Tables 29-31 both utilize all probe tracks then I would expect them to give the same result. So I'm not sure how that affects anything. In principle, the same sign subtraction could have an effect. Hopefully once the tables I asked for are made I will be able to understand if that is the reason.

Line: 250 to 252
This was normalized incorrectly and will be corrected to one in the next AN version.
Changed:
<
<
L458: Can you add plots (at least a few examples) to the AN of the Z->ll mass distributions in the OS and SS categories used for the T&P method?
>
>
L458: Can you add plots (at least a few examples) to the AN of the Z->ll mass distributions in the OS and SS categories used for the T&P method?

Changed:
<
<
We will add these plots. The 2018 data analysis was prioritized over this.
>
>
Plots similar to Figure 19 are added to the AN for electrons.
L490: I don't understand how this test shows that P_veto does not depend on track pT. How significant is the KS test for distributions with so few events?
Line: 1262 to 1264

 META FILEATTACHMENT attachment="sidebandProjections.png" attr="" comment="" date="1565712492" name="sidebandProjections.png" path="sidebandProjections.png" size="70993" user="bfrancis" version="1" attachment="tfGaus_ZtoMuMu_NLayers5.png" attr="" comment="" date="1567007928" name="tfGaus_ZtoMuMu_NLayers5.png" path="tfGaus_ZtoMuMu_NLayers5.png" size="18835" user="bfrancis" version="1" attachment="tfFlat_ZtoMuMu_NLayers5_v2.png" attr="" comment="" date="1567010202" name="tfFlat_ZtoMuMu_NLayers5_v2.png" path="tfFlat_ZtoMuMu_NLayers5_v2.png" size="16846" user="bfrancis" version="1"
Changed:
<
<
 META FILEATTACHMENT attachment="AN_18_311_currentVersion.pdf" attr="" comment="" date="1567013753" name="AN_18_311_currentVersion.pdf" path="AN_18_311_currentVersion.pdf" size="14030592" user="bfrancis" version="1"
>
>
 META FILEATTACHMENT attachment="AN_18_311_currentVersion.pdf" attr="" comment="" date="1567515552" name="AN_18_311_currentVersion.pdf" path="AN_18_311_currentVersion.pdf" size="14257372" user="bfrancis" version="2"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 562019-08-29 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 145 to 145
Section 8.2.5 question. This answer misses the point of my comment. Your estimate of the systematic uncertainty is simply a measure of MC statistics. It does not address the question of the systematic uncertainty associated with the method itself.
Changed:
<
<
As discussed in first ARC meeting, this is a very common procedure used by many analyses. There is an observed difference between simulation and data in the hadronic recoil of di-muon events which must be accounted for, and the only uncertainty to apply is the statistics of the observed difference.
>
>
As discussed in first ARC meeting, this is a very common procedure used by many analyses. For example SUSY diphoton+MET (SUS-17-011) uses the sum-pt of photon/electron pairs as a handle on the hadronic activity. The MET performance itself is measured (e.g. JME-17-001) using Z events; from their PAS "A well-measured Z/γ boson provides a unique event axis and a precise momentum scale.".
Finally, I see in the answer for the ARC meeting that the calculation of dxy for the track isolation may not have been done in the intended fashion. Can you specify what information you have on what was done. Is there any way to see the dxy distribution as calculated? Is there any way at this point to change to use dxy relative to the primary vertex (like dxy<0.02cm as is done for the signal tracks)? I see that this cut has a pretty big effect for the short lifetime case as shown in Figure 16.
Line: 252 to 252
L458: Can you add plots (at least a few examples) to the AN of the Z->ll mass distributions in the OS and SS categories used for the T&P method?
Changed:
<
<
We will add these plots.
>
>
We will add these plots. The 2018 data analysis was prioritized over this.
L490: I don't understand how this test shows that P_veto does not depend on track pT. How significant is the KS test for distributions with so few events?
Line: 374 to 374
L338-341: Would be nice to show the signal and background distributions for these two variables so we can evaluate for ourselves the effectiveness of the cut. Also, it would be helpful to report the background rejection and signal efficiency for these cuts.
Changed:
<
<
We are producing N-1 plots to show this.
>
>
We will produce N-1 plots to show this. The 2018 data analysis was prioritized over this.
Table 18: Are there no standard Tracking POG quality requirements on the tracks? I think they still have "loose" and "highPurity" requirements based on an MVA. Do you require either of these?
Line: 390 to 390
L368-370 and Table 20: Would be nice to see plots of Delta R to see why 0.15 is chosen. I would have expected a smaller value for muons and a larger value for electrons.
Changed:
<
<
We are producing N-1 plots to show this.
>
>
We will produce N-1 plots to show this. The 2018 data analysis was prioritized over this.
L363-370: Just to be clear, there are no requirements on the pT of the leptons? So, if there is a 4 GeV muon within Delta R of 0.15 of a 100 GeV track, then you reject the track?
Line: 412 to 412
This method was suggested by you in the review of EXO-16-044. The language of the AN regarding continuum DY has been updated (here) for clarity as you suggest. It is not relevant to the measurement of P(veto) to estimate the non-Z backgrounds in our tag-and-probe samples, so we do not -- the purpose of the same-sign subtraction is to increase the purity of the lepton flavor under study.
Changed:
<
<
L468-9: It would be helpful to provide a table giving the values for the 4 numbers in Equation 6 for each of the 9 cases (3 leptons * 3 nlayer bins). I would like to get an idea of the signal-to-background ratio. I may also want to calculate the effect of subtracting the background versus ignoring it.
>
>
L468-9: It would be helpful to provide a table giving the values for the 4 numbers in Equation 6 for each of the 9 cases (3 leptons * 3 nlayer bins). I would like to get an idea of the signal-to-background ratio. I may also want to calculate the effect of subtracting the background versus ignoring it.

Changed:
<
<
In making this table we found some bugs in the P(veto) script. Firstly the N_{SS T&P} was not being subtracted in the denominator of Equation 6; in nlayers>=6 this is an extremely trivial issue but is relevant in the newer, shorter categories. Secondly in cases where the numerator of Equation 6 is negative, the N_{SS T&P}^{veto} subtraction was ignored when it should be assumed to be 0 + 1.1 -0. Both these issues are resolved and we are currently making the table. This slightly changes some estimates.
>
>
In making this table we found some bugs in the P(veto) script. Firstly the N_{SS T&P} was not being subtracted in the denominator of Equation 6; in nlayers>=6 this is an extremely trivial issue but is relevant in the newer, shorter categories. Secondly in cases where the numerator of Equation 6 is negative, the N_{SS T&P}^{veto} subtraction was ignored when it should be assumed to be 0 + 1.1 -0. This slightly changes some estimates.
L472-5: Are these recHits from the tracker or the calorimeter? I don't really understand what you are describing here. Are the electron seeds from the ECAL or pixel detector?
Line: 493 to 493
c) The purpose of the transfer factor is only to normalize the sideband rates to the signal region. We must describe this normalization in a way that does not depend on obvserving the signal region count, because in nlayers=5, >=6 the statistics do not allow for that. That is what the fit does in Eq. 14.
Changed:
<
<
d) L629-630 quotes the transfer factor for the baseline sideband (0.05, 0.10) cm, so only one. The authors felt that Table 35 was large enough already, but we now provide an additional table listing the P^raw_fake and transfer factors.
>
>
d) L629-630 quotes the transfer factor for the baseline sideband (0.05, 0.10) cm, so only one. The authors felt that Table 35 was large enough already, but we now provide an additional table listing the P^raw_fake and transfer factors.
- What best describes your assumption of the fake track rate as a function of d0. Is it uniform (flat), Gaussian, Gaussian+flat, or something else?
Line: 534 to 534
Figure 31: Would be good to have a plot for nlayers=5 as well.
Changed:
<
<
We are producing this plot.
>
>
We will produce this plot. The 2018 data processing was prioritized over this.
Figure 35: Please include the ratio of the two since this provides the scale factors that are used. It may be better to simply include the region of 50-300 GeV on a linear scale.
Changed:
<
<
We are producing this plot.
>
>
We will produce this plot. The 2018 data processing was prioritized over this.
L752-754: While this signal yield reduction is interesting, just as interesting would be the change after all cuts are applied (with nlayers>=4). Can you provide this as well?
Changed:
<
<
Producing this.
>
>
We will produce this. The 2018 data processing was prioritized over this.

Changed:
<
<
L760-762: How sure are we that the Z can be used to measure ISR difference for the signal model? I generally agree with the statement that both recoil off ISR but it would be nice if this could be confirmed somehow. Does the pT distribution for a 100 GeV chargino look similar to a Z in Pythia8? Does the ISR reweighting work for ttbar events or diboson events?
>
>
L760-762: How sure are we that the Z can be used to measure ISR difference for the signal model? I generally agree with the statement that both recoil off ISR but it would be nice if this could be confirmed somehow. Does the pT distribution for a 100 GeV chargino look similar to a Z in Pythia8? Does the ISR reweighting work for ttbar events or diboson events?

Changed:
<
<
We are proceeding through the SIM/RECO steps for a small portion of the Pythia8 Drell-Yan sample generated for Figure 38. By applying the ISR weights and checking against the reconstructed MadGraph sample, this will be an effective check of the method. For the pT distribution we are producing this for 100 GeV; but for 1000 GeV for example this surely will be different. The relevant issue is not that the Z's distribution looks like the electroweak-ino pair's, but that the simulation underestimates the ISR shape compared to data for the same event content.

>
>
See the above answer from the Email questions from Kevin Stenson August 27.
L772-773: Please expand on "is applied to the simulated signal samples". Do you reweight the events using the ISR jet in the event or the net momentum of the produced SUSY particles or something else.

#### Revision 552019-08-29 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 58 to 57
Despite the plots above these average weights are very consistent with one, i.e. the estimate does not depend on this.
Changed:
<
<
If possible, find the justification for using dxy with respect to the origin for the track isolation pileup subtraction.
>
>
If possible, find the justification for using dxy with respect to the origin for the track isolation pileup subtraction.

Changed:
<
<
We could not find a justification and have concluded that regardless of what inspired it, calculating the track isolation with respect to the origin was/is a mistake. However, as noted in the ARC meeting, this mistake is confined to the selection of tracks to be included in track isolation sum. The effect of which is to reduce the efficacy of the track isolation requirement. Redoing this sum would require prohibitively large reprocessing but fortunately, due to the redundancy of this cut with the calorimeter isolation requirement for charged hadrons and electrons, and the muon delta R requirement for muons. The effect on the analysis is not significant as will be seen in an in-progress plot (to be shown below) showing the observed events without the track isolation cut (N-1 plot).
>
>
As it turns out the AN is incorrect, and actually describes two versions of this isolation sum. AN lines 355-361 and equation 4 describe the method used, which is to use dZ with respect to the candidate track's vertex and include both track's parameter uncertainties in the calculated sigma. Lines 352-355 and Table 19 are a very old version which is not used. These have been removed from the AN.
Suggestion: compare the nominal Gaussian fit to a flat line for NLayers5, as there's a concern the bias towards the PV changes as nLayers increases.

#### Revision 542019-08-28 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Changed:
<
<
>
>

Line: 34 to 34

## ARC review

>
>
Current version of the AN: here

### ARC action items from July 25 meeting

<!--/twistyPlugin twikiMakeVisibleInline-->

Combine the nine dxy sideband regions in the fake estimate into one larger sideband.

Changed:
<
<
Done. The fake estimates change slightly but are all within 0.5sigma (statistical) of the previous estimates. The AN is updated. Done. The fake estimates change slightly but are all within 0.5sigma (statistical) of the previous estimates. The draft AN linked above is updated to reflect this.
>
>
Done. The fake estimates change slightly but are all within 0.5sigma (statistical) of the previous estimates. The AN is updated (here).
Compare the pileup distributions in ZtoMuMu, ZtoEE, and BasicSelection events. If there is a big difference, try reweighting and see how much it changes the estimate.

 nPV ratios to BasicSelection
Line: 57 to 58
Despite the plots above these average weights are very consistent with one, i.e. the estimate does not depend on this.
Changed:
<
<
If possible, find the justification for using dxy with respect to the origin for the track isolation pileup subtraction.

We could not find a justification and have concluded that regardless of what inspired it, calculating the track isolation with respect to the origin was/is a mistake. However, as noted in the ARC meeting, this mistake is confined to the selection of tracks to be included in track isolation sum. The effect of which is to reduce the efficacy of the track isolation requirement. Redoing this sum would require prohibitively large reprocessing but fortunately, due to the redundancy of this cut with the calorimeter isolation requirement for charged hadrons and electrons, and the muon delta R requirement for muons, the effect on the analysis is not significant as seen in the below plot showing the observed events without the track isolation cut (N-1 plot).

>
>
If possible, find the justification for using dxy with respect to the origin for the track isolation pileup subtraction.

Changed:
<
<
>
>
We could not find a justification and have concluded that regardless of what inspired it, calculating the track isolation with respect to the origin was/is a mistake. However, as noted in the ARC meeting, this mistake is confined to the selection of tracks to be included in track isolation sum. The effect of which is to reduce the efficacy of the track isolation requirement. Redoing this sum would require prohibitively large reprocessing but fortunately, due to the redundancy of this cut with the calorimeter isolation requirement for charged hadrons and electrons, and the muon delta R requirement for muons. The effect on the analysis is not significant as will be seen in an in-progress plot (to be shown below) showing the observed events without the track isolation cut (N-1 plot).
Suggestion: compare the nominal Gaussian fit to a flat line for NLayers5, as there's a concern the bias towards the PV changes as nLayers increases.
Changed:
<
<
 ZtoMuMu NLayers5 (gaussian fit from NLayers4) ZtoMuMu NLayers5 (pol0 fit, finer binning)
| | | | | | |

With a flat line, the 5-layer fake estimate is 0.81 events compared to the nominal 1.00. With a flat line fit, the 5-layer fake estimate is 0.81 events with a chi2/dof of 7.0 (c.f. the gaussian fit estimate 1.00 with a chi2/dof of 1.5.

>
>
 ZtoMuMu NLayers5 (gaussian fit) ZtoMuMu NLayers5 (pol0 fit)

With a flat line, the 5-layer fake estimate is 0.81 events compared to the nominal 1.00.

<!--/twistyPlugin-->

### On-going 2018 estimate updates (last updated July 9)

Line: 142 to 138
Section 6.1.5 question. I'm not sure this makes sense. You say that you only use ttbar because that contributes the most. But the analysis includes a tag-and-probe assuming Z production. So, while Z->ll may not contribute to the overall background, it will certainly contribute a great deal to the measurement of P_veto. I think you should use all of the MC samples you have if you are trying to imitate the data.
Changed:
<
<
The closure test doesn't necessarily need to mimic the data, but should demonstrate that the method closes within a given sample. The other samples just did not have the statistics available to provide a reasonable observation and estimate.
>
>
The closure test doesn't necessarily need to mimic the data, but should demonstrate that the method closes within a given sample. The other samples just did not have the statistics available to provide a reasonable observation and estimate. We are now running over additional samples, but are confident the method will still close because after the signal selection the tracks in other samples will be very similar to the ones selected in ttbar.
Section 8.2.2 question. You write that the chargino behaves like a muon and so muons should be used as proxy for measuring the efficiency. It is true that the chargino does not undergo hadronic interactions so it is more like a muon than a hadron. However, the chargino of interest for this analysis does NOT get reconstructed as a muon in the muon system. So it is unlike a muon in that sense. The track reconstruction explicitly uses information from the muon detectors to find all possible muon tracks and to associated as many hits as possible with the track. This is possible for muons but NOT for charginos that decay before the muon system. Therefore, using muons may overestimate the hit efficiency. If you found the same result for hadrons, then I would not be concerned. Or, if you just used muon tracks that were not found by the special muon-finding steps, then I would be happy. I know you have already used this but for reference, this shows the overall tracking efficiency for muons with and without the muon-specific steps: https://cds.cern.ch/record/2666648/files/DP2019_004.pdf As you can see there is a significant reduction in efficiency when the muon-specific steps are removed and a much larger disagreement between data and MC. I have no idea how this translates (or not) to the hit finding efficiency.
Changed:
<
<
We do include the data/simulation difference in DP2019_004 as a systematic uncertainty however, so the overall track reconstruction efficiency for charginos is well-covered. Comparing the 0.02% hit systematic to DP2019_004's 2.1%, the hit systematic is negligable even if the extra iterations increase the hit-association efficiency several-fold.
>
>
The issue in hit requirement efficiencies being different between data and simulation is in the simulation's handling of non-functional channels. This difference will be the same for both simulated muons and charginos. These differences as well can only occur when a track has been reconstructed in the first place, an issue covered by the systematic we take from DP2019_004. So while the issues you mention do affect tracking, they do not affect the discrepancy in non-functional channels handled by the systematics in question.
Section 8.2.5 question. This answer misses the point of my comment. Your estimate of the systematic uncertainty is simply a measure of MC statistics. It does not address the question of the systematic uncertainty associated with the method itself.
Line: 154 to 150
Finally, I see in the answer for the ARC meeting that the calculation of dxy for the track isolation may not have been done in the intended fashion. Can you specify what information you have on what was done. Is there any way to see the dxy distribution as calculated? Is there any way at this point to change to use dxy relative to the primary vertex (like dxy<0.02cm as is done for the signal tracks)? I see that this cut has a pretty big effect for the short lifetime case as shown in Figure 16.
Changed:
<
<
See the above answer from the ARC action items.
>
>
See the above answer from the ARC action items.

<!--/twistyPlugin-->
Line: 299 to 295
Table 19: More clarification on dxy and sigma of the track is needed. If dxy is truly with respect to the origin, that is a terrible idea. The beamspot is significantly displaced from the origin (by several mm). So dxy should be measured with respect to either the beamspot or the primary vertex. Regarding sigma, I guess you are saying that it only includes the calculated uncertainty on the track parameters. Can you provide a plot of dxy and sigma for the tracks. Preferably for all tracks. These are applied in the signal selection, so why not here?
Changed:
<
<
See the above answer from the ARC action items. Regarding sigma that is correct.
>
>
See the above answer from the ARC action items. Regarding sigma that is correct.
Table 22: I don't understand your response. Regarding Table 22, my specific questions are: for electrons, why is there no veto on minDeltaR_track,muon>0.15 or min DeltaR_track,had_tau>0.15 or DeltaR_track,jet>0.5 for muons, why is there no veto on minDeltaR_track,electron>0.15 or E_calo<10 GeV or minDeltaR_track,had_tau>0.15 or DeltaR_track,jet>0.5 for taus, why is there no veto on minDeltaR_track,electron>0.15 or minDeltaR_track,muon>0.15
Changed:
<
<
See the above answer from the ARC action items.
>
>
See the above answer from the ARC action items.
Figure 19 and Tables 29-31: I'm still not sure I understand. Is Figure 19 the plot for all probe tracks, regardless of whether there is a matching tag? If it is just all probe tracks that survive the tag-and-probe requirements, then the fact that they are all plotted and that you use all combinations should mean we get the same answer. On the other hand, it could be the same-sign subtraction is the cause of the difference. In addition to providing the four numbers that go into Equation 6, can you also provide the integrals of the blue and red points in the three plots of Figure 19? I'm hoping that N_T&P and N^veto_T&P will be similar to the integrals of the blue and red points, respectively.
Changed:
<
<
See: above.
>
>
See: above.
L514-518 and Tables 29-31: So my understanding is that when you write "tau background" you are really intending to identify the sum of taus and single hadronic track backgrounds. I think this approach is fine but there may still be some issues with the implementation. The measurement of P_veto is going to be dominated by taus as it is measured from Z events and has the same-sign background subtracted. If you are trying to apply P_veto to the sum of taus and single hadronic track background it seems like it is necessary to show that P_veto is the same for taus and single hadronic tracks. For P_offline, the single-tau control sample will clearly be a mix of taus and single hadronic tracks as the selection is relatively loose. This is probably good. However, it will also include fake tracks as there is no same-sign subtraction in this measurement. So I think there is still the possibility that you are including fake tracks here. I guess the fact that there are basically no 4 or 5 layer tracks suggests that the fake track contribution is negligible.
Line: 322 to 318
Section 6.1.5: Your response keeps mentioning that P_veto is small and therefore other sample are not useful. It may be true that other samples will not contribute when you select the signal region and compare to the background estimate. However, I would think the non-ttbar background will contribute to the measurement of P_offline and P_trigger, since these simply come from single-lepton samples. So, again, if you want this to mimic the data, I think you need to include all of the background MC samples.
Changed:
<
<
See: above.
>
>
See: above.
Section 6.2: Thanks for the info. I think you are correct that fake tracks are the main contributors to the Gaussian and flat portion of the dxy plot. I don't think there is any bias in the final fit that is done. However, the pattern recognition does have a bias for finding tracks that originate from the beamline. So that could be the reason. I am concerned about one statement. You write that you label a track as fake if it is not "matched to any hard interaction truth particle". Can you clarify what you mean? I am worried that you only check the tracks coming from the "hard interaction" rather than all the truth tracks (including from those from pileup). I think it would be wrong to classify pileup tracks as fake tracks.
Line: 330 to 326
Section 8.2.2: It is good to know that the hit efficiencies seem to be accurate. However, you also write that charginos behave like muons and so using muons is the correct way to evaluate the hit efficiency. You also write that "the reconstruction is done only with the tracker information". As I wrote earlier, for real muons, there are two additional tracking iterations that use muon detector information to improve the track reconstruction. This won't be the case for your charginos of interest because they decay before reaching the muon stations. So that is why I worry that muons are not a good substitute for your charginos.
Changed:
<
<
See: above.
>
>
See: above.
Section 8.2.5: This response does not really address the heart of the question. Suppose you did have infinite statistics in data and MC. Would we then be comfortable quoting no systematic uncertainty? Are we 100% sure that taking the Z pT and using that to reweight the pT spectrum of the electroweak-ino pair gives the correct distribution? Has anyone looked at applying this procedure to diboson production or ttbar production to confirm that it works?
Changed:
<
<
See: above.
>
>
See: above.
Section 8.2.11: Are you saying that for your 700 GeV chargino signal, the increase in trigger efficiency when going from HLT_MET120 to HLT_MET120 || HLT_MET105_IsoTrk50 is only 1%? I'm not sure this is relevant to my concern though. Suppose you have a signal that produces PFMET of 125 GeV. When you measure the efficiency for nlayers=4,5,6 tracks, you will get the same efficiency because the MC includes a HLT_PFMET120_PFMHT120_IDTight trigger in it. So, you say great, there is no systematic because the efficiencies are the same. However, in some fraction of the data this trigger path is disabled and so the efficiency would be quite different for nlayers=4 tracks (which won't get triggered) and nlayers=6+ tracks which will get triggered by HLT_MET105_IsoTrk50. So, my suggestion was to do the same study but only include triggers that are never disabled or prescaled. Basically, you are currently using HLT_PFMET120_PFMHT120_IDTight || HLT_PFMETNoMu120_PFMHTNoMu120_IDTight || HLT_MET105_IsoTrack50 to make the calculation. My suggestion is to use HLT_PFMET140_PFMHT140_IDTight || HLT_PFMETNoMu140_PFMHTNoMu140_IDTight || HLT_MET105_IsoTrack50. Alternatively, you could correctly weight the MC to account for the different luminosity periods when each trigger is active.
Line: 375 to 371
L332-334: It is claimed that a jet pT cut of >110 GeV removes a lot of background and not signal. The plots in Figure 11 don't seem to back this up. It seems like about the same percentage of signal and background events are removed by the cut. Can you quantify the effect of this cut (background rejection and signal efficiency)?
Changed:
<
<
Figure 11 has been updated to show jet pt >30 GeV instead of >55 GeV, as the issue is seen at lower pt. The efficiency of the >110 GeV cut is 84.6% in data and 87.2% for the signal sample shown.
>
>
Figure 11 has been updated (here) to show jet pt >30 GeV instead of >55 GeV, as the issue is seen at lower pt. The efficiency of the >110 GeV cut is 84.6% in data and 87.2% for the signal sample shown.
L338-341: Would be nice to show the signal and background distributions for these two variables so we can evaluate for ourselves the effectiveness of the cut. Also, it would be helpful to report the background rejection and signal efficiency for these cuts.
Line: 415 to 411
L456-460: Your assumption here is that the same-sign sample has the same production rate as the background. Have you verified this? You could verify it with a high statistics sample of dilepton events (or come up with a scaling factor if it is not exactly true). Also, in L457-458 you list three sources of background: DY, non-DY, fake tracks. I don't see how a same-sign sample can be used to estimate the DY background? I would suggest calling DY part of the signal. For di-electron and di-muon events, you also have the possibility of using the sidebands around the Z mass to estimate the background. You could check that this gives consistent results.
Changed:
<
<
This method was suggested by you in the review of EXO-16-044. The language of the AN regarding continuum DY has been updated for clarity as you suggest. It is not relevant to the measurement of P(veto) to estimate the non-Z backgrounds in our tag-and-probe samples, so we do not -- the purpose of the same-sign subtraction is to increase the purity of the lepton flavor under study.
>
>
This method was suggested by you in the review of EXO-16-044. The language of the AN regarding continuum DY has been updated (here) for clarity as you suggest. It is not relevant to the measurement of P(veto) to estimate the non-Z backgrounds in our tag-and-probe samples, so we do not -- the purpose of the same-sign subtraction is to increase the purity of the lepton flavor under study.
L468-9: It would be helpful to provide a table giving the values for the 4 numbers in Equation 6 for each of the 9 cases (3 leptons * 3 nlayer bins). I would like to get an idea of the signal-to-background ratio. I may also want to calculate the effect of subtracting the background versus ignoring it.
Line: 483 to 479
Figure 27: Would be good to add the nlayers=4 result as is done in Figure 28. Should also say what simulated samples are included here.
Changed:
<
<
All samples are used, the caption is updated. The cleaning cuts are only relevant for the nlayers=3 category which is only used in an appendix (after the edits suggested below). Showing this for nlayers=4 would provide the same information as Figure 31.
>
>
All samples are used, the caption is updated (here). The cleaning cuts are only relevant for the nlayers=3 category which is only used in an appendix (after the edits suggested below). Showing this for nlayers=4 would provide the same information as Figure 31.
Section 6.2: This needs to be cleaned up and explained better. Here are some specific comments/suggestions - I think that L566-569, Figures 26-28, and L602-615 can all be removed. It seems like they have nothing to do with the analysis that is done. They just lead to confusion. If you want to move this material to Appendix C, that is fine. But don't clutter up this section.
Line: 522 to 518
- Figure 29: Why do you not fit the region |dxy|<0.1cm? If you fit out to |dxy|=1.0cm, please show the entire range in the plots. It would be nice to see the results for nlayers=5 and 6 as well so we can evaluate the extent to which a fit may or may not be possible and whether the shape is consistent with nlayers=4.
Changed:
<
<
The AN has been updated to correctly reflect the fit extending to |dxy|<0.5cm, the range of the plots. Should the d0 peak actually contain real tracks, it would peak more narrowly than observed in the sidebands; so |dxy|<0.1cm is excluded from the fit, and the count of nlayers=4 tracks in the signal region is checked against the fit prediction, and agrees.
>
>
The AN has been updated (here) to correctly reflect the fit extending to |dxy|<0.5cm, the range of the plots. Should the d0 peak actually contain real tracks, it would peak more narrowly than observed in the sidebands; so |dxy|<0.1cm is excluded from the fit, and the count of nlayers=4 tracks in the signal region is checked against the fit prediction, and agrees.
Shown below is the nlayers=5 d0 distributions, with the fit from nlayers=4 overlaid. The nlayers>=6 samples have one (three) events in ZtoMuMu (ZtoEE), so no fit is possible.
Line: 997 to 993
finish some systematic uncertainties implement trigger scale factors (L615)
Changed:
<
<
The trigger scale factors are actually implemented in the signal yield, it seems we saw "XX%" and overlooked them thinking they were systematic uncertainties -- these numbers on L615 are now updated. The remaining analysis tasks are to investigate possible optimizations as mentioned above, and to finish the signal systematics.
>
>
The trigger scale factors are actually implemented in the signal yield, it seems we saw "XX%" and overlooked them thinking they were systematic uncertainties -- these numbers on L615 are now updated (here). The remaining analysis tasks are to investigate possible optimizations as mentioned above, and to finish the signal systematics.
Please add what you told me in your answer about the Ecalo selection to the note, namely that it primarily removes electrons and optimizing this criterion to increase the search sensitivity will be ineffective because it will be primarily be cutting into pileup calo deposits rather than real backgrounds. Adding this to the note will help anyone else who reads it and could have the same question that I did.
Line: 1267 to 1263

 META FILEATTACHMENT attachment="comparePU_fakeCRs.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs.jpg" path="comparePU_fakeCRs.jpg" size="116660" user="bfrancis" version="1" attachment="comparePU_fakeCRs_ratio.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs_ratio.jpg" path="comparePU_fakeCRs_ratio.jpg" size="128471" user="bfrancis" version="1" attachment="sidebandProjections.png" attr="" comment="" date="1565712492" name="sidebandProjections.png" path="sidebandProjections.png" size="70993" user="bfrancis" version="1"
Deleted:
<
<
 META FILEATTACHMENT attachment="tfFlat_ZtoMuMu_NLayers5.png" attr="" comment="" date="1567007928" name="tfFlat_ZtoMuMu_NLayers5.png" path="tfFlat_ZtoMuMu_NLayers5.png" size="16846" user="bfrancis" version="1"

 META FILEATTACHMENT attachment="tfGaus_ZtoMuMu_NLayers5.png" attr="" comment="" date="1567007928" name="tfGaus_ZtoMuMu_NLayers5.png" path="tfGaus_ZtoMuMu_NLayers5.png" size="18835" user="bfrancis" version="1"
>
>
 META FILEATTACHMENT attachment="tfFlat_ZtoMuMu_NLayers5_v2.png" attr="" comment="" date="1567010202" name="tfFlat_ZtoMuMu_NLayers5_v2.png" path="tfFlat_ZtoMuMu_NLayers5_v2.png" size="16846" user="bfrancis" version="1" attachment="AN_18_311_currentVersion.pdf" attr="" comment="" date="1567013753" name="AN_18_311_currentVersion.pdf" path="AN_18_311_currentVersion.pdf" size="14030592" user="bfrancis" version="1"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 532019-08-28 - ChristopherHill

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 40 to 40
Combine the nine dxy sideband regions in the fake estimate into one larger sideband.
Changed:
<
<
Done. The fake estimates change slightly but are all within 0.5sigma (statistical) of the previous estimates. The AN is updated.
>
>
Done. The fake estimates change slightly but are all within 0.5sigma (statistical) of the previous estimates. The AN is updated. Done. The fake estimates change slightly but are all within 0.5sigma (statistical) of the previous estimates. The draft AN linked above is updated to reflect this.
Compare the pileup distributions in ZtoMuMu, ZtoEE, and BasicSelection events. If there is a big difference, try reweighting and see how much it changes the estimate.

 nPV ratios to BasicSelection
Line: 65 to 66
Suggestion: compare the nominal Gaussian fit to a flat line for NLayers5, as there's a concern the bias towards the PV changes as nLayers increases.

 ZtoMuMu NLayers5 (gaussian fit from NLayers4) ZtoMuMu NLayers5 (pol0 fit, finer binning)
Changed:
<
<

With a flat line, the 5-layer fake estimate is 0.81 events compared to the nominal 1.00.

>
>
| | | | | | |

With a flat line, the 5-layer fake estimate is 0.81 events compared to the nominal 1.00. With a flat line fit, the 5-layer fake estimate is 0.81 events with a chi2/dof of 7.0 (c.f. the gaussian fit estimate 1.00 with a chi2/dof of 1.5.

</>
<!--/twistyPlugin-->

### On-going 2018 estimate updates (last updated July 9)

#### Revision 522019-08-28 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 1263 to 1263

 META FILEATTACHMENT attachment="comparePU_fakeCRs.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs.jpg" path="comparePU_fakeCRs.jpg" size="116660" user="bfrancis" version="1" attachment="comparePU_fakeCRs_ratio.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs_ratio.jpg" path="comparePU_fakeCRs_ratio.jpg" size="128471" user="bfrancis" version="1" attachment="sidebandProjections.png" attr="" comment="" date="1565712492" name="sidebandProjections.png" path="sidebandProjections.png" size="70993" user="bfrancis" version="1"
Changed:
<
<
 META FILEATTACHMENT attachment="tfFlat_ZtoMuMu_NLayers5.png" attr="" comment="" date="1567002070" name="tfFlat_ZtoMuMu_NLayers5.png" path="tfFlat_ZtoMuMu_NLayers5.png" size="13474" user="bfrancis" version="1" attachment="tfFlat_ZtoMuMu_NLayers5_v2.png" attr="" comment="" date="1567002102" name="tfFlat_ZtoMuMu_NLayers5_v2.png" path="tfFlat_ZtoMuMu_NLayers5_v2.png" size="13474" user="bfrancis" version="1"
>
>
 META FILEATTACHMENT attachment="tfFlat_ZtoMuMu_NLayers5.png" attr="" comment="" date="1567007928" name="tfFlat_ZtoMuMu_NLayers5.png" path="tfFlat_ZtoMuMu_NLayers5.png" size="16846" user="bfrancis" version="1" attachment="tfGaus_ZtoMuMu_NLayers5.png" attr="" comment="" date="1567007928" name="tfGaus_ZtoMuMu_NLayers5.png" path="tfGaus_ZtoMuMu_NLayers5.png" size="18835" user="bfrancis" version="1"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 512019-08-28 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 32 to 32
NOTE: Questions are in Red (Unanswered), or Green (Answered), or Purple (In Progress) while answers are in Blue .
Changed:
<
<

>
>

## ARC review

### ARC action items from July 25 meeting

<!--/twistyPlugin twikiMakeVisibleInline-->

Combine the nine dxy sideband regions in the fake estimate into one larger sideband.

Done. The fake estimates change slightly but are all within 0.5sigma (statistical) of the previous estimates. The AN is updated.

Compare the pileup distributions in ZtoMuMu, ZtoEE, and BasicSelection events. If there is a big difference, try reweighting and see how much it changes the estimate.

 nPV ratios to BasicSelection

See the above plots. Using the ratios as weights to the fake selection (ZtoMuMu/EE + DisTrk (no d0 cut)), the overall weights applied to P_fake^raw would be:

 ZtoMuMu ZtoEE nLayers = 4 0.994 +- 0.064 1.01 +- 0.34 nLayers = 5 1.013 +- 0.088 1.0 +- 1.4 nLayers >= 6 1.0 +- 0.21 1.02 +- 0.83

Despite the plots above these average weights are very consistent with one, i.e. the estimate does not depend on this.

If possible, find the justification for using dxy with respect to the origin for the track isolation pileup subtraction.

We could not find a justification and have concluded that regardless of what inspired it, calculating the track isolation with respect to the origin was/is a mistake. However, as noted in the ARC meeting, this mistake is confined to the selection of tracks to be included in track isolation sum. The effect of which is to reduce the efficacy of the track isolation requirement. Redoing this sum would require prohibitively large reprocessing but fortunately, due to the redundancy of this cut with the calorimeter isolation requirement for charged hadrons and electrons, and the muon delta R requirement for muons, the effect on the analysis is not significant as seen in the below plot showing the observed events without the track isolation cut (N-1 plot).

Suggestion: compare the nominal Gaussian fit to a flat line for NLayers5, as there's a concern the bias towards the PV changes as nLayers increases.

 ZtoMuMu NLayers5 (gaussian fit from NLayers4) ZtoMuMu NLayers5 (pol0 fit, finer binning)

With a flat line, the 5-layer fake estimate is 0.81 events compared to the nominal 1.00.

<!--/twistyPlugin-->

### On-going 2018 estimate updates (last updated July 9)

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 81 to 120

<!--/twistyPlugin-->
Changed:
<
<

>
>

## Questions from ARC

### Email questions from Kevin Stenson August 27

<!--/twistyPlugin twikiMakeVisibleInline-->

AN Table 22 question. I do not understand the answer. It seems like you are agreeing with me. Maybe I don't understand Table 22. Let's just take the muon case for now. You want to measure the probability that a muon passes the lepton veto. You basically just measure the fraction of probe tracks that pass the lepton veto. According to Table 22, you measure the fraction of probe tracks that pass the criteria minDeltaR_track,muon > 0.15 and missing outer hits > 2. This is P_veto. However, the actual veto that you apply in the signal region is the 5 requirements in Tables 20-21. So, why don't you measure P_veto for muons by measuring the fraction of probe tracks that pass the criteria of all 5 requirements in Tables 20-21.

We require a pure sample of each flavor, and using all five requirements would reduce that purity. The listed requirements are the most powerful at rejecting the given flavor, so they are the correct choices to study.

Figure 19 and Tables 29-31 questions. If Figure 19 and Tables 29-31 both utilize all probe tracks then I would expect them to give the same result. So I'm not sure how that affects anything. In principle, the same sign subtraction could have an effect. Hopefully once the tables I asked for are made I will be able to understand if that is the reason.

These are "all probe tracks", there has been no Z -> T&P requirement yet. This is a technical necessity in creating a pool of tags and probes in which to find all possible pairs. The table you request has been made for the coming AN update, and for this channel:

P(veto) := (35 - 2) / (1213660 - 1437) = 2.72e-5

Section 6.1.5 question. I'm not sure this makes sense. You say that you only use ttbar because that contributes the most. But the analysis includes a tag-and-probe assuming Z production. So, while Z->ll may not contribute to the overall background, it will certainly contribute a great deal to the measurement of P_veto. I think you should use all of the MC samples you have if you are trying to imitate the data.

The closure test doesn't necessarily need to mimic the data, but should demonstrate that the method closes within a given sample. The other samples just did not have the statistics available to provide a reasonable observation and estimate.

Section 8.2.2 question. You write that the chargino behaves like a muon and so muons should be used as proxy for measuring the efficiency. It is true that the chargino does not undergo hadronic interactions so it is more like a muon than a hadron. However, the chargino of interest for this analysis does NOT get reconstructed as a muon in the muon system. So it is unlike a muon in that sense. The track reconstruction explicitly uses information from the muon detectors to find all possible muon tracks and to associated as many hits as possible with the track. This is possible for muons but NOT for charginos that decay before the muon system. Therefore, using muons may overestimate the hit efficiency. If you found the same result for hadrons, then I would not be concerned. Or, if you just used muon tracks that were not found by the special muon-finding steps, then I would be happy. I know you have already used this but for reference, this shows the overall tracking efficiency for muons with and without the muon-specific steps: https://cds.cern.ch/record/2666648/files/DP2019_004.pdf As you can see there is a significant reduction in efficiency when the muon-specific steps are removed and a much larger disagreement between data and MC. I have no idea how this translates (or not) to the hit finding efficiency.

We do include the data/simulation difference in DP2019_004 as a systematic uncertainty however, so the overall track reconstruction efficiency for charginos is well-covered. Comparing the 0.02% hit systematic to DP2019_004's 2.1%, the hit systematic is negligable even if the extra iterations increase the hit-association efficiency several-fold.

Section 8.2.5 question. This answer misses the point of my comment. Your estimate of the systematic uncertainty is simply a measure of MC statistics. It does not address the question of the systematic uncertainty associated with the method itself.

As discussed in first ARC meeting, this is a very common procedure used by many analyses. There is an observed difference between simulation and data in the hadronic recoil of di-muon events which must be accounted for, and the only uncertainty to apply is the statistics of the observed difference.

Finally, I see in the answer for the ARC meeting that the calculation of dxy for the track isolation may not have been done in the intended fashion. Can you specify what information you have on what was done. Is there any way to see the dxy distribution as calculated? Is there any way at this point to change to use dxy relative to the primary vertex (like dxy<0.02cm as is done for the signal tracks)? I see that this cut has a pretty big effect for the short lifetime case as shown in Figure 16.

See the above answer from the ARC action items.

<!--/twistyPlugin-->

### Comments from Giacomo Sguazzoni HN August 13

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 146 to 219
Comparing with/without the correction would be extremely large (~70% for some example points) and not relevant since differences between Pythia and Madgraph are well known. Removing the Madgraph/Pythia correction, we do not have a comparison of data/Pythia to make the correction -- an entirely Pythia-based SM background campaign would be required. In the limit of infinite statistics in data/Madgraph/Pythia samples, we would essentially have a perfect tune for Z ISR and there would be no uncertainty. Another possibility we are persuing is to generate a sample of our signal in Madgraph, the expectation being an ISR distribution equal to the 10M DY+jets we generated to form the Madgraph/Pythia weights.
<!--/twistyPlugin-->
Changed:
<
<

## Questions from Joe Pastika HN August 7

>
>

### Questions from Joe Pastika HN August 7

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 215 to 288

<!--/twistyPlugin-->
Changed:
<
<

## ARC action items from July 25 meeting

>
>

### "Some followup questions" from Kevin Stenson HN July 24

<!--/twistyPlugin twikiMakeVisibleInline-->
Changed:
<
<
Combine the nine dxy sideband regions in the fake estimate into one larger sideband.
>
>
Table 19: More clarification on dxy and sigma of the track is needed. If dxy is truly with respect to the origin, that is a terrible idea. The beamspot is significantly displaced from the origin (by several mm). So dxy should be measured with respect to either the beamspot or the primary vertex. Regarding sigma, I guess you are saying that it only includes the calculated uncertainty on the track parameters. Can you provide a plot of dxy and sigma for the tracks. Preferably for all tracks. These are applied in the signal selection, so why not here?

Changed:
<
<
Done. The fake estimates change slightly but are all within 0.5sigma (statistical) of the previous estimates. The AN is updated.
>
>
See the above answer from the ARC action items. Regarding sigma that is correct.

Changed:
<
<
Compare the pileup distributions in ZtoMuMu, ZtoEE, and BasicSelection events. If there is a big difference, try reweighting and see how much it changes the estimate.
>
>
Table 22: I don't understand your response. Regarding Table 22, my specific questions are: for electrons, why is there no veto on minDeltaR_track,muon>0.15 or min DeltaR_track,had_tau>0.15 or DeltaR_track,jet>0.5 for muons, why is there no veto on minDeltaR_track,electron>0.15 or E_calo<10 GeV or minDeltaR_track,had_tau>0.15 or DeltaR_track,jet>0.5 for taus, why is there no veto on minDeltaR_track,electron>0.15 or minDeltaR_track,muon>0.15

Changed:
<
<
 nPV ratios to BasicSelection
>
>
See the above answer from the ARC action items.

Changed:
<
<
See the above plots. Using the ratios as weights to the fake selection (ZtoMuMu/EE + DisTrk (no d0 cut)), the overall weights applied to P_fake^raw would be:
>
>
Figure 19 and Tables 29-31: I'm still not sure I understand. Is Figure 19 the plot for all probe tracks, regardless of whether there is a matching tag? If it is just all probe tracks that survive the tag-and-probe requirements, then the fact that they are all plotted and that you use all combinations should mean we get the same answer. On the other hand, it could be the same-sign subtraction is the cause of the difference. In addition to providing the four numbers that go into Equation 6, can you also provide the integrals of the blue and red points in the three plots of Figure 19? I'm hoping that N_T&P and N^veto_T&P will be similar to the integrals of the blue and red points, respectively.

Changed:
<
<
 ZtoMuMu ZtoEE nLayers = 4 0.994 +- 0.064 1.01 +- 0.34 nLayers = 5 1.013 +- 0.088 1.0 +- 1.4 nLayers >= 6 1.0 +- 0.21 1.02 +- 0.83
>
>
See: above.

Changed:
<
<
Despite the plots above these average weights are very consistent with one, e.g. the estimate does not depend on this.
>
>
L514-518 and Tables 29-31: So my understanding is that when you write "tau background" you are really intending to identify the sum of taus and single hadronic track backgrounds. I think this approach is fine but there may still be some issues with the implementation. The measurement of P_veto is going to be dominated by taus as it is measured from Z events and has the same-sign background subtracted. If you are trying to apply P_veto to the sum of taus and single hadronic track background it seems like it is necessary to show that P_veto is the same for taus and single hadronic tracks. For P_offline, the single-tau control sample will clearly be a mix of taus and single hadronic tracks as the selection is relatively loose. This is probably good. However, it will also include fake tracks as there is no same-sign subtraction in this measurement. So I think there is still the possibility that you are including fake tracks here. I guess the fact that there are basically no 4 or 5 layer tracks suggests that the fake track contribution is negligible.

Changed:
<
<
If possible, find the justification for using dxy with respect to the origin for the track isolation pileup subtraction.
>
>
We agree. The tau control region is not sufficient to study the composition of real taus versus single hadronic tracks so there is no feasible study on fake contamination here. Even if for example the contamination was 50-100% the estimates would still be statistically consistent with what we have now.

Changed:
<
<
We've found where we initially took this from several years ago, however it seems like it was taken incorrectly. We still however observe no pileup-dependence on the track isolation requirement.
>
>
Figure 25 and Tables 29 and 31: The Figure 25 caption seems to suggest (to me) that the plots show the projection of the events in the upper-right of the red lines. In actuality, the plots show the full distribution. I would suggest changing the plots to only include the events from the upper-right of the red lines in Figures 20-22.

Changed:
<
<
Suggestion: compare the nominal Gaussian fit to a flat line for NLayers5, as there's a concern the bias towards the PV changes as nLayers increases.
>
>
Instead the caption has been changed to be more accurate.

Changed:
<
<
With a flat line the transfer factor is purely a normalization issue and has no uncertainty; it is always 0.02 / 0.45 = 0.0444. The table below is added to the AN:
>
>
Section 6.1.5: Your response keeps mentioning that P_veto is small and therefore other sample are not useful. It may be true that other samples will not contribute when you select the signal region and compare to the background estimate. However, I would think the non-ttbar background will contribute to the measurement of P_offline and P_trigger, since these simply come from single-lepton samples. So, again, if you want this to mimic the data, I think you need to include all of the background MC samples.

Changed:
<
<
>
>
See: above.

Changed:
<
<
 ZtoMuMu NLayers5 (gaussian fit from NLayers4) ZtoMuMu NLayers5 (pol0 fit, finer binning)
>
>
Section 6.2: Thanks for the info. I think you are correct that fake tracks are the main contributors to the Gaussian and flat portion of the dxy plot. I don't think there is any bias in the final fit that is done. However, the pattern recognition does have a bias for finding tracks that originate from the beamline. So that could be the reason. I am concerned about one statement. You write that you label a track as fake if it is not "matched to any hard interaction truth particle". Can you clarify what you mean? I am worried that you only check the tracks coming from the "hard interaction" rather than all the truth tracks (including from those from pileup). I think it would be wrong to classify pileup tracks as fake tracks.

Technically this means there is no "packedGenParticles" (e.g. status=1) object with pt>10 GeV within deltaR < 0.1 of the selected track. This designation is not used for any measurement in the analysis, and whatever the source of fake tracks one would need to treat them in data as we've done; we just do not separate them by source.

Section 8.2.2: It is good to know that the hit efficiencies seem to be accurate. However, you also write that charginos behave like muons and so using muons is the correct way to evaluate the hit efficiency. You also write that "the reconstruction is done only with the tracker information". As I wrote earlier, for real muons, there are two additional tracking iterations that use muon detector information to improve the track reconstruction. This won't be the case for your charginos of interest because they decay before reaching the muon stations. So that is why I worry that muons are not a good substitute for your charginos.

See: above.

Section 8.2.5: This response does not really address the heart of the question. Suppose you did have infinite statistics in data and MC. Would we then be comfortable quoting no systematic uncertainty? Are we 100% sure that taking the Z pT and using that to reweight the pT spectrum of the electroweak-ino pair gives the correct distribution? Has anyone looked at applying this procedure to diboson production or ttbar production to confirm that it works?

Changed:
<
<
The flat assumption reduces the estimates by ~2/3 and the agreement is worse, especially as there would be no 40-50% fit uncertainty.
>
>
See: above.

Section 8.2.11: Are you saying that for your 700 GeV chargino signal, the increase in trigger efficiency when going from HLT_MET120 to HLT_MET120 || HLT_MET105_IsoTrk50 is only 1%? I'm not sure this is relevant to my concern though. Suppose you have a signal that produces PFMET of 125 GeV. When you measure the efficiency for nlayers=4,5,6 tracks, you will get the same efficiency because the MC includes a HLT_PFMET120_PFMHT120_IDTight trigger in it. So, you say great, there is no systematic because the efficiencies are the same. However, in some fraction of the data this trigger path is disabled and so the efficiency would be quite different for nlayers=4 tracks (which won't get triggered) and nlayers=6+ tracks which will get triggered by HLT_MET105_IsoTrk50. So, my suggestion was to do the same study but only include triggers that are never disabled or prescaled. Basically, you are currently using HLT_PFMET120_PFMHT120_IDTight || HLT_PFMETNoMu120_PFMHTNoMu120_IDTight || HLT_MET105_IsoTrack50 to make the calculation. My suggestion is to use HLT_PFMET140_PFMHT140_IDTight || HLT_PFMETNoMu140_PFMHTNoMu140_IDTight || HLT_MET105_IsoTrack50. Alternatively, you could correctly weight the MC to account for the different luminosity periods when each trigger is active.

The increase in signal acceptance*efficency is only 1%, the increase in trigger efficiency is as shown in Figure 44.

What we've done with trigger efficiency scale factors is to average the data efficiency over the entire data set, which includes the histories of every path. As you suggest we could alternatively measure a different scale factor for every data period, and then take a lumi-weighted average of these different periods -- but these methods are equal. The loss of efficiency in 2017 B is included in the data efficiency and thus the scale factor on signal efficiency.

<!--/twistyPlugin-->
Changed:
<
<

## First set of comments from Kevin Stenson HN July 14

>
>

### First set of comments from Kevin Stenson HN July 14

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 587 to 674

<!--/twistyPlugin-->
Changed:
<
<

## Questions from Juan Alcaraz (July 5)

>
>

### Questions from Juan Alcaraz (July 5)

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 612 to 699

<!--/twistyPlugin-->
Changed:
<
<

>
>

## Pre-approval

### Additional pre-approval followup Ivan Mikulec HN June 21

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 635 to 724

<!--/twistyPlugin-->
Changed:
<
<

## Questions from pre-approval EXO June 1

>
>

### Questions from pre-approval EXO June 1

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 1173 to 1262

 META FILEATTACHMENT attachment="tf_ZtoMuMu_NLayers5.png" attr="" comment="" date="1563389063" name="tf_ZtoMuMu_NLayers5.png" path="tf_ZtoMuMu_NLayers5.png" size="17113" user="bfrancis" version="1" attachment="comparePU_fakeCRs.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs.jpg" path="comparePU_fakeCRs.jpg" size="116660" user="bfrancis" version="1" attachment="comparePU_fakeCRs_ratio.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs_ratio.jpg" path="comparePU_fakeCRs_ratio.jpg" size="128471" user="bfrancis" version="1"
Deleted:
<
<
 META FILEATTACHMENT attachment="tfFlat_ZtoMuMu_NLayers5.png" attr="" comment="" date="1565462840" name="tfFlat_ZtoMuMu_NLayers5.png" path="tfFlat_ZtoMuMu_NLayers5.png" size="14955" user="bfrancis" version="1"

 META FILEATTACHMENT attachment="sidebandProjections.png" attr="" comment="" date="1565712492" name="sidebandProjections.png" path="sidebandProjections.png" size="70993" user="bfrancis" version="1"
>
>
 META FILEATTACHMENT attachment="tfFlat_ZtoMuMu_NLayers5.png" attr="" comment="" date="1567002070" name="tfFlat_ZtoMuMu_NLayers5.png" path="tfFlat_ZtoMuMu_NLayers5.png" size="13474" user="bfrancis" version="1" attachment="tfFlat_ZtoMuMu_NLayers5_v2.png" attr="" comment="" date="1567002102" name="tfFlat_ZtoMuMu_NLayers5_v2.png" path="tfFlat_ZtoMuMu_NLayers5_v2.png" size="13474" user="bfrancis" version="1"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 502019-08-19 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 142 to 142
It would be more appropriate to say that P_fake^raw,i is independent, correct. However with the use of one larger sideband this sentence is no longer relevant.
Changed:
<
<
l 326: among the systematic uncertainties, the one associated to the ISR modelling is the largest; nevertheless is it sufficient? You just consider 1sigma of statistical fluctuation (according to the AN, Section 8.2.5), a quantity that, in principle, you can reduce by increasing the statistics. Is there no systematic associated to the reweighing method itself? I think it is needed given the large correction factors you end up with (350%). A possibility (indeed extreme but for sure conservative) would be to evaluate the efficiency change with and without correction factors. Which would be the systematic in this case?
>
>
l 326: among the systematic uncertainties, the one associated to the ISR modelling is the largest; nevertheless is it sufficient? You just consider 1sigma of statistical fluctuation (according to the AN, Section 8.2.5), a quantity that, in principle, you can reduce by increasing the statistics. Is there no systematic associated to the reweighing method itself? I think it is needed given the large correction factors you end up with (350%). A possibility (indeed extreme but for sure conservative) would be to evaluate the efficiency change with and without correction factors. Which would be the systematic in this case?

Changed:
<
<
<!--/twistyPlugin-->
>
>
Comparing with/without the correction would be extremely large (~70% for some example points) and not relevant since differences between Pythia and Madgraph are well known. Removing the Madgraph/Pythia correction, we do not have a comparison of data/Pythia to make the correction -- an entirely Pythia-based SM background campaign would be required. In the limit of infinite statistics in data/Madgraph/Pythia samples, we would essentially have a perfect tune for Z ISR and there would be no uncertainty. Another possibility we are persuing is to generate a sample of our signal in Madgraph, the expectation being an ISR distribution equal to the 10M DY+jets we generated to form the Madgraph/Pythia weights.</>
<!--/twistyPlugin-->

## Questions from Joe Pastika HN August 7

#### Revision 492019-08-16 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 87 to 87
Comments are for the paper (v0 2019/05/16) with references to the AN (v7) when applicable.
Changed:
<
<
L 15-16: "Decaying to a weakly-interacting, stable neutralino and
>
>
L 15-16: "Decaying to a weakly-interacting, stable neutralino and

16 unreconstructed pion, a chargino decay often leaves a disappearing track in the AMSB model." -->
it seems the disappearing track is left in the model!
Changed:
<
<
l 25: exclude -->
*excludes*
>
>
Reworded slightly.

Changed:
<
<
l 31-32: the interpretation ... are -->
the interpretation ... *is*
>
>
l 25: exclude -->
*excludes*

Changed:
<
<
The CMS detector: As already noted, a more detailed description of the tracker is needed. Here the geometry of the tracker is not described at all, while this is important to clarify the concept of 'measurement layer' and their numbers. I'm wondering if a picture of the tracker layout would be appropriate in this paper.
>
>
Fixed.

Changed:
<
<
l 75: blank missing between chi^0_1 and 'mass'
>
>
l 31-32: the interpretation ... are -->
the interpretation ... *is*

Changed:
<
<
l 87: these correction factors are huge (350%); with these correction factor involved, how can you trust the simulation? Moreover the description in the paper is misleading, I think. You give the impression that the 350% factor derives from the ISR mismodelling in the Z->mumu events. But, reading the AN section 7.4, I understand that the big factor derives from AN Fig. 38, i.e. Madgraph vs. Pythia for Z->mumu, that you need since your signal is simulated with Pythia. I think this point has to better explained in the paper, cause the reader could be surprised to learn that we are 350% off in Z->mumu simulation. See below also the discussion on the systematic uncertainty associated to this correction.
>
>
Fixed.

Changed:
<
<
l 112: I think PF as an acronym of particle flow has never been defined
>
>
The CMS detector: As already noted, a more detailed description of the tracker is needed. Here the geometry of the tracker is not described at all, while this is important to clarify the concept of 'measurement layer' and their numbers. I'm wondering if a picture of the tracker layout would be appropriate in this paper.

Changed:
<
<
l 133-155: concept is clear but the description is complicated and I think there is room for improvement; in case they are missing, 'hits' are counted as 'layers' (what about using a different nomenclature? e.g. 'layer measurement'); but what about not missing hits, the one on which you apply the cut? what about the 4 hits you require? This is relevant with respect to overlaps. Clarify (you may need to introduce a discussion on overlaps here and in the tracker description).
>
>
A sentence has been added to specify the position of each layer of the Phase 1 upgrade. L 186-190 should clarify "nLayers". We likely have room for another figure, but we can discuss what would be best to include.

Changed:
<
<
l 184-185: track coordinates? Do you mean track parameters?
>
>
l 75: blank missing between chi^0_1 and 'mass'

Changed:
<
<
l 253-255: The phrasing could be improved.
>
>
Fixed.

Changed:
<
<
l 262-271 the d0 fit description could be improved (need to check the AN to better understand what's going on). In particular:
>
>
l 87: these correction factors are huge (350%); with these correction factor involved, how can you trust the simulation? Moreover the description in the paper is misleading, I think. You give the impression that the 350% factor derives from the ISR mismodelling in the Z->mumu events. But, reading the AN section 7.4, I understand that the big factor derives from AN Fig. 38, i.e. Madgraph vs. Pythia for Z->mumu, that you need since your signal is simulated with Pythia. I think this point has to better explained in the paper, cause the reader could be surprised to learn that we are 350% off in Z->mumu simulation. See below also the discussion on the systematic uncertainty associated to this correction.

Changed:
<
<
l 263: when you say 'we first fit the observed d0 of selected tracks to a Gaussian distribution in the range 0.1cm < |d0| < 1.0cm', you mean that you fit d0 excluding, from the fit, the range -0.1cm < d0 < 0.1cm. Is that correct? If this is the case, I think the you way to describe the fit is misleading.
>
>
A brief comment has been added, saying that most of this value is due to the ISR modeling in Pythia.

Changed:
<
<
l 265: |d0 ==> |d0|
>
>
l 112: I think PF as an acronym of particle flow has never been defined

Changed:
<
<
l 272: 'independent': is that true? the transfer factor derive from the same fit, i.e. the same gaussian. How could N_est^i,fake be independent?
>
>
Particle flow is first mentioned on L 95 and the "(PF)" is now defined there.

l 133-155: concept is clear but the description is complicated and I think there is room for improvement; in case they are missing, 'hits' are counted as 'layers' (what about using a different nomenclature? e.g. 'layer measurement'); but what about not missing hits, the one on which you apply the cut? what about the 4 hits you require? This is relevant with respect to overlaps. Clarify (you may need to introduce a discussion on overlaps here and in the tracker description).

Until L 186-190, the actual quantity of nLayers isn't used so the distinction wasn't made clear. Before that point we make frequent reference to "layers", but as physical layers of the detector itself in which hits can exist.

l 184-185: track coordinates? Do you mean track parameters?

That is a better phrasing and we use that now.

l 253-255: The phrasing could be improved.

This section has been rewritten.

l 262-271 the d0 fit description could be improved (need to check the AN to better understand what's going on). In particular:

l 263: when you say 'we first fit the observed d0 of selected tracks to a Gaussian distribution in the range 0.1cm < |d0| < 1.0cm', you mean that you fit d0 excluding, from the fit, the range -0.1cm < d0 < 0.1cm. Is that correct? If this is the case, I think the you way to describe the fit is misleading.

This section requires some rewriting as in the review we've moved to a single sideband instead of the nine. This rewording makes the fit range and process more clear.

l 265: |d0 ==> |d0|

Fixed.

l 272: 'independent': is that true? the transfer factor derive from the same fit, i.e. the same gaussian. How could N_est^i,fake be independent?

It would be more appropriate to say that P_fake^raw,i is independent, correct. However with the use of one larger sideband this sentence is no longer relevant.

l 326: among the systematic uncertainties, the one associated to the ISR modelling is the largest; nevertheless is it sufficient? You just consider 1sigma of statistical fluctuation (according to the AN, Section 8.2.5), a quantity that, in principle, you can reduce by increasing the statistics. Is there no systematic associated to the reweighing method itself? I think it is needed given the large correction factors you end up with (350%). A possibility (indeed extreme but for sure conservative) would be to evaluate the efficiency change with and without correction factors. Which would be the systematic in this case?

#### Revision 482019-08-13 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 81 to 81
</>
<!--/twistyPlugin-->
>
>
 Comments from Giacomo Sguazzoni HN August 13

Show Details   Hide Details    <!--/twistyPlugin twikiMakeVisibleInline-->

Comments are for the paper (v0 2019/05/16) with references to the AN (v7) when applicable.

L 15-16: "Decaying to a weakly-interacting, stable neutralino and
16 unreconstructed pion, a chargino decay often leaves a disappearing track in the AMSB model." --> it seems the disappearing track is left in the model!

l 25: exclude -->
*excludes*

l 31-32: the interpretation ... are --> the interpretation ... *is*

The CMS detector: As already noted, a more detailed description of the tracker is needed. Here the geometry of the tracker is not described at all, while this is important to clarify the concept of 'measurement layer' and their numbers. I'm wondering if a picture of the tracker layout would be appropriate in this paper.

l 75: blank missing between chi^0_1 and 'mass'

l 87: these correction factors are huge (350%); with these correction factor involved, how can you trust the simulation? Moreover the description in the paper is misleading, I think. You give the impression that the 350% factor derives from the ISR mismodelling in the Z->mumu events. But, reading the AN section 7.4, I understand that the big factor derives from AN Fig. 38, i.e. Madgraph vs. Pythia for Z->mumu, that you need since your signal is simulated with Pythia. I think this point has to better explained in the paper, cause the reader could be surprised to learn that we are 350% off in Z->mumu simulation. See below also the discussion on the systematic uncertainty associated to this correction.

l 112: I think PF as an acronym of particle flow has never been defined

l 133-155: concept is clear but the description is complicated and I think there is room for improvement; in case they are missing, 'hits' are counted as 'layers' (what about using a different nomenclature? e.g. 'layer measurement'); but what about not missing hits, the one on which you apply the cut? what about the 4 hits you require? This is relevant with respect to overlaps. Clarify (you may need to introduce a discussion on overlaps here and in the tracker description).

l 184-185: track coordinates? Do you mean track parameters?

l 253-255: The phrasing could be improved.

l 262-271 the d0 fit description could be improved (need to check the AN to better understand what's going on). In particular:

l 263: when you say 'we first fit the observed d0 of selected tracks to a Gaussian distribution in the range 0.1cm < |d0| < 1.0cm', you mean that you fit d0 excluding, from the fit, the range -0.1cm < d0 < 0.1cm. Is that correct? If this is the case, I think the you way to describe the fit is misleading.

l 265: |d0 ==> |d0|

l 272: 'independent': is that true? the transfer factor derive from the same fit, i.e. the same gaussian. How could N_est^i,fake be independent?

l 326: among the systematic uncertainties, the one associated to the ISR modelling is the largest; nevertheless is it sufficient? You just consider 1sigma of statistical fluctuation (according to the AN, Section 8.2.5), a quantity that, in principle, you can reduce by increasing the statistics. Is there no systematic associated to the reweighing method itself? I think it is needed given the large correction factors you end up with (350%). A possibility (indeed extreme but for sure conservative) would be to evaluate the efficiency change with and without correction factors. Which would be the systematic in this case?

<!--/twistyPlugin-->

## Questions from Joe Pastika HN August 7

<!--/twistyPlugin twikiMakeVisibleInline-->

L254: "bad charged hadron filter" is listed as "not recommended" on the JetMET twiki. Is there a reason this is still included in your filter list?

Changed:
<
<
In the next version "(2017 only)" is added to this filter. The recommendation was changed between our analysis of 2017 and 2018 data, and it wasn't feasible to re-process 2017 for this; all of the MET filters listed remove only 2% of the /MET/ dataset, so it is a small issue.
>
>
The recommendation was changed between our analysis of 2017 and 2018 data, and it wasn't feasible to re-process 2017 for this; all of the MET filters listed remove only 2% of the /MET/ dataset, so it is a small issue. In the next AN version "(2017 only)" is added here.
L268: Could you use a difference symbol in the text/tables for when you use dxy/dz referenced from 0,0,0 (maybe dz_000 or something else reasonable) to differentiate it clearly from measurement w.r.t. the beamspot?
Line: 113 to 152
This was normalized incorrectly and will be corrected to one in the next AN version.
Changed:
<
<
L458: Can you add plots (at least a few examples) to the AN of the Z->ll mass distributions in the OS and SS categories used for the T&P method?
>
>
L458: Can you add plots (at least a few examples) to the AN of the Z->ll mass distributions in the OS and SS categories used for the T&P method?

We will add these plots.

L490: I don't understand how this test shows that P_veto does not depend on track pT. How significant is the KS test for distributions with so few events?
Changed:
<
<
This question was asked in the pre-approval (see below in the responses). Investigating we found that the estimate as presented is statistically consistent with several other hypotheses of P_veto's pt-dependence, for example a linear dependence. In short we do not have the statistics to determine a pt-dependence or for any potential dependence to matter.
>
>
This question was asked in the pre-approval (see below in the responses). Investigating we found that the estimate as presented is statistically consistent with several other hypotheses of P_veto's pt-dependence, for example a linear dependence. In short we do not have the statistics to determine a pt-dependence or for any potential dependence to affect the estimate.
L523: Can you say a few more words about how the trigger efficiency is calculated?
Line: 168 to 209

 nLayers = 5 1.013 +- 0.088 1.0 +- 1.4 nLayers >= 6 1.0 +- 0.21 1.02 +- 0.83
Changed:
<
<
To use those weights in the estimate would need a more careful treatment of the statistical uncertainties, but even ignoring them these do not change the estimates at all.
>
>
Despite the plots above these average weights are very consistent with one, e.g. the estimate does not depend on this.
If possible, find the justification for using dxy with respect to the origin for the track isolation pileup subtraction.
Line: 176 to 217
Suggestion: compare the nominal Gaussian fit to a flat line for NLayers5, as there's a concern the bias towards the PV changes as nLayers increases.
Changed:
<
<
The transfer factor for |d0| ~ (0.05, 0.50) is 0.14 (0.12 ZtoEE), but with a flat line the transfer factor is purely geometric and needs to fit. It would always be 0.02 / 0.45 = 0.0444. To do the same comparison as the "second assumption" (L 646-658 in the AN):

 ZtoMuMu NLayers5 ZtoEE NLayers5 sideband d0 tracks 25 9 signal d0 tracks 2 5 t.f. * sideband count (from fit) 3.50 ± 1.85 1.08 ± 0.69 t.f. * sideband count (flat) 1.11 ± 0.02 0.40 ± 0.13
>
>
With a flat line the transfer factor is purely a normalization issue and has no uncertainty; it is always 0.02 / 0.45 = 0.0444. The table below is added to the AN:

Changed:
<
<
Due to the ~ 40-50% uncertainty from the fit, the projection to the peak using the fit from NLayers4 ends up being more compatible with the observation of the peak.
>
>

Changed:
<
<
 ZtoMuMu NLayers5 (gaussian fit from NLayers4) ZtoMuMu NLayers5 (pol0 fit)
>
>
 ZtoMuMu NLayers5 (gaussian fit from NLayers4) ZtoMuMu NLayers5 (pol0 fit, finer binning)

>
>
The flat assumption reduces the estimates by ~2/3 and the agreement is worse, especially as there would be no 40-50% fit uncertainty.

<!--/twistyPlugin-->

## First set of comments from Kevin Stenson HN July 14

Line: 1112 to 1148

 META FILEATTACHMENT attachment="comparePU_fakeCRs.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs.jpg" path="comparePU_fakeCRs.jpg" size="116660" user="bfrancis" version="1" attachment="comparePU_fakeCRs_ratio.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs_ratio.jpg" path="comparePU_fakeCRs_ratio.jpg" size="128471" user="bfrancis" version="1" attachment="tfFlat_ZtoMuMu_NLayers5.png" attr="" comment="" date="1565462840" name="tfFlat_ZtoMuMu_NLayers5.png" path="tfFlat_ZtoMuMu_NLayers5.png" size="14955" user="bfrancis" version="1"
>
>
 META FILEATTACHMENT attachment="sidebandProjections.png" attr="" comment="" date="1565712492" name="sidebandProjections.png" path="sidebandProjections.png" size="70993" user="bfrancis" version="1"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 472019-08-10 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 174 to 174
We've found where we initially took this from several years ago, however it seems like it was taken incorrectly. We still however observe no pileup-dependence on the track isolation requirement.
>
>
Suggestion: compare the nominal Gaussian fit to a flat line for NLayers5, as there's a concern the bias towards the PV changes as nLayers increases.

The transfer factor for |d0| ~ (0.05, 0.50) is 0.14 (0.12 ZtoEE), but with a flat line the transfer factor is purely geometric and needs to fit. It would always be 0.02 / 0.45 = 0.0444. To do the same comparison as the "second assumption" (L 646-658 in the AN):

 ZtoMuMu NLayers5 ZtoEE NLayers5 sideband d0 tracks 25 9 signal d0 tracks 2 5 t.f. * sideband count (from fit) 3.50 ± 1.85 1.08 ± 0.69 t.f. * sideband count (flat) 1.11 ± 0.02 0.40 ± 0.13

Due to the ~ 40-50% uncertainty from the fit, the projection to the peak using the fit from NLayers4 ends up being more compatible with the observation of the peak.

 ZtoMuMu NLayers5 (gaussian fit from NLayers4) ZtoMuMu NLayers5 (pol0 fit)
</>
<!--/twistyPlugin-->

## First set of comments from Kevin Stenson HN July 14

Line: 1095 to 1111

 META FILEATTACHMENT attachment="tf_ZtoMuMu_NLayers5.png" attr="" comment="" date="1563389063" name="tf_ZtoMuMu_NLayers5.png" path="tf_ZtoMuMu_NLayers5.png" size="17113" user="bfrancis" version="1" attachment="comparePU_fakeCRs.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs.jpg" path="comparePU_fakeCRs.jpg" size="116660" user="bfrancis" version="1" attachment="comparePU_fakeCRs_ratio.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs_ratio.jpg" path="comparePU_fakeCRs_ratio.jpg" size="128471" user="bfrancis" version="1"
>
>
 META FILEATTACHMENT attachment="tfFlat_ZtoMuMu_NLayers5.png" attr="" comment="" date="1565462840" name="tfFlat_ZtoMuMu_NLayers5.png" path="tfFlat_ZtoMuMu_NLayers5.png" size="14955" user="bfrancis" version="1"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 462019-08-08 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 81 to 81
</>
<!--/twistyPlugin-->
>
>

## Questions from Joe Pastika HN August 7

<!--/twistyPlugin twikiMakeVisibleInline-->

L254: "bad charged hadron filter" is listed as "not recommended" on the JetMET twiki. Is there a reason this is still included in your filter list?

In the next version "(2017 only)" is added to this filter. The recommendation was changed between our analysis of 2017 and 2018 data, and it wasn't feasible to re-process 2017 for this; all of the MET filters listed remove only 2% of the /MET/ dataset, so it is a small issue.

L268: Could you use a difference symbol in the text/tables for when you use dxy/dz referenced from 0,0,0 (maybe dz_000 or something else reasonable) to differentiate it clearly from measurement w.r.t. the beamspot?

Now used is $d_{z}^{0}$ and $d_{xy}^{0}$ where relevant.

L311: What effect does the choice of "2 sigma" on the inefficiency of tracks have on the analysis? How is "sigma" defined here? Is it a uncertainty or is it related to the standard deviation of the distribution?

The sigma here is the standard deviation of the sample mean of the veto inefficiency for each flavor. This procedure is taken as a conservative measure so the value of 2 isn't rigorously optimized. For an example of the effect, in a sample of 900 GeV, ctau = 100 cm charginos in the NLayers6plus category, 745 tracks are selected with the <2sigma requirement and 730 tracks are selected with <1sigma required instead.

L322: What is the signal efficiency for your benchmark points for the feducial cuts?

The fiducial cuts all togther remove roughly 20% of signal tracks.

L334: Can you help me understand what the jet pT > 110 GeV cut is achieving? Do you have the ratio of each passing this cut? Its hard to tell how effective it is from the 2D plots in figure 11.

Figure 11 will be extended to jet pt > 30 GeV on the y-axis to better show the issue the text mentions. The pt > 110 GeV cut is used to be consistent with the online requirement of MET from an ISR jet. The efficiency of this cut is 84.6% in data and 87.2% for the signal sample shown.

L375: Does rho include neutral or charged + neutral average energy?

Rho is from the "fixedGridRhoFastjetCentralCalo" collection which uses all PF candidates within |eta| < 2.5

Figure 16: Is the difference from one in the first bin labeled "total" from acceptance?

This was normalized incorrectly and will be corrected to one in the next AN version.

L458: Can you add plots (at least a few examples) to the AN of the Z->ll mass distributions in the OS and SS categories used for the T&P method?

L490: I don't understand how this test shows that P_veto does not depend on track pT. How significant is the KS test for distributions with so few events?

This question was asked in the pre-approval (see below in the responses). Investigating we found that the estimate as presented is statistically consistent with several other hypotheses of P_veto's pt-dependence, for example a linear dependence. In short we do not have the statistics to determine a pt-dependence or for any potential dependence to matter.

L523: Can you say a few more words about how the trigger efficiency is calculated?

This is simply the effiency to pass the signal trigger requirement. As in Section 7.3 we require lepton pt > 55 GeV, so that the efficiency is measured on the IsoTrk50 track leg plateau. An additional sentence to this effect is added now to the AN here.

Table 29-31: Are the uncertainties statistical only?

Statistical only. A comment has been added to these captions.

L590: What level of signal track contamination in the ee/mumu CR would be required before it would affect the background estimate significantly?

To have signal contamination here, there would need to be sources of ee/mumu pairs (a Z or otherwise) in the signal which does not occur in any of our samples; we have 0% contamination now. Even if signal did contain Z->ee/mumu candidates, a track would need to have |dxy| >= 0.05 to be fake estimate contamination. The efficiency for the |dxy| < 0.02 cut on one sample (900 GeV, 100 cm, NLayers6plus) is 99.8% -- so the contamination would be even less than 0.2% times the transfer factor.

L646: Do I understand correctly that the cross check here is to simply take the ration of integrals of the sideband vs. signal region instead of using the fit to determine this ratio?

Not quite. The cross check takes the normal estimate from the sideband (count/integrate the events and scale by the fitted transfer factor), and compares its estimation of the events in the peak to the actual observation in the peak.

L742: Did you compare the trigger efficiency calculated with other reference triggers than SingleMuon?

From a pre-approval question we measured the trigger efficiency using the SingleElectron dataset as well. It was very similar, although electrons introduce hit efficiency effects like conversions so we do not use it in the analysis.

L837: Is there a good argument why the source of differences in the electron case really is applicable to the muon and tau cases?

The online/offline MET isn't expected to be strongly dependent on the nLayers of a mis-reconstructed lepton, since they've failed to be reconstructed, and so naively one should expect P_offline and P_trigger are all the same. But there is a small difference for electrons, and we assume muons/taus have a similar difference. The statistical uncertainties for these muon/tau estimates are already 100% or more so even a much larger systematic would not make a difference.

<!--/twistyPlugin-->

## ARC action items from July 25 meeting

<!--/twistyPlugin twikiMakeVisibleInline-->

#### Revision 452019-08-06 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 105 to 105
If possible, find the justification for using dxy with respect to the origin for the track isolation pileup subtraction.
Changed:
<
<
We've been unsuccessful finding a reference for this. This is most likely just incorrect, although we do not observe a pileup-dependence on the track isolation requirement.
>
>
We've found where we initially took this from several years ago, however it seems like it was taken incorrectly. We still however observe no pileup-dependence on the track isolation requirement.

<!--/twistyPlugin-->
Line: 183 to 183
L468-9: It would be helpful to provide a table giving the values for the 4 numbers in Equation 6 for each of the 9 cases (3 leptons * 3 nlayer bins). I would like to get an idea of the signal-to-background ratio. I may also want to calculate the effect of subtracting the background versus ignoring it.
Changed:
<
<
In making this table we found some bugs in the P(veto) script. Firstly the N_{SS T&P} was not being subtracted in the denominator of Equation 6; in nlayers>=6 this is an extremely trivial issue but is relevant in the newer, shorter categories. Secondly in cases where the numerator of Equation 6 is negative, the N_{SS T&P}^{veto} subtraction was ignored when it should be assumed to be 0 + 1.1 -0. Both these issues are resolved and we supply the table. This slightly changes some estimates.
>
>
In making this table we found some bugs in the P(veto) script. Firstly the N_{SS T&P} was not being subtracted in the denominator of Equation 6; in nlayers>=6 this is an extremely trivial issue but is relevant in the newer, shorter categories. Secondly in cases where the numerator of Equation 6 is negative, the N_{SS T&P}^{veto} subtraction was ignored when it should be assumed to be 0 + 1.1 -0. Both these issues are resolved and we are currently making the table. This slightly changes some estimates.
L472-5: Are these recHits from the tracker or the calorimeter? I don't really understand what you are describing here. Are the electron seeds from the ECAL or pixel detector?
Line: 349 to 349
We do not have the entire hitPattern contents histogrammed so this is not quickly answerable as suggested. However in the electron control region (pt > 55 GeV) with nlayers=4, 11% of tracks have more than 4 pixel hits whereas signal ranges from 8-12%. This comparison however would only describe the offline association of hits to tracks, which is known to be better than the online association -- so one would still need to examine the trigger fires as Section 8.2.11 does to get a clear view of the difference in trigger efficiencies.
Changed:
<
<
Here are also some brief comments on the paper:
>
>
Here are also some brief comments on the paper:

Changed:
<
<
Whenever you have a range, it should be written in regular (not math) mode with a double hyphen in LaTeX and no spaces. That is, "1--2". Done correctly in L44, . Incorrect in L186, L285, L321, L326, L327, L331
>
>
Whenever you have a range, it should be written in regular (not math) mode with a double hyphen in LaTeX and no spaces. That is, "1--2". Done correctly in L44, . Incorrect in L186, L285, L321, L326, L327, L331

Changed:
<
<
In Section 2, I think it would be good to give more information about the tracker, especially the Phase 1 pixel detector. It is pretty important to know that we expect particles to pass through 4 pixel layers.
>
>
Fixed.

In Section 2, I think it would be good to give more information about the tracker, especially the Phase 1 pixel detector. It is pretty important to know that we expect particles to pass through 4 pixel layers.

Lines 28-31 should establish the extra categories that are possible thanks to the upgrade. A sentence listing the positions of the layers/disks has been added to Section 2.

Should mention the difference between number of hits and number of layers with a hit.

Lines 185-187 have been expanded to mention this.

L60-67: At the end you talk about physics quantities like tan beta, mu, and the chargino-neutralino mass difference. In principle, I believe the lifetime is set by the masses (mainly mass difference) of the chargino and neutralino. I think you need to be clear that the lifetimes are changed arbitrarily and also give the mass difference (could just say 0.2 GeV).

Reworded to mention that more clearly. Typically the mass values would be included in the HEPdata entry because they vary, and space in a letter is too limited to include a full table.

Changed:
<
<
Should mention the difference between number of hits and number of layers with a hit.
>
>
L113: pTmiss and pTmiss no mu should be vectors

Changed:
<
<
L60-67: At the end you talk about physics quantities like tan beta, mu, and the chargino-neutralino mass difference. In principle, I believe the lifetime is set by the masses (mainly mass difference) of the chargino and neutralino. I think you need to be clear that the lifetimes are changed arbitrarily and also give the mass difference (could just say 0.2 GeV).
>
>
Fixed.

L128: Should say why |eta|<2.1 is used.

Changed:
<
<
L113: pTmiss and pTmiss no mu should be vectors
>
>

Changed:
<
<
L128: Should say why |eta|<2.1 is used.
>
>
L131: need to specify the eta and pT requirements on the jets, perhaps in L103-107.

Changed:
<
<
L131: need to specify the eta and pT requirements on the jets, perhaps in L103-107.
>
>
The pT is specified now. The jet eta requirement is |eta|<4.5, which for tracks with |eta|<2.1 is all of them so it's left out.

Changed:
<
<
L157: Should describe hadronic tau reconstruction. Could be at the end of L91-102 where electrons, muons, and charged hadron reconstruction is described.
>
>
L157: Should describe hadronic tau reconstruction. Could be at the end of L91-102 where electrons, muons, and charged hadron reconstruction is described.

Changed:
<
<
L168,L177: Given that your special procedure removes 4% of the signal tracks, it is natural to wonder what fraction of the signal tracks are removed by the requirements of L159-168.
>
>
The PubComm does not give a recommendation for this, and typically hadronically decaying taus are included in the mention of charged hadron reconstruction.

Changed:
<
<
L179: Commas after "Firstly" and "Secondly"
>
>
L168,L177: Given that your special procedure removes 4% of the signal tracks, it is natural to wonder what fraction of the signal tracks are removed by the requirements of L159-168.

Changed:
<
<
L190: Should make it clear that leptons here refers to electrons, muons, and taus.
>
>

Changed:
<
<
L194, 196, 222, 231: The ordering of P_offline and P_trigger in L194,196 is different than in L222,231. Better to be consistent.
>
>
L179: Commas after "Firstly" and "Secondly"

Changed:
<
<
L204: I think you mean "excepting" rather than "expecting"
>
>
Fixed.

Changed:
<
<
L214: I don't think you need the subscript "invmass" given that you define it that way in L213.
>
>
L190: Should make it clear that leptons here refers to electrons, muons, and taus.

Changed:
<
<
L222: Change "condition" to "conditional"
>
>
Okay.

Changed:
<
<
L227: p_T^l should be a vector
>
>
L194, 196, 222, 231: The ordering of P_offline and P_trigger in L194,196 is different than in L222,231. Better to be consistent.

Changed:
<
<
L234-238: This will need to be expanded to make it clear
>
>
Fixed in in L193-197.

L204: I think you mean "excepting" rather than "expecting"

Fixed.

Changed:
<
<
L247: I don't think it is useful to mention a closure test with 2% of the data. I mean a 2% test may reveal something that is horribly wrong but it is not going to convince anyone that you know what you are doing.
>
>
L214: I don't think you need the subscript "invmass" given that you define it that way in L213.

Changed:
<
<
L339 and Table 3 caption: Suggest changing "signal yields" to "signal efficiencies"
>
>
Removed.

Changed:
<
<
Table 4: I guess to match the text it should be "spurious tracks" instead of "fake tracks"
>
>
L222: Change "condition" to "conditional"

Changed:
<
<
Lots of the references need minor fixing. The main problems are
>
>
Fixed.

L227: p_T^l should be a vector

Vectorized.

L234-238: This will need to be expanded to make it clear

Reworded somewhat to improve clarity.

L247: I don't think it is useful to mention a closure test with 2% of the data. I mean a 2% test may reveal something that is horribly wrong but it is not going to convince anyone that you know what you are doing.

Removed.

L339 and Table 3 caption: Suggest changing "signal yields" to "signal efficiencies"

Changed..

Table 4: I guess to match the text it should be "spurious tracks" instead of "fake tracks"

Changed..

Lots of the references need minor fixing. The main problems are

- volume letter needs to go with title and not volume number: refs 2, 8, 26, 27, 30, 39, 40, 41 - only the first page number should be given: refs 2, 19, 30 - no issue number should be given: refs 8, 13, 31 - PDG should use the Bibtex entry given here: https://twiki.cern.ch/twiki/bin/view/CMS/Internal/PubGuidelines - ref 40 needs help
>
>
</>
<!--/twistyPlugin-->

## Questions from Juan Alcaraz (July 5)

#### Revision 442019-08-03 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 81 to 81
</>
<!--/twistyPlugin-->
>
>

## ARC action items from July 25 meeting

<!--/twistyPlugin twikiMakeVisibleInline-->

Combine the nine dxy sideband regions in the fake estimate into one larger sideband.

Done. The fake estimates change slightly but are all within 0.5sigma (statistical) of the previous estimates. The AN is updated.

Compare the pileup distributions in ZtoMuMu, ZtoEE, and BasicSelection events. If there is a big difference, try reweighting and see how much it changes the estimate.

 nPV ratios to BasicSelection

See the above plots. Using the ratios as weights to the fake selection (ZtoMuMu/EE + DisTrk (no d0 cut)), the overall weights applied to P_fake^raw would be:

 ZtoMuMu ZtoEE nLayers = 4 0.994 +- 0.064 1.01 +- 0.34 nLayers = 5 1.013 +- 0.088 1.0 +- 1.4 nLayers >= 6 1.0 +- 0.21 1.02 +- 0.83

To use those weights in the estimate would need a more careful treatment of the statistical uncertainties, but even ignoring them these do not change the estimates at all.

If possible, find the justification for using dxy with respect to the origin for the track isolation pileup subtraction.

We've been unsuccessful finding a reference for this. This is most likely just incorrect, although we do not observe a pileup-dependence on the track isolation requirement.

<!--/twistyPlugin-->

## First set of comments from Kevin Stenson HN July 14

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 109 to 137
See above; jets are considered if pt>30 and |eta|<4.5. So |Delta phi_jet,jet| would apply only if there are two or more jets, the minimal case being a ~110 GeV and a second ~30 GeV jet. A clarifying sentence has been added.
Changed:
<
<
L332-334: It is claimed that a jet pT cut of >110 GeV removes a lot of background and not signal. The plots in Figure 11 don't seem to back this up. It seems like about the same percentage of signal and background events are removed by the cut. Can you quantify the effect of this cut (background rejection and signal efficiency)?
>
>
L332-334: It is claimed that a jet pT cut of >110 GeV removes a lot of background and not signal. The plots in Figure 11 don't seem to back this up. It seems like about the same percentage of signal and background events are removed by the cut. Can you quantify the effect of this cut (background rejection and signal efficiency)?

Changed:
<
<
>
>
Figure 11 has been updated to show jet pt >30 GeV instead of >55 GeV, as the issue is seen at lower pt. The efficiency of the >110 GeV cut is 84.6% in data and 87.2% for the signal sample shown.

Changed:
<
<
L338-341: Would be nice to show the signal and background distributions for these two variables so we can evaluate for ourselves the effectiveness of the cut. Also, it would be helpful to report the background rejection and signal efficiency for these cuts.
>
>
L338-341: Would be nice to show the signal and background distributions for these two variables so we can evaluate for ourselves the effectiveness of the cut. Also, it would be helpful to report the background rejection and signal efficiency for these cuts.

We are producing N-1 plots to show this.

Table 18: Are there no standard Tracking POG quality requirements on the tracks? I think they still have "loose" and "highPurity" requirements based on an MVA. Do you require either of these?
Line: 127 to 157
Text describing this around line 365 has been added to explain that "all" means all available in MINIAOD, which has a minimal set of slimming requirements which are now provided in the text.
Changed:
<
<
L368-370 and Table 20: Would be nice to see plots of Delta R to see why 0.15 is chosen. I would have expected a smaller value for muons and a larger value for electrons.
>
>
L368-370 and Table 20: Would be nice to see plots of Delta R to see why 0.15 is chosen. I would have expected a smaller value for muons and a larger value for electrons.

We are producing N-1 plots to show this.

L363-370: Just to be clear, there are no requirements on the pT of the leptons? So, if there is a 4 GeV muon within Delta R of 0.15 of a 100 GeV track, then you reject the track?
Line: 149 to 181
This method was suggested by you in the review of EXO-16-044. The language of the AN regarding continuum DY has been updated for clarity as you suggest. It is not relevant to the measurement of P(veto) to estimate the non-Z backgrounds in our tag-and-probe samples, so we do not -- the purpose of the same-sign subtraction is to increase the purity of the lepton flavor under study.
Changed:
<
<
L468-9: It would be helpful to provide a table giving the values for the 4 numbers in Equation 6 for each of the 9 cases (3 leptons * 3 nlayer bins). I would like to get an idea of the signal-to-background ratio. I may also want to calculate the effect of subtracting the background versus ignoring it.
>
>
L468-9: It would be helpful to provide a table giving the values for the 4 numbers in Equation 6 for each of the 9 cases (3 leptons * 3 nlayer bins). I would like to get an idea of the signal-to-background ratio. I may also want to calculate the effect of subtracting the background versus ignoring it.

In making this table we found some bugs in the P(veto) script. Firstly the N_{SS T&P} was not being subtracted in the denominator of Equation 6; in nlayers>=6 this is an extremely trivial issue but is relevant in the newer, shorter categories. Secondly in cases where the numerator of Equation 6 is negative, the N_{SS T&P}^{veto} subtraction was ignored when it should be assumed to be 0 + 1.1 -0. Both these issues are resolved and we supply the table. This slightly changes some estimates.

L472-5: Are these recHits from the tracker or the calorimeter? I don't really understand what you are describing here. Are the electron seeds from the ECAL or pixel detector?
Line: 234 to 268
Guassian + flat.
Changed:
<
<
- I don't see the advantage of having 9 different sideband regions. Simply take the sum of events from 0.05-0.5 and multiply by the overall transfer factor. This should minimize the statistical uncertainty. In fact, I would suggest combining the Z->mumu and Z->ee samples as well. Also, remember to use the correct Poisson uncertainties (as discussed for Table 31) when you only have a handful of events. If you somehow think it is a good idea to have 18 different measurements instead of 1 and you are using a transfer factor with an uncertainty, make sure to properly account for the fact that this uncertainty is correlated for different bins.
>
>
- I don't see the advantage of having 9 different sideband regions. Simply take the sum of events from 0.05-0.5 and multiply by the overall transfer factor. This should minimize the statistical uncertainty. In fact, I would suggest combining the Z->mumu and Z->ee samples as well. Also, remember to use the correct Poisson uncertainties (as discussed for Table 31) when you only have a handful of events. If you somehow think it is a good idea to have 18 different measurements instead of 1 and you are using a transfer factor with an uncertainty, make sure to properly account for the fact that this uncertainty is correlated for different bins.

As discussed in the first ARC meeting, we've used a single larger sideband. The fit uncertainty is employed as a nuisance parameter 100% correlated between bins.

Changed:
<
<
- L635-638: As mentioned above, I would suggest combining the Z->mumu and Z->ee results to get the final estimate, seeing as you are statistics limited. You can still use the difference between the two as a systematic uncertainty (but see below).
>
>
- L635-638: As mentioned above, I would suggest combining the Z->mumu and Z->ee results to get the final estimate, seeing as you are statistics limited. You can still use the difference between the two as a systematic uncertainty (but see below).

Changed:
<
<
- L640-645: It is obvious that Z->mumu and Z->ee events are quite similar. They have the same production mechanism, they are selected by single lepton triggers, etc. So, it is not much of a test to show that they give the same result. On the other hand, your signal region requires large missing ET, a high pT jet, and a high pT isolated track that is neither a muon or electron. One might worry that the fake track rate depends on the amount of hadronic activity in an event, which is likely higher in the signal region than in Z events. One might also worry that the fake track rate depends on pileup, and the signal trigger/selection may be more susceptible to pileup than the single lepton trigger/selection. Ideally, I would suggest that you perform the same measurement on a QCD dominated region (like requiring a dijet or quadjet trigger or just high HT). You can require pTmiss,no mu < 100 GeV to ensure no signal contamination. If this is not possible, then you could consider taking what you have and either reweighting the pileup and HT distribution to match the signal region or checking that the fake rate is independent of these quantities.
>
>
Doing as you suggest is acceptable, but will not change the estimate very much. This was discussed in the first ARC meeting and will be revisited.

- L640-645: It is obvious that Z->mumu and Z->ee events are quite similar. They have the same production mechanism, they are selected by single lepton triggers, etc. So, it is not much of a test to show that they give the same result. On the other hand, your signal region requires large missing ET, a high pT jet, and a high pT isolated track that is neither a muon or electron. One might worry that the fake track rate depends on the amount of hadronic activity in an event, which is likely higher in the signal region than in Z events. One might also worry that the fake track rate depends on pileup, and the signal trigger/selection may be more susceptible to pileup than the single lepton trigger/selection. Ideally, I would suggest that you perform the same measurement on a QCD dominated region (like requiring a dijet or quadjet trigger or just high HT). You can require pTmiss,no mu < 100 GeV to ensure no signal contamination. If this is not possible, then you could consider taking what you have and either reweighting the pileup and HT distribution to match the signal region or checking that the fake rate is independent of these quantities.

See the above response (from "ARC action items from July 25 meeting"). Small differences exist between pileup in these different samples, but reweighting for those differences doesn't change estimate.

- L649-653: I don't understand how these numbers are consistent with Figure 29. In Figure 29 (left) it seems there are about 9 events with |dxy|<0.02cm and about 15 with 0.05<|dxy|<0.10cm to be compared with 32 events and 68 events. There is a similar discrepancy for electrons. I guess the plots have been scaled for some reason as the entries are not integers. Please fix the plots and verify the results are consistent.
Line: 253 to 293

 ZtoMuMu NLayers5 ZtoEE NLayers5
Changed:
<
<
Section 6.2.2: In Table 36, it would be enlightening to show the same results as in Table 35. That is, I am curious as to how P_fake compares between data and MC. Are these results normalized to 41 fb^-1? If so, then it seems like the MC predicts about 1/5 as many fake tracks as data. It is hard to be confident that the MC tells us anything if that is so.
>
>
Section 6.2.2: In Table 36, it would be enlightening to show the same results as in Table 35. That is, I am curious as to how P_fake compares between data and MC. Are these results normalized to 41 fb^-1? If so, then it seems like the MC predicts about 1/5 as many fake tracks as data. It is hard to be confident that the MC tells us anything if that is so.

We felt that Table 35 was distractingly large, and the low MC statistics makes things worse for Table 36. That we are moving to one single sideband, Table 35 is less relevant too. Table 36 is normalized to 41/fb. The value of Table 36 is the test of closure in the fake estimate method, rather than the absolute rate of fake tracks in simulation which has always been an issue, thus the data-driven estimate is a must.

Changed:
<
<
Section 6.2.2: Your hypothesis is that the fake track rate is independent of selection so you can use the Z data to estimate the fake track rate in your signal region. I have suggested that you could also measure the fake track rate in QCD events to verify this. You can also check the effect in MC. I guess in Section 6.2.2 you apply the same criteria to MC as you do for data (selecting Z events). However, if your hypothesis is true, then you should also get the same fake rate if you use any MC sample. What happens if you use all the samples in Section 3.3 but remove the Table 33 and 34 requirements so you are using all events? If P_fake changes significantly, this is cause for concern. If not, then that is good. In either case, it still may not prove anything if the MC is really predicting 1/5 the amount of fake tracks.
>
>
Section 6.2.2: Your hypothesis is that the fake track rate is independent of selection so you can use the Z data to estimate the fake track rate in your signal region. I have suggested that you could also measure the fake track rate in QCD events to verify this. You can also check the effect in MC. I guess in Section 6.2.2 you apply the same criteria to MC as you do for data (selecting Z events). However, if your hypothesis is true, then you should also get the same fake rate if you use any MC sample. What happens if you use all the samples in Section 3.3 but remove the Table 33 and 34 requirements so you are using all events? If P_fake changes significantly, this is cause for concern. If not, then that is good. In either case, it still may not prove anything if the MC is really predicting 1/5 the amount of fake tracks.

Changed:
<
<
Figure 31: Would be good to have a plot for nlayers=5 as well.
>
>
As above the absolute rate of fake tracks in simulation is not well trusted, so the comparison of 1/5 to data does not concern us. Certainly one could use additional MC samples and change the selection, but this then deviates from the treatment in data and in principle is not the same closure test. If a third selection/sample were used then the closure test in MC would also need to be used.

Changed:
<
<
Figure 35: Please include the ratio of the two since this provides the scale factors that are used. It may be better to simply include the region of 50-300 GeV on a linear scale.
>
>
Figure 31: Would be good to have a plot for nlayers=5 as well.

Changed:
<
<
L752-754: While this signal yield reduction is interesting, just as interesting would be the change after all cuts are applied (with nlayers>=4). Can you provide this as well?
>
>
We are producing this plot.

Changed:
<
<
L760-762: How sure are we that the Z can be used to measure ISR difference for the signal model? I generally agree with the statement that both recoil off ISR but it would be nice if this could be confirmed somehow. Does the pT distribution for a 100 GeV chargino look similar to a Z in Pythia8? Does the ISR reweighting work for ttbar events or diboson events?
>
>
Figure 35: Please include the ratio of the two since this provides the scale factors that are used. It may be better to simply include the region of 50-300 GeV on a linear scale.

We are producing this plot.

L752-754: While this signal yield reduction is interesting, just as interesting would be the change after all cuts are applied (with nlayers>=4). Can you provide this as well?

Producing this.

L760-762: How sure are we that the Z can be used to measure ISR difference for the signal model? I generally agree with the statement that both recoil off ISR but it would be nice if this could be confirmed somehow. Does the pT distribution for a 100 GeV chargino look similar to a Z in Pythia8? Does the ISR reweighting work for ttbar events or diboson events?

We are proceeding through the SIM/RECO steps for a small portion of the Pythia8 Drell-Yan sample generated for Figure 38. By applying the ISR weights and checking against the reconstructed MadGraph sample, this will be an effective check of the method. For the pT distribution we are producing this for 100 GeV; but for 1000 GeV for example this surely will be different. The relevant issue is not that the Z's distribution looks like the electroweak-ino pair's, but that the simulation underestimates the ISR shape compared to data for the same event content.

L772-773: Please expand on "is applied to the simulated signal samples". Do you reweight the events using the ISR jet in the event or the net momentum of the produced SUSY particles or something else.
Line: 283 to 337
Changed:
<
<
Section 8.2.11: Please explain this better. My understanding is that various prescales were in place. In the original measurement of the trigger efficiency in Section 7.3, you rely on being above the track requirement plateau to measure the trigger efficiency versus pTmiss in data (which has all of the various prescales naturally included) and compare to MC (which just has an OR of all trigger paths, I think). This is the main reason why the data efficiency is lower than the MC efficiency. Is this correct? Now, in this section, you are measuring the trigger efficiency solely with MC, which is an OR of all trigger paths. So if the trigger path with the track requirement fails, the MC might still find that an MHT only trigger will fire, while in data the MHT-only trigger may be prescaled. Isn't this a problem? Can you perhaps repeat the exercise using only triggers in MC that were never prescaled (as the opposite extreme to assuming there was never any prescale)? Also, why would you average over the chargino lifetimes? Shouldn't this systematic uncertainty depend very strongly on chargino lifetime?
>
>
Section 8.2.11: Please explain this better. My understanding is that various prescales were in place. In the original measurement of the trigger efficiency in Section 7.3, you rely on being above the track requirement plateau to measure the trigger efficiency versus pTmiss in data (which has all of the various prescales naturally included) and compare to MC (which just has an OR of all trigger paths, I think). This is the main reason why the data efficiency is lower than the MC efficiency. Is this correct? Now, in this section, you are measuring the trigger efficiency solely with MC, which is an OR of all trigger paths. So if the trigger path with the track requirement fails, the MC might still find that an MHT only trigger will fire, while in data the MHT-only trigger may be prescaled. Isn't this a problem? Can you perhaps repeat the exercise using only triggers in MC that were never prescaled (as the opposite extreme to assuming there was never any prescale)? Also, why would you average over the chargino lifetimes? Shouldn't this systematic uncertainty depend very strongly on chargino lifetime?
Several triggers were disabled in portions of 2017, so not precisely "prescaled" but we understand. The main difference between data and MC in Section 7.3 is that this history was not in the simulation; this operational history is averaged over in data and applied to the simulation with these weights.
Line: 291 to 345
Lastly if the charginos are reconstructed at all they would be reconstructed as muons and will not be included in metNoMu. The only way they can contribute to the metNoMu is by affecting the recoil of the ISR jet, which is why we average over chargino lifetime but measure this systematic separately for each chargino mass.
Changed:
<
<
Section 8.2.11: Per my discussion of pixel issues in 2017. It is relatively easy to get 5 pixel hits with only 4 pixel layers as in order to make a hermetic cylindrical detector with flat sensors, you need to have overlaps. These overlaps are largest in the first layer, which is where there were significant issues with the Phase 1 pixel detector. So I am concerned that if the MC is optimistic about layer 1 hits, then relying on the MC may not be wise. Maybe you can check the following. Take good tracks (not muons but large number of hits with pT>50 GeV). Check the fraction of tracks that have two layer 1 pixel hits compared to one layer 1 pixel hits between MC and data. Or, more generally, the average number of pixel hits. If they differ, then you could see how many times you would go from 5 pixel hits to 4 pixel hits in data vs MC and use this difference as another estimate of the difference in trigger efficiency.
>
>
Section 8.2.11: Per my discussion of pixel issues in 2017. It is relatively easy to get 5 pixel hits with only 4 pixel layers as in order to make a hermetic cylindrical detector with flat sensors, you need to have overlaps. These overlaps are largest in the first layer, which is where there were significant issues with the Phase 1 pixel detector. So I am concerned that if the MC is optimistic about layer 1 hits, then relying on the MC may not be wise. Maybe you can check the following. Take good tracks (not muons but large number of hits with pT>50 GeV). Check the fraction of tracks that have two layer 1 pixel hits compared to one layer 1 pixel hits between MC and data. Or, more generally, the average number of pixel hits. If they differ, then you could see how many times you would go from 5 pixel hits to 4 pixel hits in data vs MC and use this difference as another estimate of the difference in trigger efficiency.

We do not have the entire hitPattern contents histogrammed so this is not quickly answerable as suggested. However in the electron control region (pt > 55 GeV) with nlayers=4, 11% of tracks have more than 4 pixel hits whereas signal ranges from 8-12%. This comparison however would only describe the offline association of hits to tracks, which is known to be better than the online association -- so one would still need to examine the trigger fires as Section 8.2.11 does to get a clear view of the difference in trigger efficiencies.

Here are also some brief comments on the paper:
Line: 928 to 984

 META FILEATTACHMENT attachment="compareWithWithout700GeV100cmNLayers6plus.jpg" attr="" comment="" date="1562699529" name="compareWithWithout700GeV100cmNLayers6plus.jpg" path="compareWithWithout700GeV100cmNLayers6plus.jpg" size="105856" user="bfrancis" version="2" attachment="tf_ZtoEE_NLayers5.png" attr="" comment="" date="1563389063" name="tf_ZtoEE_NLayers5.png" path="tf_ZtoEE_NLayers5.png" size="15491" user="bfrancis" version="1" attachment="tf_ZtoMuMu_NLayers5.png" attr="" comment="" date="1563389063" name="tf_ZtoMuMu_NLayers5.png" path="tf_ZtoMuMu_NLayers5.png" size="17113" user="bfrancis" version="1"
>
>
 META FILEATTACHMENT attachment="comparePU_fakeCRs.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs.jpg" path="comparePU_fakeCRs.jpg" size="116660" user="bfrancis" version="1" attachment="comparePU_fakeCRs_ratio.jpg" attr="" comment="" date="1564787415" name="comparePU_fakeCRs_ratio.jpg" path="comparePU_fakeCRs_ratio.jpg" size="128471" user="bfrancis" version="1"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 432019-07-24 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 111 to 111
L332-334: It is claimed that a jet pT cut of >110 GeV removes a lot of background and not signal. The plots in Figure 11 don't seem to back this up. It seems like about the same percentage of signal and background events are removed by the cut. Can you quantify the effect of this cut (background rejection and signal efficiency)?
>
>
L338-341: Would be nice to show the signal and background distributions for these two variables so we can evaluate for ourselves the effectiveness of the cut. Also, it would be helpful to report the background rejection and signal efficiency for these cuts.

Table 18: Are there no standard Tracking POG quality requirements on the tracks? I think they still have "loose" and "highPurity" requirements based on an MVA. Do you require either of these?

#### Revision 422019-07-24 - ChristopherHill

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 205 to 205
All three ttbar samples are used, but the di-leptonic ttbar sample contributes the most. Keep in mind that P(veto) is a tag-and-probe selection, and samples such as W->lnu and Z->invisible will not significantly contribute. The Z->ll sample should contribute, but the size of those samples is considerably smaller than the ttbar samples.
Changed:
<
<
Figure 26: Are all three results properly normalized to 41.5 fb^-1? If so, it seems like we should be including 3 layer tracks in the signal region because I can exclude a ct=10cm with this plot alone (observe 350 with a prediction of 125), which you can't do with the whole analysis.
>
>
Figure 26: Are all three results properly normalized to 41.5 fb^-1? If so, it seems like we should be including 3 layer tracks in the signal region because I can exclude a ct=10cm with this plot alone (observe 350 with a prediction of 125), which you can't do with the whole analysis.

Yes, the results are properly normalized, but there is no observation in Fig. 26 -- all the entries are MC. If we were to include data, we would expect the fake contribution to the 3 layer tracks to be orders of magnitude larger making any exclusion difficult without a dedicated analysis.

Figure 27: Would be good to add the nlayers=4 result as is done in Figure 28. Should also say what simulated samples are included here.

#### Revision 412019-07-23 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 205 to 205
All three ttbar samples are used, but the di-leptonic ttbar sample contributes the most. Keep in mind that P(veto) is a tag-and-probe selection, and samples such as W->lnu and Z->invisible will not significantly contribute. The Z->ll sample should contribute, but the size of those samples is considerably smaller than the ttbar samples.
Changed:
<
<
Figure 26: Are all three results properly normalized to 41.5 fb^-1? If so, it seems like we should be including 3 layer tracks in the signal region because I can exclude a ct=10cm with this plot alone (observe 350 with a prediction of 125), which you can't do with the whole analysis.

Yes. The challenge in including a 3-layer category would be in describing the extra gaussian peak for 3-layer fakes shown in Figure 28.

>
>
Figure 26: Are all three results properly normalized to 41.5 fb^-1? If so, it seems like we should be including 3 layer tracks in the signal region because I can exclude a ct=10cm with this plot alone (observe 350 with a prediction of 125), which you can't do with the whole analysis.
Figure 27: Would be good to add the nlayers=4 result as is done in Figure 28. Should also say what simulated samples are included here.

#### Revision 402019-07-22 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 166 to 165
Recall in the review of EXO-16-044 you recommended we utilize all possible tag-and-probe pairs in every event, on top of performing the same-sign subtraction. Figure 19 shows all probe tracks in all events; often there are multiple probes that can be chosen as the tag-and-probe combination. As you say these figures are related to the value of P(veto), but are not precisely equal.
Changed:
<
<
Figures 19, 21, 25, and 40 and page 72: I would suggest removing footnote #17 on page 72 and adding that information into the captions for figures 19, 21, 25, and 40. You should also add a similar explanation to the caption of Table 31 indicating that N_ctrl is scaled to the signal region luminosity.
>
>
Figures 19, 21, 25, and 40 and page 72: I would suggest removing footnote #17 on page 72 and adding that information into the captions for figures 19, 21, 25, and 40. You should also add a similar explanation to the caption of Table 31 indicating that N_ctrl is scaled to the signal region luminosity.

We have added the mention to Table 31's caption, but the authors feel the luminosity label is sufficient in those figures.

Figure 22: Would be good to show the results for nlayers=4 and nlayers=5 (unless there are no entries in which case you should note that in the caption), similar to the way you show the results for tau for nlayers=5 even though it is not used.

There are 1 and 2 events respectively for nlayers=4 and =5 respectively, thus these plots were found unhelpful. A comment has been added to the caption of Figure 22 to mention this.

Changed:
<
<
L514-518 and Tables 29-31: I note that P_offline for electrons and muons is very similar, around 80%, while for taus it is much lower, around 20%. Do you understand the difference and do you think it is OK for the method. I can imagine two effects that could cause this. First, it could be that since the pion from the tau decay does not carry all of the tau momentum, the tau candidates from W decays will have a lower pT than the muon or electron candidates from W decays. So when the tau pT gets added to pTmiss, it will get shifted less than when the electron pT gets added and so more will fail the Ecalo cut. Based on Figure 25, this seems to be true and is probably innocuous. But I think there is more to it. Comparing Figures 20 and 21, the modified pTmiss for the tau case has a large contribution at the bottom left corner that is not present in the electron (or muon) case. It seems that the electrons and muon are very consistent with the topology of a W recoiling from an ISR jet so delta phi is ~pi. However, for the tau case, there seems to be many events where the "tau" is part of the leading jet. I guess that since there is a Delta R cut of 0.5, the "tau" must differ from the jet a bit in eta. Given this evidence and the fact that we know that the tau purity is much worse than electron and muon purity, it seems likely that many of the events in Figure 21 do not contain taus. I would guess the events are multijet QCD events with either an isolated track by chance or a fake track. So, my hypothesis is that the single tau control region has a large contamination of non-tau events. You use the same sample for measuring P_offline and the multiplying by P_offline as part of estimating the tau background. So we could consider this estimate as being the tau+single hadronic track+fake track contribution. But there are two problems with that. First, P_veto is measured on a much purer sample of taus as it uses Z decays and subtracts the same-sign contribution. So P_veto is really measuring taus, not the sum of tau+single hadronic track+fake track. And P_veto may not be the same if the other contributions were included. Second, you have a separate measurement of the fake track contribution so you would be double counting. Please let me know what you think.
>
>
L514-518 and Tables 29-31: I note that P_offline for electrons and muons is very similar, around 80%, while for taus it is much lower, around 20%. Do you understand the difference and do you think it is OK for the method. I can imagine two effects that could cause this. First, it could be that since the pion from the tau decay does not carry all of the tau momentum, the tau candidates from W decays will have a lower pT than the muon or electron candidates from W decays. So when the tau pT gets added to pTmiss, it will get shifted less than when the electron pT gets added and so more will fail the Ecalo cut. Based on Figure 25, this seems to be true and is probably innocuous. But I think there is more to it. Comparing Figures 20 and 21, the modified pTmiss for the tau case has a large contribution at the bottom left corner that is not present in the electron (or muon) case. It seems that the electrons and muon are very consistent with the topology of a W recoiling from an ISR jet so delta phi is ~pi. However, for the tau case, there seems to be many events where the "tau" is part of the leading jet. I guess that since there is a Delta R cut of 0.5, the "tau" must differ from the jet a bit in eta. Given this evidence and the fact that we know that the tau purity is much worse than electron and muon purity, it seems likely that many of the events in Figure 21 do not contain taus. I would guess the events are multijet QCD events with either an isolated track by chance or a fake track. So, my hypothesis is that the single tau control region has a large contamination of non-tau events. You use the same sample for measuring P_offline and the multiplying by P_offline as part of estimating the tau background. So we could consider this estimate as being the tau+single hadronic track+fake track contribution. But there are two problems with that. First, P_veto is measured on a much purer sample of taus as it uses Z decays and subtracts the same-sign contribution. So P_veto is really measuring taus, not the sum of tau+single hadronic track+fake track. And P_veto may not be the same if the other contributions were included. Second, you have a separate measurement of the fake track contribution so you would be double counting. Please let me know what you think.

Our interepretation of the lower modified-MET distribution for taus is the same as yours here, and we too feel it is innoccuous.

We intentionally capture the "single hadronic track" component as part of the tau estimate. These do contribute as a background, and are included as part of our "tau background"; the analyzers feel that calling this the "tau and single hadronic track background" is a distraction for the reader, beyond the one mention in L454-455. We capture this contribution in two ways: firstly as other reviewers have noticed, our hadronic tau ID is fairly loose, and secondly we remove the requirement on deltaR(track, jet) > 0.5. Thus the tau P(veto) considers all of these contributions, and you see in Figure 21 that some of them have a lower probability to pass the deltaPhi(jet, MET) requirement which is then included in P(offline). For clarity we have added a reminder of the deltaR(track, jet) cut removal to L455 in the AN.

Lastly the "fake track" contribution will not survive the same-sign subtraction, whereas the "single hadronic track" contribution will. So we are not concerned about double-counting the fake tracks.

Section 6.1.3: I think you need to be a little more clear here. I want to confirm that I understand. First, the figures mention HLT efficiency but I think this is really the full trigger efficiency (L1+HLT). Do you agree? Second, the trigger efficiencies shown in Figure 23 and 24 are the actual results from the L1+HLT that was run and the x-axis refers to the actual pTmiss,nomu of the event. That is, the x-axis is not the modified pTmiss,nomu where the electron pT is added back in. Is that correct? Then, the x-axis of Figure 25 shows the modified pTmiss,nomu with the electron or tau pT added back in. Is that correct? Try to make the text and figures a bit clearer.

Figures 23 and 24 now are labeled as just "trigger efficiency", and the caption for Figure 25 has been made more clear.

Changed:
<
<
Figure 25 and Tables 29 and 31: Figure 25 seems to show that the electron distribution is shifted higher than the tau distribution. Therefore, once you convolute this distribution with the trigger efficiency, I would expect the electron trigger efficiency to be higher than the tau trigger efficiency. However, the opposite is true. If I naively take the trigger efficiency as a step function which is 0% for pTmiss<200 GeV and 100% for pTmiss>200 GeV, I think I get about 30% for electrons and 13% for the tau, compared to 46% and 52%. Can you check the results and if correct, try to explain what I am missing?
>
>
Figure 25 and Tables 29 and 31: Figure 25 seems to show that the electron distribution is shifted higher than the tau distribution. Therefore, once you convolute this distribution with the trigger efficiency, I would expect the electron trigger efficiency to be higher than the tau trigger efficiency. However, the opposite is true. If I naively take the trigger efficiency as a step function which is 0% for pTmiss<200 GeV and 100% for pTmiss>200 GeV, I think I get about 30% for electrons and 13% for the tau, compared to 46% and 52%. Can you check the results and if correct, try to explain what I am missing?

Your numbers are correct if you integrate Figure 25 across the entire MET range. However P(trigger) is a conditional probability after P(offline) has already required metNoMu > 120 GeV, so that must be applied.

Table 31: How did you determine the uncertainties for nlayers=4? The upper uncertainty of 0 for N^l_ctrl and the upper uncertainty of 0.0058 for estimate seem too small. Actually nlayers = 4 and nlayers = 5 for both muons and taus (Tables 30 and 31) have yields that are too small to assume Gaussian uncertainties. You should use Poisson uncertainties. You can ask the statistics committee for better advice but I think more correct uncertainties would be: 0 +1.15 -0
Line: 274 to 283
Section 8.2.11: Please explain this better. My understanding is that various prescales were in place. In the original measurement of the trigger efficiency in Section 7.3, you rely on being above the track requirement plateau to measure the trigger efficiency versus pTmiss in data (which has all of the various prescales naturally included) and compare to MC (which just has an OR of all trigger paths, I think). This is the main reason why the data efficiency is lower than the MC efficiency. Is this correct? Now, in this section, you are measuring the trigger efficiency solely with MC, which is an OR of all trigger paths. So if the trigger path with the track requirement fails, the MC might still find that an MHT only trigger will fire, while in data the MHT-only trigger may be prescaled. Isn't this a problem? Can you perhaps repeat the exercise using only triggers in MC that were never prescaled (as the opposite extreme to assuming there was never any prescale)? Also, why would you average over the chargino lifetimes? Shouldn't this systematic uncertainty depend very strongly on chargino lifetime?
>
>
Several triggers were disabled in portions of 2017, so not precisely "prescaled" but we understand. The main difference between data and MC in Section 7.3 is that this history was not in the simulation; this operational history is averaged over in data and applied to the simulation with these weights.

For this section, consider a simple worst-case scenario: the HLT_MET105_IsoTrk50 path which was disabled for 2017B (10% of the lumi), and a 100% enabled path HLT_MET120. In 2017B conditions the efficiency is solely that of HLT_MET120, and you end up over-estimating the trigger efficiency by the difference in efficiency between HLT_MET120 and the OR of the two. Figure 44 and the systematic in Table 45 show this difference to be very small (~1%), and this would only apply to 10% of the data in 2017B, very contained by this systematic. Ignoring the IsoTrk50 path, the triggers dominating the MET turn-on from Table 9 are HLT_PFMET(noMu)120_PFMHT(noMu)120_IDTight which is very similar to this simple worst-case example.

Lastly if the charginos are reconstructed at all they would be reconstructed as muons and will not be included in metNoMu. The only way they can contribute to the metNoMu is by affecting the recoil of the ISR jet, which is why we average over chargino lifetime but measure this systematic separately for each chargino mass.

Section 8.2.11: Per my discussion of pixel issues in 2017. It is relatively easy to get 5 pixel hits with only 4 pixel layers as in order to make a hermetic cylindrical detector with flat sensors, you need to have overlaps. These overlaps are largest in the first layer, which is where there were significant issues with the Phase 1 pixel detector. So I am concerned that if the MC is optimistic about layer 1 hits, then relying on the MC may not be wise. Maybe you can check the following. Take good tracks (not muons but large number of hits with pT>50 GeV). Check the fraction of tracks that have two layer 1 pixel hits compared to one layer 1 pixel hits between MC and data. Or, more generally, the average number of pixel hits. If they differ, then you could see how many times you would go from 5 pixel hits to 4 pixel hits in data vs MC and use this difference as another estimate of the difference in trigger efficiency.

Here are also some brief comments on the paper:

#### Revision 392019-07-22 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 119 to 119
Table 19: You need to define what sigma is in the last line. Is it the beamspot width, is it the primary vertex transverse position uncertainty, is it the uncertainty on the track position at the distance of closest approach to the beamline or primary vertex, or some combination of these?
Changed:
<
<
Both dxy and sigma here refer to the dxy measurement with respect to the origin. This is made more clear.
>
>
Both dxy and sigma here refer to the dxy measurement with respect to the origin. This is made more clear in the AN.
L365-367: You write that "all" muons and electrons are used. I would like to have a more complete description of this. It may help if you add some text around L260 describing muon and electron reconstruction. For muons, Table 13 defines tight and loose ID. Is it simply the OR of these two that you use? Or do you include tracker muons? I think there is also a soft muon ID. Are these included? What about standalone muons? For electrons, Table 12 defines tight and loose ID. Is it simply the OR of these two categories that you use? Or do you loosen up the requirements further? If so, what are the requirements?
Line: 129 to 129
L363-370: Just to be clear, there are no requirements on the pT of the leptons? So, if there is a 4 GeV muon within Delta R of 0.15 of a 100 GeV track, then you reject the track?
Changed:
<
<
As above these are all available in MINIAOD, which has a very minimal set of slimming requirements. For example muons passing the PF ID have no pt requirement whatsoever. If such a muon as you write is near our track, yes we reject it.
>
>
As above these are all those available in MINIAOD, which has a very minimal set of slimming requirements. For example muons passing the PF ID have no pt requirement whatsoever. If such a muon as you write is near our track, yes we reject it.
L389-91 and Figure 14: It would be good to plot Figure 14 with finer bins (at least between 0 and 10 GeV) to back up this statement.
Line: 141 to 141
AN Table 22: Why don't you apply the full set of vetos for all lepton flavors? Is it an attempt to increases statistics? Can you perform the test with all vetos applied to see if the results are consistent?
Changed:
<
<
Recall these are the definitions of the vetoes used to study P(veto) for each flavor; all three sets of requirements are applied in the signal. Each flavor of P(veto) is measured by removing that flavor's veto -- the other flavors' vetoes are still applied. This is instead tighter than removing all the cuts listed, to increase the purity of each flavor in each study. For example when studying muons one would not remove the cut on ECalo in the denominator of P(veto).
>
>
All three sets of requirements are applied in the signal region. In measuring each flavor's P(veto), it's necessary to retain the veto against other flavors to maintain purity of the flavor under study. For example when studying muons, one would still require ECalo < 10 GeV. So this is tighter than what you suggest, to achieve better purity of each flavor.
L456-460: Your assumption here is that the same-sign sample has the same production rate as the background. Have you verified this? You could verify it with a high statistics sample of dilepton events (or come up with a scaling factor if it is not exactly true). Also, in L457-458 you list three sources of background: DY, non-DY, fake tracks. I don't see how a same-sign sample can be used to estimate the DY background? I would suggest calling DY part of the signal. For di-electron and di-muon events, you also have the possibility of using the sidebands around the Z mass to estimate the background. You could check that this gives consistent results.
Changed:
<
<
Recall that this method was suggested by you in the review of EXO-16-044. The intent of the same-sign subtraction is to increase the purity of T&P events actually containing the lepton flavor under study. To the suggestion that the continuum DY is more correctly signal and isn't equal in rate to the opposite-sign sample, we agree and have reworded the AN's description here. We have also included plots of the invariant mass distributions of the T&P samples. Lastly it is not our intent to estimate the non-Z backgrounds here, but to increase the purity of the lepton flavor under study.
>
>
This method was suggested by you in the review of EXO-16-044. The language of the AN regarding continuum DY has been updated for clarity as you suggest. It is not relevant to the measurement of P(veto) to estimate the non-Z backgrounds in our tag-and-probe samples, so we do not -- the purpose of the same-sign subtraction is to increase the purity of the lepton flavor under study.
L468-9: It would be helpful to provide a table giving the values for the 4 numbers in Equation 6 for each of the 9 cases (3 leptons * 3 nlayer bins). I would like to get an idea of the signal-to-background ratio. I may also want to calculate the effect of subtracting the background versus ignoring it.

L472-5: Are these recHits from the tracker or the calorimeter? I don't really understand what you are describing here. Are the electron seeds from the ECAL or pixel detector?

Changed:
<
<
recHits will always refer to calorimeter recHits. Electron seeds are ECAL superclusters matched to seed tracks in or close to the pixels -- so electron seeds are from both. We've reworded this section for increased clarity.
>
>
Throughout the documentation, recHits refers to calorimeter recHits. Electron seeds are ECAL superclusters matched to seed tracks in or close to the pixels -- so electron seeds are from both. We've reworded this section for increased clarity.
L482: It looks like the probe tracks already have a pT cut > 30 GeV. So going down to 20 GeV is just being extra safe. Is that right?
Line: 160 to 160
L478-487: It is not clear to me. Is this re-reconstruction needed for the signal region or not? Was it done for the signal region?
Changed:
<
<
The signal region requires track pt > 55 GeV, so the reconstruction already keeps all nearby recHits and considers all of them as potential electron seeds. This process is needed only for the P(veto) measurement because below 50 GeV the reconstruction is changed and different from the signal region.
>
>
It is not needed for the signal region and is not done, as track pt > 55 GeV is above the threshold of 50 GeV.
Figure 19 and Tables 29-31. If I try to integrate the plots in Figure 19, I would estimate that the integral of red/blue is roughtly 10^-6, 10^-7, and 10^-4, for electrons, muons, and taus, respectively. I would expect this to be approximately equal to P_veto. But in Tables 29-31, I find P_veto numbers of 10^-5, 10^-6, and 10^-3 for electrons, muons, and taus for nlayers>=6. So roughly a factor of 10 off. Can you explain this?
Changed:
<
<
Figure 19 shows all probe tracks; this includes events where the tag + probe do not pass the Z cuts. Recall too that in the review of EXO-16-044 you recommended we utilize all possible tag-and-probe pairs in every event, so the number of pairs is further different from the number of events. Lastly there is the same-sign subtraction. As you say these figures are related to the value of P(veto) but are not precisely equal.
>
>
Recall in the review of EXO-16-044 you recommended we utilize all possible tag-and-probe pairs in every event, on top of performing the same-sign subtraction. Figure 19 shows all probe tracks in all events; often there are multiple probes that can be chosen as the tag-and-probe combination. As you say these figures are related to the value of P(veto), but are not precisely equal.
Figures 19, 21, 25, and 40 and page 72: I would suggest removing footnote #17 on page 72 and adding that information into the captions for figures 19, 21, 25, and 40. You should also add a similar explanation to the caption of Table 31 indicating that N_ctrl is scaled to the signal region luminosity.

Figure 22: Would be good to show the results for nlayers=4 and nlayers=5 (unless there are no entries in which case you should note that in the caption), similar to the way you show the results for tau for nlayers=5 even though it is not used.

Changed:
<
<
There are 1 and 2 events respectively for nlayers=4 and =5 respectively. Thus these plots even less helpful then the top two plots in Figure 21. We've added a comment to the caption of Figure 22 to explain this.
>
>
There are 1 and 2 events respectively for nlayers=4 and =5 respectively, thus these plots were found unhelpful. A comment has been added to the caption of Figure 22 to mention this.
L514-518 and Tables 29-31: I note that P_offline for electrons and muons is very similar, around 80%, while for taus it is much lower, around 20%. Do you understand the difference and do you think it is OK for the method. I can imagine two effects that could cause this. First, it could be that since the pion from the tau decay does not carry all of the tau momentum, the tau candidates from W decays will have a lower pT than the muon or electron candidates from W decays. So when the tau pT gets added to pTmiss, it will get shifted less than when the electron pT gets added and so more will fail the Ecalo cut. Based on Figure 25, this seems to be true and is probably innocuous. But I think there is more to it. Comparing Figures 20 and 21, the modified pTmiss for the tau case has a large contribution at the bottom left corner that is not present in the electron (or muon) case. It seems that the electrons and muon are very consistent with the topology of a W recoiling from an ISR jet so delta phi is ~pi. However, for the tau case, there seems to be many events where the "tau" is part of the leading jet. I guess that since there is a Delta R cut of 0.5, the "tau" must differ from the jet a bit in eta. Given this evidence and the fact that we know that the tau purity is much worse than electron and muon purity, it seems likely that many of the events in Figure 21 do not contain taus. I would guess the events are multijet QCD events with either an isolated track by chance or a fake track. So, my hypothesis is that the single tau control region has a large contamination of non-tau events. You use the same sample for measuring P_offline and the multiplying by P_offline as part of estimating the tau background. So we could consider this estimate as being the tau+single hadronic track+fake track contribution. But there are two problems with that. First, P_veto is measured on a much purer sample of taus as it uses Z decays and subtracts the same-sign contribution. So P_veto is really measuring taus, not the sum of tau+single hadronic track+fake track. And P_veto may not be the same if the other contributions were included. Second, you have a separate measurement of the fake track contribution so you would be double counting. Please let me know what you think.

Section 6.1.3: I think you need to be a little more clear here. I want to confirm that I understand. First, the figures mention HLT efficiency but I think this is really the full trigger efficiency (L1+HLT). Do you agree? Second, the trigger efficiencies shown in Figure 23 and 24 are the actual results from the L1+HLT that was run and the x-axis refers to the actual pTmiss,nomu of the event. That is, the x-axis is not the modified pTmiss,nomu where the electron pT is added back in. Is that correct? Then, the x-axis of Figure 25 shows the modified pTmiss,nomu with the electron or tau pT added back in. Is that correct? Try to make the text and figures a bit clearer.

Changed:
<
<
You are correct. Figures 23 and 24 now are labeled as just "trigger efficiency", and the caption for Figure 25 has been made more clear.
>
>
Figures 23 and 24 now are labeled as just "trigger efficiency", and the caption for Figure 25 has been made more clear.

Changed:
<
<
Figure 25 and Tables 29 and 31: Figure 25 seems to show that the electron distribution is shifted higher than the tau distribution. Therefore, once you convolute this distribution with the trigger efficiency, I would expect the electron trigger efficiency to be higher than the tau trigger efficiency. However, the opposite is true. If I naively take the trigger efficiency as a step function which is 0% for pTmiss<200 GeV and 100% for pTmiss>200 GeV, I think I get about 30% for electrons and 13% for the tau, compared to 46% and 52%. Can you check the results and if correct, try to explain what I am missing?
>
>
Figure 25 and Tables 29 and 31: Figure 25 seems to show that the electron distribution is shifted higher than the tau distribution. Therefore, once you convolute this distribution with the trigger efficiency, I would expect the electron trigger efficiency to be higher than the tau trigger efficiency. However, the opposite is true. If I naively take the trigger efficiency as a step function which is 0% for pTmiss<200 GeV and 100% for pTmiss>200 GeV, I think I get about 30% for electrons and 13% for the tau, compared to 46% and 52%. Can you check the results and if correct, try to explain what I am missing?
Table 31: How did you determine the uncertainties for nlayers=4? The upper uncertainty of 0 for N^l_ctrl and the upper uncertainty of 0.0058 for estimate seem too small. Actually nlayers = 4 and nlayers = 5 for both muons and taus (Tables 30 and 31) have yields that are too small to assume Gaussian uncertainties. You should use Poisson uncertainties. You can ask the statistics committee for better advice but I think more correct uncertainties would be: 0 +1.15 -0
Line: 186 to 186
2 +1.52 -0.86 This comes from the prescription on page 32 of http://pdg.lbl.gov/2019/reviews/rpp2018-rev-statistics.pdf but again, the statistics committee may have another prescription. Note that I think this results in an estimate for the tau background for nlayers=4 to be 0 +1.9 -0.0 rather than 0 +0.0058 -0.
Changed:
<
<
The table values were incorrect because of the "0 \pm 0". This has been corrected to "0_{-0}^{+8.2}" using the poisson 1.15; it must however be multiplied by the tau trigger prescale of ~7.2. Following equation 39.42 in that link yields an estimate of 0 - 0 + 0.18, which is now the quoted estimate. This results in a roughly 2.5% increase in the total lepton statistical uncertainty and 1% increase in the total background statistical uncertainty. Do note that this was only the table values -- the upper limits used the correct gamma distributed 0 with alpha 0.0222 \pm 0.0051.
>
>
This has been corrected to "0_{-0}^{+8.2}" using poisson errors; the +1.15 must be multiplied by the tau trigger prescale. This is also corrected in the total lepton and total background stat uncertainties. Only the table values needed correction -- the upper limits used the correct values.
Section 6.1.5: There needs to be more information. Do you calculate P_veto, P_offline, and P_trigger for each of the leptons using simulated samples following the same recipe as for data? If so, what simulated samples? Do you just use Z->ll for P_veto and W->lnu for the others? Does the single lepton control region come from just W->lnu events? Or do you include all the background samples in Tables 7-8? I think using all of the samples from Tables 7-8 for every calculation would make the most sense.
Line: 194 to 194
Section 6.1.5: It seems like even with the relaxed selection criteria, you are still quite lacking in MC statistics for this test. You mention that you only include ttbar events. Is this just the ttbar semileptonic sample? Although this may be the largest single sample, it seems like you could also include other samples. Most importantly would be the W->lnu and Z-> invisible (including the HT-binned samples) as these seem to be the largest source of background in Figures 14 and 15. Is there some reason you didn't include these? If not, I suggest you go ahead and do this.
Changed:
<
<
All three ttbar samples are used; the other samples do not affect the study due to the small P(veto). Figures 14 and 15 compare the disappearing track selection, but the lepton background estimates do not use this. The (here modified) single-lepton control region selections and tag-and-probe selections are used. W->lnu and Z->invisible do not contribute to the tag-and-probe study of P(veto); the Z->ll sample should contribute however it is considerably smaller in statistics than the ttbar samples.
>
>
All three ttbar samples are used, but the di-leptonic ttbar sample contributes the most. Keep in mind that P(veto) is a tag-and-probe selection, and samples such as W->lnu and Z->invisible will not significantly contribute. The Z->ll sample should contribute, but the size of those samples is considerably smaller than the ttbar samples.

Changed:
<
<
Figure 26: Are all three results properly normalized to 41.5 fb^-1? If so, it seems like we should be including 3 layer tracks in the signal region because I can exclude a ct=10cm with this plot alone (observe 350 with a prediction of 125), which you can't do with the whole analysis.
>
>
Figure 26: Are all three results properly normalized to 41.5 fb^-1? If so, it seems like we should be including 3 layer tracks in the signal region because I can exclude a ct=10cm with this plot alone (observe 350 with a prediction of 125), which you can't do with the whole analysis.
Yes. The challenge in including a 3-layer category would be in describing the extra gaussian peak for 3-layer fakes shown in Figure 28.

Figure 27: Would be good to add the nlayers=4 result as is done in Figure 28. Should also say what simulated samples are included here.

Changed:
<
<
All samples are used and the caption now says this. The cleaning cuts are only relevant for the nlayers=3 category (which is otherwise not used in the analysis), and showing this for nlayers=4 would provide the same information as Figure 31.
>
>
All samples are used, the caption is updated. The cleaning cuts are only relevant for the nlayers=3 category which is only used in an appendix (after the edits suggested below). Showing this for nlayers=4 would provide the same information as Figure 31.
Section 6.2: This needs to be cleaned up and explained better. Here are some specific comments/suggestions - I think that L566-569, Figures 26-28, and L602-615 can all be removed. It seems like they have nothing to do with the analysis that is done. They just lead to confusion. If you want to move this material to Appendix C, that is fine. But don't clutter up this section.
Line: 213 to 213
a) Correct, the fit is a gaussian + constant. The AN has been clarified.
Changed:
<
<
b) Our hypothesis is that this is a bias in the track-fitting algorithm, where short tracks with very few hits have the importance of the primary vertex inflated, drawing tracks closer to the PV. Figure 27 shows this issue increased further for 3-layer (even shorter) tracks, and also occuring in SM background MC -- it is not a data-exclusive feature. Also none of these tracks in background MC are near to any hard interaction truth particle, which is our consideration of "fake" in MC truth.
>
>
b) Our hypothesis is that this is a bias in the track-fitting algorithm, where short tracks with very few hits have the importance of the primary vertex inflated, drawing tracks closer to the PV. Figure 27 shows this also occurs in SM background MC, and in those MC samples none of the tracks are near to any hard interaction truth particle; this is precisely our consideration of "fake" in MC truth.

Changed:
<
<
c) The purpose of the transfer factor is purely the normalization of the sideband rates to the signal region; if we observed a flat distribution the transfer factor would be 0.04cm / 0.10cm = 0.25 everywhere. We do not observe a flat distribution and must describe this normalization, which is the purpose of the fit in equation 14. The uncertainty in the fit is also important in this normalization, because the issue of the statistical uncertainty in our observed P^raw_fake is separate from the issue of the uncertainty in the relative rates between the sidebands and the signal region.
>
>
c) The purpose of the transfer factor is only to normalize the sideband rates to the signal region. We must describe this normalization in a way that does not depend on obvserving the signal region count, because in nlayers=5, >=6 the statistics do not allow for that. That is what the fit does in Eq. 14.

Changed:
<
<
d) L629-630 quotes the transfer factor for the baseline sideband (0.05, 0.10) cm, so only one. The authors felt that Table 35 was large enough already, but we now provide an additional table listing the P^raw_fake and transfer factors.
>
>
d) L629-630 quotes the transfer factor for the baseline sideband (0.05, 0.10) cm, so only one. The authors felt that Table 35 was large enough already, but we now provide an additional table listing the P^raw_fake and transfer factors.
- What best describes your assumption of the fake track rate as a function of d0. Is it uniform (flat), Gaussian, Gaussian+flat, or something else?
Line: 231 to 231
- L649-653: I don't understand how these numbers are consistent with Figure 29. In Figure 29 (left) it seems there are about 9 events with |dxy|<0.02cm and about 15 with 0.05<|dxy|<0.10cm to be compared with 32 events and 68 events. There is a similar discrepancy for electrons. I guess the plots have been scaled for some reason as the entries are not integers. Please fix the plots and verify the results are consistent.
Changed:
<
<
These were scaled incorrectly; the plots are now fixed and agree with the text.
>
>
The scaling of the plots is now fixed, and agrees with the text.
- Figure 29: Why do you not fit the region |dxy|<0.1cm? If you fit out to |dxy|=1.0cm, please show the entire range in the plots. It would be nice to see the results for nlayers=5 and 6 as well so we can evaluate the extent to which a fit may or may not be possible and whether the shape is consistent with nlayers=4.
Changed:
<
<
The plots being fit end at |dxy|<0.5cm so really the 1.0cm is incorrect and the full range is indeed shown; this has been updated in the AN. The NLayers5 plots are shown below with the fit -from the NLayers4- overlaid, scaled to the integral of NLayers5 -- no new fit is shown. The NLayers6plus category has one event in ZtoMuMu (dxy ~ 0.06cm) and three events in ZtoEE (dxy ~ 0.05 0.05 and 0.17 cm), so no fit is possible there.
>
>
The AN has been updated to correctly reflect the fit extending to |dxy|<0.5cm, the range of the plots. Should the d0 peak actually contain real tracks, it would peak more narrowly than observed in the sidebands; so |dxy|<0.1cm is excluded from the fit, and the count of nlayers=4 tracks in the signal region is checked against the fit prediction, and agrees.

Shown below is the nlayers=5 d0 distributions, with the fit from nlayers=4 overlaid. The nlayers>=6 samples have one (three) events in ZtoMuMu (ZtoEE), so no fit is possible.

 ZtoMuMu NLayers5 ZtoEE NLayers5
Line: 258 to 260
Section 8.2.2: Please expand on this. I am very surprised that this is such a small effect. Given the problems encountered with the Phase 1 pixel detector (problems with timing levels 1 and 3, way more noise than expected in layer 1 causing high thresholds, DC-DC converter failures, etc.) I would have expected big differences between data and simulation on quantities requiring hits in the pixel detector. I know the tracking reconstruction was changed at HLT and offline to keep track reconstruction efficiency high but this doesn't remove the problem of missing pixel hits. So please expand on how you measure these uncertainties. Do you just use tracks with pT>55 GeV that are associated with a muon? One problem with using muons to evaluate tracking efficiency is that there are special track reconstruction techniques developed to recover muons missed by the standard tracking. These tend to use wider windows to discover silicon hits and so may not reflect the track reconstruction of "standard" charged particles. You could perhaps remove the electron and tau vetos to see what you get in those cases.
Changed:
<
<
The process is as described: the muon veto is removed from candidate track selection, and further the missing inner/middle hits requirements are removed to form an N-2 distribution of missing inner/middle hits. The differences between data and simulation is mitigated by the global tag, which here was produced in the re-reco campaign after data-taking as completed. Further the efficiency for muon tracks to have zero middle hits is already very high (this cut is more for rejecting fakes); the difference between data/MC for the middle inner hits cut -inefficiency- is 4.5%, but the passing events are so much larger the difference in efficiency is only 0.02%.
>
>
THE AN describes the process correctly. The global tag used for signal was formed well after data-taking was completed for 2017, and has updated hit efficiencies. Further this efficiency is very high which affects the scale of this value. For missing middle hits, the inefficiency has a 4.5% difference between data and MC, but it is the efficiency which is 0.02% different.

Changed:
<
<
The muon control region is our signal region with the muon veto, ECalo, and missing outer hits cuts removed. So it is still a track selection (pt > 55, MET > 120, jet pt > 110), the tracks dominantly being from muons. Before the chargino decay, our signal tracks are muon-like in the reconstruction's treatment; it is in fact electrons/taus that behave differently than signal tracks as concerns missing inner/middle hits, so we need to keep those vetoes.
>
>
Before the chargino decays, signal tracks are muon-like and would be treated the same way. The muon control region is still a track selection (pt > 55, MET > 120, jet pt > 110), and the reconstruction is done only with the tracker information. The electron/tau vetoes need to remain so that this sample is dominated by muon tracks, and is comparable to the signal tracks before they decay.
Section 8.2.5: This seems like an underestimate of the systematic uncertainty. If you had infinite data and MC statistics, your systematic uncertainty would be 0. As mentioned above, this doesn't address whether measuring the ISR using Z->mumu decays translates exactly into the ISR for the signal process. The paper mentions this is up to a 350% correction, so it is a big effect. I am very worried that the systematic uncertainty does not cover all that we don't know. I note that Figure 37 shows results with pT and pTmiss. Why did you use pT? Perhaps pTmiss could also be used as a systematic check.
Changed:
<
<
The MadGraph/Pythia correction is the larger correction and the uncertainties in it are reasonably small due to the 10M event sample size of the Pythia8 sample. The largest uncertainties are from the data/MC correction which is statistically limited due to the recorded data. Moreover the uncertainties are largest where the signal populates least, which lowers the impact on the total yield. The pTmiss is a useful cross-check, but the sum-pT of a diMuon system has a much better resolution than pTmiss; it is a very common tool in characterizing the hadronic recoil in many analyses.
>
>
Most of this uncertainty comes from the data/MC correction at lower sum-pt's, where we do not have infinite data statistics. Moreover these statistical uncertainties are largest where our signal populates the least, which lowers this systematic uncertainty. The pTmiss is a useful cross-check which was requested by conveners, but the sum-pT of a diMuon system has a much better resolution than pTmiss; it is a very common tool in characterizing the hadronic recoil in many analyses.
Figure 43: I would suggest including the same comparison vs pT from the document you reference. This shows that for pT>55 GeV, the differences are similar as for lower pT.

#### Revision 382019-07-22 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 25 to 25

 24 May 2019 LL EXO WG First 2018 plots Agenda slides 31 May 2019 LL EXO WG Pre-approval Agenda slides 7 June 2019 LL EXO WG Followup to pre-approval Agenda slides
>
>
 19 July 2019 LL EXO WG 2018 ABC background estimates Agenda slides

Line: 47 to 49

• Produce skimmed ntuples with CRAB
• MET
Changed:
<
<
• EGamma (D in progress)
>
>
• EGamma

• SingleMuon
• Tau
• Create fiducial maps * Muon
Changed:
<
<
* Electron (A, B in progress)
>
>
* Electron (D in progress)

• Run channels without fiducial track selections
• basicSelection
Changed:
<
<
• ZtoEE (BC complete, A in progress)
>
>
• ZtoEE (D in progress)

• ZtoMuMu
• Trigger efficiencies with muons
Changed:
<
<
• Background estimates and systematics (C complete, requires fiducial maps)
>
>
• Background estimates and systematics (ABC complete)
* Electron * Muon * Tau
Line: 196 to 198
Figure 26: Are all three results properly normalized to 41.5 fb^-1? If so, it seems like we should be including 3 layer tracks in the signal region because I can exclude a ct=10cm with this plot alone (observe 350 with a prediction of 125), which you can't do with the whole analysis.
>
>
Yes. The challenge in including a 3-layer category would be in describing the extra gaussian peak for 3-layer fakes shown in Figure 28.
Figure 27: Would be good to add the nlayers=4 result as is done in Figure 28. Should also say what simulated samples are included here.

All samples are used and the caption now says this. The cleaning cuts are only relevant for the nlayers=3 category (which is otherwise not used in the analysis), and showing this for nlayers=4 would provide the same information as Figure 31.

Line: 238 to 242
Section 6.2.2: In Table 36, it would be enlightening to show the same results as in Table 35. That is, I am curious as to how P_fake compares between data and MC. Are these results normalized to 41 fb^-1? If so, then it seems like the MC predicts about 1/5 as many fake tracks as data. It is hard to be confident that the MC tells us anything if that is so.
Changed:
<
<
Section 6.2.2: Your hypothesis is that the fake track rate is independent of selection so you can use the Z data to estimate the fake track rate in your signal region. I have suggested that you could also measure the fake track rate in QCD events to verify this. You can also check the effect in MC. I guess in Section 6.2.2 you apply the same criteria to MC as you do for data (selecting Z events). However, if your hypothesis is true, then you should also get the same fake rate if you use any MC sample. What happens if you use all the samples in Section 3.3 but remove the Table 33 and 34 requirements so you are using all events? If P_fake changes significantly, this is cause for concern. If not, then that is good. In either case, it still may not prove anything if the MC is really predicting 1/5 the amount of fake tracks.
>
>
Section 6.2.2: Your hypothesis is that the fake track rate is independent of selection so you can use the Z data to estimate the fake track rate in your signal region. I have suggested that you could also measure the fake track rate in QCD events to verify this. You can also check the effect in MC. I guess in Section 6.2.2 you apply the same criteria to MC as you do for data (selecting Z events). However, if your hypothesis is true, then you should also get the same fake rate if you use any MC sample. What happens if you use all the samples in Section 3.3 but remove the Table 33 and 34 requirements so you are using all events? If P_fake changes significantly, this is cause for concern. If not, then that is good. In either case, it still may not prove anything if the MC is really predicting 1/5 the amount of fake tracks.
Figure 31: Would be good to have a plot for nlayers=5 as well.
Changed:
<
<
Figure 35: Please include the ratio of the two since this provides the scale factors that are used. It may be better to simply include the region of 50-300 GeV on a linear scale.
>
>
Figure 35: Please include the ratio of the two since this provides the scale factors that are used. It may be better to simply include the region of 50-300 GeV on a linear scale.

Changed:
<
<
L752-754: While this signal yield reduction is interesting, just as interesting would be the change after all cuts are applied (with nlayers>=4). Can you provide this as well?
>
>
L752-754: While this signal yield reduction is interesting, just as interesting would be the change after all cuts are applied (with nlayers>=4). Can you provide this as well?

Changed:
<
<
L760-762: How sure are we that the Z can be used to measure ISR difference for the signal model? I generally agree with the statement that both recoil off ISR but it would be nice if this could be confirmed somehow. Does the pT distribution for a 100 GeV chargino look similar to a Z in Pythia8? Does the ISR reweighting work for ttbar events or diboson events?
>
>
L760-762: How sure are we that the Z can be used to measure ISR difference for the signal model? I generally agree with the statement that both recoil off ISR but it would be nice if this could be confirmed somehow. Does the pT distribution for a 100 GeV chargino look similar to a Z in Pythia8? Does the ISR reweighting work for ttbar events or diboson events?
L772-773: Please expand on "is applied to the simulated signal samples". Do you reweight the events using the ISR jet in the event or the net momentum of the produced SUSY particles or something else.
Line: 266 to 270
Changed:
<
<
Section 8.2.11: Please explain this better. My understanding is that various prescales were in place. In the original measurement of the trigger efficiency in Section 7.3, you rely on being above the track requirement plateau to measure the trigger efficiency versus pTmiss in data (which has all of the various prescales naturally included) and compare to MC (which just has an OR of all trigger paths, I think). This is the main reason why the data efficiency is lower than the MC efficiency. Is this correct? Now, in this section, you are measuring the trigger efficiency solely with MC, which is an OR of all trigger paths. So if the trigger path with the track requirement fails, the MC might still find that an MHT only trigger will fire, while in data the MHT-only trigger may be prescaled. Isn't this a problem? Can you perhaps repeat the exercise using only triggers in MC that were never prescaled (as the opposite extreme to assuming there was never any prescale)? Also, why would you average over the chargino lifetimes? Shouldn't this systematic uncertainty depend very strongly on chargino lifetime?
>
>
Section 8.2.11: Please explain this better. My understanding is that various prescales were in place. In the original measurement of the trigger efficiency in Section 7.3, you rely on being above the track requirement plateau to measure the trigger efficiency versus pTmiss in data (which has all of the various prescales naturally included) and compare to MC (which just has an OR of all trigger paths, I think). This is the main reason why the data efficiency is lower than the MC efficiency. Is this correct? Now, in this section, you are measuring the trigger efficiency solely with MC, which is an OR of all trigger paths. So if the trigger path with the track requirement fails, the MC might still find that an MHT only trigger will fire, while in data the MHT-only trigger may be prescaled. Isn't this a problem? Can you perhaps repeat the exercise using only triggers in MC that were never prescaled (as the opposite extreme to assuming there was never any prescale)? Also, why would you average over the chargino lifetimes? Shouldn't this systematic uncertainty depend very strongly on chargino lifetime?

Changed:
<
<
Section 8.2.11: Per my discussion of pixel issues in 2017. It is relatively easy to get 5 pixel hits with only 4 pixel layers as in order to make a hermetic cylindrical detector with flat sensors, you need to have overlaps. These overlaps are largest in the first layer, which is where there were significant issues with the Phase 1 pixel detector. So I am concerned that if the MC is optimistic about layer 1 hits, then relying on the MC may not be wise. Maybe you can check the following. Take good tracks (not muons but large number of hits with pT>50 GeV). Check the fraction of tracks that have two layer 1 pixel hits compared to one layer 1 pixel hits between MC and data. Or, more generally, the average number of pixel hits. If they differ, then you could see how many times you would go from 5 pixel hits to 4 pixel hits in data vs MC and use this difference as another estimate of the difference in trigger efficiency.
>
>
Section 8.2.11: Per my discussion of pixel issues in 2017. It is relatively easy to get 5 pixel hits with only 4 pixel layers as in order to make a hermetic cylindrical detector with flat sensors, you need to have overlaps. These overlaps are largest in the first layer, which is where there were significant issues with the Phase 1 pixel detector. So I am concerned that if the MC is optimistic about layer 1 hits, then relying on the MC may not be wise. Maybe you can check the following. Take good tracks (not muons but large number of hits with pT>50 GeV). Check the fraction of tracks that have two layer 1 pixel hits compared to one layer 1 pixel hits between MC and data. Or, more generally, the average number of pixel hits. If they differ, then you could see how many times you would go from 5 pixel hits to 4 pixel hits in data vs MC and use this difference as another estimate of the difference in trigger efficiency.

Changed:
<
<
Here are also some brief comments on the paper:
>
>
Here are also some brief comments on the paper:

Changed:
<
<
Whenever you have a range, it should be written in regular (not math) mode with a double hyphen in LaTeX and no spaces. That is, "1--2". Done correctly in L44, . Incorrect in L186, L285, L321, L326, L327, L331
>
>
Whenever you have a range, it should be written in regular (not math) mode with a double hyphen in LaTeX and no spaces. That is, "1--2". Done correctly in L44, . Incorrect in L186, L285, L321, L326, L327, L331

Changed:
<
<
In Section 2, I think it would be good to give more information about the tracker, especially the Phase 1 pixel detector. It is pretty important to know that we expect particles to pass through 4 pixel layers.
>
>
In Section 2, I think it would be good to give more information about the tracker, especially the Phase 1 pixel detector. It is pretty important to know that we expect particles to pass through 4 pixel layers.

Changed:
<
<
Should mention the difference between number of hits and number of layers with a hit.
>
>
Should mention the difference between number of hits and number of layers with a hit.

Changed:
<
<
L60-67: At the end you talk about physics quantities like tan beta, mu, and the chargino-neutralino mass difference. In principle, I believe the lifetime is set by the masses (mainly mass difference) of the chargino and neutralino. I think you need to be clear that the lifetimes are changed arbitrarily and also give the mass difference (could just say 0.2 GeV).
>
>
L60-67: At the end you talk about physics quantities like tan beta, mu, and the chargino-neutralino mass difference. In principle, I believe the lifetime is set by the masses (mainly mass difference) of the chargino and neutralino. I think you need to be clear that the lifetimes are changed arbitrarily and also give the mass difference (could just say 0.2 GeV).

Changed:
<
<
L113: pTmiss and pTmiss no mu should be vectors
>
>
L113: pTmiss and pTmiss no mu should be vectors

Changed:
<
<
L128: Should say why |eta|<2.1 is used.
>
>
L128: Should say why |eta|<2.1 is used.

Changed:
<
<
L131: need to specify the eta and pT requirements on the jets, perhaps in L103-107.
>
>
L131: need to specify the eta and pT requirements on the jets, perhaps in L103-107.

Changed:
<
<
L157: Should describe hadronic tau reconstruction. Could be at the end of L91-102 where electrons, muons, and charged hadron reconstruction is described.
>
>
L157: Should describe hadronic tau reconstruction. Could be at the end of L91-102 where electrons, muons, and charged hadron reconstruction is described.

Changed:
<
<
L168,L177: Given that your special procedure removes 4% of the signal tracks, it is natural to wonder what fraction of the signal tracks are removed by the requirements of L159-168.
>
>
L168,L177: Given that your special procedure removes 4% of the signal tracks, it is natural to wonder what fraction of the signal tracks are removed by the requirements of L159-168.

Changed:
<
<
L179: Commas after "Firstly" and "Secondly"
>
>
L179: Commas after "Firstly" and "Secondly"

Changed:
<
<
L190: Should make it clear that leptons here refers to electrons, muons, and taus.
>
>
L190: Should make it clear that leptons here refers to electrons, muons, and taus.

Changed:
<
<
L194, 196, 222, 231: The ordering of P_offline and P_trigger in L194,196 is different than in L222,231. Better to be consistent.
>
>
L194, 196, 222, 231: The ordering of P_offline and P_trigger in L194,196 is different than in L222,231. Better to be consistent.

Changed:
<
<
L204: I think you mean "excepting" rather than "expecting"
>
>
L204: I think you mean "excepting" rather than "expecting"

Changed:
<
<
L214: I don't think you need the subscript "invmass" given that you define it that way in L213.
>
>
L214: I don't think you need the subscript "invmass" given that you define it that way in L213.

Changed:
<
<
L222: Change "condition" to "conditional"
>
>
L222: Change "condition" to "conditional"

Changed:
<
<
L227: p_T^l should be a vector
>
>
L227: p_T^l should be a vector

Changed:
<
<
L234-238: This will need to be expanded to make it clear
>
>
L234-238: This will need to be expanded to make it clear

Changed:
<
<
L247: I don't think it is useful to mention a closure test with 2% of the data. I mean a 2% test may reveal something that is horribly wrong but it is not going to convince anyone that you know what you are doing.
>
>
L247: I don't think it is useful to mention a closure test with 2% of the data. I mean a 2% test may reveal something that is horribly wrong but it is not going to convince anyone that you know what you are doing.

Changed:
<
<
L339 and Table 3 caption: Suggest changing "signal yields" to "signal efficiencies"
>
>
L339 and Table 3 caption: Suggest changing "signal yields" to "signal efficiencies"

Changed:
<
<
Table 4: I guess to match the text it should be "spurious tracks" instead of "fake tracks"
>
>
Table 4: I guess to match the text it should be "spurious tracks" instead of "fake tracks"

Changed:
<
<
Lots of the references need minor fixing. The main problems are
>
>
Lots of the references need minor fixing. The main problems are
- volume letter needs to go with title and not volume number: refs 2, 8, 26, 27, 30, 39, 40, 41 - only the first page number should be given: refs 2, 19, 30 - no issue number should be given: refs 8, 13, 31 - PDG should use the Bibtex entry given here: https://twiki.cern.ch/twiki/bin/view/CMS/Internal/PubGuidelines
Changed:
<
<
- ref 40 needs help
>
>
- ref 40 needs help

<!--/twistyPlugin-->
Line: 327 to 331
Regarding the Z (or ewkino pair) recoil correction, are you really performing the following two steps for the signal: 1) reweight from Pythia8 to MG as a function of the recoil pt; 2) reweight again the resulting signal MC according to the data/MC observed recoil spectrum in Z->mumu events? Also, let me ask again (probably you answer that at the pre-approval meeting, but I forgot): the data/MC discrepancy at lot dimuon pt was just de to the lack of MC dimuon events at low invariant mass ? (this should be irrelevant given the ISR jet cut used in the analysis, but just to understand).
Changed:
<
<
This is correct, we apply both weights. In the ARC review of EXO-16-044 (in which Kevin Stenson was chair and will recall), it was noted that only applying the data/(MG MC) correction would only correct the MG distribution to that seen in data. As our signal is generated in Pythia, we need to correct Pythia's distribution to that of MG's first, otherwise the first correction is not applicable.
>
>
This is correct, we apply both weights. In the ARC review of EXO-16-044 (in which Kevin Stenson was chair and will recall), it was noted that only applying the data/(MG MC) correction would only correct the MG distribution to that seen in data. As our signal is generated in Pythia, we need to correct Pythia's distribution to that of MG's first, otherwise the first correction is not applicable.
Yes, the discrepancy at low dimuon pt is driven by the drell-yan samples; in 2017 the samples available were M > 5 GeV, and M > 10 GeV in 2018. Yes, this is irrelevant given the ISR cut.

If one of the trigger paths has a tighter cut (5 hits) than the offline cut, why did not you redefine the offline cuts and required >=5 hits when ONLY that trigger path is fired ? I do not see any right to assume that we can count on an extra efficiency that does not really exist, even if it is small. Am I missing anything?

Changed:
<
<
There is a non-zero probability that a track has multiple hits associated to it in the same pixel (9% in one signal sample for example). This allows tracks with only 4 layers to have >=5 hits and fire the IsoTrk50 leg.
>
>
There is a non-zero probability that a track has multiple hits associated to it in the same pixel (9% in one signal sample for example). This allows tracks with only 4 layers to have >=5 hits and fire the IsoTrk50 leg.

Changed:
<
<
As you suspect, the addition of this trigger has a small effect on the efficiency for the nlayer = 4 bin as shown in the left plot below (of course for the nlayers > 6 bin the effect of is much more larger, as shown in the right plot).
>
>
As you suspect, the addition of this trigger has a small effect on the efficiency for the nlayer = 4 bin as shown in the left plot below (of course for the nlayers > 6 bin the effect of is much more larger, as shown in the right plot).

Line: 340 to 344

Deleted:
<
<
On a tangent matter: when can we expect to have any kind of 2018 results ? Despite the suggestion from the EXO conveners I am a bit uncomfortable with considering this step as a trivial top-up operation in an analysis like this one. We know by experience that each new year can give rise to new features and then change significantly the rate of pathological background events that we have to consider...
Changed:
<
<
The above section will for now provide immediate updates for 2018 results. We will also deliver them in an updated analysis note when available. Currently all background estimates are available in 2018 C. The nTuples are now complete in 2018 ABC and we are now processing them quickly, we expect background estimates in ABC this week. The nTuples for 2018 D will progress quickly because we still have global CRAB priority.
>
>
The above section will for now provide immediate updates for 2018 results. See also this recent update with recent updates in 2018 ABC.
</>
<!--/twistyPlugin-->
Line: 359 to 362
We find the first paragraph of the answer to (5) confusing. If most of the signal events are in the plateau, why not cutaway the turn on in the selection? Anyway, according to Fig. 36 in the AN, quite some part of signal is in the turn on. If this is the case, we find unbelievable that you can be confident about your efficiency in the middle of the steep turn on to relative uncertainty of the order of 0.5%. We still think that a check with different datasets might provide some handle on the related systematics (position and slope of the turn on). We hope that ARC can pay attention to this issue. We are fine with the second paragraph of the answer.
Changed:
<
<
We do not cutaway the turn on in order to have as much signal acceptance as possible. The small uncertainties you refer to are the statistical uncertainties in the data and SM background MC efficiency measurements, and are small due to those samples being very large. This, however, is not the total uncertainty on the trigger efficiency, and we apologize if the AN gave that impression. We have added Section 8.2.11 and Figure 44 to the AN in version 7 that describes an additional component of the signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those, which is roughly a 10% effect. However, since only about 10% of the signal is on the turn-on, this uncertainty results in yield uncertainties of 1.1% and 0.5% for ==4 and ==5 layers respectively. This is combined with the statistical uncertainties to reach a total trigger systematic uncertainty, which is presented in the updated AN in Table ???.
>
>
Since MET from ISR is only needed for the trigger strategy, as a search for the disappearing tracks signature we wish to keep as much acceptance as possible. The small uncertainties you mention are those from the statistical uncertainties in the data and SM background MC efficeincy measurements, and are small due to those samples being very large. We recently added Section 8.2.11 and Figure 44 to the AN in version 7 which introduces a signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those. Only about 10% of the signal is on the turn-on so even a 10% uncertainty in the turn-on region only results in a 1% yield systematic -- this new AN section resulting in a 1.1% and 0.5% systematic for ==4 and ==5 layers respectively. In the next version of the AN we will combine all of the trigger signal systematics into one section to make it easier to read.

#### Revision 372019-07-19 - ChristopherHill

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 329 to 329
This is correct, we apply both weights. In the ARC review of EXO-16-044 (in which Kevin Stenson was chair and will recall), it was noted that only applying the data/(MG MC) correction would only correct the MG distribution to that seen in data. As our signal is generated in Pythia, we need to correct Pythia's distribution to that of MG's first, otherwise the first correction is not applicable.
Changed:
<
<
The discrepancy at low dimuon pt is driven by the drell-yan samples; in 2017 the samples available were M > 5 GeV, and M > 10 GeV in 2018. Recall we also require a jet here with pt > 110 GeV since our signal requires that, and that also restricts the low-pt events we select. But due also to this jet requirement, in signal the electroweak-ino pair has high sum pt. For example for 700 GeV charginos with ctau = 100cm, the lowest event selected is at ~100 GeV and the mean pt is ~310 GeV -- where the data/MC ISR weights are closer to unity.
>
>
Yes, the discrepancy at low dimuon pt is driven by the drell-yan samples; in 2017 the samples available were M > 5 GeV, and M > 10 GeV in 2018. Yes, this is irrelevant given the ISR cut.
If one of the trigger paths has a tighter cut (5 hits) than the offline cut, why did not you redefine the offline cuts and required >=5 hits when ONLY that trigger path is fired ? I do not see any right to assume that we can count on an extra efficiency that does not really exist, even if it is small. Am I missing anything?
Changed:
<
<
It is a very small but non-zero probability that a track has multiple hits associated to it in the same pixel (9% in one signal sample for example). This allows tracks with only 4 layers to have >=5 hits and fire the IsoTrk50 leg. But to your question, this is a small probability on top of the small probability to have a MET on the turn-on such that it would make a difference. Adding that cut is statistically consistent with not adding it and relying on only the inclusive MET paths, as is shown in the left plot below.
>
>
There is a non-zero probability that a track has multiple hits associated to it in the same pixel (9% in one signal sample for example). This allows tracks with only 4 layers to have >=5 hits and fire the IsoTrk50 leg.

As you suspect, the addition of this trigger has a small effect on the efficiency for the nlayer = 4 bin as shown in the left plot below (of course for the nlayers > 6 bin the effect of is much more larger, as shown in the right plot).

#### Revision 362019-07-17 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 231 to 231
- Figure 29: Why do you not fit the region |dxy|<0.1cm? If you fit out to |dxy|=1.0cm, please show the entire range in the plots. It would be nice to see the results for nlayers=5 and 6 as well so we can evaluate the extent to which a fit may or may not be possible and whether the shape is consistent with nlayers=4.
Changed:
<
<
The plots being fit end at |dxy|<0.5cm so really the 1.0cm is incorrect and the full range is indeed shown; this has been updated in the AN. The NLayers5 plots are shown below with the fit from the NLayers4 overlaid, scaled to the integral of NLayers5 -- no new fit is shown.
>
>
The plots being fit end at |dxy|<0.5cm so really the 1.0cm is incorrect and the full range is indeed shown; this has been updated in the AN. The NLayers5 plots are shown below with the fit -from the NLayers4- overlaid, scaled to the integral of NLayers5 -- no new fit is shown. The NLayers6plus category has one event in ZtoMuMu (dxy ~ 0.06cm) and three events in ZtoEE (dxy ~ 0.05 0.05 and 0.17 cm), so no fit is possible there.

>
>
 ZtoMuMu NLayers5 ZtoEE NLayers5

Changed:
<
<
Section 6.2.2: In Table 36, it would be enlightening to show the same results as in Table 35. That is, I am curious as to how P_fake compares between data and MC. Are these results normalized to 41 fb^-1? If so, then it seems like the MC predicts about 1/5 as many fake tracks as data. It is hard to be confident that the MC tells us anything if that is so.
>
>
Section 6.2.2: In Table 36, it would be enlightening to show the same results as in Table 35. That is, I am curious as to how P_fake compares between data and MC. Are these results normalized to 41 fb^-1? If so, then it seems like the MC predicts about 1/5 as many fake tracks as data. It is hard to be confident that the MC tells us anything if that is so.
Section 6.2.2: Your hypothesis is that the fake track rate is independent of selection so you can use the Z data to estimate the fake track rate in your signal region. I have suggested that you could also measure the fake track rate in QCD events to verify this. You can also check the effect in MC. I guess in Section 6.2.2 you apply the same criteria to MC as you do for data (selecting Z events). However, if your hypothesis is true, then you should also get the same fake rate if you use any MC sample. What happens if you use all the samples in Section 3.3 but remove the Table 33 and 34 requirements so you are using all events? If P_fake changes significantly, this is cause for concern. If not, then that is good. In either case, it still may not prove anything if the MC is really predicting 1/5 the amount of fake tracks.
Changed:
<
<
Figure 31: Would be good to have a plot for nlayers=5 as well.
>
>
Figure 31: Would be good to have a plot for nlayers=5 as well.
Figure 35: Please include the ratio of the two since this provides the scale factors that are used. It may be better to simply include the region of 50-300 GeV on a linear scale.
Line: 247 to 248
L760-762: How sure are we that the Z can be used to measure ISR difference for the signal model? I generally agree with the statement that both recoil off ISR but it would be nice if this could be confirmed somehow. Does the pT distribution for a 100 GeV chargino look similar to a Z in Pythia8? Does the ISR reweighting work for ttbar events or diboson events?
Changed:
<
<
L772-773: Please expand on "is applied to the simulated signal samples". Do you reweight the events using the ISR jet in the event or the net momentum of the produced SUSY particles or something else.
>
>
L772-773: Please expand on "is applied to the simulated signal samples". Do you reweight the events using the ISR jet in the event or the net momentum of the produced SUSY particles or something else.

Changed:
<
<
Section 8.2.2: Please expand on this. I am very surprised that this is such a small effect. Given the problems encountered with the Phase 1 pixel detector (problems with timing levels 1 and 3, way more noise than expected in layer 1 causing high thresholds, DC-DC converter failures, etc.) I would have expected big differences between data and simulation on quantities requiring hits in the pixel detector. I know the tracking reconstruction was changed at HLT and offline to keep track reconstruction efficiency high but this doesn't remove the problem of missing pixel hits. So please expand on how you measure these uncertainties. Do you just use tracks with pT>55 GeV that are associated with a muon? One problem with using muons to evaluate tracking efficiency is that there are special track reconstruction techniques developed to recover muons missed by the standard tracking. These tend to use wider windows to discover silicon hits and so may not reflect the track reconstruction of "standard" charged particles. You could perhaps remove the electron and tau vetos to see what you get in those cases.
>
>
The vector sum pt of the gen-level electroweak-ino pair is used to evaluate the weights. The AN is clarified.

Changed:
<
<
Section 8.2.5: This seems like an underestimate of the systematic uncertainty. If you had infinite data and MC statistics, your systematic uncertainty would be 0. As mentioned above, this doesn't address whether measuring the ISR using Z->mumu decays translates exactly into the ISR for the signal process. The paper mentions this is up to a 350% correction, so it is a big effect. I am very worried that the systematic uncertainty does not cover all that we don't know. I note that Figure 37 shows results with pT and pTmiss. Why did you use pT? Perhaps pTmiss could also be used as a systematic check.
>
>
Section 8.2.2: Please expand on this. I am very surprised that this is such a small effect. Given the problems encountered with the Phase 1 pixel detector (problems with timing levels 1 and 3, way more noise than expected in layer 1 causing high thresholds, DC-DC converter failures, etc.) I would have expected big differences between data and simulation on quantities requiring hits in the pixel detector. I know the tracking reconstruction was changed at HLT and offline to keep track reconstruction efficiency high but this doesn't remove the problem of missing pixel hits. So please expand on how you measure these uncertainties. Do you just use tracks with pT>55 GeV that are associated with a muon? One problem with using muons to evaluate tracking efficiency is that there are special track reconstruction techniques developed to recover muons missed by the standard tracking. These tend to use wider windows to discover silicon hits and so may not reflect the track reconstruction of "standard" charged particles. You could perhaps remove the electron and tau vetos to see what you get in those cases.

Changed:
<
<
Figure 43: I would suggest including the same comparison vs pT from the document you reference. This shows that for pT>55 GeV, the differences are similar as for lower pT.
>
>
The process is as described: the muon veto is removed from candidate track selection, and further the missing inner/middle hits requirements are removed to form an N-2 distribution of missing inner/middle hits. The differences between data and simulation is mitigated by the global tag, which here was produced in the re-reco campaign after data-taking as completed. Further the efficiency for muon tracks to have zero middle hits is already very high (this cut is more for rejecting fakes); the difference between data/MC for the middle inner hits cut -inefficiency- is 4.5%, but the passing events are so much larger the difference in efficiency is only 0.02%.

The muon control region is our signal region with the muon veto, ECalo, and missing outer hits cuts removed. So it is still a track selection (pt > 55, MET > 120, jet pt > 110), the tracks dominantly being from muons. Before the chargino decay, our signal tracks are muon-like in the reconstruction's treatment; it is in fact electrons/taus that behave differently than signal tracks as concerns missing inner/middle hits, so we need to keep those vetoes.

Section 8.2.5: This seems like an underestimate of the systematic uncertainty. If you had infinite data and MC statistics, your systematic uncertainty would be 0. As mentioned above, this doesn't address whether measuring the ISR using Z->mumu decays translates exactly into the ISR for the signal process. The paper mentions this is up to a 350% correction, so it is a big effect. I am very worried that the systematic uncertainty does not cover all that we don't know. I note that Figure 37 shows results with pT and pTmiss. Why did you use pT? Perhaps pTmiss could also be used as a systematic check.

The MadGraph/Pythia correction is the larger correction and the uncertainties in it are reasonably small due to the 10M event sample size of the Pythia8 sample. The largest uncertainties are from the data/MC correction which is statistically limited due to the recorded data. Moreover the uncertainties are largest where the signal populates least, which lowers the impact on the total yield. The pTmiss is a useful cross-check, but the sum-pT of a diMuon system has a much better resolution than pTmiss; it is a very common tool in characterizing the hadronic recoil in many analyses.

Figure 43: I would suggest including the same comparison vs pT from the document you reference. This shows that for pT>55 GeV, the differences are similar as for lower pT.

Section 8.2.11: Please explain this better. My understanding is that various prescales were in place. In the original measurement of the trigger efficiency in Section 7.3, you rely on being above the track requirement plateau to measure the trigger efficiency versus pTmiss in data (which has all of the various prescales naturally included) and compare to MC (which just has an OR of all trigger paths, I think). This is the main reason why the data efficiency is lower than the MC efficiency. Is this correct? Now, in this section, you are measuring the trigger efficiency solely with MC, which is an OR of all trigger paths. So if the trigger path with the track requirement fails, the MC might still find that an MHT only trigger will fire, while in data the MHT-only trigger may be prescaled. Isn't this a problem? Can you perhaps repeat the exercise using only triggers in MC that were never prescaled (as the opposite extreme to assuming there was never any prescale)? Also, why would you average over the chargino lifetimes? Shouldn't this systematic uncertainty depend very strongly on chargino lifetime?
Line: 344 to 355
We find the first paragraph of the answer to (5) confusing. If most of the signal events are in the plateau, why not cutaway the turn on in the selection? Anyway, according to Fig. 36 in the AN, quite some part of signal is in the turn on. If this is the case, we find unbelievable that you can be confident about your efficiency in the middle of the steep turn on to relative uncertainty of the order of 0.5%. We still think that a check with different datasets might provide some handle on the related systematics (position and slope of the turn on). We hope that ARC can pay attention to this issue. We are fine with the second paragraph of the answer.
Changed:
<
<
We do not cutaway the turn on in order to have as much signal acceptance as possible. The small uncertainties you refer to are the statistical uncertainties in the data and SM background MC efficiency measurements, and are small due to those samples being very large. This, however, is not the total uncertainty on the trigger efficiency, and we apologize if the AN gave that impression. We have added Section 8.2.11 and Figure 44 to the AN in version 7 that describes an additional component of the signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those, which is roughly a 10% effect. However, since only about 10% of the signal is on the turn-on, this uncertainty results in yield uncertainties of 1.1% and 0.5% for ==4 and ==5 layers respectively. This is combined with the statistical uncertainties to reach a total trigger systematic uncertainty, which is presented in the updated AN in Table .
>
>
We do not cutaway the turn on in order to have as much signal acceptance as possible. The small uncertainties you refer to are the statistical uncertainties in the data and SM background MC efficiency measurements, and are small due to those samples being very large. This, however, is not the total uncertainty on the trigger efficiency, and we apologize if the AN gave that impression. We have added Section 8.2.11 and Figure 44 to the AN in version 7 that describes an additional component of the signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those, which is roughly a 10% effect. However, since only about 10% of the signal is on the turn-on, this uncertainty results in yield uncertainties of 1.1% and 0.5% for ==4 and ==5 layers respectively. This is combined with the statistical uncertainties to reach a total trigger systematic uncertainty, which is presented in the updated AN in Table ???.

#### Revision 352019-07-17 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 186 to 186
The table values were incorrect because of the "0 \pm 0". This has been corrected to "0_{-0}^{+8.2}" using the poisson 1.15; it must however be multiplied by the tau trigger prescale of ~7.2. Following equation 39.42 in that link yields an estimate of 0 - 0 + 0.18, which is now the quoted estimate. This results in a roughly 2.5% increase in the total lepton statistical uncertainty and 1% increase in the total background statistical uncertainty. Do note that this was only the table values -- the upper limits used the correct gamma distributed 0 with alpha 0.0222 \pm 0.0051.
Changed:
<
<
Section 6.1.5: There needs to be more information. Do you calculate P_veto, P_offline, and P_trigger for each of the leptons using simulated samples following the same recipe as for data? If so, what simulated samples? Do you just use Z->ll for P_veto and W->lnu for the others? Does the single lepton control region come from just W->lnu events? Or do you include all the background samples in Tables 7-8? I think using all of the samples from Tables 7-8 for every calculation would make the most sense.
>
>
Section 6.1.5: There needs to be more information. Do you calculate P_veto, P_offline, and P_trigger for each of the leptons using simulated samples following the same recipe as for data? If so, what simulated samples? Do you just use Z->ll for P_veto and W->lnu for the others? Does the single lepton control region come from just W->lnu events? Or do you include all the background samples in Tables 7-8? I think using all of the samples from Tables 7-8 for every calculation would make the most sense.

Changed:
<
<
Section 6.1.5: It seems like even with the relaxed selection criteria, you are still quite lacking in MC statistics for this test. You mention that you only include ttbar events. Is this just the ttbar semileptonic sample? Although this may be the largest single sample, it seems like you could also include other samples. Most importantly would be the W->lnu and Z-> invisible (including the HT-binned samples) as these seem to be the largest source of background in Figures 14 and 15. Is there some reason you didn't include these? If not, I suggest you go ahead and do this.
>
>
With the modifications given, we calculate the background estimates in precisely the same way as in data. The AN has been clarified on that. For the lepton closure we use only the ttbar samples because the other samples do not affect the statistics for this study due to the small P(veto).

Changed:
<
<
Figure 26: Are all three results properly normalized to 41.5 fb^-1? If so, it seems like we should be including 3 layer tracks in the signal region because I can exclude a ct=10cm with this plot alone (observe 350 with a prediction of 125), which you can't do with the whole analysis.
>
>
Section 6.1.5: It seems like even with the relaxed selection criteria, you are still quite lacking in MC statistics for this test. You mention that you only include ttbar events. Is this just the ttbar semileptonic sample? Although this may be the largest single sample, it seems like you could also include other samples. Most importantly would be the W->lnu and Z-> invisible (including the HT-binned samples) as these seem to be the largest source of background in Figures 14 and 15. Is there some reason you didn't include these? If not, I suggest you go ahead and do this.

Changed:
<
<
Figure 27: Would be good to add the nlayers=4 result as is done in Figure 28. Should also say what simulated samples are included here.
>
>
All three ttbar samples are used; the other samples do not affect the study due to the small P(veto). Figures 14 and 15 compare the disappearing track selection, but the lepton background estimates do not use this. The (here modified) single-lepton control region selections and tag-and-probe selections are used. W->lnu and Z->invisible do not contribute to the tag-and-probe study of P(veto); the Z->ll sample should contribute however it is considerably smaller in statistics than the ttbar samples.

Changed:
<
<
Section 6.2: This needs to be cleaned up and explained better. Here are some specific comments/suggestions - I think that L566-569, Figures 26-28, and L602-615 can all be removed. It seems like they have nothing to do with the analysis that is done. They just lead to confusion. If you want to move this material to Appendix C, that is fine. But don't clutter up this section. - I'm confused by the transfer factor. I assume that the fit in Figure 29 is actually a Gaussian + flat line. Is that correct? What is your hypothesis about what is contained in the Gaussian area and what is contained in the flat line area? I would have assumed that Gaussian contribution indicates real tracks (since they peak at d0=0) and the flat line contribution indicates fake tracks. But this doesn't seem to match your hypothesis. Can you say exactly what the fit in Eq. 14 is doing? In L629-630 you quote a single transfer factor for each Z mode. Shouldn't there be a different transfer factor for each of the 9 sideband regions? - What best describes your assumption of the fake track rate as a function of d0. Is it uniform (flat), Gaussian, Gaussian+flat, or something else? - I don't see the advantage of having 9 different sideband regions. Simply take the sum of events from 0.05-0.5 and multiply by the overall transfer factor. This should minimize the statistical uncertainty. In fact, I would suggest combining the Z->mumu and Z->ee samples as well. Also, remember to use the correct Poisson uncertainties (as discussed for Table 31) when you only have a handful of events. If you somehow think it is a good idea to have 18 different measurements instead of 1 and you are using a transfer factor with an uncertainty, make sure to properly account for the fact that this uncertainty is correlated for different bins. - L635-638: As mentioned above, I would suggest combining the Z->mumu and Z->ee results to get the final estimate, seeing as you are statistics limited. You can still use the difference between the two as a systematic uncertainty (but see below). - L640-645: It is obvious that Z->mumu and Z->ee events are quite similar. They have the same production mechanism, they are selected by single lepton triggers, etc. So, it is not much of a test to show that they give the same result. On the other hand, your signal region requires large missing ET, a high pT jet, and a high pT isolated track that is neither a muon or electron. One might worry that the fake track rate depends on the amount of hadronic activity in an event, which is likely higher in the signal region than in Z events. One might also worry that the fake track rate depends on pileup, and the signal trigger/selection may be more susceptible to pileup than the single lepton trigger/selection. Ideally, I would suggest that you perform the same measurement on a QCD dominated region (like requiring a dijet or quadjet trigger or just high HT). You can require pTmiss,no mu < 100 GeV to ensure no signal contamination. If this is not possible, then you could consider taking what you have and either reweighting the pileup and HT distribution to match the signal region or checking that the fake rate is independent of these quantities. - L649-653: I don't understand how these numbers are consistent with Figure 29. In Figure 29 (left) it seems there are about 9 events with |dxy|<0.02cm and about 15 with 0.05<|dxy|<0.10cm to be compared with 32 events and 68 events. There is a similar discrepancy for electrons. I guess the plots have been scaled for some reason as the entries are not integers. Please fix the plots and verify the results are consistent. - Figure 29: Why do you not fit the region |dxy|<0.1cm? If you fit out to |dxy|=1.0cm, please show the entire range in the plots. It would be nice to see the results for nlayers=5 and 6 as well so we can evaluate the extent to which a fit may or may not be possible and whether the shape is consistent with nlayers=4.
>
>
Figure 26: Are all three results properly normalized to 41.5 fb^-1? If so, it seems like we should be including 3 layer tracks in the signal region because I can exclude a ct=10cm with this plot alone (observe 350 with a prediction of 125), which you can't do with the whole analysis.

Figure 27: Would be good to add the nlayers=4 result as is done in Figure 28. Should also say what simulated samples are included here.

All samples are used and the caption now says this. The cleaning cuts are only relevant for the nlayers=3 category (which is otherwise not used in the analysis), and showing this for nlayers=4 would provide the same information as Figure 31.

Section 6.2: This needs to be cleaned up and explained better. Here are some specific comments/suggestions - I think that L566-569, Figures 26-28, and L602-615 can all be removed. It seems like they have nothing to do with the analysis that is done. They just lead to confusion. If you want to move this material to Appendix C, that is fine. But don't clutter up this section.

These have been moved to Appendix C.

- I'm confused by the transfer factor. I assume that the fit in Figure 29 is actually a Gaussian + flat line. Is that correct? What is your hypothesis about what is contained in the Gaussian area and what is contained in the flat line area? I would have assumed that Gaussian contribution indicates real tracks (since they peak at d0=0) and the flat line contribution indicates fake tracks. But this doesn't seem to match your hypothesis. Can you say exactly what the fit in Eq. 14 is doing? In L629-630 you quote a single transfer factor for each Z mode. Shouldn't there be a different transfer factor for each of the 9 sideband regions?

a) Correct, the fit is a gaussian + constant. The AN has been clarified.

b) Our hypothesis is that this is a bias in the track-fitting algorithm, where short tracks with very few hits have the importance of the primary vertex inflated, drawing tracks closer to the PV. Figure 27 shows this issue increased further for 3-layer (even shorter) tracks, and also occuring in SM background MC -- it is not a data-exclusive feature. Also none of these tracks in background MC are near to any hard interaction truth particle, which is our consideration of "fake" in MC truth.

c) The purpose of the transfer factor is purely the normalization of the sideband rates to the signal region; if we observed a flat distribution the transfer factor would be 0.04cm / 0.10cm = 0.25 everywhere. We do not observe a flat distribution and must describe this normalization, which is the purpose of the fit in equation 14. The uncertainty in the fit is also important in this normalization, because the issue of the statistical uncertainty in our observed P^raw_fake is separate from the issue of the uncertainty in the relative rates between the sidebands and the signal region.

d) L629-630 quotes the transfer factor for the baseline sideband (0.05, 0.10) cm, so only one. The authors felt that Table 35 was large enough already, but we now provide an additional table listing the P^raw_fake and transfer factors.

- What best describes your assumption of the fake track rate as a function of d0. Is it uniform (flat), Gaussian, Gaussian+flat, or something else?

Guassian + flat.

- I don't see the advantage of having 9 different sideband regions. Simply take the sum of events from 0.05-0.5 and multiply by the overall transfer factor. This should minimize the statistical uncertainty. In fact, I would suggest combining the Z->mumu and Z->ee samples as well. Also, remember to use the correct Poisson uncertainties (as discussed for Table 31) when you only have a handful of events. If you somehow think it is a good idea to have 18 different measurements instead of 1 and you are using a transfer factor with an uncertainty, make sure to properly account for the fact that this uncertainty is correlated for different bins.

- L635-638: As mentioned above, I would suggest combining the Z->mumu and Z->ee results to get the final estimate, seeing as you are statistics limited. You can still use the difference between the two as a systematic uncertainty (but see below).

- L640-645: It is obvious that Z->mumu and Z->ee events are quite similar. They have the same production mechanism, they are selected by single lepton triggers, etc. So, it is not much of a test to show that they give the same result. On the other hand, your signal region requires large missing ET, a high pT jet, and a high pT isolated track that is neither a muon or electron. One might worry that the fake track rate depends on the amount of hadronic activity in an event, which is likely higher in the signal region than in Z events. One might also worry that the fake track rate depends on pileup, and the signal trigger/selection may be more susceptible to pileup than the single lepton trigger/selection. Ideally, I would suggest that you perform the same measurement on a QCD dominated region (like requiring a dijet or quadjet trigger or just high HT). You can require pTmiss,no mu < 100 GeV to ensure no signal contamination. If this is not possible, then you could consider taking what you have and either reweighting the pileup and HT distribution to match the signal region or checking that the fake rate is independent of these quantities.

- L649-653: I don't understand how these numbers are consistent with Figure 29. In Figure 29 (left) it seems there are about 9 events with |dxy|<0.02cm and about 15 with 0.05<|dxy|<0.10cm to be compared with 32 events and 68 events. There is a similar discrepancy for electrons. I guess the plots have been scaled for some reason as the entries are not integers. Please fix the plots and verify the results are consistent.

These were scaled incorrectly; the plots are now fixed and agree with the text.

- Figure 29: Why do you not fit the region |dxy|<0.1cm? If you fit out to |dxy|=1.0cm, please show the entire range in the plots. It would be nice to see the results for nlayers=5 and 6 as well so we can evaluate the extent to which a fit may or may not be possible and whether the shape is consistent with nlayers=4.

The plots being fit end at |dxy|<0.5cm so really the 1.0cm is incorrect and the full range is indeed shown; this has been updated in the AN. The NLayers5 plots are shown below with the fit from the NLayers4 overlaid, scaled to the integral of NLayers5 -- no new fit is shown.

Section 6.2.2: In Table 36, it would be enlightening to show the same results as in Table 35. That is, I am curious as to how P_fake compares between data and MC. Are these results normalized to 41 fb^-1? If so, then it seems like the MC predicts about 1/5 as many fake tracks as data. It is hard to be confident that the MC tells us anything if that is so.
Line: 313 to 344
We find the first paragraph of the answer to (5) confusing. If most of the signal events are in the plateau, why not cutaway the turn on in the selection? Anyway, according to Fig. 36 in the AN, quite some part of signal is in the turn on. If this is the case, we find unbelievable that you can be confident about your efficiency in the middle of the steep turn on to relative uncertainty of the order of 0.5%. We still think that a check with different datasets might provide some handle on the related systematics (position and slope of the turn on). We hope that ARC can pay attention to this issue. We are fine with the second paragraph of the answer.
Changed:
<
<
We do not cutaway the turn on in order to have as much signal acceptance as possible. The small uncertainties you refer to are the statistical uncertainties in the data and SM background MC efficiency measurements, and are small due to those samples being very large. This, however, is not the total uncertainty on the trigger efficiency, and we apologize if the AN gave that impression. We have added Section 8.2.11 and Figure 44 to the AN in version 7 that describes an additional component of the signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those, which is roughly a 10% effect. However, since only about 10% of the signal is on the turn-on, this uncertainty results in yield uncertainties of 1.1% and 0.5% for ==4 and ==5 layers respectively. This is combined with the statistical uncertainties to reach a total trigger systematic uncertainty, which is presented in the updated AN in Table .
>
>
We do not cutaway the turn on in order to have as much signal acceptance as possible. The small uncertainties you refer to are the statistical uncertainties in the data and SM background MC efficiency measurements, and are small due to those samples being very large. This, however, is not the total uncertainty on the trigger efficiency, and we apologize if the AN gave that impression. We have added Section 8.2.11 and Figure 44 to the AN in version 7 that describes an additional component of the signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those, which is roughly a 10% effect. However, since only about 10% of the signal is on the turn-on, this uncertainty results in yield uncertainties of 1.1% and 0.5% for ==4 and ==5 layers respectively. This is combined with the statistical uncertainties to reach a total trigger systematic uncertainty, which is presented in the updated AN in Table .

Line: 858 to 889

 META FILEATTACHMENT attachment="ratio_of_efficiencies.png" attr="" comment="" date="1562621624" name="ratio_of_efficiencies.png" path="ratio_of_efficiencies.png" size="111310" user="bfrancis" version="1" attachment="compareWithWithout700GeV100cmNLayers4.jpg" attr="" comment="" date="1562699528" name="compareWithWithout700GeV100cmNLayers4.jpg" path="compareWithWithout700GeV100cmNLayers4.jpg" size="107239" user="bfrancis" version="2" attachment="compareWithWithout700GeV100cmNLayers6plus.jpg" attr="" comment="" date="1562699529" name="compareWithWithout700GeV100cmNLayers6plus.jpg" path="compareWithWithout700GeV100cmNLayers6plus.jpg" size="105856" user="bfrancis" version="2"
>
>
 META FILEATTACHMENT attachment="tf_ZtoEE_NLayers5.png" attr="" comment="" date="1563389063" name="tf_ZtoEE_NLayers5.png" path="tf_ZtoEE_NLayers5.png" size="15491" user="bfrancis" version="1" attachment="tf_ZtoMuMu_NLayers5.png" attr="" comment="" date="1563389063" name="tf_ZtoMuMu_NLayers5.png" path="tf_ZtoMuMu_NLayers5.png" size="17113" user="bfrancis" version="1"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 342019-07-15 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 111 to 111
L338-341: Would be nice to show the signal and background distributions for these two variables so we can evaluate for ourselves the effectiveness of the cut. Also, it would be helpful to report the background rejection and signal efficiency for these cuts.
Deleted:
<
<
# he wants N-1 plots
Table 18: Are there no standard Tracking POG quality requirements on the tracks? I think they still have "loose" and "highPurity" requirements based on an MVA. Do you require either of these?

These exist but we do not use them; the standard quality flags do not make requirements on the hit pattern for example.

Line: 127 to 125
L368-370 and Table 20: Would be nice to see plots of Delta R to see why 0.15 is chosen. I would have expected a smaller value for muons and a larger value for electrons.
Deleted:
<
<
# n-1
L363-370: Just to be clear, there are no requirements on the pT of the leptons? So, if there is a 4 GeV muon within Delta R of 0.15 of a 100 GeV track, then you reject the track?

As above these are all available in MINIAOD, which has a very minimal set of slimming requirements. For example muons passing the PF ID have no pt requirement whatsoever. If such a muon as you write is near our track, yes we reject it.

Line: 149 to 145
Recall that this method was suggested by you in the review of EXO-16-044. The intent of the same-sign subtraction is to increase the purity of T&P events actually containing the lepton flavor under study. To the suggestion that the continuum DY is more correctly signal and isn't equal in rate to the opposite-sign sample, we agree and have reworded the AN's description here. We have also included plots of the invariant mass distributions of the T&P samples. Lastly it is not our intent to estimate the non-Z backgrounds here, but to increase the purity of the lepton flavor under study.
Changed:
<
<
L468-9: It would be helpful to provide a table giving the values for the 4 numbers in Equation 6 for each of the 9 cases (3 leptons * 3 nlayer bins). I would like to get an idea of the signal-to-background ratio. I may also want to calculate the effect of subtracting the background versus ignoring it.
>
>
L468-9: It would be helpful to provide a table giving the values for the 4 numbers in Equation 6 for each of the 9 cases (3 leptons * 3 nlayer bins). I would like to get an idea of the signal-to-background ratio. I may also want to calculate the effect of subtracting the background versus ignoring it.

Deleted:
<
<
# remind him he wanted this # maybe add the invmass plots

>
>
L472-5: Are these recHits from the tracker or the calorimeter? I don't really understand what you are describing here. Are the electron seeds from the ECAL or pixel detector?

>
>
recHits will always refer to calorimeter recHits. Electron seeds are ECAL superclusters matched to seed tracks in or close to the pixels -- so electron seeds are from both. We've reworded this section for increased clarity.

Changed:
<
<
L472-5: Are these recHits from the tracker or the calorimeter? I don't really understand what you are describing here. Are the electron seeds from the ECAL or pixel detector?
>
>
L482: It looks like the probe tracks already have a pT cut > 30 GeV. So going down to 20 GeV is just being extra safe. Is that right?

>
>
Correct.

>
>
L478-487: It is not clear to me. Is this re-reconstruction needed for the signal region or not? Was it done for the signal region?

Changed:
<
<
L482: It looks like the probe tracks already have a pT cut > 30 GeV. So going down to 20 GeV is just being extra safe. Is that right?
>
>
The signal region requires track pt > 55 GeV, so the reconstruction already keeps all nearby recHits and considers all of them as potential electron seeds. This process is needed only for the P(veto) measurement because below 50 GeV the reconstruction is changed and different from the signal region.

Changed:
<
<
L478-487: It is not clear to me. Is this re-reconstruction needed for the signal region or not? Was it done for the signal region?
>
>
Figure 19 and Tables 29-31. If I try to integrate the plots in Figure 19, I would estimate that the integral of red/blue is roughtly 10^-6, 10^-7, and 10^-4, for electrons, muons, and taus, respectively. I would expect this to be approximately equal to P_veto. But in Tables 29-31, I find P_veto numbers of 10^-5, 10^-6, and 10^-3 for electrons, muons, and taus for nlayers>=6. So roughly a factor of 10 off. Can you explain this?

Changed:
<
<
# signal is > 55, it's not needed and not done.
>
>
Figure 19 shows all probe tracks; this includes events where the tag + probe do not pass the Z cuts. Recall too that in the review of EXO-16-044 you recommended we utilize all possible tag-and-probe pairs in every event, so the number of pairs is further different from the number of events. Lastly there is the same-sign subtraction. As you say these figures are related to the value of P(veto) but are not precisely equal.

>
>
Figures 19, 21, 25, and 40 and page 72: I would suggest removing footnote #17 on page 72 and adding that information into the captions for figures 19, 21, 25, and 40. You should also add a similar explanation to the caption of Table 31 indicating that N_ctrl is scaled to the signal region luminosity.

Changed:
<
<
Figure 19 and Tables 29-31. If I try to integrate the plots in Figure 19, I would estimate that the integral of red/blue is roughtly 10^-6, 10^-7, and 10^-4, for electrons, muons, and taus, respectively. I would expect this to be approximately equal to P_veto. But in Tables 29-31, I find P_veto numbers of 10^-5, 10^-6, and 10^-3 for electrons, muons, and taus for nlayers>=6. So roughly a factor of 10 off. Can you explain this?
>
>
Figure 22: Would be good to show the results for nlayers=4 and nlayers=5 (unless there are no entries in which case you should note that in the caption), similar to the way you show the results for tau for nlayers=5 even though it is not used.

Changed:
<
<
# you told us to do the all-possible-combos thing
>
>
There are 1 and 2 events respectively for nlayers=4 and =5 respectively. Thus these plots even less helpful then the top two plots in Figure 21. We've added a comment to the caption of Figure 22 to explain this.

Changed:
<
<
Figures 19, 21, 25, and 40 and page 72: I would suggest removing footnote #17 on page 72 and adding that information into the captions for figures 19, 21, 25, and 40. You should also add a similar explanation to the caption of Table 31 indicating that N_ctrl is scaled to the signal region luminosity.
>
>
L514-518 and Tables 29-31: I note that P_offline for electrons and muons is very similar, around 80%, while for taus it is much lower, around 20%. Do you understand the difference and do you think it is OK for the method. I can imagine two effects that could cause this. First, it could be that since the pion from the tau decay does not carry all of the tau momentum, the tau candidates from W decays will have a lower pT than the muon or electron candidates from W decays. So when the tau pT gets added to pTmiss, it will get shifted less than when the electron pT gets added and so more will fail the Ecalo cut. Based on Figure 25, this seems to be true and is probably innocuous. But I think there is more to it. Comparing Figures 20 and 21, the modified pTmiss for the tau case has a large contribution at the bottom left corner that is not present in the electron (or muon) case. It seems that the electrons and muon are very consistent with the topology of a W recoiling from an ISR jet so delta phi is ~pi. However, for the tau case, there seems to be many events where the "tau" is part of the leading jet. I guess that since there is a Delta R cut of 0.5, the "tau" must differ from the jet a bit in eta. Given this evidence and the fact that we know that the tau purity is much worse than electron and muon purity, it seems likely that many of the events in Figure 21 do not contain taus. I would guess the events are multijet QCD events with either an isolated track by chance or a fake track. So, my hypothesis is that the single tau control region has a large contamination of non-tau events. You use the same sample for measuring P_offline and the multiplying by P_offline as part of estimating the tau background. So we could consider this estimate as being the tau+single hadronic track+fake track contribution. But there are two problems with that. First, P_veto is measured on a much purer sample of taus as it uses Z decays and subtracts the same-sign contribution. So P_veto is really measuring taus, not the sum of tau+single hadronic track+fake track. And P_veto may not be the same if the other contributions were included. Second, you have a separate measurement of the fake track contribution so you would be double counting. Please let me know what you think.

Changed:
<
<
Figure 22: Would be good to show the results for nlayers=4 and nlayers=5 (unless there are no entries in which case you should note that in the caption), similar to the way you show the results for tau for nlayers=5 even though it is not used.
>
>
Section 6.1.3: I think you need to be a little more clear here. I want to confirm that I understand. First, the figures mention HLT efficiency but I think this is really the full trigger efficiency (L1+HLT). Do you agree? Second, the trigger efficiencies shown in Figure 23 and 24 are the actual results from the L1+HLT that was run and the x-axis refers to the actual pTmiss,nomu of the event. That is, the x-axis is not the modified pTmiss,nomu where the electron pT is added back in. Is that correct? Then, the x-axis of Figure 25 shows the modified pTmiss,nomu with the electron or tau pT added back in. Is that correct? Try to make the text and figures a bit clearer.

Changed:
<
<
L514-518 and Tables 29-31: I note that P_offline for electrons and muons is very similar, around 80%, while for taus it is much lower, around 20%. Do you understand the difference and do you think it is OK for the method. I can imagine two effects that could cause this. First, it could be that since the pion from the tau decay does not carry all of the tau momentum, the tau candidates from W decays will have a lower pT than the muon or electron candidates from W decays. So when the tau pT gets added to pTmiss, it will get shifted less than when the electron pT gets added and so more will fail the Ecalo cut. Based on Figure 25, this seems to be true and is probably innocuous. But I think there is more to it. Comparing Figures 20 and 21, the modified pTmiss for the tau case has a large contribution at the bottom left corner that is not present in the electron (or muon) case. It seems that the electrons and muon are very consistent with the topology of a W recoiling from an ISR jet so delta phi is ~pi. However, for the tau case, there seems to be many events where the "tau" is part of the leading jet. I guess that since there is a Delta R cut of 0.5, the "tau" must differ from the jet a bit in eta. Given this evidence and the fact that we know that the tau purity is much worse than electron and muon purity, it seems likely that many of the events in Figure 21 do not contain taus. I would guess the events are multijet QCD events with either an isolated track by chance or a fake track. So, my hypothesis is that the single tau control region has a large contamination of non-tau events. You use the same sample for measuring P_offline and the multiplying by P_offline as part of estimating the tau background. So we could consider this estimate as being the tau+single hadronic track+fake track contribution. But there are two problems with that. First, P_veto is measured on a much purer sample of taus as it uses Z decays and subtracts the same-sign contribution. So P_veto is really measuring taus,! not the sum of tau+single hadronic track+fake track. And P_veto may not be the same if the other contributions were included. Second, you have a separate measurement of the fake track contribution so you would be double counting. Please let me know what you think.

Section 6.1.3: I think you need to be a little more clear here. I want to confirm that I understand. First, the figures mention HLT efficiency but I think this is really the full trigger efficiency (L1+HLT). Do you agree? Second, the trigger efficiencies shown in Figure 23 and 24 are the actual results from the L1+HLT that was run and the x-axis refers to the actual pTmiss,nomu of the event. That is, the x-axis is not the modified pTmiss,nomu where the electron pT is added back in. Is that correct? Then, the x-axis of Figure 25 shows the modified pTmiss,nomu with the electron or tau pT added back in. Is that correct? Try to make the text and figures a bit clearer.

>
>
You are correct. Figures 23 and 24 now are labeled as just "trigger efficiency", and the caption for Figure 25 has been made more clear.
Figure 25 and Tables 29 and 31: Figure 25 seems to show that the electron distribution is shifted higher than the tau distribution. Therefore, once you convolute this distribution with the trigger efficiency, I would expect the electron trigger efficiency to be higher than the tau trigger efficiency. However, the opposite is true. If I naively take the trigger efficiency as a step function which is 0% for pTmiss<200 GeV and 100% for pTmiss>200 GeV, I think I get about 30% for electrons and 13% for the tau, compared to 46% and 52%. Can you check the results and if correct, try to explain what I am missing?
Changed:
<
<
Table 31: How did you determine the uncertainties for nlayers=4? The upper uncertainty of 0 for N^l_ctrl and the upper uncertainty of 0.0058 for estimate seem too small. Actually nlayers = 4 and nlayers = 5 for both muons and taus (Tables 30 and 31) have yields that are too small to assume Gaussian uncertainties. You should use Poisson uncertainties. You can ask the statistics committee for better advice but I think more correct uncertainties would be:
>
>
Table 31: How did you determine the uncertainties for nlayers=4? The upper uncertainty of 0 for N^l_ctrl and the upper uncertainty of 0.0058 for estimate seem too small. Actually nlayers = 4 and nlayers = 5 for both muons and taus (Tables 30 and 31) have yields that are too small to assume Gaussian uncertainties. You should use Poisson uncertainties. You can ask the statistics committee for better advice but I think more correct uncertainties would be:
0 +1.15 -0 1 +1.36 -0.62 2 +1.52 -0.86
Changed:
<
<
This comes from the prescription on page 32 of http://pdg.lbl.gov/2019/reviews/rpp2018-rev-statistics.pdf but again, the statistics committee may have another prescription. Note that I think this results in an estimate for the tau background for nlayers=4 to be 0 +1.9 -0.0 rather than 0 +0.0058 -0.
>
>
This comes from the prescription on page 32 of http://pdg.lbl.gov/2019/reviews/rpp2018-rev-statistics.pdf but again, the statistics committee may have another prescription. Note that I think this results in an estimate for the tau background for nlayers=4 to be 0 +1.9 -0.0 rather than 0 +0.0058 -0.

The table values were incorrect because of the "0 \pm 0". This has been corrected to "0_{-0}^{+8.2}" using the poisson 1.15; it must however be multiplied by the tau trigger prescale of ~7.2. Following equation 39.42 in that link yields an estimate of 0 - 0 + 0.18, which is now the quoted estimate. This results in a roughly 2.5% increase in the total lepton statistical uncertainty and 1% increase in the total background statistical uncertainty. Do note that this was only the table values -- the upper limits used the correct gamma distributed 0 with alpha 0.0222 \pm 0.0051.

Section 6.1.5: There needs to be more information. Do you calculate P_veto, P_offline, and P_trigger for each of the leptons using simulated samples following the same recipe as for data? If so, what simulated samples? Do you just use Z->ll for P_veto and W->lnu for the others? Does the single lepton control region come from just W->lnu events? Or do you include all the background samples in Tables 7-8? I think using all of the samples from Tables 7-8 for every calculation would make the most sense.

#### Revision 332019-07-15 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 79 to 79
</>
<!--/twistyPlugin-->
>
>

## First set of comments from Kevin Stenson HN July 14

<!--/twistyPlugin twikiMakeVisibleInline-->

Table 5: There seems to be a third column of percentages but there is no column heading or description of what this is in the caption or text. Please clarify.

The \pm was missing; fixed.

L208-209, Table 9: You mention the single tau trigger was prescaled. This appears to result in less than 6 fb^-1 of luminosity. Is there no other single tau trigger with less of a prescale that could be used? What is the situation for 2018 data?

Yes, Table 1 details it as 5.75/fb. The only other single tau triggers have much higher thresholds or additional requirements that are unsuitable. The prescale was higher in 2018.

L243-4: Are there eta restrictions on the jets you use? |eta|<5 or |eta|<3 or |eta|<2.4 or |eta|<2.1 or something else? In Table 17 there is a cut of |eta|<2.4 but I'm not clear if this applies to all jets used in the analysis.

Yes, |eta| < 4.5 overall; this is added to the AN. Different requirements are as listed e.g. Table 17. The 10 GeV has also been fixed to 30 GeV; MINIAOD contains only >10 GeV jets but the analysis considers only >30 GeV.

L260: I have a general idea of how electrons and muons are reconstructed but not so much with the taus. I seem to think there is some flexibility in tau_h reconstruction. Can you add some information about how the taus are reconstructed? I seem to recall that one can select (with some ambiguity) tau+ decays to pi+ or pi+ pi0 or pi+ 2pi0 or pi+ pi- pi+. Do you consider all tau_h decays or just one-prong decays? It wouldn't hurt to add a few sentences about muon and electron reconstruction as well (or at least some references).

As stated we use the POG recommended decay mode reconstruction with light flavor rejection which does target multiple tau_h decays. This selection is only used to normalize the tau_h background estimate and must be inclusive to all tau_h decays. A brief reference has been added for the PF lepton reconstruction description.

L260-4: Have you checked the efficiency for selecting the correct primary vertex in signal events? One could imagine selecting the vertex based on the origin of the isolated track and/or ISR jet.

We use the standard PV recommendation using the highest sum-pt^2 vertex (L 260-264); we have no need of a specialized vertexing requirement. Figure 26 for example demonstrates that signal tracks are well-associated with the PV already.

Table 17: I'm a little confused. I get that there must be at least one jet which simultaneously has pT>110 GeV and |eta|<2.4 and passing tight ID with lepton veto. But when you measure max |Delta phi_jet,jet|, do each of the jets in the comparison need to pass those cuts as well? If so, then this cut only applies in events where there are two or more jets with pT>110 GeV. That doesn't seem right.

See above; jets are considered if pt>30 and |eta|<4.5. So |Delta phi_jet,jet| would apply only if there are two or more jets, the minimal case being a ~110 GeV and a second ~30 GeV jet. A clarifying sentence has been added.

L332-334: It is claimed that a jet pT cut of >110 GeV removes a lot of background and not signal. The plots in Figure 11 don't seem to back this up. It seems like about the same percentage of signal and background events are removed by the cut. Can you quantify the effect of this cut (background rejection and signal efficiency)?

L338-341: Would be nice to show the signal and background distributions for these two variables so we can evaluate for ourselves the effectiveness of the cut. Also, it would be helpful to report the background rejection and signal efficiency for these cuts.

# he wants N-1 plots

Table 18: Are there no standard Tracking POG quality requirements on the tracks? I think they still have "loose" and "highPurity" requirements based on an MVA. Do you require either of these?

These exist but we do not use them; the standard quality flags do not make requirements on the hit pattern for example.

Table 19: You need to define what sigma is in the last line. Is it the beamspot width, is it the primary vertex transverse position uncertainty, is it the uncertainty on the track position at the distance of closest approach to the beamline or primary vertex, or some combination of these?

Both dxy and sigma here refer to the dxy measurement with respect to the origin. This is made more clear.

L365-367: You write that "all" muons and electrons are used. I would like to have a more complete description of this. It may help if you add some text around L260 describing muon and electron reconstruction. For muons, Table 13 defines tight and loose ID. Is it simply the OR of these two that you use? Or do you include tracker muons? I think there is also a soft muon ID. Are these included? What about standalone muons? For electrons, Table 12 defines tight and loose ID. Is it simply the OR of these two categories that you use? Or do you loosen up the requirements further? If so, what are the requirements?

Text describing this around line 365 has been added to explain that "all" means all available in MINIAOD, which has a minimal set of slimming requirements which are now provided in the text.

L368-370 and Table 20: Would be nice to see plots of Delta R to see why 0.15 is chosen. I would have expected a smaller value for muons and a larger value for electrons.

# n-1

L363-370: Just to be clear, there are no requirements on the pT of the leptons? So, if there is a 4 GeV muon within Delta R of 0.15 of a 100 GeV track, then you reject the track?

As above these are all available in MINIAOD, which has a very minimal set of slimming requirements. For example muons passing the PF ID have no pt requirement whatsoever. If such a muon as you write is near our track, yes we reject it.

L389-91 and Figure 14: It would be good to plot Figure 14 with finer bins (at least between 0 and 10 GeV) to back up this statement.

The statistics are very limited and the intent is to show the separation between 0-10 and above 10. The statement now states "remove almost all background" which Figure 14 supports.

Figure 16: Why is the first bin at ~30% rather than 100%?

Fixed.

AN Table 22: Why don't you apply the full set of vetos for all lepton flavors? Is it an attempt to increases statistics? Can you perform the test with all vetos applied to see if the results are consistent?

Recall these are the definitions of the vetoes used to study P(veto) for each flavor; all three sets of requirements are applied in the signal. Each flavor of P(veto) is measured by removing that flavor's veto -- the other flavors' vetoes are still applied. This is instead tighter than removing all the cuts listed, to increase the purity of each flavor in each study. For example when studying muons one would not remove the cut on ECalo in the denominator of P(veto).

L456-460: Your assumption here is that the same-sign sample has the same production rate as the background. Have you verified this? You could verify it with a high statistics sample of dilepton events (or come up with a scaling factor if it is not exactly true). Also, in L457-458 you list three sources of background: DY, non-DY, fake tracks. I don't see how a same-sign sample can be used to estimate the DY background? I would suggest calling DY part of the signal. For di-electron and di-muon events, you also have the possibility of using the sidebands around the Z mass to estimate the background. You could check that this gives consistent results.

Recall that this method was suggested by you in the review of EXO-16-044. The intent of the same-sign subtraction is to increase the purity of T&P events actually containing the lepton flavor under study. To the suggestion that the continuum DY is more correctly signal and isn't equal in rate to the opposite-sign sample, we agree and have reworded the AN's description here. We have also included plots of the invariant mass distributions of the T&P samples. Lastly it is not our intent to estimate the non-Z backgrounds here, but to increase the purity of the lepton flavor under study.

L468-9: It would be helpful to provide a table giving the values for the 4 numbers in Equation 6 for each of the 9 cases (3 leptons * 3 nlayer bins). I would like to get an idea of the signal-to-background ratio. I may also want to calculate the effect of subtracting the background versus ignoring it.

# remind him he wanted this # maybe add the invmass plots

L472-5: Are these recHits from the tracker or the calorimeter? I don't really understand what you are describing here. Are the electron seeds from the ECAL or pixel detector?

L482: It looks like the probe tracks already have a pT cut > 30 GeV. So going down to 20 GeV is just being extra safe. Is that right?

L478-487: It is not clear to me. Is this re-reconstruction needed for the signal region or not? Was it done for the signal region?

# signal is > 55, it's not needed and not done.

Figure 19 and Tables 29-31. If I try to integrate the plots in Figure 19, I would estimate that the integral of red/blue is roughtly 10^-6, 10^-7, and 10^-4, for electrons, muons, and taus, respectively. I would expect this to be approximately equal to P_veto. But in Tables 29-31, I find P_veto numbers of 10^-5, 10^-6, and 10^-3 for electrons, muons, and taus for nlayers>=6. So roughly a factor of 10 off. Can you explain this?

# you told us to do the all-possible-combos thing

Figures 19, 21, 25, and 40 and page 72: I would suggest removing footnote #17 on page 72 and adding that information into the captions for figures 19, 21, 25, and 40. You should also add a similar explanation to the caption of Table 31 indicating that N_ctrl is scaled to the signal region luminosity.

Figure 22: Would be good to show the results for nlayers=4 and nlayers=5 (unless there are no entries in which case you should note that in the caption), similar to the way you show the results for tau for nlayers=5 even though it is not used.

L514-518 and Tables 29-31: I note that P_offline for electrons and muons is very similar, around 80%, while for taus it is much lower, around 20%. Do you understand the difference and do you think it is OK for the method. I can imagine two effects that could cause this. First, it could be that since the pion from the tau decay does not carry all of the tau momentum, the tau candidates from W decays will have a lower pT than the muon or electron candidates from W decays. So when the tau pT gets added to pTmiss, it will get shifted less than when the electron pT gets added and so more will fail the Ecalo cut. Based on Figure 25, this seems to be true and is probably innocuous. But I think there is more to it. Comparing Figures 20 and 21, the modified pTmiss for the tau case has a large contribution at the bottom left corner that is not present in the electron (or muon) case. It seems that the electrons and muon are very consistent with the topology of a W recoiling from an ISR jet so delta phi is ~pi. However, for the tau case, there seems to be many events where the "tau" is part of the leading jet. I guess that since there is a Delta R cut of 0.5, the "tau" must differ from the jet a bit in eta. Given this evidence and the fact that we know that the tau purity is much worse than electron and muon purity, it seems likely that many of the events in Figure 21 do not contain taus. I would guess the events are multijet QCD events with either an isolated track by chance or a fake track. So, my hypothesis is that the single tau control region has a large contamination of non-tau events. You use the same sample for measuring P_offline and the multiplying by P_offline as part of estimating the tau background. So we could consider this estimate as being the tau+single hadronic track+fake track contribution. But there are two problems with that. First, P_veto is measured on a much purer sample of taus as it uses Z decays and subtracts the same-sign contribution. So P_veto is really measuring taus,! not the sum of tau+single hadronic track+fake track. And P_veto may not be the same if the other contributions were included. Second, you have a separate measurement of the fake track contribution so you would be double counting. Please let me know what you think.

Section 6.1.3: I think you need to be a little more clear here. I want to confirm that I understand. First, the figures mention HLT efficiency but I think this is really the full trigger efficiency (L1+HLT). Do you agree? Second, the trigger efficiencies shown in Figure 23 and 24 are the actual results from the L1+HLT that was run and the x-axis refers to the actual pTmiss,nomu of the event. That is, the x-axis is not the modified pTmiss,nomu where the electron pT is added back in. Is that correct? Then, the x-axis of Figure 25 shows the modified pTmiss,nomu with the electron or tau pT added back in. Is that correct? Try to make the text and figures a bit clearer.

Figure 25 and Tables 29 and 31: Figure 25 seems to show that the electron distribution is shifted higher than the tau distribution. Therefore, once you convolute this distribution with the trigger efficiency, I would expect the electron trigger efficiency to be higher than the tau trigger efficiency. However, the opposite is true. If I naively take the trigger efficiency as a step function which is 0% for pTmiss<200 GeV and 100% for pTmiss>200 GeV, I think I get about 30% for electrons and 13% for the tau, compared to 46% and 52%. Can you check the results and if correct, try to explain what I am missing?

Table 31: How did you determine the uncertainties for nlayers=4? The upper uncertainty of 0 for N^l_ctrl and the upper uncertainty of 0.0058 for estimate seem too small. Actually nlayers = 4 and nlayers = 5 for both muons and taus (Tables 30 and 31) have yields that are too small to assume Gaussian uncertainties. You should use Poisson uncertainties. You can ask the statistics committee for better advice but I think more correct uncertainties would be: 0 +1.15 -0 1 +1.36 -0.62 2 +1.52 -0.86 This comes from the prescription on page 32 of http://pdg.lbl.gov/2019/reviews/rpp2018-rev-statistics.pdf but again, the statistics committee may have another prescription. Note that I think this results in an estimate for the tau background for nlayers=4 to be 0 +1.9 -0.0 rather than 0 +0.0058 -0.

Section 6.1.5: There needs to be more information. Do you calculate P_veto, P_offline, and P_trigger for each of the leptons using simulated samples following the same recipe as for data? If so, what simulated samples? Do you just use Z->ll for P_veto and W->lnu for the others? Does the single lepton control region come from just W->lnu events? Or do you include all the background samples in Tables 7-8? I think using all of the samples from Tables 7-8 for every calculation would make the most sense.

Section 6.1.5: It seems like even with the relaxed selection criteria, you are still quite lacking in MC statistics for this test. You mention that you only include ttbar events. Is this just the ttbar semileptonic sample? Although this may be the largest single sample, it seems like you could also include other samples. Most importantly would be the W->lnu and Z-> invisible (including the HT-binned samples) as these seem to be the largest source of background in Figures 14 and 15. Is there some reason you didn't include these? If not, I suggest you go ahead and do this.

Figure 26: Are all three results properly normalized to 41.5 fb^-1? If so, it seems like we should be including 3 layer tracks in the signal region because I can exclude a ct=10cm with this plot alone (observe 350 with a prediction of 125), which you can't do with the whole analysis.

Figure 27: Would be good to add the nlayers=4 result as is done in Figure 28. Should also say what simulated samples are included here.

Section 6.2: This needs to be cleaned up and explained better. Here are some specific comments/suggestions - I think that L566-569, Figures 26-28, and L602-615 can all be removed. It seems like they have nothing to do with the analysis that is done. They just lead to confusion. If you want to move this material to Appendix C, that is fine. But don't clutter up this section. - I'm confused by the transfer factor. I assume that the fit in Figure 29 is actually a Gaussian + flat line. Is that correct? What is your hypothesis about what is contained in the Gaussian area and what is contained in the flat line area? I would have assumed that Gaussian contribution indicates real tracks (since they peak at d0=0) and the flat line contribution indicates fake tracks. But this doesn't seem to match your hypothesis. Can you say exactly what the fit in Eq. 14 is doing? In L629-630 you quote a single transfer factor for each Z mode. Shouldn't there be a different transfer factor for each of the 9 sideband regions? - What best describes your assumption of the fake track rate as a function of d0. Is it uniform (flat), Gaussian, Gaussian+flat, or something else? - I don't see the advantage of having 9 different sideband regions. Simply take the sum of events from 0.05-0.5 and multiply by the overall transfer factor. This should minimize the statistical uncertainty. In fact, I would suggest combining the Z->mumu and Z->ee samples as well. Also, remember to use the correct Poisson uncertainties (as discussed for Table 31) when you only have a handful of events. If you somehow think it is a good idea to have 18 different measurements instead of 1 and you are using a transfer factor with an uncertainty, make sure to properly account for the fact that this uncertainty is correlated for different bins. - L635-638: As mentioned above, I would suggest combining the Z->mumu and Z->ee results to get the final estimate, seeing as you are statistics limited. You can still use the difference between the two as a systematic uncertainty (but see below). - L640-645: It is obvious that Z->mumu and Z->ee events are quite similar. They have the same production mechanism, they are selected by single lepton triggers, etc. So, it is not much of a test to show that they give the same result. On the other hand, your signal region requires large missing ET, a high pT jet, and a high pT isolated track that is neither a muon or electron. One might worry that the fake track rate depends on the amount of hadronic activity in an event, which is likely higher in the signal region than in Z events. One might also worry that the fake track rate depends on pileup, and the signal trigger/selection may be more susceptible to pileup than the single lepton trigger/selection. Ideally, I would suggest that you perform the same measurement on a QCD dominated region (like requiring a dijet or quadjet trigger or just high HT). You can require pTmiss,no mu < 100 GeV to ensure no signal contamination. If this is not possible, then you could consider taking what you have and either reweighting the pileup and HT distribution to match the signal region or checking that the fake rate is independent of these quantities. - L649-653: I don't understand how these numbers are consistent with Figure 29. In Figure 29 (left) it seems there are about 9 events with |dxy|<0.02cm and about 15 with 0.05<|dxy|<0.10cm to be compared with 32 events and 68 events. There is a similar discrepancy for electrons. I guess the plots have been scaled for some reason as the entries are not integers. Please fix the plots and verify the results are consistent. - Figure 29: Why do you not fit the region |dxy|<0.1cm? If you fit out to |dxy|=1.0cm, please show the entire range in the plots. It would be nice to see the results for nlayers=5 and 6 as well so we can evaluate the extent to which a fit may or may not be possible and whether the shape is consistent with nlayers=4.

Section 6.2.2: In Table 36, it would be enlightening to show the same results as in Table 35. That is, I am curious as to how P_fake compares between data and MC. Are these results normalized to 41 fb^-1? If so, then it seems like the MC predicts about 1/5 as many fake tracks as data. It is hard to be confident that the MC tells us anything if that is so.

Section 6.2.2: Your hypothesis is that the fake track rate is independent of selection so you can use the Z data to estimate the fake track rate in your signal region. I have suggested that you could also measure the fake track rate in QCD events to verify this. You can also check the effect in MC. I guess in Section 6.2.2 you apply the same criteria to MC as you do for data (selecting Z events). However, if your hypothesis is true, then you should also get the same fake rate if you use any MC sample. What happens if you use all the samples in Section 3.3 but remove the Table 33 and 34 requirements so you are using all events? If P_fake changes significantly, this is cause for concern. If not, then that is good. In either case, it still may not prove anything if the MC is really predicting 1/5 the amount of fake tracks.

Figure 31: Would be good to have a plot for nlayers=5 as well.

Figure 35: Please include the ratio of the two since this provides the scale factors that are used. It may be better to simply include the region of 50-300 GeV on a linear scale.

L752-754: While this signal yield reduction is interesting, just as interesting would be the change after all cuts are applied (with nlayers>=4). Can you provide this as well?

L760-762: How sure are we that the Z can be used to measure ISR difference for the signal model? I generally agree with the statement that both recoil off ISR but it would be nice if this could be confirmed somehow. Does the pT distribution for a 100 GeV chargino look similar to a Z in Pythia8? Does the ISR reweighting work for ttbar events or diboson events?

L772-773: Please expand on "is applied to the simulated signal samples". Do you reweight the events using the ISR jet in the event or the net momentum of the produced SUSY particles or something else.

Section 8.2.2: Please expand on this. I am very surprised that this is such a small effect. Given the problems encountered with the Phase 1 pixel detector (problems with timing levels 1 and 3, way more noise than expected in layer 1 causing high thresholds, DC-DC converter failures, etc.) I would have expected big differences between data and simulation on quantities requiring hits in the pixel detector. I know the tracking reconstruction was changed at HLT and offline to keep track reconstruction efficiency high but this doesn't remove the problem of missing pixel hits. So please expand on how you measure these uncertainties. Do you just use tracks with pT>55 GeV that are associated with a muon? One problem with using muons to evaluate tracking efficiency is that there are special track reconstruction techniques developed to recover muons missed by the standard tracking. These tend to use wider windows to discover silicon hits and so may not reflect the track reconstruction of "standard" charged particles. You could perhaps remove the electron and tau vetos to see what you get in those cases.

Section 8.2.5: This seems like an underestimate of the systematic uncertainty. If you had infinite data and MC statistics, your systematic uncertainty would be 0. As mentioned above, this doesn't address whether measuring the ISR using Z->mumu decays translates exactly into the ISR for the signal process. The paper mentions this is up to a 350% correction, so it is a big effect. I am very worried that the systematic uncertainty does not cover all that we don't know. I note that Figure 37 shows results with pT and pTmiss. Why did you use pT? Perhaps pTmiss could also be used as a systematic check.

Figure 43: I would suggest including the same comparison vs pT from the document you reference. This shows that for pT>55 GeV, the differences are similar as for lower pT.

Section 8.2.11: Please explain this better. My understanding is that various prescales were in place. In the original measurement of the trigger efficiency in Section 7.3, you rely on being above the track requirement plateau to measure the trigger efficiency versus pTmiss in data (which has all of the various prescales naturally included) and compare to MC (which just has an OR of all trigger paths, I think). This is the main reason why the data efficiency is lower than the MC efficiency. Is this correct? Now, in this section, you are measuring the trigger efficiency solely with MC, which is an OR of all trigger paths. So if the trigger path with the track requirement fails, the MC might still find that an MHT only trigger will fire, while in data the MHT-only trigger may be prescaled. Isn't this a problem? Can you perhaps repeat the exercise using only triggers in MC that were never prescaled (as the opposite extreme to assuming there was never any prescale)? Also, why would you average over the chargino lifetimes? Shouldn't this systematic uncertainty depend very strongly on chargino lifetime?

Section 8.2.11: Per my discussion of pixel issues in 2017. It is relatively easy to get 5 pixel hits with only 4 pixel layers as in order to make a hermetic cylindrical detector with flat sensors, you need to have overlaps. These overlaps are largest in the first layer, which is where there were significant issues with the Phase 1 pixel detector. So I am concerned that if the MC is optimistic about layer 1 hits, then relying on the MC may not be wise. Maybe you can check the following. Take good tracks (not muons but large number of hits with pT>50 GeV). Check the fraction of tracks that have two layer 1 pixel hits compared to one layer 1 pixel hits between MC and data. Or, more generally, the average number of pixel hits. If they differ, then you could see how many times you would go from 5 pixel hits to 4 pixel hits in data vs MC and use this difference as another estimate of the difference in trigger efficiency.

Here are also some brief comments on the paper:

Whenever you have a range, it should be written in regular (not math) mode with a double hyphen in LaTeX and no spaces. That is, "1--2". Done correctly in L44, . Incorrect in L186, L285, L321, L326, L327, L331

In Section 2, I think it would be good to give more information about the tracker, especially the Phase 1 pixel detector. It is pretty important to know that we expect particles to pass through 4 pixel layers.

Should mention the difference between number of hits and number of layers with a hit.

L60-67: At the end you talk about physics quantities like tan beta, mu, and the chargino-neutralino mass difference. In principle, I believe the lifetime is set by the masses (mainly mass difference) of the chargino and neutralino. I think you need to be clear that the lifetimes are changed arbitrarily and also give the mass difference (could just say 0.2 GeV).

L113: pTmiss and pTmiss no mu should be vectors

L128: Should say why |eta|<2.1 is used.

L131: need to specify the eta and pT requirements on the jets, perhaps in L103-107.

L157: Should describe hadronic tau reconstruction. Could be at the end of L91-102 where electrons, muons, and charged hadron reconstruction is described.

L168,L177: Given that your special procedure removes 4% of the signal tracks, it is natural to wonder what fraction of the signal tracks are removed by the requirements of L159-168.

L179: Commas after "Firstly" and "Secondly"

L190: Should make it clear that leptons here refers to electrons, muons, and taus.

L194, 196, 222, 231: The ordering of P_offline and P_trigger in L194,196 is different than in L222,231. Better to be consistent.

L204: I think you mean "excepting" rather than "expecting"

L214: I don't think you need the subscript "invmass" given that you define it that way in L213.

L222: Change "condition" to "conditional"

L227: p_T^l should be a vector

L234-238: This will need to be expanded to make it clear

L247: I don't think it is useful to mention a closure test with 2% of the data. I mean a 2% test may reveal something that is horribly wrong but it is not going to convince anyone that you know what you are doing.

L339 and Table 3 caption: Suggest changing "signal yields" to "signal efficiencies"

Table 4: I guess to match the text it should be "spurious tracks" instead of "fake tracks"

Lots of the references need minor fixing. The main problems are - volume letter needs to go with title and not volume number: refs 2, 8, 26, 27, 30, 39, 40, 41 - only the first page number should be given: refs 2, 19, 30 - no issue number should be given: refs 8, 13, 31 - PDG should use the Bibtex entry given here: https://twiki.cern.ch/twiki/bin/view/CMS/Internal/PubGuidelines - ref 40 needs help

<!--/twistyPlugin-->

## Questions from Juan Alcaraz (July 5)

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 113 to 315
We find the first paragraph of the answer to (5) confusing. If most of the signal events are in the plateau, why not cutaway the turn on in the selection? Anyway, according to Fig. 36 in the AN, quite some part of signal is in the turn on. If this is the case, we find unbelievable that you can be confident about your efficiency in the middle of the steep turn on to relative uncertainty of the order of 0.5%. We still think that a check with different datasets might provide some handle on the related systematics (position and slope of the turn on). We hope that ARC can pay attention to this issue. We are fine with the second paragraph of the answer.
Changed:
<
<
We do not cutaway the turn on in order to have as much signal acceptance as possible. The small uncertainties you refer to are the statistical uncertainties in the data and SM background MC efficiency measurements, and are small due to those samples being very large. This, however, is not the total uncertainty on the trigger efficiency, and we apologize if the AN gave that impression. We have added Section 8.2.11 and Figure 44 to the AN in version 7 that describes an additional component of the signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those, which is roughly a 10% effect. However, since only about 10% of the signal is on the turn-on so even a 10% uncertainty this results in yield uncertainties of 1.1% and 0.5% systematic for ==4 and ==5 layers respectively, which is combined with the statistical uncertainties to reach a total trigger systematic uncertainty, which is presented in the updated AN in Table .
>
>
We do not cutaway the turn on in order to have as much signal acceptance as possible. The small uncertainties you refer to are the statistical uncertainties in the data and SM background MC efficiency measurements, and are small due to those samples being very large. This, however, is not the total uncertainty on the trigger efficiency, and we apologize if the AN gave that impression. We have added Section 8.2.11 and Figure 44 to the AN in version 7 that describes an additional component of the signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those, which is roughly a 10% effect. However, since only about 10% of the signal is on the turn-on, this uncertainty results in yield uncertainties of 1.1% and 0.5% for ==4 and ==5 layers respectively. This is combined with the statistical uncertainties to reach a total trigger systematic uncertainty, which is presented in the updated AN in Table .

#### Revision 322019-07-15 - ChristopherHill

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 113 to 113
We find the first paragraph of the answer to (5) confusing. If most of the signal events are in the plateau, why not cutaway the turn on in the selection? Anyway, according to Fig. 36 in the AN, quite some part of signal is in the turn on. If this is the case, we find unbelievable that you can be confident about your efficiency in the middle of the steep turn on to relative uncertainty of the order of 0.5%. We still think that a check with different datasets might provide some handle on the related systematics (position and slope of the turn on). We hope that ARC can pay attention to this issue. We are fine with the second paragraph of the answer.
Changed:
<
<
Since MET from ISR is only needed for the trigger strategy, as a search for the disappearing tracks signature we wish to keep as much acceptance as possible. The small uncertainties you mention are those from the statistical uncertainties in the data and SM background MC efficeincy measurements, and are small due to those samples being very large. We recently added Section 8.2.11 and Figure 44 to the AN in version 7 which introduces a signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those. Only about 10% of the signal is on the turn-on so even a 10% uncertainty in the turn-on region only results in a 1% yield systematic -- this new AN section resulting in a 1.1% and 0.5% systematic for ==4 and ==5 layers respectively. In the next version of the AN we will combine all of the trigger signal systematics into one section to make it easier to read.
>
>
We do not cutaway the turn on in order to have as much signal acceptance as possible. The small uncertainties you refer to are the statistical uncertainties in the data and SM background MC efficiency measurements, and are small due to those samples being very large. This, however, is not the total uncertainty on the trigger efficiency, and we apologize if the AN gave that impression. We have added Section 8.2.11 and Figure 44 to the AN in version 7 that describes an additional component of the signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those, which is roughly a 10% effect. However, since only about 10% of the signal is on the turn-on so even a 10% uncertainty this results in yield uncertainties of 1.1% and 0.5% systematic for ==4 and ==5 layers respectively, which is combined with the statistical uncertainties to reach a total trigger systematic uncertainty, which is presented in the updated AN in Table .

#### Revision 312019-07-12 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 91 to 91
If one of the trigger paths has a tighter cut (5 hits) than the offline cut, why did not you redefine the offline cuts and required >=5 hits when ONLY that trigger path is fired ? I do not see any right to assume that we can count on an extra efficiency that does not really exist, even if it is small. Am I missing anything?
Changed:
<
<
This highlights the difference between the number of hits (nValidHits) and number of layers (trackerLayersWithMeasurement). For nLayers >=6 and ==5, our requirements on missing hits does mean we are requiring respectively >=6 and >=5 hits. It is possible although rare for a track to have multiple hits associated to it in the same pixel layer, so in the case of nLayers ==4 there is a small efficiency to have >=5 hits. For 700 GeV charginos with ctau = 10cm for example, about 9% of the selected tracks have more than 4 hits. So to your question of requiring >=5 hits, this would heavily reduce our efficiency in the ==4 layer category.

The more subtle issue here is the difference between online/offline track reconstruction. The offline track fitting does a much better job at associating hits to tracks than the online algorithm, something that can be seen clearly in the track leg efficiency (see Figure 53 in the AN). So even in the case of >=6 layers, the trigger requirement is fairly inefficient. Thus why we take such a large OR with MET paths: at high MET we can improve on the track leg and at lower MET the track leg is still an improvement over not using HLT_MET105_IsoTrk50 at all. See the below plot to that effect.

>
>
It is a very small but non-zero probability that a track has multiple hits associated to it in the same pixel (9% in one signal sample for example). This allows tracks with only 4 layers to have >=5 hits and fire the IsoTrk50 leg. But to your question, this is a small probability on top of the small probability to have a MET on the turn-on such that it would make a difference. Adding that cut is statistically consistent with not adding it and relying on only the inclusive MET paths, as is shown in the left plot below.

Deleted:
<
<
Our most recent addition (Section 8.2.11 and Figure 44 in the AN), which is is quite a conservative approach to this, does present the efficiency in signal for the shorter track categories. Perhaps it would be prudent to show a comparison of Figure 44 to the efficiency in signal without HLT_MET105_IsoTrk50, but even as it is these different turn-ons illustrate the effect of the track leg efficiency which becomes less efficient (yet non-zero) with shorter tracks.
On a tangent matter: when can we expect to have any kind of 2018 results ? Despite the suggestion from the EXO conveners I am a bit uncomfortable with considering this step as a trivial top-up operation in an analysis like this one. We know by experience that each new year can give rise to new features and then change significantly the rate of pathological background events that we have to consider...

The above section will for now provide immediate updates for 2018 results. We will also deliver them in an updated analysis note when available. Currently all background estimates are available in 2018 C. The nTuples are now complete in 2018 ABC and we are now processing them quickly, we expect background estimates in ABC this week. The nTuples for 2018 D will progress quickly because we still have global CRAB priority.

Line: 117 to 113
We find the first paragraph of the answer to (5) confusing. If most of the signal events are in the plateau, why not cutaway the turn on in the selection? Anyway, according to Fig. 36 in the AN, quite some part of signal is in the turn on. If this is the case, we find unbelievable that you can be confident about your efficiency in the middle of the steep turn on to relative uncertainty of the order of 0.5%. We still think that a check with different datasets might provide some handle on the related systematics (position and slope of the turn on). We hope that ARC can pay attention to this issue. We are fine with the second paragraph of the answer.
Changed:
<
<
Our requirement of MET from ISR is an unfortunate reality for this analysis, it really does not have anything to do with the disappearing track signature. We try to keep our MET requirements low to have the most acceptance. But for the scale of the signal systematic from trigger efficiencies, recall that these are calculated by fluctuating the data and WJets MC efficiencies up/down by their statistical uncertainties which are small. For example:

 MET value Raw efficiency efficiency relative uncertainty Data 120 GeV 38 / 108330 0.00035078002 + 6.6620299e-05 - 5.6645505e-05 17.6% Data 300 GeV 3623 / 5084 0.71262785 + 0.0064020737 - 0.0064870057 0.09% WJets 120 GeV 293.54503 / 379928.84 0.00077382879 + 4.7794247e-05 - 4.5089480e-05 6.0% WJets 300 GeV 19228.983 / 19426.379 0.98985895 + 0.00071873992 - 0.00077098031 0.075%

These uncertainties are larger in the turn-on, but these are convoluted with the actual MET of our signal -- these uncertainties are small due to the large samples we use to measure the efficiencies, and only a small portion of our selected signal receives these larger uncertainties.

>
>
Since MET from ISR is only needed for the trigger strategy, as a search for the disappearing tracks signature we wish to keep as much acceptance as possible. The small uncertainties you mention are those from the statistical uncertainties in the data and SM background MC efficeincy measurements, and are small due to those samples being very large. We recently added Section 8.2.11 and Figure 44 to the AN in version 7 which introduces a signal systematic for the shorter (==4, ==5 layers) track categories due to the turn-on region for those. Only about 10% of the signal is on the turn-on so even a 10% uncertainty in the turn-on region only results in a 1% yield systematic -- this new AN section resulting in a 1.1% and 0.5% systematic for ==4 and ==5 layers respectively. In the next version of the AN we will combine all of the trigger signal systematics into one section to make it easier to read.

Changed:
<
<
ALso as requested we measured the trigger efficiency in data using electrons instead of muons. The turn-on is slightly slower for electrons. The analyzers still feel using muons to study this is more appropriate, as the chargino signature is more muon-like and electrons introduce hit pattern effects due to conversions and bremsstrahlung. Even ignoring that, one could take the ratio of these efficiencies and convolute them with the metNoMu as a weight to derive a signal systematic. Even ignoring that however, if one were to take the ratio of these efficiencies and apply them as a weight to derive a signal systematic, convoluting with the MET (no mu), one gets a 2.7-3.2% downwards systematic across the NLayers categories, using AMSB 700GeV_100cm as an example.
>
>
Also as requested we measured the trigger efficiency in data using electrons instead of muons; see the below plots. One can take the ratio of these efficiencies and apply them as a weight (as a function of MET) to derive another signal systematic on the signal yields. This would give a 2.7-3.2% downwards systematic across the NLayers categories, using 700GeV 100cm charginos as an example. The analyzers feel however that this is not appropriate to use, because the chargino signature is muon-like in the tracker and electrons introduce hit pattern effects due to conversions and bremsstrahlung which would not affect the signal.

#### Revision 302019-07-10 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 111 to 111
In your answer to (4) the plots show only the correction related to Fig. 37 in the AN. First, it is a bit surprising that there is a residual MC overprediction at high pT (right plot on the twiki) and the effect on recoil (left and middle figure) is marginal. Second, and more importantly, in the Pythia/MG part of the correction which is in Fig. 38 of the AN, it seems that Pythia does not generate enough high recoil events, so the resulting weight on high recoil signal (>~250 GeV) seems completely saturated. Do you have convincing arguments that this is not an issue?
Changed:
<
<
Again if we applied this weight to background MC evaluating them as a function of reconstructed di-muon pT, the reweighted plots will agree perfectly by construction. But that is not the reweighting we are applying. We evaluate the weights as a function of GEN-level electroweak-ino pair pT in our AMSB signal. So the closest meaningful comparison is to use GEN-level di-muon pT in background MC in which such a pair exists: drell-yan. So in those plots below, we are reweighting only one sample because it did not make sense for us to reweight the others. We also feel that the effect of the weights being very similar between diMuonPt and metNoMu is due to them being well-correlated -- if they were not well-correlated, the data/MC disagreement would be substantially different between the two. The original intent of the question was if the diMuonPt was serving as a good proxy (with much better resolution) for metNoMu, the real recoil, and the plots demonstrate this.
>
>
The residual MC overprediction is an artifact of the fact that we reweight the background MC evaluating them as a function of GEN-level electroweak-ino pair pT in our AMSB signal, not as function of reconstructed di-muon pT that is plotted.

Changed:
<
<
Secondly for Pythia/MG, this is precisely right: Pythia does not generate enough high recoil events. This is a well-known feature of Pythia and why MadGraph was developed, and why this correction is necessary to correctly describe the AMSB hypothesis. As for "saturation", if you refer to the tail towards 1000 GeV where you eventually have 0-1 Pythia events -- yes, this would "saturate" the weights. However we do not observe AMSB events with that large of a recoil. Typically our signal samples will have a recoil of about 150-300 GeV, the tail dropping very low around 500 GeV. Across all masses and lifetimes we observe no AMSB events that fall into the overflow bin of the ISR weights where a proper weight might not exist. Even if you have an AMSB event with ~1 TeV recoil, the weight would be large yes, but with a large systematic uncertainty due to the low statistics in the ISR weight for that event.

>
>
For Pythia/MG, you are correct that Pythia does not generate enough high recoil events. This is a well-known feature of Pythia and one of the main reasons why MadGraph was developed, and why such a correction is necessary to correctly describe the AMSB hypothesis. Yes, it does result in weights of 3--4 for events >~ 250 GeV, but this is a necessary correction, so we don't see it as an issue.
We find the first paragraph of the answer to (5) confusing. If most of the signal events are in the plateau, why not cutaway the turn on in the selection? Anyway, according to Fig. 36 in the AN, quite some part of signal is in the turn on. If this is the case, we find unbelievable that you can be confident about your efficiency in the middle of the steep turn on to relative uncertainty of the order of 0.5%. We still think that a check with different datasets might provide some handle on the related systematics (position and slope of the turn on). We hope that ARC can pay attention to this issue. We are fine with the second paragraph of the answer.
Line: 670 to 668

 META FILEATTACHMENT attachment="metAndTriggerEff_700_100.png" attr="" comment="" date="1561388708" name="metAndTriggerEff_700_100.png" path="metAndTriggerEff_700_100.png" size="178148" user="bfrancis" version="1" attachment="compareDatasets_GrandOr_METPath_MuonElectron.png" attr="" comment="" date="1562621623" name="compareDatasets_GrandOr_METPath_MuonElectron.png" path="compareDatasets_GrandOr_METPath_MuonElectron.png" size="119630" user="bfrancis" version="1" attachment="ratio_of_efficiencies.png" attr="" comment="" date="1562621624" name="ratio_of_efficiencies.png" path="ratio_of_efficiencies.png" size="111310" user="bfrancis" version="1"
Deleted:
<
<
 META FILEATTACHMENT attachment="compareISRweights.png" attr="" comment="" date="1562625581" name="compareISRweights.png" path="compareISRweights.png" size="112998" user="bfrancis" version="1"

 META FILEATTACHMENT attachment="compareWithWithout700GeV100cmNLayers4.jpg" attr="" comment="" date="1562699528" name="compareWithWithout700GeV100cmNLayers4.jpg" path="compareWithWithout700GeV100cmNLayers4.jpg" size="107239" user="bfrancis" version="2" attachment="compareWithWithout700GeV100cmNLayers6plus.jpg" attr="" comment="" date="1562699529" name="compareWithWithout700GeV100cmNLayers6plus.jpg" path="compareWithWithout700GeV100cmNLayers6plus.jpg" size="105856" user="bfrancis" version="2" by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 292019-07-10 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 79 to 79

<!--/twistyPlugin-->
Changed:
<
<

>
>

## Questions from Juan Alcaraz (July 5)

<!--/twistyPlugin twikiMakeVisibleInline-->
Changed:
<
<
As requested we measured the trigger efficiency in data using electrons instead of muons. The turn-on is slightly slower for electrons. The analyzers still feel using muons to study this is more appropriate, as the chargino signature is more muon-like and electrons introduce hit pattern effects due to conversions and bremsstrahlung. Even ignoring that, one could take the ratio of these efficiencies and convolute them with the metNoMu as a weight to derive a signal systematic.
>
>
Regarding the Z (or ewkino pair) recoil correction, are you really performing the following two steps for the signal: 1) reweight from Pythia8 to MG as a function of the recoil pt; 2) reweight again the resulting signal MC according to the data/MC observed recoil spectrum in Z->mumu events? Also, let me ask again (probably you answer that at the pre-approval meeting, but I forgot): the data/MC discrepancy at lot dimuon pt was just de to the lack of MC dimuon events at low invariant mass ? (this should be irrelevant given the ISR jet cut used in the analysis, but just to understand).

Changed:
<
<
Even ignoring that however, if one were to take the ratio of these efficiencies and apply them as a weight to derive a signal systematic, convoluting with the MET (no mu), one gets a 2.7-3.2% downwards systematic across the NLayers categories, using AMSB 700GeV_100cm as an example.
>
>
This is correct, we apply both weights. In the ARC review of EXO-16-044 (in which Kevin Stenson was chair and will recall), it was noted that only applying the data/(MG MC) correction would only correct the MG distribution to that seen in data. As our signal is generated in Pythia, we need to correct Pythia's distribution to that of MG's first, otherwise the first correction is not applicable.

Changed:
<
<
>
>
The discrepancy at low dimuon pt is driven by the drell-yan samples; in 2017 the samples available were M > 5 GeV, and M > 10 GeV in 2018. Recall we also require a jet here with pt > 110 GeV since our signal requires that, and that also restricts the low-pt events we select. But due also to this jet requirement, in signal the electroweak-ino pair has high sum pt. For example for 700 GeV charginos with ctau = 100cm, the lowest event selected is at ~100 GeV and the mean pt is ~310 GeV -- where the data/MC ISR weights are closer to unity.

Changed:
<
<
Furthermore there are continuing questions as to how the trigger efficiency systematics for signal are so low, 0.07-0.35%. Recall that these are calculated by fluctuating the data and WJets MC efficiencies up/down by their statistical uncertainties which are small. For example:
>
>
If one of the trigger paths has a tighter cut (5 hits) than the offline cut, why did not you redefine the offline cuts and required >=5 hits when ONLY that trigger path is fired ? I do not see any right to assume that we can count on an extra efficiency that does not really exist, even if it is small. Am I missing anything?

Changed:
<
<
 MET value Raw efficiency efficiency relative uncertainty Data 120 GeV 38 / 108330 0.00035078002 + 6.6620299e-05 - 5.6645505e-05 17.6% Data 300 GeV 3623 / 5084 0.71262785 + 0.0064020737 - 0.0064870057 0.09% WJets 120 GeV 293.54503 / 379928.84 0.00077382879 + 4.7794247e-05 - 4.5089480e-05 6.0% WJets 300 GeV 19228.983 / 19426.379 0.98985895 + 0.00071873992 - 0.00077098031 0.075%
>
>
This highlights the difference between the number of hits (nValidHits) and number of layers (trackerLayersWithMeasurement). For nLayers >=6 and ==5, our requirements on missing hits does mean we are requiring respectively >=6 and >=5 hits. It is possible although rare for a track to have multiple hits associated to it in the same pixel layer, so in the case of nLayers ==4 there is a small efficiency to have >=5 hits. For 700 GeV charginos with ctau = 10cm for example, about 9% of the selected tracks have more than 4 hits. So to your question of requiring >=5 hits, this would heavily reduce our efficiency in the ==4 layer category.

Changed:
<
<
These uncertainties are larger in the turn-on, but these are convoluted with the actual MET of our signal -- these uncertainties are small due to the large samples we use to measure the efficiencies, and only a small portion of our selected signal receives these larger uncertainties.
>
>
The more subtle issue here is the difference between online/offline track reconstruction. The offline track fitting does a much better job at associating hits to tracks than the online algorithm, something that can be seen clearly in the track leg efficiency (see Figure 53 in the AN). So even in the case of >=6 layers, the trigger requirement is fairly inefficient. Thus why we take such a large OR with MET paths: at high MET we can improve on the track leg and at lower MET the track leg is still an improvement over not using HLT_MET105_IsoTrk50 at all. See the below plot to that effect.

Changed:
<
<
>
>

Our most recent addition (Section 8.2.11 and Figure 44 in the AN), which is is quite a conservative approach to this, does present the efficiency in signal for the shorter track categories. Perhaps it would be prudent to show a comparison of Figure 44 to the efficiency in signal without HLT_MET105_IsoTrk50, but even as it is these different turn-ons illustrate the effect of the track leg efficiency which becomes less efficient (yet non-zero) with shorter tracks.

On a tangent matter: when can we expect to have any kind of 2018 results ? Despite the suggestion from the EXO conveners I am a bit uncomfortable with considering this step as a trivial top-up operation in an analysis like this one. We know by experience that each new year can give rise to new features and then change significantly the rate of pathological background events that we have to consider...

The above section will for now provide immediate updates for 2018 results. We will also deliver them in an updated analysis note when available. Currently all background estimates are available in 2018 C. The nTuples are now complete in 2018 ABC and we are now processing them quickly, we expect background estimates in ABC this week. The nTuples for 2018 D will progress quickly because we still have global CRAB priority.

<!--/twistyPlugin-->
Changed:
<
<

>
>

## Additional pre-approval followup Ivan Mikulec HN June 21

<!--/twistyPlugin twikiMakeVisibleInline-->
Changed:
<
<
Regarding the Z (or ewkino pair) recoil correction, are you really performing the following two steps for the signal: 1) reweight from Pythia8 to MG as a function of the recoil pt; 2) reweight again the resulting signal MC according to the data/MC observed recoil spectrum in Z->mumu events? Also, let me ask again (probably you answer that at the pre-approval meeting, but I forgot): the data/MC discrepancy at lot dimuon pt was just de to the lack of MC dimuon events at low invariant mass ? (this should be irrelevant given the ISR jet cut used in the analysis, but just to understand).
>
>
In your answer to (4) the plots show only the correction related to Fig. 37 in the AN. First, it is a bit surprising that there is a residual MC overprediction at high pT (right plot on the twiki) and the effect on recoil (left and middle figure) is marginal. Second, and more importantly, in the Pythia/MG part of the correction which is in Fig. 38 of the AN, it seems that Pythia does not generate enough high recoil events, so the resulting weight on high recoil signal (>~250 GeV) seems completely saturated. Do you have convincing arguments that this is not an issue?

Changed:
<
<
This is correct, we apply both weights. In the ARC review of EXO-16-044 (in which Kevin Stenson was chair and will recall), it was noted that only applying the data/(MG MC) correction would only correct the MG distribution to that seen in data. As our signal is generated in Pythia, we need to correct Pythia's distribution to that of MG's first, otherwise the first correction is not applicable.
>
>
Again if we applied this weight to background MC evaluating them as a function of reconstructed di-muon pT, the reweighted plots will agree perfectly by construction. But that is not the reweighting we are applying. We evaluate the weights as a function of GEN-level electroweak-ino pair pT in our AMSB signal. So the closest meaningful comparison is to use GEN-level di-muon pT in background MC in which such a pair exists: drell-yan. So in those plots below, we are reweighting only one sample because it did not make sense for us to reweight the others. We also feel that the effect of the weights being very similar between diMuonPt and metNoMu is due to them being well-correlated -- if they were not well-correlated, the data/MC disagreement would be substantially different between the two. The original intent of the question was if the diMuonPt was serving as a good proxy (with much better resolution) for metNoMu, the real recoil, and the plots demonstrate this.

Changed:
<
<
The discrepancy at low dimuon pt is driven by the drell-yan samples; in 2017 the samples available were M > 5 GeV, and M > 10 GeV in 2018. Recall we also require a jet here with pt > 110 GeV since our signal requires that, and that also restricts the low-pt events we select. But due also to this jet requirement, in signal the electroweak-ino pair has high sum pt. For example for 700 GeV charginos with ctau = 100cm, the lowest event selected is at ~100 GeV and the mean pt is ~310 GeV -- where the data/MC ISR weights are closer to unity.
>
>
Secondly for Pythia/MG, this is precisely right: Pythia does not generate enough high recoil events. This is a well-known feature of Pythia and why MadGraph was developed, and why this correction is necessary to correctly describe the AMSB hypothesis. As for "saturation", if you refer to the tail towards 1000 GeV where you eventually have 0-1 Pythia events -- yes, this would "saturate" the weights. However we do not observe AMSB events with that large of a recoil. Typically our signal samples will have a recoil of about 150-300 GeV, the tail dropping very low around 500 GeV. Across all masses and lifetimes we observe no AMSB events that fall into the overflow bin of the ISR weights where a proper weight might not exist. Even if you have an AMSB event with ~1 TeV recoil, the weight would be large yes, but with a large systematic uncertainty due to the low statistics in the ISR weight for that event.

Changed:
<
<
If one of the trigger paths has a tighter cut (5 hits) than the offline cut, why did not you redefine the offline cuts and required >=5 hits when ONLY that trigger path is fired ? I do not see any right to assume that we can count on an extra efficiency that does not really exist, even if it is small. Am I missing anything?
>
>
We find the first paragraph of the answer to (5) confusing. If most of the signal events are in the plateau, why not cutaway the turn on in the selection? Anyway, according to Fig. 36 in the AN, quite some part of signal is in the turn on. If this is the case, we find unbelievable that you can be confident about your efficiency in the middle of the steep turn on to relative uncertainty of the order of 0.5%. We still think that a check with different datasets might provide some handle on the related systematics (position and slope of the turn on). We hope that ARC can pay attention to this issue. We are fine with the second paragraph of the answer.

Changed:
<
<
This highlights the difference between the number of hits (nValidHits) and number of layers (trackerLayersWithMeasurement). For nLayers >=6 and ==5, our requirements on missing hits does mean we are requiring respectively >=6 and >=5 hits. It is possible although rare for a track to have multiple hits associated to it in the same pixel layer, so in the case of nLayers ==4 there is a small efficiency to have >=5 hits. For 700 GeV charginos with ctau = 10cm for example, about 9% of the selected tracks have more than 4 hits. So to your question of requiring >=5 hits, this would heavily reduce our efficiency in the ==4 layer category.
>
>
Our requirement of MET from ISR is an unfortunate reality for this analysis, it really does not have anything to do with the disappearing track signature. We try to keep our MET requirements low to have the most acceptance. But for the scale of the signal systematic from trigger efficiencies, recall that these are calculated by fluctuating the data and WJets MC efficiencies up/down by their statistical uncertainties which are small. For example:

Changed:
<
<
The more subtle issue here is the difference between online/offline track reconstruction. The offline track fitting does a much better job at associating hits to tracks than the online algorithm, something that can be seen clearly in the track leg efficiency (see Figure 53 in the AN). So even in the case of >=6 layers, the trigger requirement is fairly inefficient. Thus why we take such a large OR with MET paths: at high MET we can improve on the track leg and at lower MET the track leg is still an improvement over not using HLT_MET105_IsoTrk50 at all. See the below plot to that effect.
>
>
 MET value Raw efficiency efficiency relative uncertainty Data 120 GeV 38 / 108330 0.00035078002 + 6.6620299e-05 - 5.6645505e-05 17.6% Data 300 GeV 3623 / 5084 0.71262785 + 0.0064020737 - 0.0064870057 0.09% WJets 120 GeV 293.54503 / 379928.84 0.00077382879 + 4.7794247e-05 - 4.5089480e-05 6.0% WJets 300 GeV 19228.983 / 19426.379 0.98985895 + 0.00071873992 - 0.00077098031 0.075%

Changed:
<
<
>
>
These uncertainties are larger in the turn-on, but these are convoluted with the actual MET of our signal -- these uncertainties are small due to the large samples we use to measure the efficiencies, and only a small portion of our selected signal receives these larger uncertainties.

Changed:
<
<
Our most recent addition (Section 8.2.11 and Figure 44 in the AN), which is is quite a conservative approach to this, does present the efficiency in signal for the shorter track categories. Perhaps it would be prudent to show a comparison of Figure 44 to the efficiency in signal without HLT_MET105_IsoTrk50, but even as it is these different turn-ons illustrate the effect of the track leg efficiency which becomes less efficient (yet non-zero) with shorter tracks.
>
>

Changed:
<
<
On a tangent matter: when can we expect to have any kind of 2018 results ? Despite the suggestion from the EXO conveners I am a bit uncomfortable with considering this step as a trivial top-up operation in an analysis like this one. We know by experience that each new year can give rise to new features and then change significantly the rate of pathological background events that we have to consider...
>
>
ALso as requested we measured the trigger efficiency in data using electrons instead of muons. The turn-on is slightly slower for electrons. The analyzers still feel using muons to study this is more appropriate, as the chargino signature is more muon-like and electrons introduce hit pattern effects due to conversions and bremsstrahlung. Even ignoring that, one could take the ratio of these efficiencies and convolute them with the metNoMu as a weight to derive a signal systematic. Even ignoring that however, if one were to take the ratio of these efficiencies and apply them as a weight to derive a signal systematic, convoluting with the MET (no mu), one gets a 2.7-3.2% downwards systematic across the NLayers categories, using AMSB 700GeV_100cm as an example.

Changed:
<
<
The above section will for now provide immediate updates for 2018 results. We will also deliver them in an updated analysis note when available. Currently all background estimates are available in 2018 C. The nTuples are now complete in 2018 ABC and we are now processing them quickly, we expect background estimates in ABC this week. The nTuples for 2018 D will progress quickly because we still have global CRAB priority.
>
>

<!--/twistyPlugin-->

#### Revision 282019-07-09 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 667 to 667

 META FILEATTACHMENT attachment="compareDatasets_GrandOr_METPath_MuonElectron.png" attr="" comment="" date="1562621623" name="compareDatasets_GrandOr_METPath_MuonElectron.png" path="compareDatasets_GrandOr_METPath_MuonElectron.png" size="119630" user="bfrancis" version="1" attachment="ratio_of_efficiencies.png" attr="" comment="" date="1562621624" name="ratio_of_efficiencies.png" path="ratio_of_efficiencies.png" size="111310" user="bfrancis" version="1" attachment="compareISRweights.png" attr="" comment="" date="1562625581" name="compareISRweights.png" path="compareISRweights.png" size="112998" user="bfrancis" version="1"
Changed:
<
<
 META FILEATTACHMENT attachment="compareWithWithout700GeV100cmNLayers4.jpg" attr="" comment="" date="1562694660" name="compareWithWithout700GeV100cmNLayers4.jpg" path="compareWithWithout700GeV100cmNLayers4.jpg" size="112250" user="bfrancis" version="1" attachment="compareWithWithout700GeV100cmNLayers6plus.jpg" attr="" comment="" date="1562694811" name="compareWithWithout700GeV100cmNLayers6plus.jpg" path="compareWithWithout700GeV100cmNLayers6plus.jpg" size="112524" user="bfrancis" version="1"
>
>
 META FILEATTACHMENT attachment="compareWithWithout700GeV100cmNLayers4.jpg" attr="" comment="" date="1562699528" name="compareWithWithout700GeV100cmNLayers4.jpg" path="compareWithWithout700GeV100cmNLayers4.jpg" size="107239" user="bfrancis" version="2" attachment="compareWithWithout700GeV100cmNLayers6plus.jpg" attr="" comment="" date="1562699529" name="compareWithWithout700GeV100cmNLayers6plus.jpg" path="compareWithWithout700GeV100cmNLayers6plus.jpg" size="105856" user="bfrancis" version="2"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 272019-07-09 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 30 to 30
NOTE: Questions are in Red (Unanswered), or Green (Answered), or Purple (In Progress) while answers are in Blue .
Changed:
<
<

>
>

## On-going 2018 estimate updates (last updated July 9)

<!--/twistyPlugin twikiMakeVisibleInline-->
Line: 79 to 79

<!--/twistyPlugin-->
Changed:
<
<

>
>

## Additional pre-approval followup (July 9)

### Trigger efficiency signal systematics

Line: 124 to 124
The more subtle issue here is the difference between online/offline track reconstruction. The offline track fitting does a much better job at associating hits to tracks than the online algorithm, something that can be seen clearly in the track leg efficiency (see Figure 53 in the AN). So even in the case of >=6 layers, the trigger requirement is fairly inefficient. Thus why we take such a large OR with MET paths: at high MET we can improve on the track leg and at lower MET the track leg is still an improvement over not using HLT_MET105_IsoTrk50 at all. See the below plot to that effect.
Changed:
<
<
>
>
Our most recent addition (Section 8.2.11 and Figure 44 in the AN), which is is quite a conservative approach to this, does present the efficiency in signal for the shorter track categories. Perhaps it would be prudent to show a comparison of Figure 44 to the efficiency in signal without HLT_MET105_IsoTrk50, but even as it is these different turn-ons illustrate the effect of the track leg efficiency which becomes less efficient (yet non-zero) with shorter tracks.
Line: 667 to 667

 META FILEATTACHMENT attachment="compareDatasets_GrandOr_METPath_MuonElectron.png" attr="" comment="" date="1562621623" name="compareDatasets_GrandOr_METPath_MuonElectron.png" path="compareDatasets_GrandOr_METPath_MuonElectron.png" size="119630" user="bfrancis" version="1" attachment="ratio_of_efficiencies.png" attr="" comment="" date="1562621624" name="ratio_of_efficiencies.png" path="ratio_of_efficiencies.png" size="111310" user="bfrancis" version="1" attachment="compareISRweights.png" attr="" comment="" date="1562625581" name="compareISRweights.png" path="compareISRweights.png" size="112998" user="bfrancis" version="1"
>
>
 META FILEATTACHMENT attachment="compareWithWithout700GeV100cmNLayers4.jpg" attr="" comment="" date="1562694660" name="compareWithWithout700GeV100cmNLayers4.jpg" path="compareWithWithout700GeV100cmNLayers4.jpg" size="112250" user="bfrancis" version="1" attachment="compareWithWithout700GeV100cmNLayers6plus.jpg" attr="" comment="" date="1562694811" name="compareWithWithout700GeV100cmNLayers6plus.jpg" path="compareWithWithout700GeV100cmNLayers6plus.jpg" size="112524" user="bfrancis" version="1"

 META TOPICMOVED by="bfrancis" date="1556204305" from="Main.DisappearingTracks2017" to="Main.EXO19010"

#### Revision 262019-07-09 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 45 to 45
statesel="on" }%
Changed:
<
<
>
>
• Produce skimmed ntuples with CRAB

• MET
• EGamma (D in progress)
• SingleMuon
• Tau
Changed:
<
<
>
>
• Create fiducial maps
• Muon
• Electron (A, B in progress)

• Run channels without fiducial track selections
Changed:
<
<
• from MET
• basicSelection
• from EGamma
>
>
• basicSelection
• ZtoEE (BC complete, A in progress)
• ZtoMuMu
• Trigger efficiencies with muons
• Background estimates and systematics (C complete, requires fiducial maps)
• Electron
• Muon
• Tau
• Fake
• ZtoMuMu

• ZtoEE
Changed:
<
<
>
>
• Fetch RAW and re-reco lepton P(veto) passing events
• Signal corrections
• Pileup
• ISR weights
• missing middle/outer hits (requires fiducial maps)
• Trigger scale factors
• Signal Systematics
• Expected upper limits
• Unblind observation
%CHECKLISTEND%

#### Revision 252019-07-09 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 59 to 59

• from MET
• basicSelection
• from EGamma
>
>
• ZtoEE

• ElectronFiducialCalcBeforeOldCuts
• ElectronFiducialCalcAfterOldCuts
• from SingleMuon
Line: 97 to 98

• TauTagPt55MetTrig
• from ZtoMuMu
• ZtoMuMuDisTrkNoD0CutNLayers*
>
>
• from ZtoEE
• ZtoEEDisTrkNoD0CutNLayers*

Changed:
<
<
• SingleElectron
>
>
• EGamma

#### Revision 242019-07-09 - BrianFrancis

Line: 1 to 1

 META TOPICPARENT name="BrianFrancis"

# Search for new physics with disappearing tracks

Line: 32 to 32

## On-going 2018 estimate updates (last updated July 8)

>
>
<!--/twistyPlugin twikiMakeVisibleInline-->

( - todo - doing - done ) Reset

• Produce skimmed ntuples with CRAB, using the config files in DisappTrks/CandidateTrackProducer/test
• MET
• EGamma (D in progress)
• SingleMuon
• Tau
• icon:led-orange Update dataset names in DisappTrks/StandardAnalysis/python/miniAOD_92X_Samples.py and integrated luminosities in DisappTrks/StandardAnalysis/python/IntegratedLuminosity_cff.py
• MET
• EGamma
• SingleMuon
• Tau
• Run channels without fiducial track selections
• from MET
• _basicSelection_
• from EGamma
• ElectronFiducialCalcBeforeOldCuts
• ElectronFiducialCalcAfterOldCuts
• from SingleMuon
• _ZtoMuMu_
• MuonFiducialCalcBeforeOldCuts
• MuonFiducialCalcAfterOldCuts
• Trigger efficiency
• METLegDenominator
• METLegNumerator
• TrackLegDenominatorWithMuons
• TrackLegNumeratorWithMuons
• GrandOrDenominator
• GrandOrNumerator
• Commit (and elog plots of) the trigger efficiencies to DisappTrks/StandardAnalysis/data and update references in DisappTrks/StandardAnalysis/python/customize.py
• Commit electron and muon fiducial maps to OSUT3Analysis/Configuration/data and update references to file in DisappTrks/StandardAnalysis/python/customize.py
• Run channels for background estimates and systematics
• from basicSelection
• disTrkSelectionNLayers*
• isoTrkSelection
• muonCtrlSelection
• hitsSystematicsCtrlSelection
• from EGamma
• ZtoEleProbeTrkNLayers*
• ZtoEleProbeTrkWithFilterNLayers*
• ZtoEleProbeTrkWithSSFilterNLayers*
• ElectronTagPt55NLayers*
• ElectronTagPt55MetTrigNLayers*
• from SingleMuon
• ZtoMuProbeTrkNLayers*
• ZtoMuProbeTrkWithFilterNLayers*
• ZtoMuProbeTrkWithSSFilterNLayers*
• MuonTagPt55
• MuonTagPt55MetTrig
• from Tau
• TauTagPt55
• TauTagPt55MetTrig
• from ZtoMuMu
• ZtoMuMuDisTrkNoD0CutNLayers*
• Fetch RAW events from all Zto*DisTrkNoD0CutNLayers* channels and re-reconstruct them using DisappTrks/BackgroundEstimation/test/rereco_2018ABC_MINIAOD_cfg.py (or 2018D); add the output to DisappTrks/BackgroundEstimation/data
• EGamma
• SingleMuon
• Re-run Zto*DisTrkNoD0CutNLayers* channels interactively over re-reconstructed events
• SingleElectron
• SingleMuon
• Perform background estimates by updating directories in DisappTrks/BackgroundEstimation/test/bkgdEstimate_2018.py
• fake tracks
• electrons
• muons
• taus
• Evaluate background systematics by updating directories in DisappTrks/BackgroundSystematics/test/bkgdSystematics_2018.py
• fake tracks
• leptons
• Calculate missing middle/outer hits corrections by running findMissingMiddleHitsCorrection.py and findMissingOuterHitsCorrection.py; update numbers in DisappTrks/StandardAnalysis/python/MissingHitsCorrections_cff.py
• Run the disTrkSelection on all signal MC
• Re-run channels for plots over background MC only
• from MET
• muonCtrlSelection
• hitsSystematicsCtrlSelection
• from SingleMuon
• ZtoMuMuDisTrkNHits4NoECaloCut
• Update background estimates and other settings in config files for limit setting in DisappTrks/LimitSetting/test
• Produce datacards using makeDatacards.py