Electron-faking-tau Veto Optimization in 2012
Introduction
This twiki provides documentation for the electron-faking-tau veto (eveto) reoptimization in autumn 2012.
The veto is a BDT made with
TMVA and scripted via taumva. The veto is trained and tested on taus (signal) and electrons (background) from DY/Zleplep MC.
Contact Alex Tuna (tuna at cern dot ch) for questions are concerns about the measurement.
Plots
Plots of input variables can be found
here
.
For now, plots of BDT investigations can be found in the
Version notes section.
Input variables
17 variables were considered for the BDT. (Please alert Alex if the descriptions are incorrect!) Some notes are included.
-
absdeltaeta
: The difference between the cluster eta and the leading track eta.
- Created by hand to mimic variables like
el_deltaEta1
, which are used in electron identification.
-
absdeltaphi
: The difference between the cluster phi and the leading track phi.
- See notes for
absdeltaeta
.
-
calcVars_corrFTrk
: 1.0 / etOverPtLeadTrk
, with a nVertex correction.
-
calcVars_corrCentFrac
: Same as seedCalo_centFrac
, with a nVertex correction applied.
-
calcVars_ChPiEMEOverCaloEME
: (Leading track momentum - energy of the HCAL) divided by the energy of the ECAL.
- E3 is part of the HCAL term.
- "Leading track momentum" is actually sum of track momenta, but here we only have one track.
-
calcVars_EMFractionAtEMScale_moveE3
: The fraction of energy deposited in the ECAL versus the energy deposited in the ECAL + HCAL.
- Here, E3 is moved from the HCAL term to the ECAL term by hand.
- This version of EMFraction is better modeled than the d3pd version because E3 is mis-modeled in 2012.
-
calcVars_pi0BDTPrimaryScore
: The score of the tau pi0 BDT.
- This is not used in the training because the tight timescale does not allow for sufficient studies.
-
calcVars_PSSFraction
: The fraction of energy deposited in the pressampler plus strips versus the energy deposited in the ECAL + HCAL.
-
etOverPtLeadTrk
: The energy of the ECAL + HCAL divided by the momentum of the leading track.
- Not used because it is redundant with
calcVars_corrFTrk
.
-
seedCalo_centFrac
: The amount of ECAL energy with dR < 0.1 divded by the total ECAL energy (does not include E3).
- Not used because it is redundant with
calcVars_corrCentFrac
.
-
seedCalo_hadRadius
: The equivalent of emRadius
, except in the HCAL.
- Not used at high eta because the contribution of E3 is mis-modeled.
-
seedCalo_isolFrac
: The amount of ECAL energy in a dR ring from 0.1 to 0.2 divded by the total ECAL energy (does not include E3).
-
seedTrk_hadLeakEt
: The amount of energy in the zeroth layer of the HCAL divided by the leading track momentum.
-
seedTrk_secMaxStripEt
: The maximum single strip energy in a 100x3 eta/phi window of the strips.
- Only used in the barrel because it is not well defined in the crack and for track eta > 1.7.
-
seedTrk_secMaxStripEtOverPt
: seedTrk_secMaxStripEt
divided by the leading track momentum.
- Created by hand to reduce momentum-dependence of
seedTrk_secMaxStripEt
.
-
seedTrk_sumEMCellEtOverLeadTrkPt
: The energy of the ECAL divided by the momentum of the leading track (includes E3).
- Not used because it is redundant with
etOverPtLeadTrk
and calcVars_corrFTrk
.
-
TRTHTOverLT_LeadTrk
: The ratio of high threshold TRT hits to low threshold hits of the leading track.
- Not used at high eta because the TRT coverage ends near 2.00.
Some helpful links for understanding how the input variables are made:
Grooming
Before being fed to taumva, taus from the d3pd are groomed as follows:
- pT > 20 GeV
- eta (cluster) < 2.5
- numTrack = 1
- author = 1 | 3
- pass JetBDTLoose
- categorization:
- signal:
- background:
- Leading truth object in dR cone is electron or photon (same as 2012 SF measurement)
- No overlap with tightPP electrons
Sample statistics
Sample |
pT range (GeV) |
eta range |
# events |
Ztautau |
15 -- 60 |
0.00 -- 0.80 |
908166 |
Ztautau |
15 -- 60 |
0.80 -- 1.37 |
606908 |
Ztautau |
15 -- 60 |
1.37 -- 1.52 |
150065 |
Ztautau |
15 -- 60 |
1.52 -- 2.00 |
484130 |
Ztautau |
15 -- 60 |
2.00 -- 3.00 |
421871 |
Zee |
15 -- 60 |
0.00 -- 0.80 |
541343 |
Zee |
15 -- 60 |
0.80 -- 1.37 |
490158 |
Zee |
15 -- 60 |
1.37 -- 1.52 |
494821 |
Zee |
15 -- 60 |
1.52 -- 2.00 |
370932 |
Zee |
15 -- 60 |
2.00 -- 3.00 |
246371 |
DYtautau |
60 -- 999 |
0.00 -- 0.80 |
427489 |
DYtautau |
60 -- 999 |
0.80 -- 1.37 |
214597 |
DYtautau |
60 -- 999 |
1.37 -- 1.52 |
36633 |
DYtautau |
60 -- 999 |
1.52 -- 2.00 |
89852 |
DYtautau |
60 -- 999 |
2.00 -- 3.00 |
39533 |
DYee |
60 -- 999 |
0.00 -- 0.80 |
43608 |
DYee |
60 -- 999 |
0.80 -- 1.37 |
37467 |
DYee |
60 -- 999 |
1.37 -- 1.52 |
99135 |
DYee |
60 -- 999 |
1.52 -- 2.00 |
21086 |
DYee |
60 -- 999 |
2.00 -- 3.00 |
13900 |
Version notes
v10-02 (2012-12-02-14h29m14s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
- 6 or 5 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- First training of endcap, low pT category. NB: secMaxStripEtOverPt and TRTHToverLT are not considered here because the variables are not valid in the endcap.
- Three N-1 trainings crashed.
- Some over-training, but not dramatic.
- Action item: Discuss with SM.
v10-01 (2012-12-02-12h39m40s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
- 7 or 6 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- First training of endcap, low pT category. NB: secMaxStripEtOverPt and TRTHToverLT are not considered here because the variables are not valid in the endcap.
- No trainings crashed.
- Some over-training, but not dramatic.
- Action item: Remove
absdeltaphi
, retrain. Discuss with SM.
v10-00 (2012-12-02-12h08m03s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
- 8 or 7 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- First training of endcap, low pT category. NB: secMaxStripEtOverPt and TRTHToverLT are not considered here because the variables are not valid in the endcap.
- One N-1 training crashed, when
absdeltaphi
was removed.
- Some over-training, but not dramatic.
- Action item: Remove
corrCentFrac
, retrain.
v09-03 (2012-12-02-11h47m19s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedTrk_hadLeakEt
tau_TRTHTOverLT_LeadTrk
- 6 or 5 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- NB: secMaxStripEtOverPt is not considered here because the variable is not filled in the crack.
- No trainings crashed.
- Minor over-training.
- Action item: Move to next kinematic category.
v09-02 (2012-12-02-11h24m34s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedTrk_hadLeakEt
tau_TRTHTOverLT_LeadTrk
- 7 or 6 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- NB: secMaxStripEtOverPt is not considered here because the variable is not filled in the crack.
- One N-1 training crashed, when
EMFraction
was removed.
- Some over-training, but not dramatic.
- Action item: Remove
absdeltaphi
, retrain.
v09-01 (2012-12-02-10h58m10s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_TRTHTOverLT_LeadTrk
- 8 or 7 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- NB: secMaxStripEtOverPt is not considered here because the variable is not filled in the crack.
- No trainings crashed.
- Some over-training, but not dramatic.
- Action item: Remove
isolFrac
, retrain.
v09-00 (2012-12-02-10h27m51s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_TRTHTOverLT_LeadTrk
- 9 or 8 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- First training of barrel extended, low pT category. NB: secMaxStripEtOverPt is not considered here because the variable is not filled in the crack.
- No trainings crashed.
- Some over-training, but not dramatic.
- Action item: Remove
corrCentFrac
, retrain.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 9 variables |
0.893 |
Drop absdeltaeta |
0.862 |
Drop absdeltaphi |
0.878 |
Drop corrFTrk |
0.860 |
Drop corrCentFrac |
0.890 |
Drop EMFraction |
0.857 |
Drop PSSFraction |
0.869 |
Drop isolFrac |
0.887 |
Drop hadLeakEt |
0.869 |
Drop TRTHTOverLT |
0.781 |
v08-04 (2012-12-02-10h09m54s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_seedTrk_hadLeakEt
tau_TRTHTOverLT_LeadTrk
- 5 or 4 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- NB: secMaxStripEtOverPt is not considered here because the variable is not filled in the crack.
- No trainings crashed.
- Minimal over-training.
- Action item: Move to next kinematic category.
v08-03 (2012-12-02-09h50m28s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_seedTrk_hadLeakEt
tau_TRTHTOverLT_LeadTrk
- 6 or 5 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- NB: secMaxStripEtOverPt is not considered here because the variable is not filled in the crack.
- No trainings crashed.
- Some over-training, but not dramatic.
- Action item: Remove
absdeltaphi
, retrain.
v08-02 (2012-12-02-08h59m45s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_PSSFraction
tau_seedTrk_hadLeakEt
tau_TRTHTOverLT_LeadTrk
- 7 or 6 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- NB: secMaxStripEtOverPt is not considered here because the variable is not filled in the crack.
- No trainings crashed.
- Some over-training, but not dramatic.
- Action item: Remove
PSSFraction
, retrain.
v08-01 (2012-12-02-08h34m42s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedTrk_hadLeakEt
tau_TRTHTOverLT_LeadTrk
- 8 or 7 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- NB: secMaxStripEtOverPt is not considered here because the variable is not filled in the crack.
- No trainings crashed.
- Some over-training, but not dramatic.
- Action item: Remove
EMFraction
, retrain.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 8 variables |
0.899 |
Drop absdeltaeta |
0.868 |
Drop absdeltaphi |
0.877 |
Drop corrFTrk |
0.872 |
Drop corrCentFrac |
0.876 |
Drop EMFraction |
0.888 |
Drop PSSFraction |
0.879 |
Drop hadLeakEt |
0.868 |
Drop TRTHTOverLT |
0.750 |
v08-00 (2012-12-02-07h59m12s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_TRTHTOverLT_LeadTrk
- 9 or 8 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- First training of crack, low pT category. NB: secMaxStripEtOverPt is not considered here because the variable is not filled in the crack.
- No trainings crashed.
- Some over-training, but not dramatic.
- Action item: Remove
isolFrac
, retrain.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 9 variables |
0.899 |
Drop absdeltaeta |
0.870 |
Drop absdeltaphi |
0.882 |
Drop corrFTrk |
0.875 |
Drop corrCentFrac |
0.896 |
Drop EMFraction |
0.892 |
Drop PSSFraction |
0.883 |
Drop isolFrac |
0.899 |
Drop hadLeakEt |
0.868 |
Drop TRTHTOverLT |
0.760 |
v07-04 (2012-12-02-06h54m45s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 6 or 5 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- No trainings crashed.
- Negligible over-training.
- No obvious input variable candidate for removal.
- Action item: Discuss with SM. Move to next kinematic category.
v07-03 (2012-12-02-06h32m05s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 7 or 6 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- One N-1 training crashed, when
corrFTrk
was removed.
- Some over-training, but not dramatic, and reduced from previous iteration.
- Action item: Remove
PSSFraction
, retrain.
v07-02 (2012-12-02-05h05m16s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 8 variables |
0.835 |
Drop absdeltaeta |
0.775 |
Drop corrFTrk |
0.785 |
Drop EMFraction |
0.815 |
Drop PSSFraction |
0.818 |
Drop isolFrac |
0.820 |
Drop hadLeakEt |
0.815 |
Drop secMaxStripEtOverPt |
0.789 |
Drop TRTHTOverLT |
0.718 |
v07-01 (2012-12-02-04h37m07s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 9 variables |
0.853 |
Drop absdeltaeta |
0.804 |
Drop absdeltaphi |
0.835 |
Drop corrFTrk |
0.802 |
Drop EMFraction |
0.828 |
Drop PSSFraction |
0.833 |
Drop isolFrac |
N/A |
Drop hadLeakEt |
0.835 |
Drop secMaxStripEtOverPt |
N/A |
Drop TRTHTOverLT |
0.729 |
v07-00 (2012-12-02-03h59m35s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 10 or 9 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- First training of barrel outer, low pT category.
- One N-1 training crashed, when
isolFrac
was removed.
- Some over-training, but not dramatic.
- Action item: Remove
corrCentFrac
, retrain.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 10 variables |
0.856 |
Drop absdeltaeta |
0.809 |
Drop absdeltaphi |
0.841 |
Drop corrFTrk |
0.806 |
Drop corrCentFrac |
0.853 |
Drop EMFraction |
0.834 |
Drop PSSFraction |
0.840 |
Drop isolFrac |
N/A |
Drop hadLeakEt |
0.837 |
Drop secMaxStripEtOverPt |
0.813 |
Drop TRTHTOverLT |
0.738 |
v06-04 (2012-12-01-17h42m19s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 6 or 5 input variables, 20 trees.
- Signal scores, background scores, and ROC curves
- Two N-1 trainings crashed, when
absdeltaeta
or corrFTrk
was removed.
- Still over-training, but improved from previous iteration.
- Action item: Discuss with SM. No obvious candidate for removal.
v06-03 (2012-12-01-17h34m29s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 7 or 6 input variables, 20 trees.
- Signal scores, background scores, and ROC curves
- One N-1 trainings crashed, when
absdeltaeta
was removed.
- Still significant over-training, but improved from previous iteration.
- Action item: Remove
isolFrac
, retrain in hope over-training is reduced with fewer input variables.
v06-02 (2012-12-01-17h25m55s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 8 or 7 input variables, 20 trees.
- Signal scores, background scores, and ROC curves
- Two N-1 trainings crashed, when
isolFrac
or absdeltaeta
was removed.
- Still significant over-training, but improved from previous iteration.
- Action item: Remove
absdeltaphi
, retrain in hope over-training is reduced with fewer input variables.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 8 variables |
0.805 |
Drop absdeltaeta |
N/A |
Drop absdeltaphi |
0.801 |
Drop corrFTrk |
0.764 |
Drop EMFraction |
0.775 |
Drop isolFrac |
N/A |
Drop hadLeakEt |
0.715 |
Drop secMaxStripEtOverPt |
0.758 |
Drop TRTHTOverLT |
0.691 |
v06-01 (2012-12-01-17h17m14s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 9 or 8 input variables, 20 trees.
- Signal scores, background scores, and ROC curves
- One N-1 training crashed, when
isolFrac
was removed.
- Still significant over-training.
- Action item: Remove
corrCentFrac
, retrain in hope over-training is reduced with fewer input variables.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 9 variables |
0.810 |
Drop absdeltaeta |
0.792 |
Drop absdeltaphi |
0.804 |
Drop corrFTrk |
0.763 |
Drop corrCentFrac |
0.805 |
Drop EMFraction |
0.789 |
Drop isolFrac |
N/A |
Drop hadLeakEt |
0.734 |
Drop secMaxStripEtOverPt |
0.771 |
Drop TRTHTOverLT |
0.697 |
v06-00 (2012-12-01-17h07m32s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 10 or 9 input variables, 20 trees.
- Signal scores, background scores, and ROC curves
- First training of merge barrel, high pT category.
- No trainings crashed.
- Still significant over-training.
- Action item: Remove
PSSFraction
, retrain in hope over-training is reduced with fewer input variables.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 10 variables |
0.817 |
Drop absdeltaeta |
0.797 |
Drop absdeltaphi |
0.803 |
Drop corrFTrk |
0.777 |
Drop corrCentFrac |
0.810 |
Drop EMFraction |
0.791 |
Drop PSSFraction |
0.810 |
Drop isolFrac |
0.803 |
Drop hadLeakEt |
0.738 |
Drop secMaxStripEtOverPt |
0.787 |
Drop TRTHTOverLT |
0.712 |
v05-02 (2012-12-01-16h21m03s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 9 or 8 input variables, 20 trees.
- Signal scores, background scores, and ROC curves
- No trainings crashed.
- Still significant over-training.
- Action item: After discussing with SM, merge barrel inner and barrel outer in high pT bin. Restart with 10 input variables, 20 trees.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 9 variables |
0.846 |
Drop absdeltaeta |
0.831 |
Drop absdeltaphi |
0.836 |
Drop corrFTrk |
0.810 |
Drop corrCentFrac |
0.829 |
Drop EMFraction |
0.813 |
Drop isolFrac |
0.832 |
Drop hadLeakEt |
0.739 |
Drop secMaxStripEtOverPt |
0.795 |
Drop TRTHTOverLT |
0.721 |
v05-01 (2012-12-01-16h08m50s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 10 or 9 input variables, 20 trees.
- Signal scores, background scores, and ROC curves
- No trainings crashed.
- Still significant over-training.
- Action item: Remove
PSSFraction
, retrain in hope over-training is reduced with fewer input variables.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 10 variables |
0.844 |
Drop absdeltaeta |
0.824 |
Drop absdeltaphi |
0.836 |
Drop corrFTrk |
0.805 |
Drop corrCentFrac |
0.820 |
Drop EMFraction |
0.825 |
Drop PSSFraction |
0.846 |
Drop isolFrac |
0.821 |
Drop hadLeakEt |
0.742 |
Drop secMaxStripEtOverPt |
0.799 |
Drop TRTHTOverLT |
0.734 |
v05-00 (2012-12-01-15h57m23s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 10 variables |
0.857 |
Drop absdeltaeta |
0.836 |
Drop absdeltaphi |
0.843 |
Drop corrFTrk |
0.809 |
Drop corrCentFrac |
0.836 |
Drop EMFraction |
0.832 |
Drop PSSFraction |
0.849 |
Drop isolFrac |
0.832 |
Drop hadLeakEt |
0.749 |
Drop secMaxStripEtOverPt |
0.811 |
Drop TRTHTOverLT |
0.734 |
v04-10 (2012-12-01-14h53m28s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 8 or 7 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- No trainings crashed.
- Over-training is in increased relative to 10 trees, but still not dramatic.
- Action item: Consult with SM. Move to next kinematic category.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 8 variables |
0.878 |
Drop absdeltaeta |
0.842 |
Drop absdeltaphi |
0.862 |
Drop corrFTrk |
0.825 |
Drop corrCentFrac |
0.860 |
Drop EMFraction |
0.860 |
Drop hadLeakEt |
0.826 |
Drop secMaxStripEtOverPt |
0.824 |
Drop TRTHTOverLT |
0.731 |
v04-09 (2012-12-01-14h26m53s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 9 variables |
0.883 |
Drop absdeltaeta |
0.848 |
Drop absdeltaphi |
0.870 |
Drop corrFTrk |
0.838 |
Drop corrCentFrac |
0.870 |
Drop EMFraction |
0.873 |
Drop PSSFraction |
0.878 |
Drop hadLeakEt |
0.845 |
Drop secMaxStripEtOverPt |
0.841 |
Drop TRTHTOverLT |
0.740 |
v04-08 (2012-12-01-13h48m46s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 10 variables |
0.886 |
Drop absdeltaeta |
0.855 |
Drop absdeltaphi |
0.873 |
Drop corrFTrk |
0.841 |
Drop corrCentFrac |
0.882 |
Drop EMFraction |
0.875 |
Drop PSSFraction |
0.882 |
Drop isolFrac |
0.883 |
Drop hadLeakEt |
0.894 |
Drop secMaxStripEtOverPt |
0.845 |
Drop TRTHTOverLT |
0.752 |
v04-07 (2012-12-01-09h31m15s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 6 or 5 input variables, 80 trees.
- Signal scores, background scores, and ROC curves
- No trainings crashed.
- Over-training is minor, in all configurations.
- Minor improvement by going to 80 trees. Feels like asymptotically approaching most optimal number of trees to train with.
- Action item: try 10 vars with 40 trees.
v04-06 (2012-12-01-09h09m14s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 6 or 5 input variables, 40 trees.
- Signal scores, background scores, and ROC curves
- No trainings crashed.
- Over-training is minor, in all configurations.
- Some improvement by going to 40 trees, not as dramatic as going from 10-20.
- Action item: Increase number of trees from 40 to 80 to gauge affect.
v04-05 (2012-12-01-08h50m31s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 6 or 5 input variables, 20 trees.
- Signal scores, background scores, and ROC curves
- No trainings crashed.
- Over-training is minor, in all configurations.
- Significant improvement by going to 20 trees.
- Action item: Increase number of trees from 20 to 40 to gauge affect.
v04-04 (2012-12-01-08h29m42s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 6 or 5 input variables, 10 trees.
- Signal scores, background scores, and ROC curves
- NB: Plots indicate v04-03 but they are v04-04.
- No trainings crashed.
- Over-training is minor, in all configurations.
- No input variable is obvious candidate for removal. This could be a final set.
- Action item: Increase number of trees from 10 to 20 to gauge affect.
v04-03 (2012-12-01-08h11m40s)
tau_absdeltaeta
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 7 or 6 input variables, 10 trees.
- Signal scores, background scores, and ROC curves
- No trainings crashed.
- Over-training is minor, in all configurations.
- Action item: Remove "least valuable" variable (
corrCentFrac
) and do N-1 trainings with the remaining 6 input variables.
v04-02 (2012-12-01-07h50m31s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 8 or 7 input variables, 10 trees.
- Signal scores, background scores, and ROC curves
- One N-1 training crashed: training without
absdeltaeta
- Over-training is minor, in all configurations.
- Action item: Remove "least valuable" variable (
absdeltaphi
) and do N-1 trainings with the remaining 7 input variables.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 8 variables |
0.851 |
Drop absdeltaeta |
N/A |
Drop absdeltaphi |
0.837 |
Drop corrFTrk |
0.801 |
Drop corrCentFrac |
0.833 |
Drop EMFraction |
0.835 |
Drop hadLeakEt |
0.794 |
Drop secMaxStripEtOverPt |
0.799 |
Drop TRTHTOverLT |
0.702 |
v04-01 (2012-12-01-07h17m38s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 9 or 8 input variables, 10 trees.
- Signal scores, background scores, and ROC curves
- No trainings crashed.
- Over-training is minor, in all configurations.
- Action item: Remove "least valuable" variable (
PSSFraction
) and do N-1 trainings with the remaining 8 input variables.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 9 variables |
0.860 |
Drop absdeltaeta |
0.822 |
Drop absdeltaphi |
0.845 |
Drop corrFTrk |
0.810 |
Drop corrCentFrac |
0.844 |
Drop EMFraction |
0.845 |
Drop PSSFraction |
0.851 |
Drop hadLeakEt |
0.815 |
Drop secMaxStripEtOverPt |
0.818 |
Drop TRTHTOverLT |
0.711 |
v04-00 (2012-12-01-06h13m24s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEtOverPt
tau_TRTHTOverLT_LeadTrk
- 10 or 9 input variables, 10 trees.
- Signal scores, background scores, and ROC curves
- Two N-1 trainings crashed: training without
corrFTrk
and training without secMaxStripEtoverPt
.
- Over-training is minor, in all configurations.
- TRTHT ratio is super important, as expected.
- A number of variables qualify is "less valuable". See table below.
- Action item: Remove "least valuable" variable (
isolFrac
) and do N-1 trainings with the remaining 9 input variables.
Input variable configuration |
Sig eff for bkg @ 0.01 |
All 10 variables |
0.864 |
Drop absdeltaeta |
0.832 |
Drop absdeltaphi |
0.850 |
Drop corrFTrk |
N/A |
Drop corrCentFrac |
0.858 |
Drop EMFraction |
0.853 |
Drop PSSFraction |
0.859 |
Drop isolFrac |
0.860 |
Drop hadLeakEt |
0.822 |
Drop secMaxStripEtOverPt |
N/A |
Drop TRTHTOverLT |
0.727 |
v02-06 (2012-11-30-04h47m20s)
tau_calcVars_EMFractionAtEMScale_moveE3
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEt
tau_TRTHTOverLT_LeadTrk
- 5 input variables, 10 trees.
- Signal scores, background scores, and ROC curves
- No trainings crash.
- Performance of low pT bin improves without additional over-training. Good!
- Performance of high pT bin improves but with additional over-training. Bad.
- I am now suspicious of secMaxStripEt because I found out it is simply a raw energy, not energy / lead track pT or something like that.
- Action item: Discuss with SM.
- Action follow up: Discussed with SM, settled on 10 variables which should "in principle" be important for training. Will focus on those.
v02-05 (2012-11-30-04h36m22s)
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEt
tau_TRTHTOverLT_LeadTrk
- 4 input variables, 10 trees.
- Signal scores, background scores, and ROC curves
- Low pT bin did not crash. This indicates this is at least partly correlated with available statistics.
- Over-training is reduced further, possibly to acceptable levels. I will now focus on improving performance, which has degraded since I reduced the list of input variables.
- Action item: Re-introduce EMFraction.
v02-04 (2012-11-30-04h19m59s)
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEt
tau_TRTHTOverLT_LeadTrk
- 4 input variables, 10 trees.
- Signal scores, background scores, and ROC curves
- pT bin 30-60 GeV bin crashed. It appears there is a threshold between 50 and 20 trees for this configuration for crashing.
- Over-training is significantly reduced, and performance degradation is small. I will stick with 10 trees for now.
- Action item: Merge some pT bins to see if statistics boost helps.
v02-03 (2012-11-30-04h07m51s)
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEt
tau_TRTHTOverLT_LeadTrk
- 4 input variables, 50 trees.
- Signal scores, background scores, and ROC curves
- NB: The plots say v02-02, but these are v02-03. If in doubt, consult the date in the URL.
- pT bin 30-60 GeV did not crash here. Only change was increased number of trees. This behavior is not understood, but hopefully Noel's patches to taumva will help.
- Performance is only slightly improved, at the cost of more significant over-training.
- Action item: Go to 10 trees, keep 4 input variables.
v02-02 (2012-11-30-03h55m42s)
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEt
tau_TRTHTOverLT_LeadTrk
- 4 input variables, 20 trees.
- Signal scores, background scores, and ROC curves
- pT bin 30-60 GeV crashed. Not clear why.
- Performance degradation (compared to 7 input variables) is significant.
- Over-training is reduced but still present.
- Action item: Go back to 50 trees, keep 4 input variables.
v02-01 (2012-11-29-13h34m54s)
tau_absdeltaeta
tau_calcVars_EMFractionAtEMScale_moveE3
tau_etOverPtLeadTrk
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEt
tau_TRTHTOverLT_LeadTrk
- 7 input variables, 20 trees.
- Signal scores, background scores, and ROC curves
- Over-training is reduced but still present. Correlated with available statistics?
- Action item: Reduce number of input variables to see if this reduces over-training.
v02-00 (2012-11-29-13h05m27s)
tau_absdeltaeta
tau_calcVars_EMFractionAtEMScale_moveE3
tau_etOverPtLeadTrk
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEt
tau_TRTHTOverLT_LeadTrk
v01-00 (2012-11-29-04h46m15s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_ChPiEMEOverCaloEME
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_PSSFraction
tau_etOverPtLeadTrk
tau_seedCalo_isolFrac
tau_seedTrk_hadLeakEt
tau_seedTrk_secMaxStripEt
(not used at high eta)
tau_TRTHTOverLT_LeadTrk
(not used at high eta)
- Signal scores, background scores, and ROC curves
- Variables expected to offer minor gains (redundant or not powerful) were removed.
- Two trainings did not converge: pT 60--100 GeV, eta 2.00--3.00 and pT 100+ GeV, eta 2.00--3.00.
- Action item: Reduce variable list even further. Focus on specific kinematic region (inner barrel eta bin 0.00 - 0.80).
v00-00 (2012-11-28-08h22m31s)
tau_absdeltaeta
tau_absdeltaphi
tau_calcVars_corrFTrk
tau_calcVars_corrCentFrac
tau_calcVars_ChPiEMEOverCaloEME
tau_calcVars_EMFractionAtEMScale_moveE3
tau_calcVars_pi0BDTPrimaryScore
tau_calcVars_PSSFraction
tau_etOverPtLeadTrk
tau_seedCalo_centFrac
tau_seedCalo_hadRadius
tau_seedCalo_isolFrac
tau_seedTrk_secMaxStripEt
(not used at high eta)
tau_seedTrk_sumEMCellEtOverLeadTrkPt
tau_TRTHTOverLT_LeadTrk
(not used at high eta)
- First iteration
- Signal scores, background scores, and ROC curves
- All 15 variables considered in all regions, except
secMaxStripEt
and TRTHTOverLT_LeadTrk
were excluded at high eta because they are unstable/undefined here.
- A handful of trainings did not converge.
- Among converged trainings, over-training is observed everywhere.
- Action item: Reduce variable list significantly in hopes that that trainings will converge and over-training will be reduced. Re-introduce variables one-by-one to judge value.
Meetings
Meetings (previous e-veto)
Indico search for "Harvey Maddocks"
Links
--
AlexanderTuna - 27-Nov-2012