Responsible |
Main.JohanSebastianBonilla |
Substructure Top Tagger (Part of Boosted Jet Taggers)
Boosted Jet Taggers: Official JSS Tagger Package
Introduction
In an effort to simplify the end-user interface for the various boosted object taggers supported by the JSS subgroup, all supported taggers are built into a common environment, IJSSTagger. The new interface is based on
BoostedJetTaggers-00-00-25/
and is intended to be a collection of all ATLAS-supported taggers. These taggers are designed to identify the sources of hadronic decays in the ATLAS detector (W/Z/Higgs bosons, top quarks, q/g discrimination, etc.). The methods developed and supported include, but are not limited to:
- DNN (high-level & low-level variables)
- BDT
- 'basic' MVA (2 variable taggers, typically mass+substructure)
- Shower Deconstruction
Currently all necessary files are saved in the (BoostedJetTaggers) package, but as development continues, some of these data files (.xml, .json, .dat, .root) will be moved to the central area (SVN). Update to follow.
A JIRA
page exists for the discussion of methodologies and conventions to be applied for all ATLAS taggers.
IJSSTagger environment is used by the JSS Tagger Package containing the following boosted object taggers: Hbb Tagger, W/Z Tagger, top Tagger, NN Tagger, and BDT Tagger. This twiki will descibe how the SubstructureTopTagger is used in the IJSSTagger environment.
The JSS Tagger Interface
To unify the structure of all jet substructure taggers, a base class (interface) has been defined: IJSSTagger, which inherits from
IJetSelector.
This interface ensures that all tagging algorithms follow the same structure (no matter who writes them) and users know what to expect from any tagger they implement.
The following functions are defined by the interface:
Function |
Description |
StatusCode initialize() |
Initialize all of the attributes of the tagger, setup of files to access and working points |
int isTagged( xAOD::Jet jet ) |
Get the tagging result: A return value of 0 means that all cuts PASSED |
StatusCode finalize() |
clear or delete anything necessary |
The initialize() function is called once per tagger instance, along with the constructor and its arguments. The result() function is called in the execute part of your analysis and should be called once per large-R jet. The finalize() functions is used in the finalize portion of the analysis and cleans residual memory from the IJSSTagger instance.
- Current package located at
-
Header include:
#include "BoostedJetTaggers/IJSSTagger.h"
- Other package dependencies
Usage Recommendations 2016
Warning: Can't find topic Sandbox.BoostedJetTaggingRecommendation2016
Top Tagger Information
This tagger is recommended for use on
AntiKt10LCTopoTrimmedPtFrac5SmallR20 jets, with pt > 300GeV, and |eta| < 2.0
Initializing the Tool
In addition to initializing the IJSSTagger instance using the IJSSTagger::initialize() function, the
TopTagger constructor should also be called. It has three forms of the function:
- SubstructureTopTagger *t = configSubstTagger(tagName (string), taggingShortCut (string));
- tagName is the name of the tool and also what the tool will set as an attribute to tagged jets
- taggingShortCut can be one of the following keys
*
tagName | Description |
SmoothCut_50 | pT dependent cut maintaining ~50% signal efficiency |
SmoothCut_80 | pT dependent cut maintaining ~80% signal efficiency |
FixedCut_LowPt_50 | 50% efficiency in pT < 500 [GeV] region |
FixedCut_LowPt_80 | 80% efficiency in pT < 500 [GeV] region |
FixedCut_HighPt_50 | 50% efficiency in pT > 500 [GeV] region |
FixedCut_HighPt_80 | 80% efficiency in pT > 500 [GeV] region |
-
- SubstructureTopTagger *t = configSubstTagger(tagName (string), cutdescs vector<string>);
- tagName is the name of the tool and also what the tool will set as an attribute to tagged jets
- cutdescs is a vector of strings, where each string describes a cut in the form "12.34<Att<45.67" OR "12.34<Att" OR "Att<45.67". The cuts will be applied using a logical AND.
- Att can be a jet attribute ot one of the known keys:
- "abs(eta)"
- "Tau32_wta" (n-subjettiness ratio)
- "Tau21_wta" (n-subjettiness ratio)
- The string entry can also be one of the following available shortcuts
-
tagName | Description |
SmoothMassCut_50 | pT dependent cut maintaining ~50% signal efficiency |
SmoothMassCut_80 | pT dependent cut maintaining ~80% signal efficiency |
SmoothTau32Cut_50 | pT dependent cut maintaining ~50% signal efficiency |
SmoothTau32Cut_80 | pT dependent cut maintaining ~80% signal efficiency |
SmoothMassCut_50_loweta | pT dependent cut maintaining ~50% signal efficiency. Optimized for jets in |eta| < 1.0 region |
SmoothMassCut_50_higheta | pT dependent cut maintaining ~50% signal efficiency. Optimized for jets in 1.0 < |eta| < 2.0 region |
SmoothMassCut_80_loweta | pT dependent cut maintaining ~80% signal efficiency. Optimized for jets in |eta| < 1.0 region |
SmoothMassCut_80_higheta | pT dependent cut maintaining ~80% signal efficiency. Optimized for jets in 1.0 < |eta| < 2.0 region |
SmoothTau32Cut_50_loweta | pT dependent cut maintaining ~50% signal efficiency. Optimized for jets in |eta| < 1.0 region |
SmoothTau32Cut_50_higheta | pT dependent cut maintaining ~50% signal efficiency. Optimized for jets in 1.0 < |eta| < 2.0 region |
SmoothTau32Cut_80_loweta | pT dependent cut maintaining ~80% signal efficiency. Optimized for jets in |eta| < 1.0 region |
SmoothTau32Cut_80_higheta | pT dependent cut maintaining ~80% signal efficiency. Optimized for jets in 1.0 < |eta| < 2.0 region |
-
- Ex 1: SubstructureTopTagger * t = configSubstTagger("MassAndTau32Tag", {"120000<m", "Tau32_wta<0.61"} );
- Ex 2: SubstructureTopTagger * t = configSubstTagger("SmoothMTag", {"SmoothMassCut_50", "SmoothTau32Cut_50"} );
- SubstructureTopTagger *t = configSubstTagger(tagName (string), cutdescs vector< vector<string> >);
- tagName is the name of the tool and also what the tool will set as an attribute to tagged jets
- cutdescs is a vector of vectors of strings, where each vector of strings, the string entries are treated as described in the previous bullet point and the cuts are applied to each vector of strings using a logical AND. The separate vectors of strings will then be combined using a logical OR.
- Ex:SubstructureTopTagger * t = configSubstTagger("MassAndTau32Tag",
{ {"abs(eta)<1","120000<m", "Tau32_wta<0.61"} ,
{"1<abs(eta)","120000<m", "Tau32_wta<0.61"} } );
Tool Use, Jet Decoration, and Return Value
In the execute() portion of your analysis, the
TopTagger tool that was initialized can be used. Inheriting the same environment as the other taggers in the
BoostedTJetTaggers package, the tagging results can be obtained by calling the isTagged( xAOD::Jet jet ) function.
Example:
returnValue = m_SubstructureTopTagger.isTagged( xAOD::Jet jet ); // run for each large-R jet
returnValue is an integer (0 or 1) denoting whether the large-R jet used as an argument is tagged or not. The isTagged() function may also decorate the argument jet. Is this true for for
SubstructureTopTagger?
Samples
Also took this from the old twiki. Are these up to date?
First look at available MC15 samples by Lily
here
Status for the Top-Tagging samples
here
For the signal: ttbar_allhad or ttbar_nonallhad in mtt slice, for several variations (in DC14 and MC15 for some variations)
For background: JZ*W samples (similar to W-tagging).
Signal samples
(I took this straight from the old twiki)
Editors contributors : Julien Caudron, James Ferrando
Samples preparation progress : started
contributors : Julien Caudron (+ Chris Delitzsch, Lily Asquith)
comments :
- Make sure that all the needed samples are available or requested.
Simple Top Tagger variables optimisation, ROC curve, WP definitions
progress : preliminary results
contributors : Dortmund, Grenoble
comments :
- Dortmund contribution using 14TeV NTUP_COMMON
- Grenoble study in xOAD
- Pile-up optimisation : also done by Dormund. In this case, it's done with specific high pile-up samples, we can probably add this without update to xAOD
- Do we want to revisit the choice of variables according to their uncertainties ? (i.e., to choose a variable that is less efficient to reject background but that is less sensible to the modeling)
Simple Top Tagger variables uncertainties
progress : preliminary results
contributors : Grenoble, Chris Delitzsch, James Ferrando
comments :
- Contributions from Grenoble, to be done with the correct samples, for the correct variations (work in progress)
- Run-1 uncertainties for relevant variables (Chris and James)
Simple Top Tagger scale factor and scale factor uncertainty
progress :
contributors :
comments :
Negligible variations cross-check
progress :
contributors : need contribution
comments :
- Low priority ?
- Few variations will not be included in the whole process because we expect their effect to be negligible. Only few samples exist, and we can check that, indeed, those variations are negligible.
Mass calibration for large-R jet
progress :
contributors : shared with W-tagging ?
comments :
- currently, the text will be written in the top-tagging paper repository, but it can move to the w-tagging paper if it's more relevant
Test of the Simple Top Tagger with the ttH signal
comments :
- proposed by the boosted ttH group, will be studied here.
-- JohanSebastianBonilla - 2016-11-24