V-tagger with jet mass de-corellation
A machine-learning based V-boson tagger was developed in the last few years. The tagger is trained on a set of UFO large-radius (R=1.0) jets substructure variables which are known to have a discriminant power in distinguishing jets originated from a heavy 2-prong resonance from jets originated from the usual QCD background (low-mass quark/gluon).
A recent PUBnote with further details was produced
ATL-PHYS-PUB-2021-029 . We refer to such PUBnote for further details. Let us remind that the tagger is available for UFO jets with:
With respect to the published note, the network was re-trained to correct for a couple of minor issues discovered in very first analysis checks:
- bug in the calculation of KtDr jet substructure variable
- the KtDr variable calculation was not using a proper phi definition. Now bug is fixed in athena.
- this resulted in an un-physical dip in the jet phi disitrubtion at phi~0, both for KtDr (phi) and the tagged jets phi distributions.
- mass de-correlation at low jet pT
- for the mass de-corellated tagger, it was observed the presence of a residual mass shaping in the tagged QCD spectrum. This is an unwanted behavior for analyses relying on bump-hunting on smooth background mass distributions.
The latest developments were reported at the
JetTag group:
link1 link2
Implementation in athena
The taggers are available to be used in athena through the
BoostedJetTaggers package.
Input and configuration files for the
BoostedJetTaggers are available at
/afs/cern.ch/work/d/dmelini/public/Wtagger_KtDrFix/
The
KtDr variable is one of the variables used in the training if the ML-based V-tagger.The distribution of
KtDr (phi) before and after the fix in the
KtDr calcualtion, for jets used in the training of the network, can be seen in the plot below. The calculation of
KtDr was fixed in the tool computing the jet variables and is not a problem anymore if you are using a recent Athena version.
Mass correlation at low pT
It was tried to reduce the shaping observed at low pT, unfortunately with not much success. On the other way, it was investigated in more detail and it was found that much of the tagger working points were not strongly affected and in general for jet pT > 300
GeV the mass shaping is much reduced.
We think this piece of information could be of interest for analyses planning to use this tagger.
|
200 GeV < pT < 500 GeV |
230 GeV < pT < 280 GeV |
250 GeV < pT < 300 GeV |
300 GeV < pT < 500 GeV |
500 GeV < pT < 1000 GeV |
1000 GeV < pT < 2000 GeV |
50% sig.eff. WP |
|
|
|
|
|
|
60% sig.eff. WP |
|
|
|
|
|
|
70% sig.eff. WP |
|
|
|
|
|
|
80% sig.eff. WP |
|
|
|
|
|
|
90% sig.eff. WP |
|
|
|
|
|
|
The purple lines are the distributions for the tagged QCD jets. Purple-continuous line is for jet tagged with NN-tagger (expected to shape mass distrbution), purlple-dashed line is fr jet tagged with mass-decorrelated ANN-tagger. Reference distribution of untagged QCD- and W-jets are shown. The NN-tagged QCD is expected to peak around the W-boson mass, as the network basically learns the jet substructure. The ANN-tagged QCD instead is expected not to learn the mass, at the cost of a reduced tagging performance.
The mass distribution of ANN-tagged QCD jet show a bias for pT < 300
GeV (1st,2nd and 3rd columns), where it seems some of the information on the W-mass is still learned. Depending on the analysis need, this threshold can be lowered to 250
GeV (3rd column)
--
DavideMelini - 2023-03-29