* The data results (efficiency and FR shape), look really like the ones obtained from TTbar and not so much like DY. You train on TTbar events. We propose to add the isolation when you select loose muons in order to reduce the amount of fakes and make sure that if the data looks like TTbar it is not because you have trained on this sample but rather because you have background
Done, but no change is observed. The difference could come because for data we are using the same cuts as ttbar. More detail in 3th May presentation.
* Understand the difference for results between DY and TTbar (TTBar efficiency is systematically higher). This is a closure for the method. Might be due to the cat that you do not apply isolation to select the muons.
We don’t believe that difference comes from the MVA because it has the same behaviour than the cut-based ID. It can come from the different composition of the samples and the different pt and eta shapes. More detail in 3th May presentation
* Look at the results tables for different eta and pt ranges
Done, see slide 29, 30,31 and 32 of 3th May presentation.
* Remove the new dZ definition that can not help disentail between background on signal. It only allows o have higher efficiency, so the question is more if this new cut should be inside the cut based ID or not.
Done
*Investigate step 3 on top on step 2 that looks to be worse. The bump at 0.2 in the discriminator has disappeared.
The bump disappears when the ‘original’ dz is not introduced in the training. It comes mainly from Medium muons. With the step 3 on top on step 2 the bump disappear also (because it does not include the original dz)
*Add errors on the tables in order to check the consistency between steps.
Done
*Also could you make direct comparison between the steps?
Done see slides 37, 38 and 39 of the 3th May presentation
*Slide 21, efficiency for tight is 0.0008. Is this a typo or is this real? Is so could check what is going on?
This is due to the fact that there is not a fair comparison, in the Cut-based ID the dx and dxy cuts are applied whereas in the MVA-ID there is not, so we need to cut at really high scores to get the same FR, hence leading in a low eff. In the next slide, applying th dz and dxy cuts on top of the MVA-ID the efficiency looks fine. We have also tried to train with more trees to check the robustness of the RF. Increasing the numbers of trees the results are similar. 1000 (eff=0.0008) ---> 2000 (ef=0.0005)---> 5000(ef=0.0005)
*Could add a definition of what you call non-primary? Is it also in data?
this is defined using genMatching, of course only used in MC.
*If we remove Dz for the training then maybe we can also remove dxy since it has a lower weight in the BDT.
Done
*Could you consider using BDT instead of random forest? We would like to think of the future maintenance of such an iD and people in HEP are more familiar with BDT.
We have spent time optimizing our model and we don’t believe that the BDT brings any improvement in this study. Actually, an effort was done training also BDTs with different algorithms, but the conclusion was that the change was not worth it. The Random Forest can be documented and added to CMSSW so this is reproducible in the future, but we prefer to keep it.
--
ClaraRamonAlvarez - 2021-05-03