Difference: SingleTopPolarization (107 vs. 108)

Revision 1082015-01-19 - MatthiasKomm

Line: 1 to 1
 
META TOPICPARENT name="AndreaGiammanco"

Single Top Polarization analysis

Line: 31 to 31
  FCNC tgq interactions at leading order, from D0's paper:
Changed:
<
<
fcnc_tgq.jpg
>
>
fcnc_tgq.jpg
  The CDF (ref.1, ref.2) and ATLAS experiments optimize instead for the diagram qg->t and therefore they investigate the lepton + 1 jet topology, see figure below (from ATLAS):
Line: 69 to 69
 Samples with generic (non-left-handed) tWb couplings are being produced in Moscow.

Useful links

Changed:
<
<
>
>
 
Changed:
<
<
>
>
 
Changed:
<
<
>
>
 

General Work Plan

Line: 92 to 92
 
  • Produce FastSim samples for models with A < 100%
  • Set up statistics macros to extract A and set upper limits on non-SM models
  • Extract from dedicated control samples the abundance and the cos θ* shape for W + light jets, ttbar, QCD
Changed:
<
<
  • Evaluate all the systematics (see this wiki)
>
>
  • Evaluate all the systematics (see this wiki)
 
  • Get the results. No need to keep the analysis "blind" at this stage, as the main result will be the 8 TeV one. In case we will decide to use the full 7 TeV dataset, the additional statistics will be handled with the same work plan as for 8 TeV below.
Question to be answered once all data-driven background estimations and all systematics are included in the analysis: is the current definition of the θ* angle the most optimal for this analysis? What about using the "beam-line basis" instead of the "spectator basis"? (See Mahlon, Parke 2000)
  • Which of the two gives the best ΔA and the best limits when running on MC only? (Use pseudo-data diced from the overall MC expectation.)
Line: 292 to 292
 

Naples ntuplizer

  • Naples code can be found here
Changed:
<
<
>
>
 CMSSW version: in the first stage, let's stick to 4_2_8 in order to reproduce Naples results. We will have to move to later releases (5_2_X) for the analysis of 2012 data. If we decide to perform the 7 TeV analysis with the full 2011 data set, moving to 4_4_4 is recommended (or to 5_2_X if a re-reco of the 2011 data and MC in this version is ready in time.)

Setting up SingleTop _52X on phys.hep.kbfi.ee

Line: 301 to 301
  export SCRAM_ARCH=slc5_amd64_gcc462
Changed:
<
<
Now follow the instructions, but instead of CMSSW_5_2_5 use CMSSW_5_2_5_patch1. The datafiles mentioned in the instructions are copied to /hdfs/local/stpol/sync_5_2_X
>
>
Now follow the instructions, but instead of CMSSW_5_2_5 use CMSSW_5_2_5_patch1. The datafiles mentioned in the instructions are copied to /hdfs/local/stpol/sync_5_2_X
 

Producing the trees using trees_wrapper_cfg.py

Line: 310 to 310
 cd CMSSW_4_2_8/src cvs co UserCode/STPol cp UserCode/STPol/util_scripts/trees_wrapper_cfg.py TopQuarkAnalysis/SingleTop/test/
Changed:
<
<
cd TopQuarkAnalysis/SingleTop/test/
>
>
cd TopQuarkAnalysis/SingleTop/test/
  Also copy the latest version of the file TChannel_cfg.py from the directory TopQuarkAnalysis/SingleTop/test/synch/
Changed:
<
<
cp synch/TChannel_cfg.py ./
>
>
cp synch/TChannel_cfg.py ./
  Now run the treemaker as
Changed:
<
<
cmsRun trees_wrapper_cfg.py inputFiles_load=infiles.txt outputFile=out.root maxEvents=-1 channel=CHAN
>
>
cmsRun trees_wrapper_cfg.py inputFiles_load=infiles.txt outputFile=out.root maxEvents=-1 channel=CHAN
  CHAN is taken from the file SingleTopPSetsSummer _cfi.py and removing the Ele/Mu suffix. So when running on the TChannel ntuples, channel=TChannel.
Line: 371 to 369
 V00-04-11 RecoBTag/PerformanceDB V00-03-31 RecoEgamma/ElectronIdentification V03-03-05 RecoLuminosity/LumiDB
Changed:
<
<
SingleTop_42X TopQuarkAnalysis/SingleTop
>
>
SingleTop_42X TopQuarkAnalysis/SingleTop
 
Error occurred while creating for module of type 'SingleTopLeptonCounter' with label 'countLeptons'
StatusMismatch: Parameter 'minNumberTight' is designated as untracked in the code,
but is not designated as untracked in the configuration file.
Changed:
<
<
Please change the configuration file to 'untracked minNumberTight'.
>
>
Please change the configuration file to 'untracked minNumberTight'.
  Change the following things in the source files from untracked to tracked
Line: 388 to 384
  minTight_ = iConfig.getParameter("minNumberTight"); maxTight_ = iConfig.getParameter("maxNumberTight"); minLoose_ = iConfig.getParameter("minNumberLoose");
Changed:
<
<
maxLoose_ = iConfig.getParameter("maxNumberLoose");
>
>
maxLoose_ = iConfig.getParameter("maxNumberLoose");
  python/SingleTopSelectors_cff.py
Line: 411 to 406
  maxNumberQCD = cms.untracked.int32(1), rejectOverlap = cms.untracked.bool(True), doQCD = cms.untracked.bool(True),
Changed:
<
<
)
>
>
)
  SelectionCuts_Skim_cff.py
Line: 420 to 414
 minTightLeptons = cms.int32(1) maxTightLeptons = cms.int32(99) minLooseLeptons = cms.int32(0)
Changed:
<
<
maxLooseLeptons = cms.int32(99)
>
>
maxLooseLeptons = cms.int32(99)
 
Error occurred while creating for module of type 'SingleTopSystematicsTreesDumper' with label 'TreesMu'
Error occurred while creating for module of type 'SingleTopSystematicsTreesDumper' with label 'TreesMu'
---- JetCorrectorParameters BEGIN
No definitions found!!!
Changed:
<
<

JetCorrectorParameters END
>
>

JetCorrectorParameters END
  You need to copy the file from CMSSW_4_2_8/src/TopQuarkAnalysis/SingleTop/test/JEC11_V12_AK5PF_UncertaintySources.txt to CMSSW_4_2_8/src/TopQuarkAnalysis/SingleTop/test/synch
Line: 437 to 429
  You need to set the LHAPATH environment variable
Changed:
<
<
export LHAPATH=/cvmfs/cms.cern.ch/slc5_amd64_gcc434/external/lhapdf/5.8.5-cms3/share/lhapdf/PDFsets
>
>
export LHAPATH=/cvmfs/cms.cern.ch/slc5_amd64_gcc434/external/lhapdf/5.8.5-cms3/share/lhapdf/PDFsets
 
python encountered the error: Path 'pathPreselection' contains a module of type 'FastjetJetProducer' which has no assigned label.
Line: 449 to 440
 process.selectedPatJetsForMETtype2Corr.src = cms.InputTag('selectedPatJets') process.patPFJetMETtype1p2Corr.type1JetPtThreshold = cms.double(10.0) process.patPFJetMETtype1p2Corr.skipEM = cms.bool(False)
Changed:
<
<
process.patPFJetMETtype1p2Corr.skipMuons = cms.bool(False)
>
>
process.patPFJetMETtype1p2Corr.skipMuons = cms.bool(False)
  and remove producePatPFMETCorrections from the path

process.pathPreselection = cms.Path(
        process.patseq #+  process.producePatPFMETCorrections
Changed:
<
<
)
>
>
)
 
crab weirdness introducing a lumi discrepancy
Added:
>
>
 Somehow, the results from crab -report and lumiCalc2.py are inconsisent. Diff between 83a02e9_Jul22 (Mario - old) and Aug4_c6a4b11(Joosep - new). The former was used for the previous presentation at the single top meeting, and for the plots in the AN/PAS. The latter includes MET-PHI corrections, PU reweighting systematics, top/ttbar reweighting by pt.
Line: 484 to 474
 
old WD_SingleMu3 /SingleMu/jpata-Jul16_7d17c5-7cb0fdcb434651e6fe30ffadc793c329/USER 5277
new ./Jul15/WD_SingleMu3 /SingleMu/jpata-Jul16_7d17c5-7cb0fdcb434651e6fe30ffadc793c329/USER 5319
Deleted:
<
<
 
Differences between WD_SingleMu2
Changed:
<
<
new
>
>
new
 
CMSSW.datasetpath : /SingleMu/joosep-Jul8_51f69b-7cb0fdcb434651e6fe30ffadc793c329/USER
CMSSW.dbs_url : https://cmsdbsprod.cern.ch:8443/cms_dbs_ph_analysis_02_writer/servlet/DBSServlet
Line: 500 to 491
 Total Jobs : 528 Luminosity section summary file: /home/joosep/singletop/stpol/crabs/Aug4_c6a4b11/step2/data/iso/Jul15/WD_SingleMu2/res/lumiSummary.json # Jobs: Done:1
Changed:
<
<
# Jobs: Retrieved:527
>
>
# Jobs: Retrieved:527
 
Added:
>
>
old
 
Deleted:
<
<
old
 
CMSSW.datasetpath : /SingleMu/joosep-Jul8_51f69b-7cb0fdcb434651e6fe30ffadc793c329/USER
CMSSW.dbs_url : https://cmsdbsprod.cern.ch:8443/cms_dbs_ph_analysis_02_writer/servlet/DBSServlet
Line: 518 to 508
 Total Jobs : 520 Luminosity section summary file: /home/mario/Summer13/stpol/crabs/83a02e9_Jul22/step2/data/iso/Jul15/WD_SingleMu2/res/lumiSummary.json # Jobs: Done:3
Changed:
<
<
# Jobs: Retrieved:517
>
>
# Jobs: Retrieved:517
  So somehow, the number of lumis is the same, but the number of events read is different, and thus is the final luminosity! How can that be?
Line: 547 to 537
  If you are already working on an old version of this directory and you know there are updates, type cvs update.
Changed:
<
<
See also this how to
>
>
See also this how to
  (We will want to use "tags" at some point, but let's start with the basics...)
Line: 559 to 549
 export LCG_GFAL_INFOSYS=bdii.balticgrid.org:2170

source /opt/software/cms/cmsset_default.sh

Changed:
<
<
export CVSROOT=:ext:mario@cmscvs.cern.ch:/cvs_server/repositories/CMSSW
>
>
export CVSROOT=:ext:mario@cmscvs.cern.ch:/cvs_server/repositories/CMSSW
  replacing mario with your CERN username in the CVSROOT environment. To use CRAB you first do cmsenv in some CMSSW software area and then you can source it:
Line: 583 to 573
 use_server = 0

[GRID]

Changed:
<
<
se_white_list = kbfi
>
>
se_white_list = kbfi
 

GRID with local submission from *.hep.kbfi.ee

Added:
>
>
 You need to use a modified version of CRAB:

source /opt/software/CRAB2/crab.sh

Line: 601 to 591
 [PBSV2WITHSRM] forceTransferFiles = 1 workernodebase = /home/USERNAME
Changed:
<
<
use_proxy = 1
>
>
use_proxy = 1
  You can check whether the jobs are running using qstat.

Datasets

Line: 611 to 600
 
dbs --search --query='find file where dataset like /T_TuneZ2_t-channel_7TeV-powheg-tauola/Summer11-PU_S4_START42_V11-v1/AODSIM'
Changed:
<
<
this lists the names of all the files corresponding to that dataset. Type dbs --help to know more. See also these instructions.
>
>
this lists the names of all the files corresponding to that dataset. Type dbs --help to know more. See also these instructions.
 
Changed:
<
<
Accessing the desired run range in real data requires the use of JSON files. Their use is explained here (and links within). The repository of officially validated JSON files is here.
>
>
Accessing the desired run range in real data requires the use of JSON files. Their use is explained here (and links within). The repository of officially validated JSON files is here.
  Checking for local datasets in Tallinn using the DAS CLI
Line: 633 to 622
 

8 TeV analysis:

Changed:
<
<
>
>
 

5_3 datasets:

/T_t-channel_TuneZ2star_8TeV-powheg-tauola/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/T_t-channel_TuneZ2star_8TeV-powheg-tauola/Summer12_DR53X-PU_S10_START53_V7A-v2/AODSIM
/Tbar_t-channel_TuneZ2star_8TeV-powheg-tauola/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/T_tW-channel-DR_TuneZ2star_8TeV-powheg-tauola/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/Tbar_tW-channel-DR_TuneZ2star_8TeV-powheg-tauola/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/T_s-channel_TuneZ2star_8TeV-powheg-tauola/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/Tbar_s-channel_TuneZ2star_8TeV-powheg-tauola/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM

/TTJets_MassiveBinDECAY_TuneZ2star_8TeV-madgraph-tauola/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/WJetsToLNu_TuneZ2Star_8TeV-madgraph-tarball/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/WJetsToLNu_TuneZ2Star_8TeV-madgraph-tarball/Summer12_DR53X-PU_S10_START53_V7A-v2/AODSIM
/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM

/WW_TuneZ2star_8TeV_pythia6_tauola/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/WZ_TuneZ2star_8TeV_pythia6_tauola/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/ZZ_TuneZ2star_8TeV_pythia6_tauola/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM

/QCD_Pt_20_MuEnrichedPt_15_TuneZ2star_8TeV_pythia6/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM

/QCD_Pt_20_30_BCtoE_TuneZ2star_8TeV_pythia6/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/QCD_Pt_30_80_BCtoE_TuneZ2star_8TeV_pythia6/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/QCD_Pt_80_170_BCtoE_TuneZ2star_8TeV_pythia6/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/QCD_Pt_170_250_BCtoE_TuneZ2star_8TeV_pythia6/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/QCD_Pt_250_350_BCtoE_TuneZ2star_8TeV_pythia6/Summer12_DR53X-PU_S10_START53_V7A-v1/AODSIM
/QCD_Pt_350_BCtoE_TuneZ2star_8TeV_pythia6/Summer12_DR53X-PU_S10_START53_V7A-v2/AODSIM

Line: 682 to 671
 
voms-proxy-init -voms cms
lcg-ls -b -D srmv2 -T srmv2 "srm://cmsse02.na.infn.it:8446/srm/managerv2?SFN=/dpm/na.infn.it/home/cms/store/user/oiorio/2012/Summer12/MergedJul24/"
Changed:
<
<
lcg-cp -b -D srmv2 -T srmv2 "srm://cmsse02.na.infn.it:8446/srm/managerv2?SFN=/dpm/na.infn.it/home/cms/store/user/oiorio/2012/Summer12/MergedJul24/remote_file.root" /path/to/local/file.root
>
>
lcg-cp -b -D srmv2 -T srmv2 "srm://cmsse02.na.infn.it:8446/srm/managerv2?SFN=/dpm/na.infn.it/home/cms/store/user/oiorio/2012/Summer12/MergedJul24/remote_file.root" /path/to/local/file.root
 

7 TeV analysis

Line: 706 to 694
 /WJetsToLNu_TuneZ2_7TeV-madgraph-tauola /DYJetsToLL_TuneZ2_M-50_7TeV-madgraph-tauola /QCD_Pt-20_MuEnrichedPt-15_TuneZ2_7TeV-pythia6
Changed:
<
<
/QCD_Pt-80to170_EMEnriched_TuneZ2_7TeV-pythia7
>
>
/QCD_Pt-80to170_EMEnriched_TuneZ2_7TeV-pythia7
  And the ones missing are:
Line: 721 to 708
 /QCD_Pt-30to80_EMEnriched_TuneZ2_7TeV-pythia6 /GJets_TuneD6T_HT-40To100_7TeV-madgraph /GJets_TuneD6T_HT-100To200_7TeV-madgraph
Changed:
<
<
/GJets_TuneD7T_HT-200_7TeV-madgraph
>
>
/GJets_TuneD7T_HT-200_7TeV-madgraph
  All the MC datasets processed with
 SingleTopMC_PF2PAT_cfg.py 
Line: 732 to 718
 FCNC samples:

  • t,j -> b,l,nu,j with tug coupling.(MCDB 3655)
Deleted:
<
<
 
/TJetToBLNuJet_FCNC_tug_TuneZ2_7TeV-comphep-EDM/dkonst-TJetToBLNuJet_FCNC_tug_TuneZ2_7TeV-comphep-FASTSIM-92a8e0ecc98e6ae221fc036cdde0c771/USER
Changed:
<
<
dbs_url= https://cmsdbsprod.cern.ch:8443/cms_dbs_ph_analysis_02_writer/servlet/DBSServlet
>
>
dbs_url= https://cmsdbsprod.cern.ch:8443/cms_dbs_ph_analysis_02_writer/servlet/DBSServlet
 
  • t,j -> b,l,nu,j with tcg coupling.(MCDB 3655) 1 job is still running.
Line: 796 to 778
 currently located at https://twiki.cern.ch/twiki/bin/view/CMS/Internal/PubGuidelines, and the publications wiki page is
Changed:
<
<
https://twiki.cern.ch/twiki/bin/viewauth/CMS/Internal/Publications
>
>
https://twiki.cern.ch/twiki/bin/viewauth/CMS/Internal/Publications
 

Analysis Note 2012/448

Deleted:
<
<
 
> svn co -N svn+ssh://svn.cern.ch/reps/tdr2 myDir # where myDir is a placeholder for a name of your choice
> cd myDir
Line: 816 to 796
 > svn commit -m "commit message"
New files will first need to be added with > svn add NewFileNames
Changed:
<
<
before they can be committed.
>
>
before they can be committed.
  Note: I committed a script MAKENOTE for compiling without having to remember the exact command line.
Line: 838 to 816
 > svn commit -m "commit message"
New files will first need to be added with > svn add NewFileNames
Changed:
<
<
before they can be committed.
>
>
before they can be committed.
  Note: I committed a script MAKEPAS for compiling without having to remember the exact command line.
Line: 850 to 826
  PasTop13001QA
Added:
>
>
PaperTop13001QA
 

To-do-list towards the final paper

  • Why so much QCD in muon channel? [Joosep]
Line: 870 to 847
  From the ARC, Sep.26:
Changed:
<
<
     ---> the difference of event selection between the electron and muon channels are inducing some strong differences between the 2 channels (MT vs MET) for example the tight MET cut on electron induce high sensitivity to some systematics like JER. For publication, we would like suggest to harmonize the selection between the two channels, and possibly using cut on MT for both channels.

     --->  There is some lack of statistics in some systematic samples => These statistical fluctuations seem to affect more the electron channel, as its  selection efficiency is lower. For publication, we would like to suggest to produce more MC events where it is needed (hoping there is enough computing resources).

    ---> while a conservative approach was followed by the authors, we would like to see some more investigation on the mis-modeling of the costheta* distribution by madgraph. For publication, we would like to suggest to work on better understanding of the mis-modeling of the costheta* distribution by, possibly, make details MC studies/comparisons (comparisons with other generators, investigate effects of matching, propagations of spin information etc...). 

    ---> Concerning TopFit, the correlations of measurements in the limit calculation are neglected. This is a feature of the program (analysts have no hand on it). This assumption, which is done also in W-helicity measurement by ATLAS and CMS if I understood perperly,  is made clear in the text. For publication, we would like to see with the authors and Aguilar if correlations can be introduced in a decent amount of time.

    ---> We would also suggest to investigate the reliability of jet-ID up to |eta|<4.5 and possibly (re-)optimize the "root-mean-square particle-jets deltaR" selection.

    ---> The usage of a the CSV tagger should help to remove more backgrounds with a possible increase the signal statistic. The determination of the best working point might be needed.

    ---> We understood that the BDT selection would benefit from a re-optimization.

    ---> As discussed (and proposed) by the authors, the QCD background normalization in the second background fit should be fixed to the estimation of the first background fit.

    --->
In the combination of the top polarization, it might help to investigate better the correlations between the systematics.

    --->
Some synchronization with the W helicity in single-top could be investigated.
>
>
     ---> the difference of event selection between the electron and muon channels are inducing some strong differences between the 2 channels (MT vs MET) for example the tight MET cut on electron induce high sensitivity to some systematics like JER. For publication, we would like suggest to harmonize the selection between the two channels, and possibly using cut on MT for both channels.
     --->  There is some lack of statistics in some systematic samples => These statistical fluctuations seem to affect more the electron channel, as its  selection efficiency is lower. For publication, we would like to suggest to produce more MC events where it is needed (hoping there is enough computing resources).
    ---> while a conservative approach was followed by the authors, we would like to see some more investigation on the mis-modeling of the costheta* distribution by madgraph. For publication, we would like to suggest to work on better understanding of the mis-modeling of the costheta* distribution by, possibly, make details MC studies/comparisons (comparisons with other generators, investigate effects of matching, propagations of spin information etc...). 
    ---> Concerning TopFit, the correlations of measurements in the limit calculation are neglected. This is a feature of the program (analysts have no hand on it). This assumption, which is done also in W-helicity measurement by ATLAS and CMS if I understood perperly,  is made clear in the text. For publication, we would like to see with the authors and Aguilar if correlations can be introduced in a decent amount of time.
    ---> We would also suggest to investigate the reliability of jet-ID up to |eta|
    ---> The usage of a the CSV tagger should help to remove more backgrounds with a possible increase the signal statistic. The determination of the best working point might be needed.
    ---> We understood that the BDT selection would benefit from a re-optimization.
    ---> As discussed (and proposed) by the authors, the QCD background normalization in the second background fit should be fixed to the estimation of the first background fit.
    ---> In the combination of the top polarization, it might help to investigate better the correlations between the systematics.
    ---> Some synchronization with the W helicity in single-top could be investigated.

 

From Jeremy, Sep.27:

Changed:
<
<
That would be great if you

>
>
That would be great if you

 could at least redo the nice analysis from Nadjieh. In particular,
Changed:
<
<
instead of inverting the isolation cut on electron isolation >0.1, one
>
>
instead of inverting the isolation cut on electron isolation >0.1, one
 could try to investigate how the mTW bias is behaving by bins of
Changed:
<
<
isolation, like 0.1 < iso < X. There might be some intervals with smaller bias.

Also, the fact that Nadjieh is looking at the 2jets0tag category make

>
>
isolation, like 0.1 < iso < X. There might be some intervals with smaller bias.
Also, the fact that Nadjieh is looking at the 2jets0tag category make

 the sample enriched in jet reconstructed as electrons, while there could be a significant effect of btagging for the fraction of non-prompt electron from heavy hadron decays. The bias can be smaller in the signal region.
Changed:
<
<

One could also investigate a combination of a loose MET cut and a

>
>
One could also investigate a combination of a loose MET cut and a

 tighter mWT cut, which would have to be optimized.
Line: 933 to 875
 tighter mWT cut, which would have to be optimized.
Deleted:
<
<
 From Jeannine, August 8, 2014
Changed:
<
<
1) I've only noticed several issues/problems with the QCD modeling:

>
>
1) I've only noticed several issues/problems with the QCD modeling:

 a) Looking at Figure 27 it makes no sense to have one QCD template and one template for non-QCD processes. There is quite some separation power for DY, EW V production,top. So, I suggest to perform a 4 template fit, giving the SM processes the usual width to float. Please report also the scale factors for the SM process.
Changed:
<
<
b) Looking at Figure 28, inparticular at BDT_antiQCD in the region above the cut value, the contamination from non QCD processes is by far too high. So we basically have no idea how to extrapolate from the cut region as there is certainly a sizable uncertainty on the contamination. How can we trust the QCD estimation in the BDT_antiQCD>cut region given this contamination issue? Furthermore, how can we trust any QCD shape in the region BDT_antiQCD>cut and even worse the correlations between variables as it is needed for the selection BDT? How can we trust BDT_W,tt for QCD?
>
>
b) Looking at Figure 28, inparticular at BDT_antiQCD in the region above the cut value, the contamination from non QCD processes is by far too high. So we basically have no idea how to extrapolate from the cut region as there is certainly a sizable uncertainty on the contamination. How can we trust the QCD estimation in the BDT_antiQCD>cut region given this contamination issue? Furthermore, how can we trust any QCD shape in the region BDT_antiQCD>cut and even worse the correlations between variables as it is needed for the selection BDT? How can we trust BDT_W,tt for QCD?
 c) Concerning table 7-10, the number that really matters, is the number of QCD events (plus uncertainty) after applying the BDT_anti-QCD cut. How does the number change when the non-QCD contamination is altered? How does it change when the QCD MC (isolated region) is used? d) How can we trust the cosTheta* shape of QCD at all? Does it probably peak at -1? How can we exclude that? Just thinking loud, would it help to use QCD MC (isolated, anti-iso sel) in the 2j0t region with different cuts on BDT_antiQCD (e.g 0...0.6 in steps of 0.1)? Furthermore, could we learn something from ttbar all-hadronic events (jet mimics a lepton), for example by doing the same check as for QCD MC? The W+jets modeling was already carefuly attacked in the PAS and the new studies will certainly add more knowledge about the W+jets mismodeling, so I have no comments on this right now.
Added:
>
>
2) One comment on the BDT trainings, figure 5 and 11, for both BDTs it seems that there is overtraining. In case of the BDT_anti-QCD this is true for signal (KS-test < 5%), in case of BDT_W,tt this is true for background (KS test=0). Is it feasible to find a BDT setting that does not overtain?
3) Looking at Figure 9 it seems that the BDT_W,tt output has a small peak in the signal region. That is something what one would like to avoid. Which background (tW, QCD, Q+jets, ttbar) causes this peak?

 
Deleted:
<
<
2) One comment on the BDT trainings, figure 5 and 11, for both BDTs it seems that there is overtraining. In case of the BDT_anti-QCD this is true for signal (KS-test < 5%), in case of BDT_W,tt this is true for background (KS test=0). Is it feasible to find a BDT setting that does not overtain?

3) Looking at Figure 9 it seems that the BDT_W,tt output has a small peak in the signal region. That is something what one would like to avoid. Which background (tW, QCD, Q+jets, ttbar) causes this peak?
 On fig. 9, only ttbar and W+jets are included in the background. The templates for all subcomponents will be plotted. In general, this "second peak" has been discussed some time ago , the reason seems to be that for some events, the BDT is unable to deduce them from signal and the gradient boosting does not reweight those trees down by a large enough factor. The style (hatching) of Fig. 9 will also be changed.
Changed:
<
<
4) Figure 24 and 25 (BDT,W,tt in the 2j0t and the 3j2t regions) look ok, as the observed deviation is covered by syst. It would be nice to have at least in the appendix the dta-mc comparison for all BDT_W,tt input variables. In principle also the correlation of the most important input variables and towards BDT_W,tt has to be checked, are they the same for data and MC (see suggestion from Andrea: check correlation between MT-BDT).

5) Fitting:

>
>
4) Figure 24 and 25 (BDT,W,tt in the 2j0t and the 3j2t regions) look ok, as the observed deviation is covered by syst. It would be nice to have at least in the appendix the dta-mc comparison for all BDT_W,tt input variables. In principle also the correlation of the most important input variables and towards BDT_W,tt has to be checked, are they the same for data and MC (see suggestion from Andrea: check correlation between MT-BDT).
5) Fitting:

 a) The W+jets template is a bit spiky. What subset causes the spikes? Can we safely ignore this part (e.g. Wc+1p) without introducing a kin. bias? The current smoothing studies are a good idea, I think. b) the single top scale factor for mu is 1.22. How does this compare to the published single top cross section measurement at 8TeV? Is it consistent?
Line: 964 to 895
 a) The W+jets template is a bit spiky. What subset causes the spikes? Can we safely ignore this part (e.g. Wc+1p) without introducing a kin. bias? The current smoothing studies are a good idea, I think. b) the single top scale factor for mu is 1.22. How does this compare to the published single top cross section measurement at 8TeV? Is it consistent?
Added:
>
>
 Joosep will plot the subcomponent templates, however, itís just mostly an issue of nominal MC becoming depleted also in W+2,3, for which we have no excellent approach.
Changed:
<
<
6) Correlation of BDT_W,tt and cosTheta*:

>
>
6) Correlation of BDT_W,tt and cosTheta*:

 Looking at figure 48 and 49 it is clear that a cut on BDT_W,tt results in ttbar and W+jets shapes that look more single top like. I think many variables used in the BDT_W,tt are correlated to cosTheta*, hence the correlation between BDT_W,tt and cosTheta* is even stronger for the BDT output. As long as the correlation between BDT_W,tt and cosTheta* (and better also the correlation of all variables entering the BDT to cosTheta*) is in data the same as predicted this is ok. However, this assumption has to be carefully checked in different control regions. I suggest to extend the MTW-BDT correlation study suggested today by Andrea towards cosTheta* and the BDT_W,tt output and its input variables and also towards different control regions. Furthermore, I suggest to show data-mc plots for cosTheta* in the 2j0t and 3j2t region for different BDT_W,tt cut values (do we always get reasonable data MC agreement?).
Added:
>
>
 Joosep will add additional plots with cut points.
Changed:
<
<
7) Comphep study and neyman construction:

>
>
7) Comphep study and neyman construction:

 It seems that the difference between Powheg and Comphep SM is for some distributions larger than the difference between the ano coupling samples. How is the Newman construction done? Does it use Comhep SM for the unfolding or Powheg? Is the use of Powheg in the migration matrix the reason why there is a bias for the SM case, although the pull distributions are all fine for the SM case?
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback