TO DO list

  • ALERT: Missing ingredient: trigger weights not included in the analysis (due to a misunderstanding (my fault)). Each Monte Carlo event should be weighted to take into account the trigger. The MC samples haven't got trigger, so we need to weight each MC event with a function trying to reproduce the effect of the trigger in data.
  • Cross-section for WZ->3l using the generation cut 60 < Z(M_{ll}) < 120: so I need to introduce a generic function (in AnalysisBase) to deal with these kind of generation cuts DONE
  • Use of the Blue Method to get a total cross section using the per channel cross-sections DONE
  • Deal with the systematics: define a list of systematics, decide or find out whether I can use the final number from other analysis (HWW,WW) or calculate by myself. DONE
  • Closure Test with TTbar using just the PPF subsample (the 2-leptonic decay only) DONE, but results unexpected...
  • Plot the ET distribution of the leading jet in the ttbar sample. Choose the fake rate map with the peaking ET closer to the 2obtained by the distribution in the ttbar. Why? Because the ttbar is the major contribution (I have to redo, but with the v04.00, was about 55%)
  • Evaluate the Tight-NoTight-NoTight contribution DONE, negligible

SOME WARNINGS:


Analysis code

  • Analysis note: http://cms.cern.ch/iCMS/jsp/openfile.jsp?tp=draft&files=AN2012_090_v1.pdf
  • Code
    • Current Stable release: v04.03
    • Darcs repo -- nightly build distribution
    • CVS repo -- last stable
    • Source documentation
    • Technicallity: per actualitzar el cvs del cern has de fer darcs pull al directori de gridui /gpfs/csic_users/duarte/UserCode/JordiDuarte/AnalysisVH i després actualitzar el cvs (recorda fer previament setkrbcern), i.e. cvs commit i cvs tag `darcs show tags|head -1` perque aquesta actualització s'ha de fer sempre que hi lliberem nou codi, es a dir, sempre que possem un nou tag a darcs. Nota que degut a la migracio de directoris al CVS del cern, cada cop que et trobis l'error que et diu que has d'utilitzar el migrarte-cvsroot, llavors fes servir el seguent comando (nota que el directori d'aplicació es aquell on fas els commits)
      /cvmfs/cms.cern.ch/common/migrate-cvsroot /home/duarte/UserCode/JordiDuarte
      


SVN guide for Analysis note

Setting-up: just first time

  • svn tdr help twiki
  • Non-lxplus machine (my Debian, p.e.): needs the package: libwww-perl (LWP package)
> svn co -N svn+ssh://svn.cern.ch/reps/tdr2 analysisnotes
> cd analysisnotes
> svn update utils
> svn update -N notes
> svn update notes/AN-12-090
> eval `notes/tdr runtime -sh`   # assuming bash (if csh then change -sh by -csh)
> cd notes/AN-12-090/trunk
# (edit the template, then to build the document (use the 'nodraft' if you want the final note))
> make [nodraft]

Folder structure

The note is split in sections and to every section corresponds a .tex file (which is included at the main document via the input mechanism). So, the folder structure is as follows:
  • AN-12-090.tex parent file where it is written down the title, authors, abstract, etc.. and where to input the section files
  • XX-titleofthesection.tex content of each section is written. XX stands for the number of section (filled with zeros on the left)
  • figures folder where the figures .pdf (or png) are stored. For the distribution plots for each channel there are subfolders with the name of the channel, containing the figures
  • tables same as figures but for tex tables
  • AN-12-090.bib bibliography

Editing the note

We are going to edit using the svn mechanism of locking in order to avoid conflicts and interference between editors. The lock mechanism allows the serialization of the work, i.e. no one can edit a file while this file is locked by somebody.
It is established a protocol to edit the note. The following procedure has to be followed:
  1. Up-to-date the working directory (i.e. notes/AN-12-090)
    > cd notes/AN-12-090 
    > svn update 
  2. Choose the file you are going to edit (or create) and lock that file (for instance: 0X-systematics.tex):
    > svn lock 0X-systematics.tex -m "Some informative text (not mandatory but recommended)"
  3. After editing your file and just before commit, update again the working directory to avoid out-of-date conflicts
    > svn update 
  4. Unlock the file
    • Publish the changes, so do the commit and then the file is automatically unlocked
      > svn commit -m "Summarized description of the changes" 
    • Or if you did not edit anything and it is not need for a commit, unlock the file (important)
      > svn unlock 0X-systematics.tex 
The final versions, i.e., version to release to the iCMS will be tagged as 'vX' (stands for version-X) and each tagged version is putted at the 'tags' directory. So recall to work always in the 'trunk' directory.

Some Comments

  • Be sure to check the spelling before commit the changes, this can be done at linux with (use american or british):
     ispell -d american|british AN-12-090.tex 
  • The CMS AN notes citation has been change in order to follow the standard CMS bib format. The format for use the cite command is CMS_AN_XXXX-YYY, where XXXX is the year and YYY the number of Analysis Note
  • The commands you can find by default are explained listed at pdf output example under PTDR Symbol Definitions and Particle Symbols. If some new command is needed has been defined at the master file (AN-12-090.tex)

VH->WWW Working Notes

Working Notes

29 de Març 2012 Composition of the no-Tight lepton for the mixed channels
Getting counted the flavor contribution for each mixed channel after all cuts, I found:
  noTight electrons noTight muons total sample
2mu1e 92% 8% 2124
1mu2e 30% 70% 396
So, it seems that the no-Tight lepton is more likely to be the W-candidate one. The Z-candidate lepton which is tagged as no-Tight is found to be more likely in the electron case, 30% versus 8% for the muons.

The mumu Z-candidate channel contains more events than the Zee --> due to the fact that the muon efficiency is greater than the electron efficiency
The mu from W is proportionally greater than the electron from W --> due to the fact that the purity of the electron is greater than the muon purity

So, when Marta and I discussed about the possibility of being the fake rate estimation the cause of the difference between the Fake sample yields in the mixed channels (2mu1e: 7 Fake events estimated, 2e1mu: 3.8 fake events estimated) we decided that it might be possible if the electron was always the main noTight lepton in every sample, BUT as I shown, it is not the case. The differences of yields are just related with the fact that the muons are more efficient (although with higher fake rate, so less pure). Therefore, as the MC-data discrepancies in the electronic channel seem not to be related with the fake rate estimation, should be related directly with the electron itself.


30 de Març 2012 Systematics Fake method. WZ cross section measurement --- *Closure Test with ttbar*
---> See 10 de Març

Closure Test with ttbar


1 d'Abril 2012 Systematic Fake method continue
  • /gpfs/csic_users/duarte/ANALISISVH/PROD_V04_01/WZ2011_SYSTEMATIC_FAKES, where I took the fake rate maps built with the transverse energy of the leading jets to be peaking at ET=15 for electrons and ET=30 for muons.

Alicia instruct me to extract the systematic: take those maps (created with a different shape in the ET of the jets of the sample used to extract the fake rate) and do the analysis. Substract the output to the output obtained using the nominal fake rates maps (and divide the result by this last output in order to get a percentage). Note that, the error will be used symmetrically, there is no maps to extract the up and down errors.
(Technicalities: Per calcular-ho, he canviat els mapes de FR del paquet FOManager hardcodejats.. potser canviar-ho).

  3mu0e 2mu1e 1mu2e 0mu3e ALL
Exactly3Leptons 29.7% 11.0% 27.1% 28.8% 10.3%
HasZCandidate 29.5% 19.8% 29.7% 29.6% 8.1%
HasWCandidate 11.8% 13.1% 0.70% 10.0% 8.8%
where HasWCandidate means third lepton with pt higher than 20, and MET higher than 40, i.e. after all cuts. Recall, those systematics where extracted using a fake rate maps extracted from a sample where the leading jet where peaking at ET=15 electrons, ET=30 muons, being the nominal maps: ET=35 electrons, ET=15 muons
2 and 3 (happy birthday to me...) d'Abril de 2012 Systematics list
  • Luminosity: directly extracted from the lumi group --> 2.2%
  • Trigger efficiency:
  • Object efficiency: systematic arising from the use of the tag and probe method, so I can take directly the value given by the POG or whoever provide this number and it use the same object selection--> HWW,WW analysis:
    • Muon: 1.5%
    • Electron: 2.0%
  • Muon p scale: as recommends in the page Muon momentum scale we can take the 0.2% as systematic uncertainty (for pt>200GeV/c is pt dependent: 5^-5 /GeV, i.e. at 1Tev=1000GeV--> 5%). So I need to recompute the muon channel twice: one using a variation pt-systematic and another one with pt+systematic, extracting the upper and lower systematic uncertainties from the subtraction of these outputs with respect the nominal one.
  • Electron e scale: I didn't find out a way to calculate this systematic with the MVA electron identification (neither with the cut based id electron, the closer thing I found was in Electrons: recipes for 2011 but it doesn't get any numbers... Could I use the value obtained by the HWW, WW people? The fact is we use the same object, and my selection is tighter than yours (I asking for 3 leptons...), so I could expect a systematic associated lesser than the HWW, WW analysis, ??
  • resolution: ??
  • Jet counting: ??
  • Simulation (Pile up): ??
  • PDF: ??
  • ...

10 d'Abril 2012

PFF and FFF contribution on the fake method

When extract the estimation of the background due to the fake leptons contribution, we did some assumptions in the equations, our background estimation is given by:
   \begin{equation}     N_{bkg\;total}(p,f) = N_{PPF}^{pass}(p,f) +N_{PFF}^{pass}(p,f)+N_{FFF}^{pass}(p,f)   \end{equation}

where we are ignoring the PPP background (mainly ZZ) which is given by the MC and p=prompt rate and f=fake rate. Approximating and (i.e. developing the above functions as Taylor functions around these values) we obtain a simplified equations (in fact the following equations are with p=1):
   \begin{eqnarray} 	 N_{PPF}^{pass} &amp;\simeq&amp; \varepsilon N_{t2} + \mathcal{O}(\varepsilon^2) \\ 	 N_{PFF}^{pass} &amp;\simeq&amp; \mathcal{O}(\varepsilon^2) = \varepsilon^2 N_{t1}  \\ 	 N_{FFF}^{pass} &amp;\simeq&amp; \mathcal{O}(\varepsilon^3) = \varepsilon^3 N_{t0}    \end{eqnarray}

where .
So, if our approximation cut was right, i.e., considering negligible whatever term equal or smaller than ; then we can estimate our background just with the PPF contribution because the other two are negligible, i.e. we expect a 0-compatible weighted events coming from the other samples PFF and FFF. This assumption could and should be proved it just evaluating the sub-samples and . The next table evaluates the sample at different stages: pre-selection, Z candidate and after all cuts; and it compares with the estimation (last row):
  3mu0e 2mu1e 1mu2e 0mu3e
Pre-selection 10±3 15±4 2.6±1.6 0.7±0.8
Z candidate 5±2 10±3 1.5±1.2 0.5±0.7
All cuts 0.3±0.5 0.1±0.3 0.1±0.3 0.0±0.2
12±4 7±3 3.8±1.9 0.81±0.08

Just to clarify: the table was done using the sample evaluated on the full 2011 data sample.
The conclusion is clear: the PFF contribution (and also the FFF, although it is not shown here) is negligible with respect the PPF contribution. Whatever the final uncertainties were above 2.6% (which is the difference in the worst case), our approximation is good.

Closure test for ttbar (exclusive dileptonic) sample

Recall the equations for the expected number of events on the fake method using the usual approximation (p=1, f->0),
   \begin{eqnarray}          N_{PPP}^{pass} &amp;\simeq&amp; N_{t3} - \varepsilon N_{t2} \\ 	 N_{PPF}^{pass} &amp;\simeq&amp; \varepsilon N_{t2}   \\ 	 N_{PFF}^{pass} &amp;\simeq&amp; 0 \\ 	 N_{FFF}^{pass} &amp;\simeq&amp; 0    \end{eqnarray}

So, it is possible to test the method (i.e. the above equations) if we know the real composition of a sample. It can be done with a MC sample, where it is known the truth or in data, an enriched Z-sample for instance. Therefore, if we have a real PPF sample, where the number of events PPP, PFF and FFF are zero (a Z+Jets sample, ttbar dileptonic sample, ...), then




Checking the last equality is equivalent to check the method itself.

The test has been done using a Z+Jets MC sample (elsewhere) and a ttbar sample (inclusive and exclusive dileptonic, note that the inclusive one doesn't fulfill the requirements of the last equations for the test). As we have found that the major contribution of the fake background, after all cuts, comes from the ttbar (elsewhere, in the pdf); the closure test should be done with the ttbar. The table below summarizes the fake estimation using the method (, i.e. Data Driven on Monte Carlo DDM) and the analysis using this only the ttbar exclusive sample (, i.e. MC),

  DDM MC
3mu0e 5.4±0.2 1.59±0.11
2mu1e 3.76±0.18 1.31±0.10
1mu2e 1.90±0.12 0.75±0.08
0mu3e 0.32±0.05 0.29±0.05

The pure electron channel shows a good agreement, but all the other channels don't. In particular, the muonic channel shows a disagreement of 70%!
*Hypothesis*: the fake rate estimation relies on the jets ET distribution of the sample used to estimate that rate. Therefore when the fake rate (which was estimated using a QCD sample for instance) is applied to a data sample, we would expect a better agreement whenever the sample have a similar jet ET distribution.
The fake rate maps were built using a QCD samples where the leading jets (transverse) energy distribution peaked at 15 for muons and 35 for electrons (I don't have those distributions and I don't know how to obtain them, I assume such a Landau-like shape peaking at the values mentioned).

I've plotted the leading jet ET for the ttbar sample (without any cut, I suppose I have to cut at least in pre-selection level) and the peak is around 97, while the sample plot for the Z+Jets sample is peaking in 47... Also I used as nominal map for muons the one peaking at 30:

  DDM MC
3mu0e 4.5±0.2 1.59±0.11
2mu1e 3.02±0.16 1.31±0.10
1mu2e 1.72±0.12 0.75±0.08
0mu3e 0.32±0.05 0.29±0.05

The table below show the same as the two last tables but using a fake rate map extracted with events with the ET of the leading jet peaking at 50 (for the muon case):

  DDM MC
3mu0e 2.58±0.15 1.59±0.11
2mu1e 1.99±0.13 1.31±0.10
1mu2e 1.11±0.10 0.75±0.08
0mu3e 0.32±0.05 0.29±0.05

The agreement using the Fake rate matrix extracted from a sample with a ET distribution of the leading jet peaking at 50 GeV/c is good. Recall that the fake rates were extracted from data, if they would be extracted from Monte Carlo, the agreement was even better.

The plots below shows the ET distribution of the leading jets of the samples involved on the analysis.
Error: (3) can't find leadingJet_ET_cut_PreSelection.png in Main Error: (3) can't find leadingJet_ET_cut_AllCuts.png in Main
Error: (3) can't find leadingJet_ET_cut_PreSelection_withFakes.png in Main Error: (3) can't find leadingJet_ET_cut_AllCuts_withFakes.png in Main

Note: the Et Spectrum for the Ztautau sample has been checked and found that the events producing that spectrum are the events with .


11 d'Abril 2012

How to use the cuts TTree after the analysis processing

After a runanalysis executable launch, the root file obtained not only contain all the histograms defined in the anlysis but also a TTree called cuts with two leafs: EventNumber and cuts. These two variables store the cut-id number (see the structure analysis-dependent in the CutLevels/interface/CutLevels.h file) which was reached by the given event number.
Using these root files, which contain the cut information of an analysis, with the original LatinoTrees(large files), which contain all the variables, it can be useful for a variety of problems: optimization, some behavioral plots, ... As example I wil show the steps followed to obtain a leading Jet ET distribution of events which have been passed all the cuts in the WZ analysis, for the ttbar sample.

  • From scratch: open a session in a worker node (qrsh, qlogin), and open a python interpreter
from ROOT import TProof,TChain,gROOT,TCanvas
# Latino Tree chain, adding the ttbar Latinos sample
t = TChain("Tree")
t.Add("/gpfs/csic_projects/cms/data/LatinosSkims/MC_Fall11/Tree_TTbar_2L2Nu_Powheg_*.root")
# Output from the runanalysis executable, the Tree is called "cuts"
tcut = TChain("cuts")
tcut.Add("../WZ2011_CT_TTbar_exclusive_MuonFRMapHigher/WZmmm/cluster_TTbar_2L2Nu_Powheg/Results/TTbar_2L2Nu_Powheg.root")
# Using that tree to select what events from the Latinos tree was selected within some selection
t.AddFriend("cuts")
# Open the a proof server, this is because the Latino Trees usually are large samples...
p = TProof.Open("")
# The Draw and Process methods of the t TChain is going to be delegated to the proof instance
t.SetProof()
# --- Some previous...
gROOT.SetBatch()
c1 = TCanvas()
# Extracting the ET of the leading Jets at Pre-selection level
t.Draw("T_JetAKPFNoPU_Et[0]>>h(100,0,200)","cuts.cuts>=7")
c1.SaveAs("ttbarDiLeptonic_LeadingJet_ET_cut_PreSelection.pdf")
# Extracting the ET of the leading Jets after all cuts
t.Draw("T_JetAKPFNoPU_Et[0]>>h1(100,0,200)","cuts.cuts>=7")
c1.SaveAs("ttbarDiLeptonic_LeadingJet_ET_cut_AllCuts.pdf")
IT DOESN'T WORK I was looking for and it seems to be related with the way PROOF processes the data (packets). The problem is in the friend tree, so if you don't use the cuts tree you can use PROOF. I'm investigating a way to deal with this problem.


17 d'Abril 2012

How many selected events in the ZZ madgraph sample are coming from the hadronic Z decay?

When using the fake method, the ZZ production is contributing to the PPF sample when one of the Z is decaying hadronically, mainly in b's . Then, the fake method would estimate this contribution, however we are including in the analysis the inclusive Monte Carlo ZZ so it is highly likely that we are double counting this hadronic events.
I've investigated what is the fraction of the total events (ZZ to everything) which corresponds to the when the fake estimation is done and it is compared with the fraction of that semi-hadronic ZZ events when the regular analysis is done. So, the table below shows that at different stages of the analysis:
  Fake Estimation Regular analysis
Preselection 29.22% 15.74%
Z candidate 16.24% 2.36%
All cuts 14.24% 1.47%

In total numbers (for all channels added up), using the MC composition of the fakes study (PROD_V04_02/WZ2011_FAKESCOMPOSITION) and the MC study (PROD_V04_02/WZ2011_MCBKG). The table below shows the events obtained for a luminosity of 4.9 fb-1 (full 2011 data):

  Fake Estimation Regular analysis
  ZZ->2lbb ZZ->X ZZ->2lbb ZZ->X
Preselection 1.19±0.06 4.09±0.19 13.14±0.14 83.5±0.9
Z candidate 0.58±0.03 3.58±0.18 1.81±0.02 76.9±0.18
All cuts 0.0612±0.0009 0.43±0.06 0.121± 0.004 8.2±0.3


3 de Maig de 2012

Data driven (Fakeable Object method) systematics

Updated 26 de Maig
Once we found the dependence of the Fake Rate Matrix on the ET distribution of the leading jet, we can build our sample used to extract the fake rates selecting events with a ET distribution similar to our DDD sample in order to minimize this effect. This can be seen with the table below, comparing the MC estimation with the DDM for the ttbar sample. As the electronic channel was already showing a good agreement, the Fake Rate Matrix was changed for muons. The one with better agreement was choosen:

  DDM, ET=15 DDM, ET=30 DDM, ET=50 DDM, ET=70 MC
3mu0e 5.4±0.2 4.5±0.2 2.58±0.15 2.58±0.15 1.59±0.11
2mu1e 3.76±0.18 3.02±0.16 1.99±0.13 2.05±0.13 1.31±0.10
1mu2e 1.90±0.12 1.72±0.12 1.11±0.10 1.08±0.09 0.75±0.08
0mu3e 0.39±0.05 0.39±0.05 0.39±0.05 0.39±0.06 0.29±0.05

Baseline Fake rate matrix using events with the ET distribution of the leading jet peaking at:

  • Muons: ET=50 GeV/c
  • Electrons: ET=35 GeV/c

Note that increasing the cut-off (see column DDM, ET=70) it doesn't improve the agreement. It seems there is some kind of saturation effect. The sample needed for the 70 cut-off matrix is accusing a lack of statistics, plus getting such hard leading jets the probability to fake leptons it is going to be more or less the same than using a little bit lower cut-off, although still hard like 50.

Using the baseline fake rate matrix we can compare:

  • DDD: data driven with data samples
  • DDM: data driven with Monte Carlo samples, in particular Z+Jets and ttbar as we've already proved that they are the main contribution to the fakes sample
  • MC: analysis with Monte Carlo samples

  DDD DDM MC
3mu0e 3.2±1.8 2.9±0.3 2.0±0.3
2mu1e 3.1±1.8 2.4±0.3 1.3±0.1
1mu2e 1.9±1.4 1.3±0.3 0.75±0.08
0e2mu 1.4±1.2 0.7±0..3 0.6±0.3

This table shows us:

  • the DDD and MC methods are pretty equivalent. So, it is possible to use both methods to estimate this kind of background. We have decided from the DDD because it is more robust with respect the MC: luminosity uncertainty, cross-section uncertainty, ... for the Monte Carlo sample.
  • the DDM-MC comparation is giving to us a closure test of the method (although we have not used a fake rate matrix extracted from a Monte Carlo). So, the difference between them can be used as a systematic, the error which introduces the method itself.
  • the DDD-MC comparation is giving to us the yields when using the Fakeable Object data driven method and the yields when using a pure Monte Carlo estimation. So, we could use the difference as systematic coming from the choice of use a particular method

  (DDD-MC)/DDD (#events) (DDM-MC)/DDM (#events)
  Compares a data driven method with a pure Monte Carlo data driven method Closure Test
3mu0e 38% (1.2) 31% (0.9)
2mu1e 58% (1.8) 46% (1.1)
1mu2e 61% (1.2) 42% (0.6)
0mu3e 57% (0.8) 14% (0.1)

Cross section estimation

where X denote the space phase of the signal, herafter we are going to ignore the superscript. Specifically,

  • , i.e. scale factors between data and Monte Carlo
  • integrated luminosity

Signal definition:

  • WZ decaying to all 3 leptons (mu,e,tau)
  • Mass of the Z between [60,120] GeV/c2
Note we are allowing all the leptonic decays so we are considering signal taus decaying to light leptons (e,mu), but in order to measure a cross section observable, it has more sense to calculate the observable where the leptons have been decayed promptly from the bosons: . Instead, given a channel, we are measuring the cross-section of the light leptons which could be products of a boson or a taus: . But we can take advantage of our signal definition and recover the inclusive cross-section (in the Z mass range before defined) using any of the measured channels (eee, for instance), so

which can be linked with the prompt cross-section by using

So, we can use as signal the tau decays of the WZ, keeping in mind that when present the measurement we are going to show the prompt cross-section:

Results and comparation with AN/2011-259

  • 801792 (from a total of 1216784,i .e., a 66%) was generated in the phase space
  • NOTE: 605774 out of 780584 after the Latino's skimming have the Z mass in [60,120], i.e. a 77.61% of the sample,

Note that all errors are statistical and systematics added up in quadrature,
Comparing with the CMS AN-11-259

channel CMS AN-11/259 (pb) with 1.1fb-1 Our AN with 4.9 fb-1 theoretical value
0.086±0.022 0.086±0.015 0.0637±0.0001
0.060±0.017 0.065±0.012 0.0626±0.0001
0.053±0.018 0.069±0.012 0.0637±0.0001
0.060±0.016 0.071±0.010 0.0627±0.0001

Combined using BLUE Method

CMS AN-11/259 (pb) with 1.1fb-1 Our AN with 4.9 fb-1 theoretical value
0.062±0.010 0.071±0.007 0.0643±0.0001
17.0±2.4 19.4±1.8 17.61±0.03


16 de Maig de 2012

Some systematics

  • Luminosity: 2.2% (based on the CMS online luminosity monitoring)
  • ZZ background: (extracted from Standard Model Cross Sections twiki page and propagated)
    , so the change in the cross section is propagated as . And that difference is propagated to the cross section measurement which it is having any effect at all... (I put in the table because I do the calculation but we could say that this systematic is contribution < 0.2 % in worst of the cases)
      mmm mme eem eee lepton-channel
    Syst. cross-section ZZ 0.1 % 0.05% 0.2% 0.03% 0.08%
  • PDF uncertainty, using just the CTEQ6.6 Parton Density Function set, the effect on applying the Master equation procedure (prescriptions at AN2009/048) to the signal acceptance is found to be , which propagated to the cross-section are giving a systematic of ,
  • Pile-up reweigthing (difference between reweighting and not reweighting at all): 0.5%
  • MET modelling:
    • in HToWW analysis they find that the missing ET resolutions agree between data and Monte Carlo control sample (W+Jets) better than 1% and the means are shifted off by 1%. So, they no considered any systematics on the resolution, but vary the MET by the difference found between data and simulation. The effect on the signal efficiency they found is 2%
    • in the WZ analysis, they estimated the resolution and scale (mean) using a Z->ll sample, they found values lower than 0.7%. I put here the table per channel
        3m 2mu1e 1e2mu 3e
      MET resolution 0.55% 0.47 % 0.47% 0.44%
      MET scale 0.28% 0.23% 0.04% 0.07%
  • Lepton reconstruction and identification efficiency: 2% (HToWW and WZ)
  • Trigger efficiency: less than 1%
  • Muon momentum and electron energy scale: 2%-electron, 0.5% in muons

17 de Maig de 2012

Z+Jets Closure Test using different Fake rate matrices At Preselection and Z-candidate level

At Preselection level

  DDM, ET=15 DDM, ET=30 DDM, ET=50 MC
3mu0e 71±15 45±4 33±3 41±3
2mu1e 124±6 123±6 121±6 74±5
1mu2e 41±3 26±3 20±2 25±3
0mu3e 22±3 22±3 22±3 20±2

At Z-candidate level

  DDM, ET=15 DDM, ET=30 DDM, ET=50 MC
3mu0e 65±4 42±4 31±3 38±3
2mu1e 113±6 112±6 110±6 59±4
1mu2e 38±3 25±3 18±2 24±3
0mu3e 20±2 20±2 20±2 18±2

The better agreement for the closure test using the Z+Jets samples is reached using the Fake rate matrix extracted from a sample where the leading lepton is selected to be at least ET=30. Note that result is in agreement of what we have studied on the section Closure test for ttbar (exclusive dileptonic) sample (see plots about the Leading Jet Et-distributions)


21 de Maig de 2012

Table of cross-section with systematics

I found some problem with the cross-section numbers. I cannot reproduce the numbers found at the last table. I've checked documentation but I didn't found where (using what sample/folder) the cross-section table from above were calculated. I've reached to reproduce the table of the M_Z > 12 Gev, (inside the folder PROD_V04_03Pre/WZ2011_25042012, which it is a folder where the events with events generated going from 12 up to ...; not only between the mass range of 60-120). Anyway, using this folder to evaluate the WZ cross-section I'm not able to find the same results... So, I'm using the folder PROD_V04_03Pre/WZ2011_10052012 which it already contains the generation mass range cut.

 
3e0mu 0.086±0.014(stat)±0.004(sys)±0.002(lumi) 23.7±3.9(stat)±1.2(sys)±0.5(lumi)
2e1mu 0.065±0.011(stat)±0.003(sys)±0.002(lumi) 18.4±3.1(stat)±1.0(sys)±0.4(lumi)
2mu1e 0.069±0.011(stat)±0.004(sys)±0.002(lumi) 19.0±3.0(stat)±1.0(sys)±0.4(lumi)
3mu0e 0.071±0.009(stat)±0.003(sys)±0.002(lumi) 19.9±2.6(stat)±0.9(sys)±0.5(lumi)

Latex format download xstable.tex :

Combined cross-section (with BLUE method)

with weights:


26 de Maig de 2012

Resolving Matt comments

Cross sections in the Z mass range between [60-120]

Recipe used by Vuko (with MCFM: Monte Carlo for FeMtobarn processes) to calculate the cross-section of the madgraph sample WZ used in the analysis where the phase space is defined to be with and both W and Z decaying into all 3 lepton flavours.
MCFM inclusive WZ NLO cross section with M(Z)>12 GeV,

W+Z    17005.086 ±    45.807 fb
W-Z      9730.410 ±    24.482 fb
===========================
Total: 26735 fb

Both W and Z are allowed to decay into e/mu/tau

Using the BR is:
Z ->ll : 3*0.03365 = 0.10095
W->l : 0.1075+0.1057+0.1125 = 0.3257
The cross section for the sample is
26735 fb * 3*0.03365*(0.1075+0.1057+0.1125) = 0.879 +- 0.002 pb

We can use the already calculated cross-section for the Z mass > 12, i.e, which correspond to 1216784 events (for a given integrated luminosity), we have obtained 801792 within the Z mass range [60,120], so we can calculate the cross-section in that phase space,

  • , for all leptonic decays.
Thus, we can extract the inclusive cross-section at the Z mass range between 60 and 120, by dividing over the leptonic branching ratio

IWe can restricte to the same flavour leptonic decay then,

  • ,
where the relation between them is:
  • ,

Tight-Tight-noTight sample composition

In order to see how it is composed the Tight-Tight-noTight ( sample, herafter) we have performed the data driven method using Monte Carlo sample (procedure called DDM) to the contributions we suppose are composing the sample, our guess is ttbar and Z+Jets. The table below shows the yields obtained by the DDM procedure for the ttbar and Z+Jets samples and comparing them with the data driven method using the proper data, procedure called DDD

  • Yields extracted after Z candidate cut level:
\begin{tabular}{l c c || c | r }\hline\hline
 channel                              &  $DDM_{t\bar{t}\rightarrow2\ell+X}$  &      $DDM_{Z+Jets}$     &     $DDM_{Total}$     &       $DDD$              \\ \hline
 $WZ\rightarrow3e\nu$       &    $0.60\pm0.07$                               &   $20\pm2$                     &         $21\pm2$        &       $25\pm5$     \\
 $WZ\rightarrow2e\mu\nu$  &   $2.70\pm0.15$                               &       $18\pm2$                 &      $21\pm2$        &       $28\pm5$     \\
 $WZ\rightarrow1e2\mu\nu$ &  $3.45\pm0.17$                               &   $110\pm6$                 &       $114\pm6$        &        $111\pm11$ \\
 $WZ\rightarrow3\mu\nu$   &    $4.9\pm0.2$                                   &      $31\pm3$                  &          $36\pm3$        &       $50\pm7$       \\
 \hline\hline
 $WZ\rightarrow3\ell\nu$     &    $11.7\pm0.3$                                 &        $180\pm7$              &         $191\pm7$        &      $214\pm15$  \\
\hline
\end{tabular}

Note: the channel is just adding-up all the other channels.

  • Yields extracted after all cuts (in particular the MET cut):
\begin{tabular}{l c c || c | r }\hline\hline
 channel                              &  $DDM_{t\bar{t}\rightarrow2\ell+X}$  &      $DDM_{Z+Jets}$     &     $DDM_{Total}$     &       $DDD$              \\ \hline
 $WZ\rightarrow3e\nu$       &    $0.39\pm0.06$                               &   $0.3\pm0.3$                     &         $0.7\pm0.3$        &       $1.5\pm1.2$     \\
 $WZ\rightarrow2e\mu\nu$  &   $1.11\pm0.10$                               &      $0.2\pm0.3$              &      $1.3\pm0.3$        &       $1.89\pm1.37$     \\
 $WZ\rightarrow1e2\mu\nu$ &  $1.99\pm0.13$                               &   $0.4\pm0.3$                 &       $2.4\pm0.4$        &        $3.1\pm1.8$ \\
 $WZ\rightarrow3\mu\nu$   &    $2.58\pm0.15$                           &      $0.3\pm0.3$               &          $2.8\pm0.3$   &       $3.2\pm1.8$       \\
 \hline\hline 
 $WZ\rightarrow3\ell\nu$     &    $6.1\pm0.2$                                   &      $1.1\pm0.6$              &          $7.2\pm0.6$     &   $10\pm3$    \\
\hline
\end{tabular}

Note: the channel is just adding-up all the other channels.

Recall plots are in the usual place

Considering a 70 GeV/c cut-off sample for the Fake rate matrix

See the updated table at Data driven Fakeable section

How is estimated the lepton selection, identification and trigger efficiencies: link to AN2011/148

In the AN2011/148 you will find the definitions of our objects (Muon/Electron), plus the optimization done (isolation, identification, etc..) and also how is estimated the id,iso,trigger efficiencies, which is mainly using a Tag and Probe
  • Name of the analysis note: "Search for Higgs Boson Decays to Two W Bosons in the Fully Leptonic Final State √s = 7 TeV with 2011 data of CMS detector." (W. Andrews et al.)

How is it calculated the systematics.

See some of them explained here
Trigger, identification and selection efficiencies (WZ,ZZ)
See lepton efficiencies paragraph
Muon Momentum/Electron Energy scale (WZ,ZZ)
Missing ET resolution (WZ,ZZ)
Varying the resolution of the contribution objects to the MET and then propagated to the MET itself. See prescription at Official Prescription for calculating uncertainties on Missing Transverse Energy (MET)
Pile-up (WZ,ZZ)
an incorrect modeling of the pileup in the Monte Carlo samples can bias the expected event yields. The simulated events have been re-weighted on the basis of the number of reconstructed primary vertices; the effect of the re-weighting procedure at the yields are taken as systematics (i.e., re-weighting and non-re-weighting)
PDF systematics (WZ,ZZ)
Systematic introduced by the choice of one PDF set and also, fixed a set, the uncertainty coming from our limited knowledge of the PDFs. Details can be found at hep-ph/0605240 and at EWK utils. Briefly (you can found the details in the references), we apply the master equations to recalculate the observables affected (yields for ZZ and acceptance of the signal) when varying the pdf (the set itself or the central value plus errors).
PDF uncertainty, using just the CTEQ6.6 Parton Density Function set, the effect on applying the Master equation procedure (prescriptions at AN2009/048) to the signal acceptance is found to be , which propagated to the cross-section are giving a systematic of ,
Luminosity (WZ,ZZ)
Extracted from the CMS online luminosity monitoring, directly applied 2.2%
Fakeable object method ( ttbar and Z+jets)
See section Fakable object method systematics

MC-data discrepancies in the pure electronic channel

  • It could be the effect of not applying the internal bremsstrahlung cut up to the requirement of having a 3rd lepton with pt>20 (before MET cut)? If it is true, then it should be an important discrepancy in the variable used to cut, i.e. , if the W-candidate lepton is inside a cone of 0.1 with respect to some of the Z-candidate leptons, then the event is rejected. BUT, you can see in the plots below, that is not the case. The estimation on the bin dR < 0.1 is showing no problems. So, we can conclude that the internal bremsstrahlung cut is not the cause of the MC-data discrepancies in the electronic channel
Error: (3) can't find fHdRl1Wcand.png in Main Error: (3) can't find fHdRl2Wcand.png in Main
  • It could have been forgotten some background contribution? The fakeable method is not intended to get electrons coming from photon conversion (due to the fact that the fake rate matrices were calculated using a sample without this kind of fakes) and we forgot to include that contribution: . So, we would expect to improve the electronic channels (eee and mme) with this background inclusion.
The results found are not conclusive, perhaps it seems to be indicating that indeed it was a missing Monte-Carlo sample, some bins recover pretty well the discrepancies between the data and MC. However, there are some distributions which are loosing its shape...
We have tried to use a more reliable criterion than the nude eye and we have evaluated the (which is a more general test estimator than the , see Probability Theory. The Logic Of Science, E.T. Jaynes, Cambridge University Press) for several important distributions, using the MC sample and without it. The results are inconclusive.
eee channel without Vgamma contribution eee channel with Vgamma contribution in the MC
Error: (3) can't find fHLeptonCharge.png in Main Error: (3) can't find fHLeptonCharge_Vgamma.png in Main
Error: (3) can't find fHPtLepton1.png in Main Error: (3) can't find fHPtLepton1_Vgamma.png in Main
Error: (3) can't find fHPtLepton2.png in Main Error: (3) can't find fHPtLepton2_Vgamma.png in Main
Error: (3) can't find fHPtLepton3.png in Main Error: (3) can't find fHPtLepton3_Vgamma.png in Main
Error: (3) can't find fHMETAfterZCand.png in Main Error: (3) can't find fHMETAfterZCand_Vgamma.png in Main
Error: (3) can't find fHZInvMassAfterZCand.png in Main Error: (3) can't find fHZInvMassAfterZCand_Vgamma.png in Main

05 de Juny de 2012

Asymmetric charge behaviour of V/gamma sample

Looking at the plot sum of lepton charge at the electronic channel --> Explained: due to a Lack of statistics, the VGJets sample has low statistics (in fact at the level of charge plot, there are 30 Monte.Carlo events, not weighted); so we might be seeing a fluctuation.


05 de Juny de 2012

New Z mass window and MET cut

Z mass window: M_Z±20 [71,111]

First thing is extract the number of events generated in that mass window. WARNING: Why do I have 1221134 events when look at the "/gpfs/gaes/cms/store/mc/Fall11/WZJetsTo3LNu_TuneZ2_7TeV-madgraph-tauola/AODSIM/PU_S6_START42_V14B-v1/0000/*.root" samples and in Javi spreadsheet appears 1216784? Also looking at WZJetsTo3LNu DAS page the number appearing is 1221134. What is the cause of the discrepancy?
  • Zmass between [71,111]: 787195
  • Zmass between [60,120]: 801792
The number of generated events is needed for the calculation of the WZ efficiency.
Technicality You can skip this...
The extraction of that number is done using just a simple piece of code trying to avoid any processing with cmsRun, (but the price you pay is the terrible slowly process to extract just a number, more than 4 hours at gridui...),
-- Set a CMSSW environment (probably a 4_X_X series is good). Then open a root 
$  root -l
TChain *evt = new TChain("Events");
gSystem.Load("libFWCoreFWLite.so");
AutoLibraryLoader.enable();
evt->Add("/gpfs/gaes/cms/store/mc/Fall11/WZJetsTo3LNu_TuneZ2_7TeV-madgraph-tauola/AODSIM/PU_S6_START42_V14B-v1/0000/*.root");
gROOT->SetBatch();
evt->Draw("genParticles.mass()","genParticles.pdgId() == 23 && genParticles.status() == 3 && genParticles.mass() > 71. && genParticles.mass() < 111.");
Also, you can launch the python script genZInsideRange_AOD.py (you will find it in your home at grudui) which it is trying to use PROOF with FWLite, but it is extremely unstable...


07 de Juny de 2012

PPP contamination to the Tight-Tight-noTight sample

Estimation extracted using the DDM over the PPP samples (ZZ and WZ) Using:
  • FR Matrix ET=50 GeV for muons
  • FR Matrix ET=35 GeV for electrons
  • MET > 40
  • Z in [71,111]
  eee  
  ZZ WZTo3LNu Raw Fakes
At 3 Leptons 0.61 2.18 32
Zcandidate 0.56 2.10 26
All cuts 0.05 1.07 2.3

  eem  
  ZZ WZTo3LNu Raw Fakes
At 3 Leptons 0.68 2.78 43
Zcandidate 0.58 2.61 29
All cuts 0.07 1.15 2.8

  mme  
  ZZ WZTo3LNu Raw Fakes
At 3 Leptons 1.06 3.22 142
Zcandidate 0.85 3.07 86
All cuts 0.07 1.51 3.8

  mmm  
  ZZ WZTo3LNu Raw Fakes
At 3 Leptons 1.08 3.64 66
Zcandidate 0.93 3.52 50
All cuts 0.13 1.53 3.9


14 de Juny de 2012

Fake Rate matrix choice justification

The Fakeable Object method relies in the following assumption:
  • getting a lepton coming from a jet is an universal property only dependent on the jet composition of the sample
Therefore, it is possible to use an enriched heavy flavour sample to calculate how many times we count as a prompt lepton (coming from a W or Z) a lepton which was decayed from a heavy flavour quark, the so called fake lepton in our terminology. Using these frequencies (fake rate) calculated from that enriched heavy flavour sample we can use them in others samples to estimate how many leptons are fakes. However, we should be sure about the jet composition of the different involving samples, i.e. fake rates extraction sample and application samples. It is needed that the fake lepton jet (i.e., a jet which in a cone around the it, a lepton is found) transverse energy distribution in each sample is equivalent because the probability to reconstruct a lepton coming from a jet is mainly dependent on the transverse energy of the mother quark.
We should use a fake rate matrix extracted from a sample which its fake lepton jet ET distribution being equivalent to the sample of application, in our case before the MET cut, the Z+Jets sample and after the MET cut, the ttbar sample. It is possible to tune that jet ET distribution by imposing a cut-off in the Et of the remaining leading jet in the event.
The following plots was done with:
  • Application samples:
    • Z+Jets where the legends are: *_Zee_, Zmumu are Monte Carlo samples
      • Fakes is a data sample composed by Tight-Tight-NoTight leptons and requiring a MET < 20 GeV/c (enriching the Z+Jets contribution)
    • TTbar (fully leptonic decay) Monte Carlo sample
  • FR matrix extraction samples
    • Differences coming from the transverse energy cut-off applied to the leading lepton. The leading lepton is extracted from the remaining jets in the event once it has been identified the NoTight lepton and the jet sorrounding that lepton (in a dR < 1.0 cone)

ELECTRONS MUONS
Error: (3) can't find Zee_eee.png in Main Error: (3) can't find Zmumu_mmm.png in Main
Error: (3) can't find Fakes_eee.png in Main Error: (3) can't find Fakes_mmm.png in Main
Error: (3) can't find TTbar_2L2Nu_eee.png in Main Error: (3) can't find TTbar_2L2Nu_mmm.png in Main


14 de Juny de 2012

Systematics Calculations: technical details:

  1. Systematics calculated using a slighted modified code
    Procedure
    1. Re-do the analysis using the slightly modified code (see the name of the repository at devel in the table below)
    2. The new files and yields obtained are going to be compared with the nominal one using the utitlity getsystematics. Assume the parent folder for nominal is WZ2011_nominal and the folder for systematics is WZ2011_sys, then
   10# systematics affecting WZ signal
   20$  getsystematics -p WZ2011_nominal,WZ2011_sys -s yields -n WZ
   30# systematics affecting  acceptance
   40$  getsystematics -p WZ2011_nominal,WZ2011_sys -s acceptance -n WZ  

Muon momentum/Electron energy scale We have extracted the systematic due to the momentum scale mis-assignment by varying the transverse momentum by a ±σ (see right to see how much is) and re-doing the complete analysis Code repo at devel: VH_Analysis_PTSYS
Muons: we use a variation on the transverse momentum of 1% (extracted from the HWW analysis note), the Muon POG general recomendations for muons recommends a 0.2% for scale an 0.6% for resolution..
Electrons: also varied the energy scale by 2% and 5% for electrons of the barrel and endcap respectively.(extracted from HWW analysis note)
Lepton and trigger efficiencies We have extracted the systematic due to the trigger and lepton efficiencies by using the ±σ associated to the scale factors and re-doing the complete analysis using these new scale factors Code repo at devel: VH_Analysis_LEPTONSYS

Links

References

-- JordiDuarte - 11-Mar-2012

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2012-06-21 - JordiDuarte
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback