Difference: AmnonHarelStatusReportForStatisticsBoard (1 vs. 10)

Revision 102010-04-28 - AmnonHarel

Line: 1 to 1
 
META TOPICPARENT name="AmnonHarelJetRatio"
-- AmnonHarel - 17-Mar-2010
Line: 173 to 173
 
META FILEATTACHMENT attachment="climLLR1_1K.png" attr="" comment="Limits with systematics" date="1271789641" name="climLLR1_1K.png" path="climLLR1_1K.png" size="14699" stream="climLLR1_1K.png" tmpFilename="/usr/tmp/CGItemp23220" user="aharel" version="2"
META FILEATTACHMENT attachment="plot_qs_model.png" attr="" comment="q* models (rough version from 2010-Apr-18)" date="1271781275" name="plot_qs_model.png" path="plot_qs_model.png" size="18663" stream="plot_qs_model.png" tmpFilename="/usr/tmp/CGItemp23632" user="aharel" version="2"
META FILEATTACHMENT attachment="plot_ci_model.png" attr="" comment="Contact interaction models" date="1271784777" name="plot_ci_model.png" path="plot_ci_model.png" size="19745" stream="plot_ci_model.png" tmpFilename="/usr/tmp/CGItemp23777" user="aharel" version="3"
Changed:
<
<
META FILEATTACHMENT attachment="priors.c" attr="" comment="Visualization of nuisance parameter priors" date="1272459729" name="priors.c" path="priors.c" size="3821" stream="priors.c" tmpFilename="/usr/tmp/CGItemp20752" user="aharel" version="1"
META FILEATTACHMENT attachment="priors_10_1.png" attr="" comment="m=10,w=1" date="1272459813" name="priors_10_1.png" path="priors_10_1.png" size="18896" stream="priors_10_1.png" tmpFilename="/usr/tmp/CGItemp21981" user="aharel" version="1"
META FILEATTACHMENT attachment="priors_1_01.png" attr="" comment="m=1,w=0.1" date="1272459830" name="priors_1_01.png" path="priors_1_01.png" size="21635" stream="priors_1_01.png" tmpFilename="/usr/tmp/CGItemp22031" user="aharel" version="1"
META FILEATTACHMENT attachment="priors_1_001.png" attr="" comment="m=1,w=0.01" date="1272459849" name="priors_1_001.png" path="priors_1_001.png" size="20894" stream="priors_1_001.png" tmpFilename="/usr/tmp/CGItemp21737" user="aharel" version="1"
>
>
META FILEATTACHMENT attachment="priors.c" attr="" comment="Visualization of nuisance parameter priors" date="1272476078" name="priors.c" path="priors.c" size="3834" stream="priors.c" tmpFilename="/usr/tmp/CGItemp25706" user="aharel" version="2"
META FILEATTACHMENT attachment="priors_10_1.png" attr="" comment="m=10,w=1" date="1272476096" name="priors_10_1.png" path="priors_10_1.png" size="18881" stream="priors_10_1.png" tmpFilename="/usr/tmp/CGItemp27336" user="aharel" version="2"
META FILEATTACHMENT attachment="priors_1_01.png" attr="" comment="m=1,w=0.1" date="1272476149" name="priors_1_01.png" path="priors_1_01.png" size="22030" stream="priors_1_01.png" tmpFilename="/usr/tmp/CGItemp27231" user="aharel" version="2"
META FILEATTACHMENT attachment="priors_1_001.png" attr="" comment="m=1,w=0.01" date="1272476112" name="priors_1_001.png" path="priors_1_001.png" size="21160" stream="priors_1_001.png" tmpFilename="/usr/tmp/CGItemp27371" user="aharel" version="2"

Revision 92010-04-28 - AmnonHarel

Line: 1 to 1
 
META TOPICPARENT name="AmnonHarelJetRatio"
-- AmnonHarel - 17-Mar-2010
Added:
>
>

New prior shapes for nuisance parameters

The statistics board asked for gamma and lognormal priors. These distributions have 3 degrees of freedom, two of which are fixed by the requirements of given mean and RMS. From a literature scan I found that the common practice is to assume mu=0, where I use the variable names used in ROOT's TMath.

To see what these look like I used the attached priors.c (may require ROOT5.26 to run). Some output examples are:

mean: 1 1 10
width: 0.01 0.1 1
plot: priors_1_001.png priors_1_01.png priors_10_1.png

The numbers in parenthesis are the errors on the last digits. The first number is the mean of the histogram, and the 2nd is the RMS. The Gauss and LogNormal histogram is filled by transforming a normal variable, rather than from the plotted function.

 

19-Apr-2010 Information

Introduction and references

Some material in preparation of the April 20th meeting.
Line: 154 to 173
 
META FILEATTACHMENT attachment="climLLR1_1K.png" attr="" comment="Limits with systematics" date="1271789641" name="climLLR1_1K.png" path="climLLR1_1K.png" size="14699" stream="climLLR1_1K.png" tmpFilename="/usr/tmp/CGItemp23220" user="aharel" version="2"
META FILEATTACHMENT attachment="plot_qs_model.png" attr="" comment="q* models (rough version from 2010-Apr-18)" date="1271781275" name="plot_qs_model.png" path="plot_qs_model.png" size="18663" stream="plot_qs_model.png" tmpFilename="/usr/tmp/CGItemp23632" user="aharel" version="2"
META FILEATTACHMENT attachment="plot_ci_model.png" attr="" comment="Contact interaction models" date="1271784777" name="plot_ci_model.png" path="plot_ci_model.png" size="19745" stream="plot_ci_model.png" tmpFilename="/usr/tmp/CGItemp23777" user="aharel" version="3"
Added:
>
>
META FILEATTACHMENT attachment="priors.c" attr="" comment="Visualization of nuisance parameter priors" date="1272459729" name="priors.c" path="priors.c" size="3821" stream="priors.c" tmpFilename="/usr/tmp/CGItemp20752" user="aharel" version="1"
META FILEATTACHMENT attachment="priors_10_1.png" attr="" comment="m=10,w=1" date="1272459813" name="priors_10_1.png" path="priors_10_1.png" size="18896" stream="priors_10_1.png" tmpFilename="/usr/tmp/CGItemp21981" user="aharel" version="1"
META FILEATTACHMENT attachment="priors_1_01.png" attr="" comment="m=1,w=0.1" date="1272459830" name="priors_1_01.png" path="priors_1_01.png" size="21635" stream="priors_1_01.png" tmpFilename="/usr/tmp/CGItemp22031" user="aharel" version="1"
META FILEATTACHMENT attachment="priors_1_001.png" attr="" comment="m=1,w=0.01" date="1272459849" name="priors_1_001.png" path="priors_1_001.png" size="20894" stream="priors_1_001.png" tmpFilename="/usr/tmp/CGItemp21737" user="aharel" version="1"

Revision 82010-04-20 - AmnonHarel

Line: 1 to 1
 
META TOPICPARENT name="AmnonHarelJetRatio"
-- AmnonHarel - 17-Mar-2010
Line: 20 to 20
 In particular, the systematic effects have large bin-to-bin correlations, which should dominate any quantitative treatment of the systematics (see [2], Fig. 2, off-diagonal elements). In particular, the relative JES effects at a particular jet pT should be propagated to a range of Mjj bins.
Added:
>
>

Additional presentations

 

The preferred technique

(with a few questions to the SB interleaved)

Revision 72010-04-20 - AmnonHarel

Line: 1 to 1
 
META TOPICPARENT name="AmnonHarelJetRatio"
-- AmnonHarel - 17-Mar-2010
Line: 145 to 145
 
META FILEATTACHMENT attachment="ratio_sys_var_note.pdf" attr="" comment="Note on systematic variations" date="1268850171" name="ratio_sys_var_note.pdf" path="ratio_sys_var_note.pdf" size="76351" stream="ratio_sys_var_note.pdf" tmpFilename="/usr/tmp/CGItemp5537" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR0_4K.png" attr="" comment="Limits with no systematics" date="1271642844" name="climLLR0_4K.png" path="climLLR0_4K.png" size="12429" stream="climLLR0_4K.png" tmpFilename="/usr/tmp/CGItemp14815" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR0_s2_1K.png" attr="" comment="Without systematics, 1K PDSs per ensemble" date="1271643673" name="climLLR0_s2_1K.png" path="climLLR0_s2_1K.png" size="12500" stream="climLLR0_s2_1K.png" tmpFilename="/usr/tmp/CGItemp14936" user="aharel" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="climLLR2_4K.png" attr="" comment="Limits without systematics" date="1271669682" name="climLLR2_4K.png" path="climLLR2_4K.png" size="16223" stream="climLLR2_4K.png" tmpFilename="/usr/tmp/CGItemp14761" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR1_1K.png" attr="" comment="Limits with systematics" date="1271669722" name="climLLR1_1K.png" path="climLLR1_1K.png" size="14703" stream="climLLR1_1K.png" tmpFilename="/usr/tmp/CGItemp14805" user="aharel" version="1"
>
>
META FILEATTACHMENT attachment="climLLR2_4K.png" attr="" comment="Limits without systematics" date="1271789544" name="climLLR2_4K.png" path="climLLR2_4K.png" size="16256" stream="climLLR2_4K.png" tmpFilename="/usr/tmp/CGItemp23151" user="aharel" version="2"
META FILEATTACHMENT attachment="climLLR1_1K.png" attr="" comment="Limits with systematics" date="1271789641" name="climLLR1_1K.png" path="climLLR1_1K.png" size="14699" stream="climLLR1_1K.png" tmpFilename="/usr/tmp/CGItemp23220" user="aharel" version="2"
 
META FILEATTACHMENT attachment="plot_qs_model.png" attr="" comment="q* models (rough version from 2010-Apr-18)" date="1271781275" name="plot_qs_model.png" path="plot_qs_model.png" size="18663" stream="plot_qs_model.png" tmpFilename="/usr/tmp/CGItemp23632" user="aharel" version="2"
META FILEATTACHMENT attachment="plot_ci_model.png" attr="" comment="Contact interaction models" date="1271784777" name="plot_ci_model.png" path="plot_ci_model.png" size="19745" stream="plot_ci_model.png" tmpFilename="/usr/tmp/CGItemp23777" user="aharel" version="3"

Revision 62010-04-20 - AmnonHarel

Line: 1 to 1
 
META TOPICPARENT name="AmnonHarelJetRatio"
-- AmnonHarel - 17-Mar-2010
Added:
>
>
 

19-Apr-2010 Information

Changed:
<
<
>
>

Introduction and references

 Some material in preparation of the April 20th meeting.

This supplements Robert's email and uses the references he provided:

Line: 77 to 78
  plot_ci_model.png
  • q* models (rough version from 2010-Apr-18):
    plot_qs_model.png
Deleted:
<
<

Tests of systematic variations

  • Test of unbinned technique used for SM and contact interaction in [4].
  • Test of binned JES systematics used in q* models (rough models from 2010-Apr-18):
    plot_qs_jes.png
 

17-Mar-2010 Information

Line: 117 to 114
 statistics dominates IN EVERY SINGLE BIN OF INTEREST.
  • covering only this case will leave us badly placed if some systematic effect turns out bigger than expected and increases the systematics by more than a factor of 2.
Changed:
<
<

Selected documentation:

>
>

Selected documentation:

 
Line: 128 to 125
 
Changed:
<
<

Now what?

>
>

Now what?

 I see two main options:
  1. use the current tools, with local limit setting for q* and an LLR, and ignoring all bins below ~800GeV.
    • Should work.
Line: 151 to 147
 
META FILEATTACHMENT attachment="climLLR0_s2_1K.png" attr="" comment="Without systematics, 1K PDSs per ensemble" date="1271643673" name="climLLR0_s2_1K.png" path="climLLR0_s2_1K.png" size="12500" stream="climLLR0_s2_1K.png" tmpFilename="/usr/tmp/CGItemp14936" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR2_4K.png" attr="" comment="Limits without systematics" date="1271669682" name="climLLR2_4K.png" path="climLLR2_4K.png" size="16223" stream="climLLR2_4K.png" tmpFilename="/usr/tmp/CGItemp14761" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR1_1K.png" attr="" comment="Limits with systematics" date="1271669722" name="climLLR1_1K.png" path="climLLR1_1K.png" size="14703" stream="climLLR1_1K.png" tmpFilename="/usr/tmp/CGItemp14805" user="aharel" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="plot_qs_model.png" attr="" comment="q* models (rough version from 2010-Apr-18)" date="1271683894" name="plot_qs_model.png" path="plot_qs_model.png" size="18315" stream="plot_qs_model.png" tmpFilename="/usr/tmp/CGItemp14792" user="aharel" version="1"
META FILEATTACHMENT attachment="plot_ci_model.png" attr="" comment="Contact interaction models" date="1271683918" name="plot_ci_model.png" path="plot_ci_model.png" size="19745" stream="plot_ci_model.png" tmpFilename="/usr/tmp/CGItemp14982" user="aharel" version="1"
META FILEATTACHMENT attachment="plot_qs_jes.png" attr="" comment="Test of binned JES systematics used in q* models (rough models from 2010-Apr-18)" date="1271692624" name="plot_qs_jes.png" path="plot_qs_jes.png" size="20965" stream="plot_qs_jes.png" tmpFilename="/usr/tmp/CGItemp14745" user="aharel" version="1"
>
>
META FILEATTACHMENT attachment="plot_qs_model.png" attr="" comment="q* models (rough version from 2010-Apr-18)" date="1271781275" name="plot_qs_model.png" path="plot_qs_model.png" size="18663" stream="plot_qs_model.png" tmpFilename="/usr/tmp/CGItemp23632" user="aharel" version="2"
META FILEATTACHMENT attachment="plot_ci_model.png" attr="" comment="Contact interaction models" date="1271784777" name="plot_ci_model.png" path="plot_ci_model.png" size="19745" stream="plot_ci_model.png" tmpFilename="/usr/tmp/CGItemp23777" user="aharel" version="3"

Revision 52010-04-19 - AmnonHarel

Line: 1 to 1
 
META TOPICPARENT name="AmnonHarelJetRatio"
-- AmnonHarel - 17-Mar-2010
Line: 77 to 77
  plot_ci_model.png
  • q* models (rough version from 2010-Apr-18):
    plot_qs_model.png
Added:
>
>

Tests of systematic variations

  • Test of unbinned technique used for SM and contact interaction in [4].
  • Test of binned JES systematics used in q* models (rough models from 2010-Apr-18):
    plot_qs_jes.png
 

17-Mar-2010 Information

Line: 149 to 153
 
META FILEATTACHMENT attachment="climLLR1_1K.png" attr="" comment="Limits with systematics" date="1271669722" name="climLLR1_1K.png" path="climLLR1_1K.png" size="14703" stream="climLLR1_1K.png" tmpFilename="/usr/tmp/CGItemp14805" user="aharel" version="1"
META FILEATTACHMENT attachment="plot_qs_model.png" attr="" comment="q* models (rough version from 2010-Apr-18)" date="1271683894" name="plot_qs_model.png" path="plot_qs_model.png" size="18315" stream="plot_qs_model.png" tmpFilename="/usr/tmp/CGItemp14792" user="aharel" version="1"
META FILEATTACHMENT attachment="plot_ci_model.png" attr="" comment="Contact interaction models" date="1271683918" name="plot_ci_model.png" path="plot_ci_model.png" size="19745" stream="plot_ci_model.png" tmpFilename="/usr/tmp/CGItemp14982" user="aharel" version="1"
Added:
>
>
META FILEATTACHMENT attachment="plot_qs_jes.png" attr="" comment="Test of binned JES systematics used in q* models (rough models from 2010-Apr-18)" date="1271692624" name="plot_qs_jes.png" path="plot_qs_jes.png" size="20965" stream="plot_qs_jes.png" tmpFilename="/usr/tmp/CGItemp14745" user="aharel" version="1"

Revision 42010-04-19 - AmnonHarel

Line: 1 to 1
 
META TOPICPARENT name="AmnonHarelJetRatio"
-- AmnonHarel - 17-Mar-2010
Line: 19 to 19
 In particular, the systematic effects have large bin-to-bin correlations, which should dominate any quantitative treatment of the systematics (see [2], Fig. 2, off-diagonal elements). In particular, the relative JES effects at a particular jet pT should be propagated to a range of Mjj bins.
Changed:
<
<

The preferred technique is

>
>

The preferred technique

 (with a few questions to the SB interleaved)

  1. The test statistics are log likelihood ratios that do not take systematic effects into account.
Line: 71 to 71
 I plan to explore the 2nd approach, using a profile likelihood fit with the "usual" binomial statistics. But we do not yet have a consensus on whether this is necessary for an ICHEP publication.
  • Currently, we do not fit the (nuisance parameters of the) prediction to the data, and we do not have a detailed model of the bin-to-bin correlations. We have not yet discussed internally whether such a detailed description of the systematics is realistic for the ICHEP timescale.
Added:
>
>

Additional information

Input models

  • Contact interaction models:
    plot_ci_model.png
  • q* models (rough version from 2010-Apr-18):
    plot_qs_model.png
 

17-Mar-2010 Information

Following the discussion just before midnight in the 16-Mar statistics board meeting, here is a status report for the statistics in the di-jet ratio, with context and links to notes and talks.

Line: 140 to 147
 
META FILEATTACHMENT attachment="climLLR0_s2_1K.png" attr="" comment="Without systematics, 1K PDSs per ensemble" date="1271643673" name="climLLR0_s2_1K.png" path="climLLR0_s2_1K.png" size="12500" stream="climLLR0_s2_1K.png" tmpFilename="/usr/tmp/CGItemp14936" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR2_4K.png" attr="" comment="Limits without systematics" date="1271669682" name="climLLR2_4K.png" path="climLLR2_4K.png" size="16223" stream="climLLR2_4K.png" tmpFilename="/usr/tmp/CGItemp14761" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR1_1K.png" attr="" comment="Limits with systematics" date="1271669722" name="climLLR1_1K.png" path="climLLR1_1K.png" size="14703" stream="climLLR1_1K.png" tmpFilename="/usr/tmp/CGItemp14805" user="aharel" version="1"
Added:
>
>
META FILEATTACHMENT attachment="plot_qs_model.png" attr="" comment="q* models (rough version from 2010-Apr-18)" date="1271683894" name="plot_qs_model.png" path="plot_qs_model.png" size="18315" stream="plot_qs_model.png" tmpFilename="/usr/tmp/CGItemp14792" user="aharel" version="1"
META FILEATTACHMENT attachment="plot_ci_model.png" attr="" comment="Contact interaction models" date="1271683918" name="plot_ci_model.png" path="plot_ci_model.png" size="19745" stream="plot_ci_model.png" tmpFilename="/usr/tmp/CGItemp14982" user="aharel" version="1"

Revision 32010-04-19 - AmnonHarel

Line: 1 to 1
 
META TOPICPARENT name="AmnonHarelJetRatio"
-- AmnonHarel - 17-Mar-2010
Line: 44 to 44
 
      • preferably with +/- 1 and 2 sigma bands
    • the 95% quantiles for the corresponding contact interaction scenario (dark red line and points),
    • the observed LLR (shown for a "golden dataset" below by the solid black line)
Changed:
<
<
    • Example of visual summaries of the statistical analysis of contact interaction:
Without systematics (4K pseudodatasets per ensemble) With systematics (1K PDSs per ensemble)
climLLR0_4K.png climLLR0_s2_1K.png
      • without systematics, we expect to exclude masses below a bit less than 5TeV, maybe a bit less with bad luck.
      • with systematics, the median expectation is still to exclude masses above 4teV, but 1.5 sigma's worth of bad luck will leave us with no worthwhile limits.
>
>
    • Examples of visual summaries of the statistical analysis of contact interaction:
Without systematics With systematics
climLLR2_4K.png climLLR1_1K.png
 

Additional statistical analysis of high mass region?

We would appreciate your input on whether any additional test statistics that answer the question:
Line: 140 to 138
 
META FILEATTACHMENT attachment="ratio_sys_var_note.pdf" attr="" comment="Note on systematic variations" date="1268850171" name="ratio_sys_var_note.pdf" path="ratio_sys_var_note.pdf" size="76351" stream="ratio_sys_var_note.pdf" tmpFilename="/usr/tmp/CGItemp5537" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR0_4K.png" attr="" comment="Limits with no systematics" date="1271642844" name="climLLR0_4K.png" path="climLLR0_4K.png" size="12429" stream="climLLR0_4K.png" tmpFilename="/usr/tmp/CGItemp14815" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR0_s2_1K.png" attr="" comment="Without systematics, 1K PDSs per ensemble" date="1271643673" name="climLLR0_s2_1K.png" path="climLLR0_s2_1K.png" size="12500" stream="climLLR0_s2_1K.png" tmpFilename="/usr/tmp/CGItemp14936" user="aharel" version="1"
Added:
>
>
META FILEATTACHMENT attachment="climLLR2_4K.png" attr="" comment="Limits without systematics" date="1271669682" name="climLLR2_4K.png" path="climLLR2_4K.png" size="16223" stream="climLLR2_4K.png" tmpFilename="/usr/tmp/CGItemp14761" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR1_1K.png" attr="" comment="Limits with systematics" date="1271669722" name="climLLR1_1K.png" path="climLLR1_1K.png" size="14703" stream="climLLR1_1K.png" tmpFilename="/usr/tmp/CGItemp14805" user="aharel" version="1"

Revision 22010-04-19 - AmnonHarel

Line: 1 to 1
 
META TOPICPARENT name="AmnonHarelJetRatio"
-- AmnonHarel - 17-Mar-2010
Changed:
<
<

Introduction

>
>

19-Apr-2010 Information

Some material in preparation of the April 20th meeting.

This supplements Robert's email and uses the references he provided:
[1] my talk Thursday - http://indico.cern.ch/conferenceDisplay.py?confId=91456
[2] the D0 paper - http://arxiv.org/abs/hep-ex/9807014
[3] the 2006-2007 CMS studies - http://cms.cern.ch/iCMS/jsp/openfile.jsp?tp=draft&files=AN2007_039_v4.pdf
Another reference I will use is
[4] the 2009 CMS studies - http://cms.cern.ch/iCMS/jsp/openfile.jsp?tp=draft&files=AN2009_161_v1.pdf

Our preferred limit setting technique for ICHEP, and the publication plan that motivates it, are described in [1]. This plan hinges on our expectations that for ICHEP the transition between the statistics-dominated and statistical-dominated regions and the transition between the regions explored by the Tevatron and those we will explore for the first time fall very close to each other, around 800GeV. It is clear that further development will be needed for future publication.

In particular, the systematic effects have large bin-to-bin correlations, which should dominate any quantitative treatment of the systematics (see [2], Fig. 2, off-diagonal elements). In particular, the relative JES effects at a particular jet pT should be propagated to a range of Mjj bins.

The preferred technique is

(with a few questions to the SB interleaved)

  1. The test statistics are log likelihood ratios that do not take systematic effects into account.
    • the likelihood used only bins above the Tevatron limits, e.g., above Mjj=800GeV, and in the case of contact interactions only bins below the contact interaction scale.
    • as per the standard technique for a Poisson ratio, the distributions are conditioned on the observed total counts.
  2. Use ensemble tests to account for the systematic uncertainties and derive "hybrid" 95%CL limits:
    • leading systematics are the relative and absolute JES. The corresponding nuisance parameters are drawn from Gaussian distributions.
    • we set limits as in Section 3.4 of [4].
      • the delay was to verify that this procedure holds when including the leading systematics correctly
    • in [4] we used the limit corresponding to the median of the SM distributions as the "predicted limit". It is not an expectation value. We're toying with adding +/- 1 sigma and 2 sigma bands.
      • Any objections to basing them all on the corresponding SM quantiles?
      • this assumes negligible uncertainties on the curve used to translate the observable into a limit. Only statistical uncertainties are relevant, and they can be reduced arbitrarily by increasing the size of the NP ensembles. We haven't yet chosen how to estimate the uncertainty on the 95th quantile, and welcome any SB input.
  3. For each m_q* we can plot (x-axis is m_q*, y-axis is x-section):
    • the SM predicted limit,
      • preferably with +/- 1 and 2 sigma bands
    • the predicted NP x-section,
      • can add the predicted NP limit, if worth the visual clutter
    • the observed limit Since the q* peaks are local, the q*/SM log likelihood ratios have similar localities, and such a plot provide a possible answer to the question: is the data consistent with the SM?
  4. For each lambda we can plot (x-axis is m_q*, y-axis is LLR):
    • the SM predicted LLR (dashed line below),
      • preferably with +/- 1 and 2 sigma bands
    • the 95% quantiles for the corresponding contact interaction scenario (dark red line and points),
    • the observed LLR (shown for a "golden dataset" below by the solid black line)
    • Example of visual summaries of the statistical analysis of contact interaction:
Without systematics (4K pseudodatasets per ensemble) With systematics (1K PDSs per ensemble)
climLLR0_4K.png climLLR0_s2_1K.png
      • without systematics, we expect to exclude masses below a bit less than 5TeV, maybe a bit less with bad luck.
      • with systematics, the median expectation is still to exclude masses above 4teV, but 1.5 sigma's worth of bad luck will leave us with no worthwhile limits.

Additional statistical analysis of high mass region?

We would appreciate your input on whether any additional test statistics that answer the question: "Is the data above the Tevatron limits consistent with our SM predictions?"
  1. must be published (my views against this are in [1], slide 9)
  2. is interesting enough to include in a paper
    • this probably depends on how difficult it will be to present such a test statistic. In [4] we developed a possible generic test statistic to answer that question ("G").
  • this region will be statistics dominated in ICHEP

Additional statistical analysis of low mass region?

There are two more, closely-related questions that can be asked, and we would appreciate your input on them.
  1. "Is the data below the Tevatron limits consistent with our SM predictions?" - generic null hypothesis testing
    • This region is dominated by systematics, so any method that ignores systematics will likely yield "no" here.

  1. "Can the data below the Tevatron limits be fit by the expected systematic variations of our SM predictions?" - goodness of fit statistic
    • This is one method to answer the question above. If the fit is limited to the high statistics regime, even a "standard" chi^2 test may do well enough, as the binomial distributions for such large numbers can easily be approximated to Gaussians. Another variation is to perform the test on the inner and outer spectra, taking into account their correlation. Perhaps this is what was done in [2] (note the diagonal correlations in their Fig. 2).

To a large extend, this question will be answered in the "R" plot, which summarizes our data and predictions. We will draw Clopper-Pearson intervals on the data points, and the systematics on the predictions, which allow the reader to see any significant deviation.

It would be nice to have a statistical statement that covers both the high-statistics, Tevatron-excluded-NP region and the low-statistics, we're-exploring-new-energies region. Especially as the fact that the cross over between the two regimes is basically at the same point is a temporary coincidence expected for an ICHEP result.

I plan to explore the 2nd approach, using a profile likelihood fit with the "usual" binomial statistics. But we do not yet have a consensus on whether this is necessary for an ICHEP publication.

  • Currently, we do not fit the (nuisance parameters of the) prediction to the data, and we do not have a detailed model of the bin-to-bin correlations. We have not yet discussed internally whether such a detailed description of the systematics is realistic for the ICHEP timescale.

17-Mar-2010 Information

  Following the discussion just before midnight in the 16-Mar statistics board meeting, here is a status report for the statistics in the di-jet ratio, with context and links to notes and talks.
Line: 65 to 136
 
      • the last two (c & d) look supported and that they should work cleanly, but they also look like unusual and complicated use cases.
    • thus migrating the code to RooStats is possible but not be trivial, and is detrimental to having an ICHEP result.
Added:
>
>
 
META FILEATTACHMENT attachment="ratio_sys_var_note.pdf" attr="" comment="Note on systematic variations" date="1268850171" name="ratio_sys_var_note.pdf" path="ratio_sys_var_note.pdf" size="76351" stream="ratio_sys_var_note.pdf" tmpFilename="/usr/tmp/CGItemp5537" user="aharel" version="1"
Added:
>
>
META FILEATTACHMENT attachment="climLLR0_4K.png" attr="" comment="Limits with no systematics" date="1271642844" name="climLLR0_4K.png" path="climLLR0_4K.png" size="12429" stream="climLLR0_4K.png" tmpFilename="/usr/tmp/CGItemp14815" user="aharel" version="1"
META FILEATTACHMENT attachment="climLLR0_s2_1K.png" attr="" comment="Without systematics, 1K PDSs per ensemble" date="1271643673" name="climLLR0_s2_1K.png" path="climLLR0_s2_1K.png" size="12500" stream="climLLR0_s2_1K.png" tmpFilename="/usr/tmp/CGItemp14936" user="aharel" version="1"

Revision 12010-03-17 - AmnonHarel

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="AmnonHarelJetRatio"
-- AmnonHarel - 17-Mar-2010

Introduction

Following the discussion just before midnight in the 16-Mar statistics board meeting, here is a status report for the statistics in the di-jet ratio, with context and links to notes and talks.

There are several reasons to adopt for an ICHEP analysis different tools, and a different approach, than was taken so far in the dijet ratio. This demotes some of the documentation to a documentation of historical attempts - if you wish to skip that, don't read sections 3.2 to 3.4 in the analysis note, and just look for "LLR" plots in the presentations, ignoring other test statistics.

Given an observable sensitive to a wide range of models, I started by looking for a test statistic that maintains that generality. This is documented in the AN. The presentation shows extensions, and also that systematic uncertainties have a huge effect (*). This shifts the focus to how to handle the systematics, and indicates we should switch to using the more familiar log-likelihood ratios (LLRs) so we can focus on the systematics.

(*) This is a very preliminary and unexpected result which I haven't fully debugged yet.

Given Jim's 7TeV numbers, and assuming ~3pb-1 for ICHEP, we are looking at systematic uncertainties of roughly 0.03 (absolute on R) and statistical uncertainties will dominate (i.e. >0.06) from around 800GeV. The Tevatron exclusions are Lambda<2.7TeV and Mq*<0.87TeV. So we can analyze our data starting from 870GeV, where systematics should not dominate, and ruling out a q* of 1TeV is in play (see Fig. 11 in AN, though the wrong Ecm is used), and so is a 3TeV contact interaction (see Jim's presentation).

To put it differently - for ICHEP we have the option of doing the analysis so that statistics dominates IN EVERY SINGLE BIN OF INTEREST.

  • covering only this case will leave us badly placed if some systematic effect turns out bigger than expected and increases the systematics by more than a factor of 2.

Selected documentation:

Now what?

I see two main options:
  1. use the current tools, with local limit setting for q* and an LLR, and ignoring all bins below ~800GeV.
    • Should work.
      • Latest test in a different "edge of statistical strength" scenario (in presentation) unexpectedly failed - some more tests and debugging is in order.
    • I will test with contact interactions early next week, and with Jim's help I'll also prepare the q* results later next week.
    • if the current tools are insufficient, I expect that incorporating the basic uncertainty in the test statistic, e.g. by doing a fit for the relative JES within each likelihood ("log profile-likelihood ratio"?) will handle the leading, and all other systematics.
  2. start over in RooStats, using only likelihoods.
    • re-examining RooFit and RooStat's latest versions with excellent help from Genadi Kukarzev, it seems that it should be possible to do our analysis in RooStats. Some of the things we need seem to be a bit unusual / supported in a cumbersome way, which I find a bit worrisome. But hopefully all works as it should.
    • the features we need are:
      1. ensemble testing by histograms (without individual events) - supported in latest version smile
      2. variable binning - supported in a round-about way (seems we need to name each bin)
      3. using a different statistics model and ensemble-generation models
      4. defining the test statistic as a function of functions of the data, rather than the data itself
      • the last two (c & d) look supported and that they should work cleanly, but they also look like unusual and complicated use cases.
    • thus migrating the code to RooStats is possible but not be trivial, and is detrimental to having an ICHEP result.

META FILEATTACHMENT attachment="ratio_sys_var_note.pdf" attr="" comment="Note on systematic variations" date="1268850171" name="ratio_sys_var_note.pdf" path="ratio_sys_var_note.pdf" size="76351" stream="ratio_sys_var_note.pdf" tmpFilename="/usr/tmp/CGItemp5537" user="aharel" version="1"
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback