--
AmnonHarel - 12-Sep-2010
How should we interpret the "disappearance" of the CLs limits at high lambda?
CLs basics
The "modified frequentist" CLs approach is to exclude regions of phase space where

, where

is the desired confidence level. Here,

.
Fine print
None of these really matters at the end, but for completeness:
- When the number of pseudo datasets (PDSs) that underlie either
or
at the limit is below 10, we consider the determination of the limit unreliable and do not quote it.
- For the final results, we generate enough PDFs at the crucial points in phase space to prevent this requirement from effecting the results, except where it clearly removes statistical noise (see
TeV example below)
- Since the test statistic is the log likelihood ratio, with or without systematics, when we define an excluded region of the LLR, it will normally be simply connected and include
(the sign convention is such that this is the value most unlike the new physics scenario). We considered the possibility that when the LLR does not include systematics, this may fail to hold, in which case we would quote only the uppermost excluded LLR in the lowest exclusion region. Currently, we have no reason to believe this abnormal situation arises in our measurement except as a statistical fluctuations (from finite ensemble size).
Final results
What changed
- Found a bug that greatly reduced the shift uncertainty in all results, and another that reduced the absolute JES in the results sent to the statistic board during CWR. The three leading uncertainties were unaffected by these bugs, so the results barely change. It is only due to the finicky nature of CLs limits in the regime we're at, that these tiny effects must be resolved.
- Better understanding of the low LLR tails, described in the next section
- Now using the CMS LPC batch system - generating obscene amounts of pseudo-datasets has never been easier
- Decided on the stopping conditions.
- lambda value excluded / allowed at 2 sigma level
- CLs value (i.e. the confidence level of the exclusion) is known at 0.5% accuracy
- note that this condition is neccesary - without it at the key lambda value, where the exclusion truly is 95%, only a large statistical fluctuation will yield 2 sigma separation between the true CLs level and the desired 5%.
- as a benchmark, closure tests of common frequentist methods often show inaccuracies at the 1% level.
What drives the low LLR tails?
The following earlier results hinted that statistical effects drive the low tails:
- the nuisance parameters in the tails are as one would expect from their correlations with LLR in the bulk. They are only mildly effected.
- the plot of the ensemble broken down by the last non-empty inner bin:
To check this further, we took a key ensemble and filtered out PDSs where any of the nuisance parameters is more than 2 sigma away from its nominal value, and checked how this effects the LLR tails.

To see the key numbers more clearly, here are the relevant table entries (tables explained below):
|
Lambda |
N_{PDS} |
At the CLs point |
Frequentist limiting |
Data |
|
N_{S+B} |
CLb |
CLsb |
LLR value |
LLR value |
LLR value |
N_{S+B} |
CLs |
MC stat. err. on CLs |
Full systematics |
4.00 |
151949 |
104 |
0.0007 |
0.0137 |
-3.38 |
-1.67 |
-3.57 |
40 |
0.0415 |
0.0071 |
Truncated systematics |
4.00 |
192126 |
86 |
0.0004 |
0.0089 |
-3.50 |
-1.69 |
-3.57 |
55 |
0.0451 |
0.0063 |
The effect on CLs is negligible.
We conclude that the low LLR tails are driven entirely by the statistics, and our ability to study them is not limited by the accuracy of the systematic uncertainties.
Plots
The dashed red line in the right hand plots is the CLs=5% line.
- The two peak structure visible at 5.0TeV is due to having one very high mass event, which can be either inner or outer. See also the "old" plots of 4.05TeV below, with a breakdown by the last non-empty inner bin.
The table
"N_{S+B}" is the number of events in the new physics (QCD+contact interaction) ensembles whose LLR is below the relevant LLR value (CLs limiting value or observed data value).
Lambda |
N_{PDS} |
At the CLs point |
Frequentist limiting |
Data |
N_{S+B} |
CLb |
CLsb |
LLR value |
LLR value |
LLR value |
N_{S+B} |
CLs |
MC stat. err. on CLs |
1.80 |
4000 |
199 |
0.0497 |
0.9943 |
-1.09 |
-1.06 |
-27.75 |
1 |
0.0009 |
0.0009 |
1.85 |
4000 |
198 |
0.0494 |
0.9888 |
-1.53 |
-1.46 |
-25.50 |
1 |
0.0010 |
0.0010 |
1.90 |
4000 |
197 |
0.0493 |
0.9854 |
-2.02 |
-1.93 |
-24.14 |
1 |
0.0011 |
0.0011 |
1.95 |
4000 |
196 |
0.0490 |
0.9791 |
-2.18 |
-2.08 |
-23.32 |
0 |
0.0005 |
0.0005 |
2.00 |
4000 |
194 |
0.0486 |
0.9720 |
-2.38 |
-2.24 |
-21.23 |
0 |
0.0000 |
0.0000 |
2.20 |
4000 |
181 |
0.0453 |
0.9069 |
-2.79 |
-2.49 |
-15.19 |
1 |
0.0019 |
0.0019 |
2.40 |
4000 |
180 |
0.0449 |
0.8975 |
-2.05 |
-1.77 |
-13.98 |
0 |
0.0000 |
0.0000 |
2.60 |
4000 |
155 |
0.0387 |
0.7745 |
-2.55 |
-1.99 |
-10.90 |
0 |
0.0000 |
0.0000 |
2.80 |
4000 |
125 |
0.0312 |
0.6244 |
-2.80 |
-2.02 |
-8.84 |
0 |
0.0000 |
0.0000 |
3.00 |
4000 |
88 |
0.0220 |
0.4395 |
-3.01 |
-2.09 |
-7.02 |
0 |
0.0013 |
0.0013 |
3.20 |
4000 |
44 |
0.0110 |
0.2190 |
-3.64 |
-2.09 |
-5.99 |
1 |
0.0101 |
0.0101 |
3.40 |
30000 |
179 |
0.0060 |
0.1191 |
-3.41 |
-1.79 |
-4.86 |
13 |
0.0276 |
0.0079 |
3.60 |
244084 |
455 |
0.0019 |
0.0373 |
-3.50 |
-1.64 |
-4.00 |
120 |
0.0373 |
0.0035 |
3.80 |
92098 |
187 |
0.0020 |
0.0406 |
-3.01 |
-1.43 |
-3.66 |
23 |
0.0321 |
0.0068 |
4.00 |
472238 |
164 |
0.0003 |
0.0069 |
-3.56 |
-1.69 |
-3.57 |
151 |
0.0489 |
0.0041 |
4.05 |
413663 |
12 |
0.0000 |
0.0006 |
-3.83 |
-1.66 |
-3.41 |
156 |
0.0566 |
0.0047 |
4.10 |
389236 |
46 |
0.0001 |
0.0024 |
-3.44 |
-1.62 |
-3.27 |
151 |
0.0636 |
0.0054 |
4.15 |
420161 |
--- |
--- |
--- |
--- |
-1.63 |
-3.18 |
171 |
0.0670 |
0.0054 |
4.20 |
181056 |
4 |
0.0000 |
0.0005 |
-3.34 |
-1.55 |
-3.03 |
71 |
0.0701 |
0.0088 |
5.00 |
25000 |
--- |
--- |
--- |
--- |
-1.08 |
-1.49 |
48 |
0.2426 |
0.0418 |
Conclusions
Going over the crucial lambda values one by one:
- 3.6TeV: Though the separation is low enough that it is almost power-constrained away, this lambda value is clearly excluded.
- 3.8TeV: is excluded
- 4.0TeV:The data CLs is known to an accuracy of 0.5%, and at that accuracy it is below 5%. So that 4.0TeV is excluded
- 4.05TeV:The data CLs at 4.05TeV is known to an accuracy of 0.5%, and at that accuracy it is above 5%, so that 4.05TeV is not excluded
- 4.1TeV: a critical CLs value probably exists at 4.1TeV. Our data lies above the possible critical value (at >2 sigma of MC statistics), so 4.1TeV is not excluded
- 4.15TeV:If a critical CLs value exists, the data value is above it (at 3 sigma of MC statistics) and 4.15TeV can not be excluded.
- 4.2TeV:Some indication that a critical CLs value exists, but the data value is above it (at >2 sigma of MC statistics), so 4.2TeV is not excluded.
- 5.0TeV:If a critical CLs value exists, the data value is well above it and 5TeV can not be excluded.
So, as usual, we exclude lambda values less than equal to 4.0TeV.
What is new, is that CLs exclusion value are available at 4.05, 4.1, and 4.2
TeV.
Presentation
Continue CLs line out to 4.1TeV. This shows the crossover, and should satisfy the reader looking for physics. For the statistics aficionado, the presence of an exclusion at 4.2 is interesting, but it would be difficult to communicate this visually in the graph, and it would be distracting from the main results of the paper. Presumably this information will be made public in a table format for such needs. Given such a presentation, it is not clear that any additional text is required. In short - as far as can be shown with our tools this is a plain-vanilla CLs situation, and it should be shown as such.
Updated results during CWR (30th of September)
Generated an obscene amount of additional PDSs...
Tools
- Added statistical uncertainties on the CLs values.
- Since the # of PDS (
) is predetermined, the uncertainties on CLb and CLsb are (independent) binomial efficiency problems. I use the usual approximation of putting in the observed efficiency (
, where
is the number of PDSs below the LLR value). This gives an uncertainty on the relevant fraction (
, either CLb or CLsb) of
.
- Then the two numbers are combined with standard error propagation, which implies the Gaussian approximation.
- Added more diagnostic columns to the tables: N_{S+B} (explained below), CLs at data with its uncertainty due to MC statistics.
- Plots now available for all relevant lambda values
Plots
The dashed red line in the right hand plots is the CLs=5% line.
- The two peak structure visible at 5.0TeV is due to having one very high mass event, which can be either inner or outer. See also the "old" plots of 4.05TeV below, with a breakdown by the last non-empty inner bin.
The table
"N_{S+B}" is the number of events in the new physics (QCD+contact interaction) ensembles whose LLR is below the relevant LLR value (CLs limiting value or observed data value).
Lambda |
N_{PDS} |
At the CLs point |
Frequentist limiting |
Data |
N_{S+B} |
CLb |
CLsb |
LLR value |
LLR value |
LLR value |
N_{S+B} |
CLs |
MC stat. err. on CLs |
1.80 |
4000 |
199 |
0.0498 |
0.9963 |
-1.01 |
-0.99 |
-27.75 |
0 |
0.0000 |
0.0000 |
1.85 |
4000 |
199 |
0.0497 |
0.9939 |
-0.82 |
-0.78 |
-25.50 |
0 |
0.0000 |
0.0000 |
1.90 |
4000 |
197 |
0.0494 |
0.9875 |
-1.49 |
-1.41 |
-24.14 |
0 |
0.0005 |
0.0005 |
1.95 |
4000 |
198 |
0.0495 |
0.9906 |
-0.78 |
-0.73 |
-23.32 |
1 |
0.0015 |
0.0015 |
2.00 |
4000 |
196 |
0.0489 |
0.9787 |
-2.04 |
-1.92 |
-21.23 |
0 |
0.0002 |
0.0002 |
2.20 |
4000 |
185 |
0.0462 |
0.9238 |
-2.24 |
-2.03 |
-15.19 |
0 |
0.0000 |
0.0000 |
2.40 |
4000 |
177 |
0.0443 |
0.8856 |
-2.17 |
-1.83 |
-13.98 |
0 |
0.0000 |
0.0000 |
2.60 |
4000 |
184 |
0.0460 |
0.9194 |
-0.29 |
-0.12 |
-10.90 |
0 |
0.0000 |
0.0000 |
2.80 |
4000 |
166 |
0.0414 |
0.8276 |
-0.94 |
-0.71 |
-8.84 |
0 |
0.0000 |
0.0000 |
3.00 |
4000 |
84 |
0.0209 |
0.4188 |
-3.10 |
-1.95 |
-7.02 |
2 |
0.0211 |
0.0162 |
3.20 |
4000 |
63 |
0.0157 |
0.3132 |
-2.99 |
-1.81 |
-5.99 |
2 |
0.0181 |
0.0148 |
3.40 |
4000 |
19 |
0.0047 |
0.0933 |
-3.46 |
-1.67 |
-4.86 |
1 |
0.0265 |
0.0268 |
3.60 |
21000 |
121 |
0.0058 |
0.1153 |
-2.74 |
-1.46 |
-4.00 |
10 |
0.0389 |
0.0123 |
3.80 |
12000 |
28 |
0.0023 |
0.0466 |
-2.90 |
-1.38 |
-3.66 |
2 |
0.0261 |
0.0187 |
4.00 |
16000 |
32 |
0.0020 |
0.0395 |
-3.02 |
-1.47 |
-3.57 |
1 |
0.0057 |
0.0057 |
4.05 |
181000 |
346 |
0.0019 |
0.0382 |
-2.89 |
-1.47 |
-3.41 |
44 |
0.0403 |
0.0063 |
4.10 |
146000 |
178 |
0.0012 |
0.0244 |
-2.91 |
-1.46 |
-3.27 |
35 |
0.0434 |
0.0076 |
4.15 |
35000 |
--- |
--- |
--- |
--- |
-1.46 |
-3.18 |
15 |
0.0793 |
0.0216 |
4.20 |
4000 |
--- |
--- |
--- |
--- |
-1.61 |
-3.03 |
4 |
0.1469 |
0.0859 |
5.00 |
4000 |
--- |
--- |
--- |
--- |
-1.07 |
-1.49 |
11 |
0.4184 |
0.1493 |
Conclusions
Why did we see fake rises?
- the main problem is that we evaluated the significance using the number of PDS in the low-end rise region. The right way to think of it is by analogy to a bump hunt - is there a signficant excess over background? Just using the number of PDS is looking at S+B. In particular, a high B is a reason to distrust an excess, not to trust it
- it also hurt that the S+N numbers weren't readily available. They are in the tables now (they mostyle help answer other questions).
- also a mild "look elsewhere" effect
Which CLs limits should we use?
It is not safe to use CLs limits which are based on small tail probabilities. How small?
- From these results, anything below ~0.001 is suspect
- proved irrelevant - see above! Our description of systematic uncertainties wasn't meant to cover such extreme cases. It's probably good enough till at least 3 sigma --> ~0.0015. It's certainly not good enough at 4 sigma (truncated there) --> ~0.00003. Also the choice of the prior for the JES wasn't checked below ~0.001 (see AmnonHarelStatusReportForStatisticsBoard#New_prior_shapes_for_nuisance_pa)
Going over the crucial lambda values one by one:
- Though the separation at lambda=3.6TeV is low enough that it is almost power-constrained away, it is certain that a CLs limit exists and the data lies below it, so this value is excluded.
- 4.0TeV is excluded
- A CLs limit exists at 4.05TeV. It is probably above the data, so that 4.05 is excluded. I'd guestimate the significance of the previous statement, in terms of MC statistics, at around 2 sigma.
- A CLs limit exists at 4.1TeV. There is a hint that it is above the data, so that 4.1 is excluded. I'd guestimate the significance of the previous statement, in terms of MC statistics, at around 0.5 sigma.
- There is a hint that no CLs limit exists at 4.15TeV. If a crucial CLs value exists, the data value can easily (~50%?) lie above it.
So the best choice is to stick with the limits presented in the CWR, and add one more item to the discussion of the difficulties with CLs:
"CLs sometimes requires estimating extreme tails which stresses the validity of the systematic variations and is difficult using brute force ensemble testing."
the above may be true, but we now (after CWR) realize it is irrelevant to this measurement
Need to decide on the right presentation though...
Improved plots with ensembles as of a bit before CWR (early 21st of September)
The plots on this section use systematics which are slightly lower than the final ones. There are not qualitative differences, in particular, the sensitivity runs out at 4TeV either way, and quantitative differences are likely to be tiny.
The simple cases
To understand how the CLs limits go away, we'll look at plots of CLs as a function of LLR (for a given new physics model, here contact interactions).
Easy exclusion
No exclusion
With systematics (as usual) |
Only statistical variations |
LLR distributions |
CLs plot |
LLR distributions |
CLs plot |
|
|
|
|
The leftmost "0" is real, but due to a single point from the SM ensemble. The 0,0 points (quite a few, actually) are an artifact that does not effect the code (these plots were meant to be internal to the code...).
CLs with borderline sensitivity
More than one behavior near the edge of the experimental sensitivity is possible. Here are two simple scenarios. In both scenarios, under both hypotheses (i.e. in both ensembles), the LLR is distributed as a Gaussian and the two distributions are displaced by an amount that descreases as we run out of experimental sensitivity.
- 1 - the Gaussians have the same width
- in this scenario, the lower the LLR (below the SM peak), the more the SM is prefered.
- thus, as
increases and we run out of experimental sensitivity, the CLs limit exists, and will rapidly drop to
. However, our ability to determine its correct value will detriorate quickly - brute force ensemble testing is not a suitable tool for learning about the tails of distributions.
- this is the behavior that was observed in the ICHEP results, where increasing the ensemble size by an order of magnitude extended the limit
- 2 - the new physics distribution is wider
- in this scenario, for very low LLR values (well below the SM peak), the SM is no longer prefered.
- thus, as
increases and we run out of experimental sensitivity, the CLs limit no longer exists as the plot is always above 0.05
- this is the behavior observed in the current results - increasing the ensemble size does not extended the limit
The current borderline sensitivity region
LLR distributions |
CLs plot |
|
from 0 to 1 |
zoomed in |
|
|
|
|
LLR distributions |
CLs plot |
LLR breakdown |
from 0 to 1 |
zoomed in |
by highest non-zero inner bin |
|
|
|
|
LLR distributions |
CLs plot |
|
from 0 to 1 |
zoomed in |
|
|
--- |
|
|
The table
Lambda |
N_{PDS} |
At the CLs point |
Frequentist limiting |
Data |
N_{S+B} |
CLb |
CLsb |
LLR value |
LLR value |
LLR value |
N_{S+B} |
CLs |
MC stat. err. on CLs |
1.80 |
4000 |
199 |
0.0498 |
0.9963 |
-1.01 |
-0.99 |
-27.75 |
0 |
0.0000 |
0.0000 |
1.85 |
4000 |
199 |
0.0497 |
0.9939 |
-0.82 |
-0.78 |
-25.50 |
0 |
0.0000 |
0.0000 |
1.90 |
4000 |
197 |
0.0494 |
0.9875 |
-1.49 |
-1.41 |
-24.14 |
0 |
0.0005 |
0.0005 |
1.95 |
4000 |
198 |
0.0495 |
0.9906 |
-0.78 |
-0.73 |
-23.32 |
1 |
0.0015 |
0.0015 |
2.00 |
4000 |
196 |
0.0489 |
0.9787 |
-2.04 |
-1.92 |
-21.23 |
0 |
0.0002 |
0.0002 |
2.20 |
4000 |
185 |
0.0462 |
0.9238 |
-2.24 |
-2.03 |
-15.19 |
0 |
0.0000 |
0.0000 |
2.40 |
4000 |
177 |
0.0443 |
0.8856 |
-2.17 |
-1.83 |
-13.98 |
0 |
0.0000 |
0.0000 |
2.60 |
4000 |
184 |
0.0460 |
0.9194 |
-0.29 |
-0.12 |
-10.90 |
0 |
0.0000 |
0.0000 |
2.80 |
4000 |
166 |
0.0414 |
0.8276 |
-0.94 |
-0.71 |
-8.84 |
0 |
0.0000 |
0.0000 |
3.00 |
4000 |
84 |
0.0209 |
0.4188 |
-3.10 |
-1.95 |
-7.02 |
2 |
0.0211 |
0.0162 |
3.20 |
4000 |
63 |
0.0157 |
0.3132 |
-2.99 |
-1.81 |
-5.99 |
2 |
0.0181 |
0.0148 |
3.40 |
4000 |
19 |
0.0047 |
0.0933 |
-3.46 |
-1.67 |
-4.86 |
1 |
0.0265 |
0.0268 |
3.60 |
8000 |
--- |
--- |
--- |
--- |
-1.59 |
-4.00 |
7 |
0.0803 |
0.0307 |
3.80 |
12000 |
28 |
0.0023 |
0.0466 |
-2.90 |
-1.38 |
-3.66 |
2 |
0.0261 |
0.0187 |
4.00 |
16000 |
32 |
0.0020 |
0.0395 |
-3.02 |
-1.47 |
-3.57 |
1 |
0.0057 |
0.0057 |
4.05 |
12000 |
--- |
--- |
--- |
--- |
-1.66 |
-3.41 |
6 |
0.0992 |
0.0440 |
4.10 |
8000 |
--- |
--- |
--- |
--- |
-1.57 |
-3.27 |
5 |
0.1350 |
0.0641 |
4.15 |
4000 |
--- |
--- |
--- |
--- |
-1.60 |
-3.18 |
4 |
0.2368 |
0.1259 |
4.20 |
4000 |
--- |
--- |
--- |
--- |
-1.61 |
-3.03 |
4 |
0.1469 |
0.0859 |
5.00 |
4000 |
--- |
--- |
--- |
--- |
-1.07 |
-1.49 |
11 |
0.4184 |
0.1493 |
Assorted attachments