--
AmnonHarel - 12-Sep-2010
How should we interpret the "disappearance" of the CLs limits at high lambda?
CLs basics
The "modified frequentist" CLs approach is to exclude regions of phase space where

, where

is the desired confidence level. Here,

.
Fine print
- When the number of pseudo datasets (PDSs) that underlie either
or
at the limit is below 10, we consider the determination of the limit unreliable and do not quote it.
- For the final results, we generate enough PDFs at the crucial points in phase space to prevent this requirement from effecting the results, except where it clearly removes statistical noise (see
TeV example below)
- Since the test statistic is the log likelihood ratio, with or without systematics, when we define an excluded region of the LLR, it will normally be simply connected and include
(the sign convention is such that this is the value most unlike the new physics scenario). As we'll see, when the LLR does not include systematics, this may fail to hold. Currently, we quote only the uppermost excluded LLR in the lowest exclusion region.
- The plots on this page use systematics which are slightly lower than the final ones. There are not qualitative differences, in particular, the sensitivity runs out at 4TeV either way, and quantitative differences are likely to be tiny.
Updated results
Generated an obscene amount of additional PDSs...
Tools
- Added statistical uncertainties on the CLs values.
- Since the # of PDS (
) is predetermined, the uncertainties on CLb and CLsb are (independent) binomial efficiency problems. I use the usual approximation of putting in the observed efficiency (
, where
is the number of PDSs below the LLR value). This gives an uncertainty on the relevant fraction (
, either CLb or CLsb) of
.
- Then the two numbers are combined with standard error propagation, which implies the Gaussian approximation.
- Added more diagnostic columns to the tables
Plots
The table
Lambda |
N_{PDS} |
At the CLs point |
Frequentist limiting |
Data |
N_{S+B} |
CLb |
CLsb |
LLR value |
LLR value |
LLR value |
N_{S+B} |
CLs |
MC stat. err. on CLs |
1.80 |
4000 |
199 |
0.0498 |
0.9963 |
-1.01 |
-0.99 |
-27.75 |
0 |
0.0000 |
0.0000 |
1.85 |
4000 |
199 |
0.0497 |
0.9939 |
-0.82 |
-0.78 |
-25.50 |
0 |
0.0000 |
0.0000 |
1.90 |
4000 |
197 |
0.0494 |
0.9875 |
-1.49 |
-1.41 |
-24.14 |
0 |
0.0005 |
0.0005 |
1.95 |
4000 |
198 |
0.0495 |
0.9906 |
-0.78 |
-0.73 |
-23.32 |
1 |
0.0015 |
0.0015 |
2.00 |
4000 |
196 |
0.0489 |
0.9787 |
-2.04 |
-1.92 |
-21.23 |
0 |
0.0002 |
0.0002 |
2.20 |
4000 |
185 |
0.0462 |
0.9238 |
-2.24 |
-2.03 |
-15.19 |
0 |
0.0000 |
0.0000 |
2.40 |
4000 |
177 |
0.0443 |
0.8856 |
-2.17 |
-1.83 |
-13.98 |
0 |
0.0000 |
0.0000 |
2.60 |
4000 |
184 |
0.0460 |
0.9194 |
-0.29 |
-0.12 |
-10.90 |
0 |
0.0000 |
0.0000 |
2.80 |
4000 |
166 |
0.0414 |
0.8276 |
-0.94 |
-0.71 |
-8.84 |
0 |
0.0000 |
0.0000 |
3.00 |
4000 |
84 |
0.0209 |
0.4188 |
-3.10 |
-1.95 |
-7.02 |
2 |
0.0211 |
0.0162 |
3.20 |
4000 |
63 |
0.0157 |
0.3132 |
-2.99 |
-1.81 |
-5.99 |
2 |
0.0181 |
0.0148 |
3.40 |
4000 |
19 |
0.0047 |
0.0933 |
-3.46 |
-1.67 |
-4.86 |
1 |
0.0265 |
0.0268 |
3.60 |
21000 |
121 |
0.0058 |
0.1153 |
-2.74 |
-1.46 |
-4.00 |
10 |
0.0389 |
0.0123 |
3.80 |
12000 |
28 |
0.0023 |
0.0466 |
-2.90 |
-1.38 |
-3.66 |
2 |
0.0261 |
0.0187 |
4.00 |
16000 |
32 |
0.0020 |
0.0395 |
-3.02 |
-1.47 |
-3.57 |
1 |
0.0057 |
0.0057 |
4.05 |
181000 |
346 |
0.0019 |
0.0382 |
-2.89 |
-1.47 |
-3.41 |
44 |
0.0403 |
0.0063 |
4.10 |
146000 |
178 |
0.0012 |
0.0244 |
-2.91 |
-1.46 |
-3.27 |
35 |
0.0434 |
0.0076 |
4.15 |
35000 |
--- |
--- |
--- |
--- |
-1.46 |
-3.18 |
15 |
0.0793 |
0.0216 |
4.20 |
4000 |
--- |
--- |
--- |
--- |
-1.61 |
-3.03 |
4 |
0.1469 |
0.0859 |
5.00 |
4000 |
--- |
--- |
--- |
--- |
-1.07 |
-1.49 |
11 |
0.4184 |
0.1493 |
Conclusions
Why did we see fake rises?
- the main problem is that we evaluated the significance using the number of PDS in the low-end rise region. The right way to think of it is by analogy to a bump hunt - is there a signficant excess over background? Just using the number of PDS is looking at S+B. In particular, a high B is a reason to distrust an excess, not to trust it
- it also hurt that the relevant numbers weren't readily available. They are in the tables now.
- also a mild "look elsewhere" effect
Which CLs limits should we use?
It is not safe to use CLs limits which are based on small tail probabilities. How small?
- From these results, anything below ~0.001 is suspect
- Our description of systematic uncertainties wasn't meant to cover such extreme cases. It's probably good enough till at least 3 sigma --> ~0.0015. It's certainly not good enough at 4 sigma (truncated there) --> ~0.00003. Also the choice of the prior for the JES wasn't checked below ~0.001 (see AmnonHarelStatusReportForStatisticsBoard#New_prior_shapes_for_nuisance_pa)
Going over the crucial lambda values one by one:
- Though the separation at lambda=3.6TeV is low enough that it is almost power-constrained away, it is certain that a CLs limit exists and the data lies below it, so this value is excluded.
- 4.0TeV is excluded
- A CLs limit exists at 4.05TeV. It is probably above the data, so that 4.05 is excluded. I'd guestimate the significance of the previous statement, in terms of MC statistics, at around 2 sigma.
- A CLs limit exists at 4.1TeV. There is a hint that it is above the data, so that 4.1 is excluded. I'd guestimate the significance of the previous statement, in terms of MC statistics, at around 0.5 sigma.
- There is a hint that no CLs limit exists at 4.15TeV. If a crucial CLs value exists, the data value can easily (~50%?) lie above it.
So the best choice is to stick with the limits presented in the CWR, and add one more item to the discussion of the difficulties with CLs:
"CLs sometimes requires estimating extreme tails which stresses the validity of the systematic variations and is difficult using brute force ensemble testing."
Need to decide on the right presentation though...
Improved plots with ensembles as of a bit before CWR (early 21st of September)
The simple cases
To understand how the CLs limits go away, we'll look at plots of CLs as a function of LLR (for a given new physics model, here contact interactions).
Easy exclusion
No exclusion
With systematics (as usual) |
Only statistical variations |
LLR distributions |
CLs plot |
LLR distributions |
CLs plot |
|
|
|
|
The leftmost "0" is real, but due to a single point from the SM ensemble. The 0,0 points (quite a few, actually) are an artifact that does not effect the code (these plots were meant to be internal to the code...).
CLs with borderline sensitivity
More than one behavior near the edge of the experimental sensitivity is possible. Here are two simple scenarios. In both scenarios, under both hypotheses (i.e. in both ensembles), the LLR is distributed as a Gaussian and the two distributions are displaced by an amount that descreases as we run out of experimental sensitivity.
- 1 - the Gaussians have the same width
- in this scenario, the lower the LLR (below the SM peak), the more the SM is prefered.
- thus, as
increases and we run out of experimental sensitivity, the CLs limit exists, and will rapidly drop to
. However, our ability to determine its correct value will detriorate quickly - brute force ensemble testing is not a suitable tool for learning about the tails of distributions.
- this is the behavior that was observed in the ICHEP results, where increasing the ensemble size by an order of magnitude extended the limit
- 2 - the new physics distribution is wider
- in this scenario, for very low LLR values (well below the SM peak), the SM is no longer prefered.
- thus, as
increases and we run out of experimental sensitivity, the CLs limit no longer exists as the plot is always above 0.05
- this is the behavior observed in the current results - increasing the ensemble size does not extended the limit
The current borderline sensitivity region
LLR distributions |
CLs plot |
|
from 0 to 1 |
zoomed in |
|
|
|
|
LLR distributions |
CLs plot |
LLR breakdown |
from 0 to 1 |
zoomed in |
by highest non-zero inner bin |
|
|
|
|
LLR distributions |
CLs plot |
|
from 0 to 1 |
zoomed in |
|
|
--- |
|
|
The table
Lambda |
N_{PDS} |
At the CLs point |
Frequentist limiting |
Data |
N_{S+B} |
CLb |
CLsb |
LLR value |
LLR value |
LLR value |
N_{S+B} |
CLs |
MC stat. err. on CLs |
1.80 |
4000 |
199 |
0.0498 |
0.9963 |
-1.01 |
-0.99 |
-27.75 |
0 |
0.0000 |
0.0000 |
1.85 |
4000 |
199 |
0.0497 |
0.9939 |
-0.82 |
-0.78 |
-25.50 |
0 |
0.0000 |
0.0000 |
1.90 |
4000 |
197 |
0.0494 |
0.9875 |
-1.49 |
-1.41 |
-24.14 |
0 |
0.0005 |
0.0005 |
1.95 |
4000 |
198 |
0.0495 |
0.9906 |
-0.78 |
-0.73 |
-23.32 |
1 |
0.0015 |
0.0015 |
2.00 |
4000 |
196 |
0.0489 |
0.9787 |
-2.04 |
-1.92 |
-21.23 |
0 |
0.0002 |
0.0002 |
2.20 |
4000 |
185 |
0.0462 |
0.9238 |
-2.24 |
-2.03 |
-15.19 |
0 |
0.0000 |
0.0000 |
2.40 |
4000 |
177 |
0.0443 |
0.8856 |
-2.17 |
-1.83 |
-13.98 |
0 |
0.0000 |
0.0000 |
2.60 |
4000 |
184 |
0.0460 |
0.9194 |
-0.29 |
-0.12 |
-10.90 |
0 |
0.0000 |
0.0000 |
2.80 |
4000 |
166 |
0.0414 |
0.8276 |
-0.94 |
-0.71 |
-8.84 |
0 |
0.0000 |
0.0000 |
3.00 |
4000 |
84 |
0.0209 |
0.4188 |
-3.10 |
-1.95 |
-7.02 |
2 |
0.0211 |
0.0162 |
3.20 |
4000 |
63 |
0.0157 |
0.3132 |
-2.99 |
-1.81 |
-5.99 |
2 |
0.0181 |
0.0148 |
3.40 |
4000 |
19 |
0.0047 |
0.0933 |
-3.46 |
-1.67 |
-4.86 |
1 |
0.0265 |
0.0268 |
3.60 |
8000 |
--- |
--- |
--- |
--- |
-1.59 |
-4.00 |
7 |
0.0803 |
0.0307 |
3.80 |
12000 |
28 |
0.0023 |
0.0466 |
-2.90 |
-1.38 |
-3.66 |
2 |
0.0261 |
0.0187 |
4.00 |
16000 |
32 |
0.0020 |
0.0395 |
-3.02 |
-1.47 |
-3.57 |
1 |
0.0057 |
0.0057 |
4.05 |
12000 |
--- |
--- |
--- |
--- |
-1.66 |
-3.41 |
6 |
0.0992 |
0.0440 |
4.10 |
8000 |
--- |
--- |
--- |
--- |
-1.57 |
-3.27 |
5 |
0.1350 |
0.0641 |
4.15 |
4000 |
--- |
--- |
--- |
--- |
-1.60 |
-3.18 |
4 |
0.2368 |
0.1259 |
4.20 |
4000 |
--- |
--- |
--- |
--- |
-1.61 |
-3.03 |
4 |
0.1469 |
0.0859 |
5.00 |
4000 |
--- |
--- |
--- |
--- |
-1.07 |
-1.49 |
11 |
0.4184 |
0.1493 |
Assorted attachments