Statistics is used everything and is something you don't want to miss. Here's just some of my understanding.

I would make this topic as rich and complete as possible.

# Significance.

The significance measures the discrepancy between your s+b data and your bkg-only model. So you expect a data to look like s+b, so how different it looks compared to an assumption where you don't have a signal, will give you the power to reject the bkg-only model.

# Errors

## What is Systematic Error?

Auxiliary measurements provide uncertainties for measured variables. Those variables are used in analysis. Explicitly variables are like luminosity, calibration factor and scale factor inside a certain object bin of pt vs eta; implicitly, they can be "the effect of changing your smearing algorithm", "changing your underlying parton distribution function", etc. They are given by auxiliary measurement in a 1 sigma deviation manner.

The 1 sigma variation will be usually assigned to a gaussian constraint, , and prediction will be sth like (It's the simplest linear case). This is based on the assumption that 1 sigma variation on your parameter will have a 1 sigma variation impact on the final result.

See here for some introduction to systematic uncertainty. It also introduce something about profile likelihood, which absorb the nuisance parameters
[ Pekka K. Definition and Treatement of Systematic Uncertainties in High Energy Physics and Astrophysics ].

# What is a fit?

A fit is a procedure to find minimum/maximum of a certain metric, to see the ability of your model to describe the data.

## Best Fit Interpretation

### Overconstrain/Underconstrain

Errors are usually calculated assuming 1 sigma deviation from nominal, by convention. So nuisance parameter are usually assigned to a Gaussian constraint with sigma=1. If the estimated error given by fitting algorithms is less than 1, we have overconstrain. If it's over 1, we have underconstrain.

The estimated errors on the nuisance parameter given by the the fitting algorithm are what the algorithm defines 1sigma error. So if you have overconstrain, for example 0.9, the algorithm thought your error should be 0.9 * variation_1sigma. In this case, you might have overestimated your error, or you can say it's too conservative. But also, it could be due to your response model is too simple.

## Technical procedure of fitting

### Algorithms of estimating errors

Error of a POI/fitting parameter are given by:

Hesse: square of second derivative at best fit point. This is assuming parabola shape NLL.

Minos: find intersection of min_NLL + 0.5 and profile scan of POI.

Minuit2:

### Examples of metrics

• Likelihood : multiplication of simple probability in each measurement(usually means yield in bin), as an combined probability of observing this kind of data.
• : negative log likelihood . It has some nice property in fitting.
• : It's simply when is gaussian-like, and is used in simple fitting.

### Some tutorials

-- RongkunWang - 2017-09-04
Topic revision: r5 - 2019-02-19 - RongkunWang   Home   Sandbox Web  P P View Edit     Cern Search TWiki Search Google Search Sandbox All webs   Copyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback