BsPlot

Usage of sPlot for the analysis and the control channel

Introduction

In this new page derived from the home page of Ioannis Xiotidis the strategy to have the first pass of the Bs effective lifetime is going to be described and explained. A set of plots is going to be presented with explanations and comments, in the attempt to describe as detailed as possible the procedure, that is described in the previous page.

Ntuple information

For this study the 2015/2016 Bmumu analysis ntuples were used because of the good understanding of the properties. Another benefit is that the data are unblinded, allowing to proceed with the usage of the sPlot technique without biasing ourselves.

Bs mumu ntuples (after skimming)

Type Period Stream DSID(MC)/Year(Data) Events Candidates
MC N/A N/A 300426 534475 534572
Data D Main 2015 41084 41110
Data E Main 2015 394939 395216
Data F Main 2015 263809 263986
Data G Main 2015 623371 623769
Data H Main 2015 207176 207306
Data J Main 2015 1075294 1076043
Data A Main 2016 328447 328692
Data B Main 2016 861654 862232
Data C Main 2016 1359363 1360316
Data D Main 2016 2137295 2138732
Data E Main 2016 764986 765508
Data F Main 2016 1441737 1442713
Data G Main 2016 1548242 1549332
Data I Main 2016 2903764 2905863
Data K Main 2016 1292971 1293961
Data L Main 2016 3208134 3210773
Data D Delayed 2016 342602 343080
Data E Delayed 2016 182085 182440
Data F Delayed 2016 336941 337623
Data G Delayed 2016 368415 369147
Data I Delayed 2016 547113 548338
Data K Delayed 2016 229967 230497
Data L Delayed 2016 553802 555296

A preselection is applied in those ntuples according to the one that is described in the internal note of the Bmumu analysis on the partial Run 2 dataset. The preselection code can be found in the git repository that has been set up for this analysis

B+ ntuples (after skimming)

TBA

The preselection for the B+ has been identified and set up, however cross checks against the cutflow in the internal note lead to some discrepancies that are currently investigated so this is has become an on-going task

Mass fits on Bs mass distribution

The next step after identifying the ntuples was to setup a class that is taking advantage of the RooFit framework and the knowledge of the partial run 2 data sets and fits the fitting region of the analysis [4766, 5966] MeV with the following model:

PDF_full = NCheb * Chebychev + NExpo * Exponential + NSignal * Gaussian

the models where chosen in order to describe specific components of the fitting region:

  1. Chebychev: Combinatorial background
  2. Exponential: Same sign same vertex (SSSV) background
  3. Gaussian: Signal

For those fits the BDT from the 2015/2016 analysis was used as it was and a cut was applied as shown also in the previous page. In order to cross check the fitter initially the highest BDT bin from the 2015/2016 analysis was used (0.416-1.00) and only the background was fitted, afterwards the cut has been loosen according the the study for looking at the proper time to 0.3-1.0

Mass fits

In this section we are showing the mass fits obtained by using the developed class. The cross checks were performed either against the internal note or in agreement with a fitter code that Alex wrote. The following regions were used for the mass fits:

B meson Range name Range [MeV]
Bs LowerSB 4766 - 5166
Bs UpperSB 5526 - 5966
Bs Signal Region 5166 - 5526

Bs mass fits

The following two fits were performed on 2015/2016 data and validated to provide the expected result.

Dataset 1516 BDTCut0416 BMUCALCMass SBFit.png
BDT bin 3 from the 2015/2016 analysis, SB only fit, pulls and residuals are wrong since they are only from the upper SB. FitLogBDTBin3MUCMassSBOnly
Dataset 1516 BDTCut03 BMUCALCMass FullRange.png
Full range fit with the full PDF described above and BDT cut 0.3. FitLogBDTCut03MUCMassFullRange

For the first fit we compared with the fit in the internal note for the second fit we had to compare with a macro that was written by Alex and compared with the class result in order to make sure that we are doing consistent things. An issue that was found with this method was the the too generous choice of limits on the parameters, increases significantly the step for the Minuit leading the fitter to miss the signal and flatten completely the gaussian in the signal region

Mass fit PDF model systematic studies

In this section a first study on the systematic effect of changing the bakground model and the range of fit in the invariant mass fit is being performed. Changing the background model and the range could lead to a better background subtraction from the sPlot technique. For this reason the following configuration has been introduced:

BDT cut Signal model Background model Fit range ct bins Fitted signal events Fit status Fit log Fit plots
0.365 Single gaussian chebychev + exponential [4766,5966] MeV 8 47.99 0 Full range full model fit log FullRangeFullModel
0.365 Single gaussian chebychev only [5200,5966] MeV 8 59.66 4* Cut range cheb only fit log CutRangeChebOnly
0.365 Single gaussian chebychev 0 order only [5200,5966] MeV 8 46.35 0 Cut range cheb 0 only fit log CutRangeCheb0Only
0.365 Single gaussian exponential only [5200,5966] MeV 8 45.64 0 Cut range exp only fit log CutRangeExpOnly
0.365 Single gaussian chebychev0 + exponential [4766,5966] MeV 8 42.9 0 Full range full mode with cheb 0 FullRangeFullModelCheb0

* For those fits the fitter did not manage to find an improvement and therefore returned an error code. The attempt to recover those fits is by applying a second follow up fit with the result obtained from the first one, however the fitters still fail, with a sensible result though.

From the subtraction obtained with sPlot the signal distributions where extracted and summarized in the following plot for the different background configurations.

SignalDistribution sPlot VariousBkgModelsAndRanges Fixed.png
Signal distribution take with sPlot for different background configurations

Another performed in order to see the strength of sPlot for background subtraction was to plot the distribution of the weights with the mass and with proper time. In this way we will be able to verify that the sPlot procedure is a more elaborate procedure rather than the basline by hand subtraction that someone can do.

BDT bin Distribution of weights
0.3-1.0 sWeightsVsMassAndProperTimeBaselineBDT
0.365-1.0 sWeightsVsMassAndProperTimeOptimalBDT

The final step of this initial study is to fit the lifetime distribution with out chi2 fitter to estimate the effect of the model selection in the invariant mass fit on the lifetime

Model Range Fitted Value Plot
Full model [4766,5966] 1.08 +0.5 -0.28
sPlotOnData BDT0.365 DifferentBkgModels LifetimeFitWithChi2.png
Chi2 scans for lifetime fit with different backgrounds
Cheb0 full model [4766,5966] 1.17 +0.7 -0.36
1st order cheb [5200,5966] 0.99 +0.33 -0.21
0th order cheb [5200,5966] 1.15 +0.66 -0.34
Exponential [5200,5966] 1.17 +0.69 -0.35

Mass fit residuals and pulls (Signal yield)

While checking the fitter for the lifetime, we observed that when include also the sPlot in the procedure there is some extra stuff that seem not to behave correctly. By a closer look it was seen that the mass fitter many times was mis-behaving or even failing, therefore the pulls and residuals of the signal yield has been computed:

FullDatasetToy 1000 BDTCut03 SignalYieldResidualsPulls NoConstMeanSigma.png
Signal yield residuals and pulls with no constraint on the fit parameters
FullDatasetToy 1000 BDTCut03 SignalYieldResidualsPulls ConstMeanSigma.png
Signal yield residuals and pulls with constrained the mean and sigma of gaussian, fit starting point are the values obtained from a signal only fit on the singal MC
SignaMC FitSingleGaussianModel.png
Signal only fit on signal MC

there are two main comments by looking at the plots from above. The main comment is that if we don't use any constraint we see that plenty of fits are failing leading to a loss of ~50% of our sample. The second thing to look is the the residuals plot in both scenarios. The residuals don't look good at all since they are flat. Discussions with Alex need to happen to see what we are missing.

Mass fits on B+ mass distribution

TBA

sPlot for Bs

Next step after fitting the mass range was to associate the mass fit with the sPlot technique. In sPlot what is required is a fit with known models to an independent variable (mass) in order to extract the distribution of the required control variable (effective lifetime). The two variables need to be uncorrelated in order to have a correct extraction of the control variable distributions as well as the error estimation.

Fit configuration for sPlot:

Variable Range Binning (for visual inspection) Units
Inv. Mass [4766,5966] 30 MeV
Prop. Time [0,12] 8 psec
BDT [0.3,1.0] NA NA

Mass Fit starting configuration:

Parameter Name Starting Value End Value
aCheb -0.05 -0.133577 +/- 0.315038
ExpSlope 500 123.771 +/- 26.7356
meanGaus 5366 5344.15 +/- 28.2744
sigmaGaus 160 70.2014 +/- 30.9718
chebEvents 200 332.438 +/- 79.8380
expoEvents 100 257.655 +/- 60.9702
gausEvents 60 49.8445 +/- 23.7094

For this fit we define the ranges above as signal region and SB.

Dataset 1516 BDTCut03 sPlot 8ctBins BDT03 FullMassRange.png
sPlot for BDT > 0.3 on data FitLogForMassFit
Dataset 1516 sPlot SignalDistribution.png
sPlot for signal events
Dataset 1516 sPlot SSSVdistribution.png
sPlot for SSSV background events
Dataset 1516 sPlot CombinatorialDistribution.png
sPlot for Combinatorial background events

sPlot for Bd

TBA

MC distribution fit to sPlot

In this section the fitting method for extracting the effective lifetime value is going to be discussed. In addition the cross checks to identify potential problems are going to be shown, with potential fixes.

Main concept

In our lifetime measurement as described in the previous twiki page, what we are going to fit our data with are MC template histograms parameterise with the following event per event weight. This method was preferred as a starting point, because it avoids parameterising the detector and selection effect on our data sample with a response functional form. The event per event weight used is: w(tau) = exp(-t/tauMC)-exp(-t/tau), where tauMC: is the value that the MC was generated and tau is fit parameter.

TauMC identification

An important part of our method is to identify what the tauMC value is our MC samples. To identify the value we performed an exponential extended fit on the full MC truth proper time sample far away from the pick of the distribution. The result of that fit that can be show below yield to a tauMC value of 1.511 ps, which is the value used to calculate the event per event weight for our MC template

SignalMC 1516 NoBDTCut TruthPropTimeFIt.png
Truth proper time fit on MC sample without BDT cut

the main reason why we are fitting this region is because we know that lower proper time values are affected by selection criteria and detector effects, therefore by taking advantage of the large statistics we move far away from that region and still be able to obtain a stable fit that effectively looks unbiased.

Weighted distribution

An example of the MC template histograms parameterised with tau can be seen in the following plot. In the plot 24bins in the measured ct have been selected and a BDT cut of 0.3 on the MC sample.

SignalMC 1516 BDTCut03 MeasuredProperTime.png
MC templates for different tau values

Fit Estimator

In this section the fit estimator used for the lifetime extraction is described. Initially in our studies we decided to fit our MC templates with the sPlot extracted data by using a Chi2 goodness-of-fit estimator. Essentially, what we are doing is for every MC template we calculate the sum of the Chi2 of each bin and then identify the template with the lowest values, which provides the value of the effective lifetime. To estimate the error with this procedure we rely on the statistics theory where is shows that the values of the chi2 distribution at Dchi2 = 1 are the lower and upper statistical errors. For the calculation of the chi2 a cut on the lower statistics bins has been applied since the impact on the calculation is huge and mostly unphysical. Therefore only bins with more than 10 events on data were used in the calculation.

Chi2Scan MCFitToDataset1516 BDTCut03 05psbinning.png
Chi2 scan for several lifetime values in a given range around the minimum

for reproducing the plot from above the following configuration was used:

Variable Value Range
BDT 0.3 [0.3,1.0]
sPlotEntries 10 [10,inf]
Lifetime (Step) 0.03 [0.5,2.5]
sPlotBins 24 NA
MassFitBins 30 [4766,5966]

Note: For the next studies the interfacing of the likelihood function with the standard tool of minimization (Minuit) is going to used. In this way the minimum identification is speed up and the error estimation performed in a more standard way.

Conclusions on first pass

With the first analysis pass we were able to extract the lifetime value with the corresponding uncertainty. However, detailed studies need to be performed on the Residuals and the Pulls for the fit estimator in order to identify whether our method is somehow biased or not. The first result from our dataset is tau = 1.00 [-0.2,+0.3] ps, however as noted again many changes need to be applied since we have to study in detail pulls and residuls.

ToyMC studies

In this section we are going to describe the toy MC studies on the fit method in order to identify any potential source of bias. Initially the identification of how we are planning to generate statistically independent toy samples had to be determined. Afterwards two methods to study the method have been identified. The first one is to study initially only the signal MC where only the fit estimator is being put in place and try to understand if there is any bias directly correlated to the method. The second one is a follow up after we agree that we are understanding in depth the fit estimator to include in the validation the sPlot method. Finally it has to be mentioned that both of the methods are going to be applied also for the control channel.

Bootstrap method

For generating toy MC the bootstrap method has been agreed to be used. The main concept is to use the full signal MC that includes ~70k events and with the use of a poisson distribution and a uniform extract independent datasets according to the expected number of events from the signal. Since our dataset is unblinded, by fitting the mass distribution we identified that with a BDT cut 0.3 O(50) events are expected in the signal region. For this reason the seed of the poisson distribution is 50, and then the events are selected by multiplying the value of a uniform distribution between 0 and 1 with the total number of events in the data set (70k). In this way we are able to extract as many independent datasets as we want, since the number of events required is very low in compare to the total number that are include in the full MC sample. To check that the toy generation has no bias from the generation procedure, the following graph has been produced. Where no dependence can be seen between the toy number and the mean value of the dataset.

ToyGeneration 100 MeanValueOfHistogramOfDataSet.png
proper time mean value from toy generation, signal only

Fitting the Toy samples

For fitting the toy sample two methods are put in place. The first one is the standard Chi2 scan as it was described above. The second one is use of Minuit, where we are interfacing our likelihood with minuit and with minos extract the value of the errors. The reason why the second method has been put into place is because of it's increase in performance for generating and fitting the toy samples. Initially a validation has been performed to identify whether Minuit is doing exactly what we expect as with the scan method:

ToyID1 BDTCut03 Bootstrap.png
Example toy sample for values quoted below, showing the invariant mass distribution and proper time
SignalMC BDTCut04 invMass propTime.png
Signal MC distribution used for the fit

Variable Method Value Errors Likelihood value FitLog
TauMin Scan 1.01 [-0.27,0.39] 0.809202  
TauMin Minuit 1.001 [-0.239735,0.375721] 0.809543 FitLogForChi2Fit

The result of fitting the lifetime with different tau values (scan) is being showed here where the minimum value is being found.

Chi2Scan ToyID1.png
Chi2 scan with step 0.03
ScanVsMinuitInChi2 100Toys.png
Chi2 fits with the scan method and minuit per event ratio and difference

Pulls/Residuals for toys

In this section what I am going to show is the results of fitting a set of toys and see how the lifetime varies for different datasets. Two stages are being implemented to be 100% that what is done is unbiased and we are having a good understanding of the whole procedure. Initially what we are looking at is generating a toys from the signal MC only. What I mean is that we are going to generate directly the proper time distribution according to the expected number of events from the signal fit on data (~50 events), parameterise with the same weight (described above) the MC distribution, scale one to each other and fit the histograms. The second technique involves also the use of the sPlot technique. If no clear mistake is going to be found in the signal only fits then we are going to generate bootstrap samples with a blend of the bbmumuX MC for the background and the signal MC according to the expected events from the signal fit on data.

Pulls/Residuals Singal only (Chi2)

Configuration:

#Toys BDTCut tBins FitRange Fitter Minimizer Plots
1000 0.3 6 [0,12] Chi2 Minuit
SignalToys 1000 BDTCut03 Chi2 6Bins.png
Chi2 fitter Residuals and Pulls on signal MC only
1000 0.3 8 [0,12] Chi2 Minuit
SignalToys 1000 BDTCut03 Chi2 8Bins.png
Chi2 fitter Residuals and Pulls on signal MC only
1000 0.3 12 [0,12] Chi2 Minuit
SignalToys 1000 BDTCut03 Chi2 12Bins.png
Chi2 fitter Residuals and Pulls on signal MC only

By looking at the above plots there are a couple of outcomes that can be extracted. First of all we see that our procedure has a very small bias in terms of identifying the minimum, since we see that the mean value of the residual and pull plot is very close to 0. Some, more information to be extracted is that the Std. Dev of the residuals is O(0.2) which is compatible with the error we are observing on data, giving as the sensitivity of our method. Finally, two more things can be seen from the residuals and pulls, one of them is that the value of the Std.Dev. from the Pulls is different than 0 leading on an underestimation of the error in our method. The second fact is that there is clearly a bin dependence with the selection of tha proper time bins. To lift those issues, we decided to change the fitter and move from a chi2 fitter to a multinomial fitter, since we know that the chi2 fitter is an approximation of large statistics, which here is not the case.

Pulls/Residuals Signal only (Multinomial)

As stated above the Chi2 fitter seems to have a significant issue with the calculation of the errors (underestimation), that is why a decision was made to move into a multinomial fitter, by implementing an -2LogLikelihood fitter. The same studies with the same configuration were run, and cross checked so to see whether we can reach a better performance.

#Toys BDTCut tBins FitRange Fitter Minimizer Plots
1000 0.3 6 [0,12] Multinomial Minuit
SignalToys 1000 BDTCut03 Multinomial 6Bins.png
Multinomial fitter Residuals and Pulls on signal MC only
1000 0.3 8 [0,12] Multinomial Minuit
SignalToys 1000 BDTCut03 Multinomial 8Bins.png
Multinomial fitter Residuals and Pulls on signal MC only
1000 0.3 12 [0,12] Multinomial Minuit
SignalToys 1000 BDTCut03 Multinomial 12Bins.png
Multinomial fitter Residuals and Pulls on signal MC only
1000 0.3 24 [0,12] Multinomial Minuit
SignalToys 1000 BDTCut03 Multinomial 24Bins.png
Multinomial fitter Residuals and Pulls on signal MC only
1000 0.3 42 [0,12] Multinomial Minuit
SignalToys 1000 BDTCut03 Multinomial 42Bins.png
Multinomial fitter Residuals and Pulls on signal MC only

From the plots it can be seen that the multinomial fitter with the same binning as the chi2 in proper time has a Std. Dev. value in the pulls = 1.00. Meaning that the error calculation is being performed correctly. Additionally, we see that the performance of the fitter is almost constant in terms of the binning. Therefore, we decided to try and increase the bin content to large values and see if there is any correlation. From the two extra plots (24,42) it can be seen that there is no dependance on the error calculation with the binning, neither for the mean of the residual distribution. Also, the performance in terms of resolution on the lifetime value seems to be the same as the chi2, which is what it was expected as well.

Pulls/Residuals full dataset (Chi2)

The next step is to integrate in the toy procedure the sPlot technique and check the impact it has on our residuals and pulls. Even if the Chi2 did not provide sufficient performance with the signal only we will still going to run the pulls and residuals on it for different configurations and see how it performs with sPlot included. So what we are using is the bb->mmX MC for the bkg and the signalMC for the signal. However, there is a flow in this method that we need to validate, and I will describe it in the Assumptions section a bit lower.

#Toys BDTCut tBins Exp. Signal Events Exp. Bkg Events FitRange Fitter Minimizer Plots
1000 0.3 6 50 590 [0,12] Chi2 Minuit2
FullDatasetToy 1000 BDTCut03 Chi2 6Bins NoConstraintOnMass.png
Chi2 fitter Residuals and Pulls on full data set with sPlot and 6 ct bins
1000 0.3 8 50 590 [0,12] Chi2 Minuit2
FullDatasetToy 1000 BDTCut03 Chi2 8Bins ConstraintOnMassDifferenetBkg.png
Chi2 fitter Residuals and Pulls on full data set with sPlot and 8 ct bins
1000 0.3 8 50 295 [0,12] Chi2 Minuit2
1000 0.3 8 50 148 [0,12] Chi2 Minuit2
1000 0.3 12 50 590 [0,12] Chi2 Minuit2
FullDatasetToy 1000 BDTCut03 Chi2 12Bins NoConstraintOnMass.png
Chi2 fitter Residuals and Pulls on full data set with sPlot and 12 ct bins

Additionally we created a set of plots to show how the mean, std. dev. of pulls and residuals evolves with the different number of expected background events

FullDatasetToy 1000 BDTCut03 Chi2 MeanResidualVsExpBkg.png
Evolution plot for Residual Mean with the errors
FullDatasetToy 1000 BDTCut03 Chi2 SigmaResidualVsExpBkg.png
Evolution plot for Residual Std. Dev. with the errors
FullDatasetToy 1000 BDTCut03 Chi2 MeanPullVsExpBkg.png
Evolution plot for Pulls Mean with the errors
FullDatasetToy 1000 BDTCut03 Chi2 SigmaPullVsExpBkg.png
Evolution for Pulls Std. Dev. with the errors

There is a very nice feature that has been observed by looking at the pulls and residuals. If we check the number of underflow bins in the pulls we see that they decrease when decreasing the expected number of background events. This leads to the a preliminary outcome that most likely the mass fit on the independent variable is not perfroming as expected, leading to fits that take more background inside than expected. For this reason a study is being made where the stabilization of the mass fit is being attempted initially by fixing the sigma and the mean of the gaussian component to the values observed from fitting the signal MC. This study is being performed with the multinomial an shown below.

Pulls/Residuals full dataset (Multinomial - no changed statistics)

#Toys BDTCut tBins Exp. Signal Events FitRange Fitter Minimizer Constraint in signal mass model Plots
1000 0.3 6 590 [0,12] Multinomial Minuit2 No
FullDatasetToy 1000 BDTCut03 Mult 6Bins NoConstraint OldStat.png
Pulls and residuals for 6 ct bins with no constraint and full bkg
1000 0.3 8 590 [0,12] Multinomial Minuit2 No
FullDataToy 1000 BDTCut03 Mult 8Bins NoConstraint DifferentBkg OldStat.png
Pulls and residuals for 8 ct bins with full, half, quarter of the expected background, with the sequence starting from top and no constraint on mass singal model
1000 0.3 8 295 [0,12] Multinomial Minuit2 No
1000 0.3 8 178 [0,12] Multinomial Minuit2 No
1000 0.3 12 590 [0,12] Multinomial Minuit2 No
FullDatasetToy 1000 BDTCut03 Mult 12Bins NoConstraint OldStat.png
Pulls and residuals for 12 ct bins with no constraint and full bkg
1000 0.3 6 590 [0,12] Multinomial Minuit2 Yes
FullDatasetToy 1000 BDTCut03 Mult 6Bins Constraint OldStat.png
Pulls and residuals for 6 bins with full background and keeping the mean and sigma of signal model in mass constant
1000 0.3 8 590 [0,12] Multinomial Minuit2 Yes
FullDatasetToy 1000 BDTCut03 Mult 8Bins ConstraintDifferentBkg OldStat.png
Pulls and residuals for 8 bins with full background and keeping the mean and sigma of signal model in mass constant and different bkg
1000 0.3 8 295 [0,12] Multinomial Minuit2 Yes
1000 0.3 8 178 [0,12] Multinomial Minuit2 Yes
1000 0.3 12 590 [0,12] Multinomial Minuit2 Yes
FullDatasetToy 1000 BDTCut03 Mult 12Bins Constraint OldStat.png
Pulls and residuals for 12 bins with full background and keeping the mean and sigma of signal model in mass constant

Pulls/Residuals full dataset (Multinomial)

The next step is to integrate the toy procedure with the new fitter we identified and this is the Multinomial likelihood. However, there is a small issue with using the multinomial out of the box as we did for the signal only toys. The main reason for this is because the multinomial requires the entries of the sPlot histogram which in our scenario is an event per event weighted histogram. This leads to a discrepancy between the full toy scenario and the signal only scenario. A solution has been identified to use instead of the direct entries of the bin the square of the sum of the weights in the bin divided by the sum of the squared weights. This solution remains an approximation and that's why it has to been cross checked if we can use it right away. So for the multinomial the following plots have been observed:

#Toys BDTCut tBins Exp. Signal Events FitRange Fitter Minimizer Constraint in signal mass model Plots
1000 0.3 6 590 [0,12] Multinomial Minuit2 No
FullDatasetToy 1000 BDTCut03 Mult 6Bins NoConstraint NewStat.png
Pulls and residuals for 6 ct bins with no constraint and full bkg
1000 0.3 8 590 [0,12] Multinomial Minuit2 No
FullDatasetToy 1000 BDTCut03 Mult NoConstraintDifferentBkg NewStat.png
Pulls and residuals for 8 ct bins with full, half, quarter of the expected background, with the sequence starting from top and no constraint on mass singal model
1000 0.3 8 295 [0,12] Multinomial Minuit2 No
1000 0.3 8 178 [0,12] Multinomial Minuit2 No
1000 0.3 12 590 [0,12] Multinomial Minuit2 No
FullDatasetToy 1000 BDTCut03 Mult 12Bins NoConstraint NewStat.png
Pulls and residuals for 12 ct bins with no constraint and full bkg
1000 0.3 6 590 [0,12] Multinomial Minuit2 Yes
FullDatasetToy 1000 BDTCut03 Mult 6Bins Constraint NewStat.png
Pulls and residuals for 6 bins with full background and keeping the mean and sigma of signal model in mass constant
1000 0.3 8 590 [0,12] Multinomial Minuit2 Yes
FullDatasetToy 1000 BDTCut03 Mult Constraint DifferenetBkg NewStat.png
Pulls and residuals for 8 bins with full background and keeping the mean and sigma of signal model in mass constant and different bkg
1000 0.3 8 295 [0,12] Multinomial Minuit2 Yes
1000 0.3 8 178 [0,12] Multinomial Minuit2 Yes
1000 0.3 12 590 [0,12] Multinomial Minuit2 Yes
FullDatasetToy 1000 BDTCut03 Mult 12Bins Constraint NewStat.png
Pulls and residuals for 12 bins with full background and keeping the mean and sigma of signal model in mass constant

Conclusion on toy studies

It has been seen that non of our likelihood parametrizations produced a sufficient acceptable result. This leads us to change our fitter approach. All approached indicate either a bias on the fitter (mean and pull distribution shifted) or a issue with the error estimation (Pull distribution sigma = 1 not symmetric pulls). Neither stabilization the mass fitter nor by reducing the background (increase the separation of sPlot) lead to an acceptable result. For this reason we started implementing a numerical calculation of the likelihood with the use of toy generation. This will lead to a CPU intensive method but is the final fort to parameterize a weighted histogram. For that reason a new twiki page needs to be create to enhance all the stuff in regard of the NumericalLikelihood studies.

-- IoannisXiotidis - 2020-05-19

Topic attachments
I Attachment History Action Size Date Who Comment
Texttxt Chi2MinuitOutput.txt r1 manage 2.9 K 2020-06-15 - 11:42 IoannisXiotidis  
PNGpng Chi2Scan_MCFitToDataset1516_BDTCut03_05psbinning.png r1 manage 9.3 K 2020-06-11 - 10:52 IoannisXiotidis  
PNGpng Chi2Scan_ToyID1.png r1 manage 9.8 K 2020-06-15 - 11:42 IoannisXiotidis  
Texttxt CutRangeCheb0OnlyModelFit.txt r1 manage 10.5 K 2020-11-21 - 17:16 IoannisXiotidis  
Texttxt CutRangeChebOnlyModelFit.txt r1 manage 21.7 K 2020-11-21 - 17:16 IoannisXiotidis  
Texttxt CutRangeExpOnlyModelFit.txt r1 manage 9.7 K 2020-11-21 - 17:16 IoannisXiotidis  
PNGpng Dataset_1516_BDTCut03_BMUCALCMass_FullRange.png r1 manage 21.9 K 2020-05-20 - 00:37 IoannisXiotidis  
PNGpng Dataset_1516_BDTCut03_sPlot_8ctBins_BDT03_FullMassRange.png r1 manage 28.8 K 2020-06-09 - 15:28 IoannisXiotidis  
PNGpng Dataset_1516_BDTCut0416_BMUCALCMass_SBFit.png r1 manage 22.3 K 2020-05-20 - 00:31 IoannisXiotidis  
PNGpng Dataset_1516_sPlot_CombinatorialDistribution.png r1 manage 8.5 K 2020-06-09 - 15:36 IoannisXiotidis  
PNGpng Dataset_1516_sPlot_SSSVdistribution.png r1 manage 8.8 K 2020-06-09 - 15:36 IoannisXiotidis  
PNGpng Dataset_1516_sPlot_SignalDistribution.png r1 manage 8.6 K 2020-06-09 - 15:36 IoannisXiotidis  
Texttxt FitLogBDTBin3MUCALCMassSBFit.txt r1 manage 22.5 K 2020-05-20 - 00:31 IoannisXiotidis  
Texttxt FitLogBDTCut03MUCALCMassFullRangeFit.txt r1 manage 37.9 K 2020-05-20 - 00:37 IoannisXiotidis  
Texttxt FitLog_BDT03_FullMassRange_FullFit.txt r1 manage 33.4 K 2020-06-09 - 15:28 IoannisXiotidis  
PNGpng FullDataToy_1000_BDTCut03_Mult_8Bins_NoConstraint_DifferentBkg_OldStat.png r1 manage 38.3 K 2020-06-26 - 15:37 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_12Bins_NoConstraintOnMass.png r1 manage 16.7 K 2020-06-26 - 14:57 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_178ExpBkg_8Bins.png r1 manage 16.2 K 2020-06-23 - 10:56 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_295ExpBkg_8Bins.png r1 manage 16.3 K 2020-06-23 - 10:56 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_590ExpBkg_12Bins.png r1 manage 24.8 K 2020-06-24 - 12:39 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_590ExpBkg_6Bins.png r1 manage 18.8 K 2020-06-23 - 10:56 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_590ExpBkg_8Bins.png r1 manage 21.7 K 2020-06-23 - 10:56 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_6Bins_NoConstraintOnMass.png r1 manage 19.0 K 2020-06-26 - 14:57 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_8Bins_ConstraintOnMassDifferenetBkg.png r1 manage 37.7 K 2020-06-26 - 14:57 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_MeanPullVsExpBkg.png r1 manage 10.3 K 2020-06-24 - 12:39 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_MeanResidualVsExpBkg.png r1 manage 10.8 K 2020-06-24 - 12:39 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_SigmaPullVsExpBkg.png r1 manage 11.3 K 2020-06-24 - 12:39 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Chi2_SigmaResidualVsExpBkg.png r1 manage 13.0 K 2020-06-24 - 12:39 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_12Bins_Constraint_NewStat.png r1 manage 15.6 K 2020-06-26 - 15:38 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_12Bins_Constraint_OldStat.png r1 manage 19.3 K 2020-06-26 - 15:37 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_12Bins_NoConstraint_NewStat.png r1 manage 18.8 K 2020-06-26 - 15:38 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_12Bins_NoConstraint_OldStat.png r1 manage 18.2 K 2020-06-26 - 15:37 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_12ctBinsOnFullBkgNoConstraint.png r1 manage 16.8 K 2020-06-25 - 11:02 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_6Bins_Constraint_NewStat.png r1 manage 14.4 K 2020-06-26 - 15:40 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_6Bins_Constraint_OldStat.png r1 manage 17.6 K 2020-06-26 - 15:37 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_6Bins_NoConstraint_NewStat.png r1 manage 18.6 K 2020-06-26 - 15:38 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_6Bins_NoConstraint_OldStat.png r1 manage 14.7 K 2020-06-26 - 15:37 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_6ctBinsOnFullBkgNoConstraint.png r1 manage 17.3 K 2020-06-25 - 11:02 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_8Bins_ConstraintDifferentBkg_OldStat.png r1 manage 38.8 K 2020-06-26 - 15:37 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_AllPullsAndResidualsForDifferentBinning_6to12_WithConstantGausParameters.png r1 manage 36.0 K 2020-06-25 - 10:24 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_AllPullsAndResidualsForDifferentBkgEvents.png r1 manage 38.2 K 2020-06-25 - 10:24 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_AllPulsAndResidualsForDifferentExpBkgWithConstantGaus.png r1 manage 38.9 K 2020-06-25 - 11:02 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_Constraint_DifferenetBkg_NewStat.png r1 manage 37.8 K 2020-06-26 - 15:38 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_MeanPullVsExpBkg.png r1 manage 11.1 K 2020-06-25 - 15:23 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_MeanResidualVsExpBkg.png r1 manage 10.1 K 2020-06-25 - 15:23 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_NoConstraintDifferentBkg_NewStat.png r1 manage 41.9 K 2020-06-26 - 15:38 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_SigmaPullVsExpBkg.png r1 manage 9.6 K 2020-06-25 - 15:23 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_Mult_SigmaResidualVsExpBkg.png r1 manage 12.3 K 2020-06-25 - 15:23 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_SignalYieldResidualsPulls_ConstMeanSigma.png r1 manage 26.0 K 2020-06-30 - 12:17 IoannisXiotidis  
PNGpng FullDatasetToy_1000_BDTCut03_SignalYieldResidualsPulls_NoConstMeanSigma.png r1 manage 19.4 K 2020-06-30 - 12:17 IoannisXiotidis  
Texttxt FullRangeFullModelFit.txt r1 manage 32.3 K 2020-11-21 - 17:16 IoannisXiotidis  
PNGpng ScanVsMinuitInChi2_100Toys.png r1 manage 18.1 K 2020-06-15 - 13:14 IoannisXiotidis  
PNGpng SignaMC_FitSingleGaussianModel.png r1 manage 11.5 K 2020-06-30 - 12:17 IoannisXiotidis  
PNGpng SignalDistribution_sPlot_VariousBkgModelsAndRanges.png r1 manage 26.8 K 2020-11-21 - 17:16 IoannisXiotidis  
PNGpng SignalDistribution_sPlot_VariousBkgModelsAndRanges_Fixed.png r1 manage 35.5 K 2020-11-25 - 11:28 IoannisXiotidis  
PNGpng SignalMC_1516_BDTCut03_MeasuredProperTime.png r1 manage 17.3 K 2020-06-10 - 18:34 IoannisXiotidis  
PNGpng SignalMC_1516_NoBDTCut_TruthPropTimeFIt.png r1 manage 12.0 K 2020-06-09 - 17:47 IoannisXiotidis  
PNGpng SignalMC_BDTCut04_invMass_propTime.png r1 manage 19.6 K 2020-06-15 - 11:47 IoannisXiotidis  
PNGpng SignalToys_1000_BDTCut03_Chi2_12Bins.png r1 manage 29.6 K 2020-06-22 - 13:18 IoannisXiotidis  
PNGpng SignalToys_1000_BDTCut03_Chi2_6Bins.png r1 manage 30.1 K 2020-06-22 - 13:18 IoannisXiotidis  
PNGpng SignalToys_1000_BDTCut03_Chi2_8Bins.png r1 manage 30.9 K 2020-06-22 - 13:18 IoannisXiotidis  
PNGpng SignalToys_1000_BDTCut03_Multinomial_12Bins.png r1 manage 31.1 K 2020-06-22 - 19:29 IoannisXiotidis  
PNGpng SignalToys_1000_BDTCut03_Multinomial_24Bins.png r1 manage 32.5 K 2020-06-22 - 19:29 IoannisXiotidis  
PNGpng SignalToys_1000_BDTCut03_Multinomial_42Bins.png r1 manage 30.5 K 2020-06-22 - 19:29 IoannisXiotidis  
PNGpng SignalToys_1000_BDTCut03_Multinomial_6Bins.png r1 manage 32.9 K 2020-06-22 - 19:29 IoannisXiotidis  
PNGpng SignalToys_1000_BDTCut03_Multinomial_8Bins.png r1 manage 32.0 K 2020-06-22 - 19:29 IoannisXiotidis  
PNGpng ToyGeneration_100_MeanValueOfHistogramOfDataSet.png r1 manage 6.5 K 2020-06-15 - 11:42 IoannisXiotidis  
PNGpng ToyID1_BDTCut03_Bootstrap.png r1 manage 19.0 K 2020-06-15 - 11:42 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.365_Cheb0OnlyBkg_CutRange.png r1 manage 43.0 K 2020-11-21 - 17:16 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.365_Cheb0OnlyBkg_CutRange_Fixed.png r1 manage 25.1 K 2020-11-25 - 11:28 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.365_ChebOnlyBkg_CutRange.png r1 manage 43.8 K 2020-11-21 - 17:16 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.365_DifferentBkgModels_LifetimeFitWithChi2.png r1 manage 88.6 K 2020-11-26 - 10:50 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.365_ExpOnlyBkg_CutRange.png r1 manage 46.6 K 2020-11-21 - 17:16 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.365_ExpOnlyBkg_CutRange_Fixed.png r1 manage 26.3 K 2020-11-25 - 11:28 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.365_FullBkgModel_FullRange.png r1 manage 47.5 K 2020-11-21 - 17:16 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.365_FullBkgModel_FullRange_Fixed.png r1 manage 28.1 K 2020-11-25 - 11:28 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.365_FullBkgWithCheb0Order_FullRange_Fixed.png r1 manage 27.9 K 2020-11-25 - 11:28 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.365_WeightsVsMassAndVsPropTime.png r1 manage 31.5 K 2020-11-24 - 00:12 IoannisXiotidis  
PNGpng sPlotOnData_BDT0.3_WeightsVsMassAndVsPropTime.png r1 manage 35.8 K 2020-11-24 - 00:12 IoannisXiotidis  
Edit | Attach | Watch | Print version | History: r25 < r24 < r23 < r22 < r21 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r25 - 2021-01-18 - IoannisXiotidis
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback