Two-Dimensional simultaneous fits
Look here
General
Currently (2020-08-24) the two dimensional fits are set up as follows.
1. Binning in BDT and η is chosen. Example: 2 bins in BDT with edges {0.1439, 0.2455, 1} times 2 bins in η with edges {0, 0.75, 2.5}
2. Evolution prototypes are fited individually, taking into account binning in
one variable at a time. Example: BDT evolution prototype with datasets binned in BDT as {0.1439, 0.2455, 1} and full-range in η and vice versa for an η prototype.
3. Evolution parameters for BDT/η evolution in the simultaneous fit are initiated with
values obtained from these prototypes (i.e. evolution in one variable is not affected by binning in the other). The constant term of a 2D evolution is introduced as a single variable initiated with and average of constant terms of BDT and η evolutions.
4. Simultaneous fit is performed with all desired dependecies.
5. Evolution of a parameter is projected on the initializing prototype.
UPDATE October 2020 - point 3. no longer holds
The plane-evolution parameters are now obtained by fitting a plane to the two prototype lines. This means translating the a +b*avBDT, c + d*avEta lines into a plane parametrized as alpha + beta*BDT + gamma*eta. It is fitted by minimizing chi2 between the prototype lines in the points avBDT, avEta in the bdt/eta prototype bins and the average of the plane values for all entries of the respective bin, as a function of alpha, beta, gamma. The x error is the error on the prototype bin mean.
Background unblinded MC: 2 BDT X 2 η Bins, constant evolutions
Fit Log:
fitLog_2Dev_constExpConst_constSlope_2BDT2ETA.txt
Prototype Evolution plots |
|
|
|
|
Background unblinded MC: 4 BDT X 2 η Bins, both constant in η, slope linear in BDT
Fit Log:
fitLog_2Dev_constExpConst_constSlope_4BDT2ETA.txt
Prototype Evolution plots |
|
|
|
|
AFTER plane-initialization update
Fit Log:
fitLog.txt
Prototype Evolution plots |
|
|
|
|
Background unblinded MC: 4 BDT X 3 Eta Bins
The evolutions are chosen based on the evolution plots. Wherever the linear fit to the parameter of interest has larger p-value than constant, it is used, and vice versa. When the fit does not converge, the parameters with reasonable constant evolution are used instead of the linear one.
This only converged with slope being linear in BDT. Moreover, NLL minimization through MIGRAD had to be utilized, otherwise the fit had nonzero exit statuses from minimize and hesse.
Fit Log:
Prototype Evolution plots |
|
|
|
|
Candidate count initialization with a plane
The simultaneous fits in certain cases were failing due to wrong indiviual fits in the 2D binning, which propagated to the simultaneous fit starting values of the combinatorial/sssv candidate counts (extended parameters). These were the only remaining parameters still initialized by a result of the individual 2D fits, which, with lower statistics, tend to fail/behave unexpectedly more often. A similar approach as for the initialization of the shape parameter dependences was employed for this - based on the 1D binned prototypes. A fraction of nComb and nSSSV was studied in the BDT / eta binning, and a plane was fitted to these dependences. Then, with a fraction of nComb/nSSSV as a function of bdt and eta, the starting values of nComb and nSSSV for the simultaneous fit are calculated by multiplying these fractions by the total number of entries in a given 2D bin. This is only used for the initialization, and
the counts are further unconstrained in the simultaneous fit. Should any of the components be initialized with a negative number, zero is used instead. With this initialization, the fit converges with both constant and linear dependences of the parameters in BDT and eta in the following 3x3 binning example.
2D linear ("pol1" in both bdt and eta) dependence of the SSSV/comb fractions was used for the intialization.
3 bins in eta, 3 in bdt, slope and expConst linear in both
fitLog:
fitLog_3EtaX3BDT.txt
nComb/nSSSV initialization |
|
|
|
|
|
BDT Bins |
η Bins |
|
|
|
|
|
|
|
|
|
Prototype Evolution plots |
|
|
|
|
4 bins in BDT, 3 bins in eta, parameters constant in both
In order to save space here, I'm not uploading the mass fits at the moment, only the evolution plots
Black points - individual fits, red line - constant fit to black points, green line - linear fit to black points, purple points - simultaneous fit evolution
fit log:
fitLog_3x4_allconst.txt
Prototype Evolution plots |
|
|
|
|
4 bins in BDT, 3 bins in eta, parameters linear in both - will supply pictures once it's clear how to plot them
The fit converges, but it's necessary to run migrad twice in this case to get exit status 0.
Black points - individual fits, red line - constant fit to black points, green line - linear fit to black points, purple points - simultaneous fit evolution
fit log:
fitLog_3x4.txt
Prototype Evolution plots |
|
|
|
|
4 bins in BDT, 3 bins in eta, expConst constant in BDT
In this setting, i.e. keeping expConst constant in BDT with other constraints linear, the fit converges gently in the first go.
Black points - individual fits, red line - constant fit to black points, green line - linear fit to black points, purple points - simultaneous fit evolution
fit log:
fitLog_expConst-bdtconst.txt
This particular setting was fitted also with the default
RooFit "fitTo" method using Minimize to cross-check if it gives the same results as Migrad minimization through first building the NLL out of the simultaneous pdf instance
RooSimultaneous. Same results were obtained. The latter approach seems to be more stable in certain cases and allows for multiple calls of Migrad, and therefore will be used preferentially. It was also used in the 15/16 analysis fit.
6 bins in BDT, 5 bins in eta, expConst constant in BDT
In this setting, i.e. keeping expConst constant in BDT with other constraints linear, the fit converges, but
fits negative SSSV event number in BDT_1_ETA_2 bin.
Black points - individual fits, red line - constant fit to black points, green line - linear fit to black points, purple points - simultaneous fit evolution
fit log:
fitLog_bkg_6BDTx5ETA.txt
Bs parameter evolution updated & revised
The signal model was up to now only fitted with a
ROOT functional model in the individual fits. The
RooFit step was then introduced to the signal fitting as well to be consistent in tools with the background fitting. The revised signal parameter evolutions with BDT and Eta of the
RooFit step of the individual fits are summarized below:
BDT evolution of signal model parameters |
Eta evolution of signal model parameters |
The evolutions are chosen based on the evolution plots. Wherever the linear fit to the parameter of interest has larger p-value than constant, it is used, and vice versa. When the fit does not converge, the parameters with reasonable constant evolution are used instead of the linear one.
signal evolution in BDT - M1 and M2 constant, rest linear
Based on the evolution plots above, these evolutions were tested in the fit in BDT bins only.
Fit log:
fitLog_signalSim_bdt.txt
|
|
|
|
bdt evolution of signal model parameters |
signal evolution in ETA - M1 and M2 constant, rest linear
the M1 and M2 parameters are again made constant, rest of the parameters is linear
Based on the evolution plots above, these evolutions were tested in the fit in BDT bins only.
Fit log:
signal_1BDTx3ETA.txt
|
|
|
|
eta evolution of signal model parameters |
2D signal fits
3 BDT bins 3 ETA bins, M1 and M2 constant in both, rest linear in both
This fit converged properly after merging the top two BDT bins
Fit log:
fitLog_signal_3BDTx3ETA.txt
|
parameter evolution in BDT |
parameter evolution in ETA |
S1 evolution in BDT - wrongly plotted in previous picture |
THE SAME SETTING FAILS WITH TOP 2 BDT BINS NOT MERGED INTO ONE
4 BDT bins 3 ETA bins, M1 and M2 constant in both, S1, S2, f1 constant in bdt only, rest linear in both
With stricter constraints, this converges in 4BDTx3ETA default binning
Fit log:
fitLog_signal_4BDTx3ETA.txt
|
parameter evolution in BDT |
parameter evolution in ETA |
4 BDT bins 4 ETA bins, M1 and M2 constant in both, S1, S2, f1 constant in bdt only, rest linear in both
The same constraints were tested also in 4-by-4 binning:
EtaBins = {0,0.75,1.5,2.,2.5}
BDTBins = {0.1439,0.2455,0.3312,0.4163,1}
Fit log:
fitLog_signal_4BDTx4ETA.txt
|
parameter evolution in BDT |
parameter evolution in ETA |
6 BDT bins 5 ETA bins, M1 and M2 constant in both, S1, S2, f1 constant in bdt only, rest linear in both
To check stability wrt. binning, the same constraints were tested also in 6-by-5 binning:
EtaBins = {0,0.75,1.25,1.6,2.,2.5}
BDTBins = {0.1439,0.2,0.2455,0.3,0.3312,0.4163,1};
Fit log:
fitLog_signal_6BDTx5ETA.txt
|
parameter evolution in BDT |
parameter evolution in ETA |
Bd component
Evolution of shape parameters in BDT/Eta prototypes
These are the evolutions (black - individual fit results, red - constant fit to black points, green - linear fit to black points)
BDT evolution of signal model parameters |
Eta evolution of signal model parameters |
Based on these, let's try following evolutions:
|
BDT |
ETA |
M1 |
const |
const |
S1 |
const |
lin |
M2 |
const |
lin |
S2 |
const |
lin |
M3 |
lin |
const |
S3 |
lin |
lin |
f1 |
const |
const |
f2 |
lin |
lin |
4BDT x 3Eta default binning with the above evolutions
fitLog:
fitLog_BdCheck_sim_4BDTx3Eta.txt
|
parameter evolution in BDT |
parameter evolution in ETA |
Revision of signal fit models
So far, a final step of the individual fitting was three gaussians with all parameters free. The before-final step is gaussian with common mean of two narrower gaussians. It could be viable to build the model with the pre-final step, i.e. with the common mean. The comparison of those two steps in the 4 BDT bin and 3 Eta bin prototypes follows.
Bs_modelComp_BDT.pdf,
fitLog_individual_BDT.txt
Bs_modelComp_ETA.pdf,
fitLog_individual_ETA.txt
Simultaneous fit to Bs MC in 4BDTx3Eta bins with common means of Gaussian 1 and 2
Based on the evolutions extracted from individual fits:
The evolutions of all parameters were set to be linear except the evolution of the common mean in eta, which was set constant. The result:
Fit log:
fitLog_commonMean_4BDTx3ETA.txt
Updated evolution plots containing projections of all bins in the sample - blue points correspond to values of the respective parameter on the evolution plane in the point of the corresponding average BDT and Eta. Increasing bin number is represented by darker colour. Example: in BDT evolution plot of S2, there are three blue points in each BDT bin. The lightest blue corresponds to the S2 calculated in the considered BDT bin and first Eta bin, i.e. for say BDT bin 2 this is the value in bin [2, 1], etc. Black points still correspond to the prototype results of individual fits of sample binned in one direction only
Simultaneous fit to Bs MC in 4BDTx3Eta bins with common means of G1&G2, all evolutions linear
The only constant parameter from above - the eta dependence of the common mean - has been allowed to evolve linearly.
Fit log:
fitLog_evolution_bdt_commonMeanCheck_constraints_allLin_4BDTx3Eta.txt
Simultaneous fits of signal components - aim for stability
The above setting with all parameters linear in both variables proved to be unstable when fitted to a different binning.
Releasing the dependencies one-by one, i.e. starting with several parameters constant and releasing them one by one to be linear in a given variable did not help.
A stable settings with minimal constant dependencies was found to have
Sigma 2 and f2 constant in BDT.
Check on Bs:
This fit model unfortunately doesn't work well for the 6x5 binning with Bd, despite being quite stable in Bs...
Meanwhile - problems with generation in
RooCategories - when it's supposed to generate 0, it generates the extended term-number of events.
Simultaneous fits of signal components: initialization finer binning with coarser - all parameters linear in both BDT and Eta
Another approach that could lead to more stable fits is to initialize finer binning with result of coarser. Since the
model with all evolutions linear in both BDT and Eta converges in the "default" 4BDT X 3ETA binning, the result of this will be used to initialize evolutions in 6X5 binning.
This indeed converges in both Bs and Bd case:
A cross-check has been performed: all the pdfs in the simultaneous pdf were added together into a single bin to be compared against the individual fit over the entire dataset. This was done for 1) all simFit evolutions constant (expecting to get the same result as from individual fit) 2) all simFit evolutions linear (expecting to get better agreement with the MC than the individual fit)
--
OndrejKovanda - 2020-08-24