500ToysProjection-OK_titles.pdf
Yield fits for Full Run 2
Yield estimation
Bs and Bd yields were estimated in two ways - from run 1 SM expectations and from 15/16 SM expectations using formula:
N_exp = N_run1 * (σ_FR2/σ_run1) * (eff_FR2/eff_run1) * (L_FR2/L_run1)
I.e. taking into account increase in B-production cross-section due to increase in centre-of-mass energy, difference in trigger efficiencies and available integrated luminosities.
The calculation is performed here (spreadsheet):
expCands_Run2.xlsx
The results from both approaches are
400 Bs and
44 Bd candidates to be expected in FR2 dataset.
Something to remember: the lifetime correction for the D_ref values in 15/16 paper (and internal note as well) are switched between Bs and Bd - perhaps the labels got switched, Resulting in larger D_ref value presented for Bd while it should have been so for the Bs.
Pseudo MC production
The toy datasets are produced directly from the extended
RooSimultaneous. The generation is done in two variables - the
RooRealVar mass and
RooCategory corresponding to bin. A type flag is added to each generated component - pdg id for Bd and Bs decays and 999 for background. The expected numbers to be generated are first Poisson-fluctuated prior to the
RooSimultaneous::generate() call. The relative population of the various bins in the generated dataset is taken automatically from the extended parameters of the
RooSimultaneous. This works well as long as we're generating large enough sample so that every bin has at least one event. This is however not the case for the Bd component, where it often happens that a bin should be empty. When all the desired events are generated and some of the bins remain with no events left to be generated, a number corresponding to the extended term of the
RooSimultaneous belonging to that bin is generated instead of 0. That is highly undesirable of course, so a measure dealing with this has been implemented to remove events generated this way.
A projection of 500 toys generated this way is shown here:
Average BDT and ETA values for each bin are also needed in order to parametrize the shape dependences. Therefore, at the fitting time, each event is assigned a random BDT/eta value sampled from the corresponding MC taking into account the actual BDT & ETA bin (i.e. for an event generated from the Bd pdf into BDT 2 ETA 3 bin, a random BDT & ETA value is sampled from BDT bin 2 ETA bin 3 in the Bd MC). That way we get a fluctuating average BDT and ETA for each toy.
Fitting 15/16 statistics toys
First a set of toys was generated using the procedure described above with the 15/16 statistics, and subsequently were flushed into single ETA bin, reproducing the 15/16 BDT binning. These were then fitted with the 15/16 model. The fit was performed in two steps - first, background-only was initialized with the 15/16 parameters and then fitted to the sidebands. In the next step then the full model was fitted starting from the 15/16 SM expected yields for the signal components. Pulls and residuals were then investigated. Results here are for 200 toys.
Out of the 200 toys, one failed with nonzero status, others converged gently with full-quality covariance matrix.
Subsequent fits and checks for pulls and residuals in Bs/Bd yield revealed certain bugs, such as that exponential constant was fixed and not determined by the fit. After fixing these, a comparison was made between fits using the randomized BDT averages (as described above) and fixed BDT averages as used in the 15/16 analysis, and the differences between the two resulting pulls and residuals were found to be rather negligible.
BUG FIX 16.4.2021: the random BDT for bkg was actually drawn from Bd MC. This was fixed and figures updated. No significant impact was observed due to this bug.
The pulls and residuals with randomized BDT:
correlations:
Yields correlation: -0.625143
Resids correlation: -0.690806
Pulls correlation: -0.690563
And with the 15/16 fixed bdt mean:
correlations:
Yields correlation: -0.625483
Resids correlation: -0.691427
Pulls correlation: -0.69106
NOTE: the TH1 means/rms's are calculated only from the plotted range while those in the TH2s include also the over/underflows.
Toys with continuous BDT & Eta
It's desirable to produce toys with continuous distributions of BDT & eta, unlike above where binning was assumed in the generation already. With continuous BDT/ETA we can bin our toys as we want when we fit them. We therefore need to change from a single mass pdf in each bin to a more complex
pdf(m | bdt, eta).pdf(bdt, eta)
For the pdf(bdt, eta) we can, for our purposes, use a non-parametric distribution obtained by interpolating the bdt/eta histogram with appropriate binning, as shown in the following figure for the Bs component. Also shown are the generated distributions at the same statistics. The bdt/eta histograms are weighted, and as such should better reproduce the dependences observed in data.
Continuous generation. Left - MC distributions, right - generated distributions |
It's usefull to know what exact weights should be used. For that reason, figure 2a from the 15/16 paper was reproduced. Turns out that what is marked as "combinatorial", "unmatched", "missing particles" and bb-mesons in the full bbmumuX MC should be treated as a single "combinatorial" category. Bc should be excluded to be consistend with the 15/16 fitting procedure. These updates were made to the fitted datasets and the background was re-fitted.
BDT distribution of various components |
It turns out that DDW and QLC are not used for background components, and also the B-isolation weight is not included in this plot in order to achieve perfect agreement with figure 2a in the 15/16 paper.
A discrepancy was observed in the BDT/Eta distributions also for the two signal components after the full analysis selection (incl. BDT cut):
Comparison of WEIGHTED Bs and Bd MC after analysis selection in terms of BDT and maxAbsEta |
Comparison of UNWEIGHTED Bs and Bd MC after analysis selection in terms of BDT and maxAbsEta |
*** note on errorbars: The (Roo)histograms above are scaled to unity (i.e. by factor 1/hist->sumEntries()). In unweighted case, the errorbars correspond to Poisson (asymmetric) error of the original bin content scaled by the same scale factor. In the weighted case, the errors are sqrt(Sum(w^2)) (symmetric), also scaled by the same scale factor. This is the default behaviour of
RooDataHist when plotting.
***
Further investigation of the discrepancy showed that the Eta distribution in the studied samples agrees at reasonable p-value (obtained from K-S 2 sample test) between the Bs and Bd, in various stages of the (pre)selection.
Comparison of UNWEIGHTED Bs and Bd MC Before analysis preselection (loose ntuple) |
Comparison of UNWEIGHTED Bs and Bd MC After analysis preselection without BDT cut |
Comparison of UNWEIGHTED Bs and Bd MC After full analysis preselection including BDT cut |
Similarly, the B-meson eta distributions (not the max(|eta1|, |eta2|)) show compatibility between Bs and Bd. B-meson pT does not.:
B-meson pT and eta: Comparison of UNWEIGHTED Bs and Bd MC Before analysis preselection (loose ntuple) |
B-meson pT and eta: Comparison of UNWEIGHTED Bs and Bd MC After analysis preselection without BDT cut (loose ntuple) |
The BDT distribution, also in full range, is not showing good compatibility. Therefore, the distributions of the BDT input variables were compared between the two samples:
BDT output: Comparison of UNWEIGHTED Bs and Bd MC (After analysis preselection). The BDT response is different for each component. |
BDT input variables: Comparison of UNWEIGHTED Bs and Bd MC (After analysis preselection). Purpose: why is the BDT response different to Bs and Bd? |
BDT input variables: Comparison of UNWEIGHTED Bs and Bd MC (Before analysis preselection). The purpose of this set of plots is to see if the difference in BDT input variable distributions are due to analysis preselections or not. |
The following set of plots shows those distributions aplicable at various stages of reconstruction from generator output all the way to the analysis ntuple.
|
EVNT |
DAOD |
Loose preselection |
Analysis ntuple |
TRUTH |
Basic kinematical variables of the Bmumu decay. There's some tension in the pT. Muon 1 is mu-, 2 is mu+. Additional vars, delta phi and plong enter the BDT, especially the delta phi is quite shifted. |
Truth extracted from DAODs |
Truth-matched candidates, from Unpreselected ntuple |
|
RECO |
|
Truth-matched candidates, ntuplized from DAODs |
Truth-matched candidates, from Unpreselected ntuple |
Truth-matched candidates, from Preselected ntuple |
Studies with generator-level truth showed that the discrepancy in the B_pT arises after a pT cut on the final state muons (meant to emmulate the trigger selection). Then, the lighter Bd needs larger pT to produce muons above the threshold then the heavier Bs. This shapes the pT distribution of the two processes in different ways, introducing shape incompatibility observed as drop of the KS p-value. The preselection cuts, however, introduce further cut on the B_pT at 8
GeV. As the bulk of the discrepancy introduced by the trigger selection is likely to be in the low-pT part of the spectrum where the Bs/d mass difference is most pronounced, the preselection brings the KS value in the analysis ntuple level back to value observed in the MC truth in terms of pT. The rest of the kinematical variables (B_eta, B_phi, muon pT, muon eta, muon phi) show good shape-wise agreement throughout the process.
This led me to think that the bulk of the disagreement in BDT response is not due to the kinematics. To be sure, I cut away the low-pT part of the B_pT at 10
GeV, which brought the kinematical distributions into very solid agreement, but the BDT remained inconsistent event after such cut.
Then I turned to the most important BDT variables in the list, the DR and |a_2D|. These too have incompatible distributions between the two samples. The largest discrepancy is at low values of DR and |a_2D|. Therefore, I removed the lowest bins (DR < 0.03, |a_2D| < 0.0133333) in the input var histograms, and looked at the BDT distribution. It sarted to take consistent shape in the two samples. Then I reasoned what could cause the difference in the two variables between the two MC samples. They internally differ in mass, life time, and hadronisation-related stuff (such as isolation). The only viable option I could come up with is the lifetime, since a "longer" distance from PV to SV is less likely to be mis-measured (not sure if such reasoning is correct). I therefore found region where the Bs and Bd lifetime distributions are consistent in shape (1.1 - 2.3 ps), where both samples have about 52k entries. The entries in this region also agree in terms of the BDT distribution.
A more detailed look at the |a_2D|, DR and properTime correspondence:
properTime cut 1.1 < t < 2.6 |
a_2D > 0.0133333 DR > 0.03 |
a_2D > 0.02 DR > 0.05 |
|
|
|
Toy generation as of June/July 21
For the signal components, mass shapes are fitted to the corresponding MCs, and the conditioning bdt/eta pdfs are obtained by interpolating the weighted histograms, also taken from the corresponding MCs.
For the background components, mass shapes are fitted as a combined model to the COMB + SSSV part of the bbmumuX MC, but the conditioning pdfs are obtained separately for each component - by interpolating the COMB part for the combinatoral pdf and SSSV part for the SSSV pdf.
The binning used in the conditioning histograms is 10 bins for both signal and background.
In Bd component, there is a warning issued when an event in a toy is generated over certain bdt/eta value, because of the parametrization of the f2 parameter of the signal model (relative fraction between the two core components). Above certain bdt and eta, it gets negative, which triggers the warning. The "pathological region" in BDT and eta, that is defined by the evolution parameters of the f2 fraction as
-1.3410e+00 * bdt -4.6308e-02 *eta + 9.1542e-01 < 0 (i.e. where f2 is negative), is unpopulated in the original MC, but thanks to the "rough" interpolation of the BDT and ETA histograms (10 bins each), there's non-zero chance of an event being generated in that region.
The actual warning is:
WARNING:Eval --
RooAddPdf::updateCoefCache(twoG WARNING: sum of PDF coefficients not in range [0-1], value=-0.00974772[#0] WARNING:Eval --
BDT / ETA interpolated histpdfs and what's actually found in the MC. The pathological region (very high BDT, high, ETA) is not populated in the original, but the interpolation reaches slightly into it. |
Blue: mass generated in the pathological region still forms the peak. Importantly, there were only O(50) events generated when the full MC statistics was requested (>120 000 events) |
This is happening in less than one per mil cases and still follows the Bd distribution.
Background fits with updated MC sample (Bc excluded)
The background was re-fitted in 4BDTX3ETA:
===========================
31.5.2021 BUG FIX: PlaneInitiator had hard-coded full- bbmumuX MC as the input file for calculating the starting value for the 2D evolutions. Fixing this had a negligible (relative order 10e-4) effect on the bkg fits, and slightly larger (~ O10e-2) effect on the Bs signal. The Bd signal was first fitted to the updated/correct ntuple only after this bug was fixed.
===========================
Since we're aiming to use weighted events, we need to see if the shapes fitted to the unweighted MC are fine with the weighted MC datapoints. For this reason, the distributions were plotted over the weighted data. First, an overall global scale factor was applied to the pdf (same for all the bins) as
sum(weights)/sum(entries). Next, pdf in each bin was scaled according to the sum of weights in that bin (which is also the default
RooFit behaviour). After that, the normalizations of each component were left free to vary and were fitted to the weighted mass histogram in each
BDTxEta bin.
Projections of scaled PDF over weighted MC with global scale factor being sum(weights)/sum(entries) |
Projections of scaled PDF over weighted MC with per bin scale factor |
Projections of scaled PDF over weighted MC with component normalizations determined by fitting them to the mass histogram. Normalizations allowed to be negative. |
Projections of scaled PDF over weighted MC with component normalizations determined by fitting them to the mass histogram. Normalizations forced non-negative. |
Fitting individual bkg components, one at a time
When the background components are to be fitted individually, the input sample needs to be split to comb and sssv part, and the procedure needs to be adopted to such change. The evolution prototype generation was adjusted too, in order to properly initialize the simultaneous fits. The prototype evolutions are here:
BDT Evolution of shape parameters when fitted one component at a time |
BDT Evolution of shape parameters when fitted one component at a time |
Based on the p-values it's reasonable to use the same evolutions for the shape parameters as were used in the combined-model fitting: expConst constant in BDT, linear in eta, slope linear in both.
The mass projections in individual bins follow:
COMPARISON WITH COMBINED MODEL FITS:
Fitting toys continuous in BDT/Eta
15/16 statistics
These were generated with numbers of SM expectation for the signal components and with bkg component yields obtained by SB fits to 15/16 data.
nBs = 121.333 (91 in top 3 bins), nBd = 13.333 (10 in top 3 bins), nComb = 10188.3, nSSSV = 718.2
These numbers were poisson-fluctuated before generation.
Subsequently, the toys were binned in the usual 4 BDT bins of 15/16 analysis and fitted with the 15/16 yield fit model (modulo peaking bkg and relative bin efficiencies). Here are the pulls and residuals:
Since the generation was continuous in BDT and eta, the background pulls/residuals are calculated wrt. to the actual number of generated comb/sssv events falling into each bin, i.e. the Poisson fluctuation is not counted in.
Signal yield pulls/resids |
Signal yield pulls/resids |
COMB pulls/resids |
SSSV pulls/resids |
|
|
|
|
FR2 statistics
These toys were generated with SM projection for the FR2 for the signal components. The COMB and SSSV yields were scaled from the SB fits to 15/16 data with 2020 BDT. The scale factor used here was the ratio of FR2_available_luminosity/1516_available_luminosity (139/36.2)
nBs = 533.333 (400 in top 3 bins), n_Bd = 58.6667 (44 in top 3 bins), nComb = 19808.4, nSSSV = 2550.4
Again, these were fitted with the 15/16 model.
Since the generation was continuous in BDT and eta, the background pulls/residuals are calculated wrt. to the actual number of generated comb/sssv events falling into each bin, i.e. the Poisson fluctuation is not counted in.
Signal yield pulls/resids |
Signal yield pulls/resids |
COMB pulls/resids |
SSSV pulls/resids |
|
|
|
|
Investigating the bias - generating & fitting with the 1516 model
In order to determine the source of the bias, toys were generated with simplified 1516 model (SSSV+COMB+Bs+Bd) and fitted with the same function. The toys were generated binned, so the
background yield pulls and residuals are calculated wrt. expected value in each bin (before Poisson-fluctuating and generating)
1516 stats
Signal yield pulls/resids |
COMB pulls/resids |
SSSV pulls/resids |
|
|
|
FR2 stats
Signal yield pulls/resids |
COMB pulls/resids |
SSSV pulls/resids |
|
|
|
Generating with full model, constraining BDT and ETA evolutions, fitting with 1516 model
Summary table:
bias_summary.xlsx
a bit more elaborate text on this:
biasStudy.pdf
#1: Generation: evolutions in BDT & eta frozen for signal, in eta evolutions also frozen for background, BDT evolution allowed only for the chebychev slope
- SAME evolutions as the fit model, although the functional forms for the signal are different (3 gauss in generation, 2 in fitting)
Doesn't seem to break at higher stats
#2: Generation: evolution in BDT frozen for signal, chebychev slope evolved in BDT, everything evolved in eta
- Proxy for the contribution of the bdt-evolution difference in signal between generating model (evolved in BDT) and fitting model (constant in BDT). Eta evolutions linear in all parameters.
#3: Generation: evolution in ETA frozen everywhere, otherwise everything except exponential slope evolved in BDT
-Proxy for the contribution of the eta evolution in all parameters. That is different, since at generation everything depends on eta while fitting is eta-agnostic
#4: Generation: evolution of signal frozen in both BDT and ETA, chebychev linear in both, exponential slope in ETA only
Eta - dependent fitting
First, the fitter was validated on toys generated without eta dependence in signal and was shown to have small bias. The fitting was performed on 3 eta bins (eta binning only), with events corresponding to BDT bin 0 removed (to get rid of the background). In the next step, a model describing signal has to be found. Sub-step no. 1 is to generate Bs-only toys of expected statistics using the full-evolution generating model and try to find minimal signal model accounting for eta evolution that describes these toys. The differences between these two models are summarized in the following table:
Differences between generating and fitting signal model
The fit model was fitted to 4000 Bs-only toys (binned into 3 eta bins, events corresponding to BDT bin 0 removed) generated with the fully-evolved generating model. The resulting pulls and residuals are:
Pulls & residuals with above-mentioned model
... with BDT bin 0 events included
unbiased residuals were found also for the case of including BDT bin 0 events.
Next, we need to account for Bd. First attempt is to tie the shapes together using a single additional parameter - the mass shift, so that the Bd shape is the same as Bs with means of the gaussians shifted by a common factor that is fitted.
This approach was found not viable as the Bd events leak into Bs component whenever there is freedom in means. Instead, the separation was fixed to a value observed when fitting Bs + Bd samples generated at comparable statistics about three times the expectation for the Bs. We therefore have Bd + Bs model composed of two gaussians per component that has all shape parameters in common apart from the means, which are shifted by a constant factor for the Bd component. Additionally, the fractions of the yields in each eta bin are required to be the same for Bs and Bd, and are determined in the fit.
The signal-only toys were generated with the analytical model containing all BDT and ETA evolutions. With all shape parameters frozen to values observed in Bs-only fits, the Bs + Bd toy fits are unbiased:
Pulls and residuals of the signal model fitted to the Bs + Bd toys. Shape parameters are fixed, only yields and relative fractions are determined in the fit. |
Fit result of one of the Bs + Bd toy fits, showing the initial values of the fitted parameters as well as values of the fixed parameters. |
Note that the slight average overestimation of the Bd yield can be tracked down to the small difference between the BDT response of the two signal components. The expected Bd yield of 44 that is subtracted when forming the residuals is calculated under the assumption that the response of the two components to the BDT is the same. In fact, in this particular case, the average number of generated Bd events in the toys is 44.44, which effectively cancels out the ~0.5 mean of the residuals.
Background model+events were included. Generation assumes linear shape evolution in ETA for both COMB and SSSV and linear COMB and constant SSSV in BDT. Fitting then accounts for linearity in ETA in both background components. When this model is applied together with background and fitted to the signal + background toys
generated with the analytical models containing all considered BDT and ETA evolutions, the performance is nearly unbiased and the convergence is over 99.9%:
Pulls and residuals of the Bs and Bd yields with correlation |
Pulls and residuals of the Bs and Bd yields, 1D projections |
Fit result of one of the signal + background toy fits, showing the initial values of the fitted parameters as well as values of the fixed parameters. |
UPDATE 9. 11. 21: The version of the fitter was finalized, and the fixed parameters in the signal model were determined as follows -
1. Generate 5k Bs-only toys of 3x expected FR2 statistics and fit with two gaussians with all parameters free (initialized with the 15/16 model values)
2. Fix M1 to the mean value observed in step 1. fits, repeat step 1.
3. Fix M2 to the mean value observed in step 2. fits, repeat
3. Fix f1 to the mean value
4. Set initial values of S1_const, S2_const and widening rate to the values observed in step 3.
5. Bs-Bd mass separation fixed to observed (mean value of) difference in means of generated Bs and Bd samples of 100k each
6. Bd model identical to Bs with M1 and M2 shifted by the value found in step 5.
7. Bs and Bd fractions of the total yield are bound to be same amongst the eta bins
If the signal width parameters are released, some bias emerges already at the level of signal-only toy fits, but is small compared to statistical uncertainty:
Pulls and residuals of the signal model fitted to the Bs + Bd toys. Shape parameters are fixed, only yields and relative fractions are determined in the fit. |
When fitted together with background in a sequence bkg on side bands -> bkg + frozen signal -> bkg + signal with released width, the bias gets worse but the correlation decreases significantly:
BUG 9. 11. 21: S2_const parameter in the signal model is very often on the boundary of the allowed range when fitted together with toys in eta-only binning. Investigation underway - first attempt: make the range broader, that helped in the sense that most of the values are away from the boundary, but still spread very broadly into large values. Second attempt: on top of larger range, add an extra step in the fitting when we first release s1_const and widening rate, fit, and then release also s2_const for the final fit. That way we no longer have s2_const on the boundary.
Pulls and residuals of the signal with released width parameters + bkg yields with correlation |
Pulls and residuals of the signal with released width parameters + bkg yields, 1D projections |
Other fitting strategies were tested, namely:
1) bkg on SB -> bkg + frozen signal -> release signal width and freeze signal -> release bkg
2) bkg + frozen signal -> release signal
3) bkg on SB -> bkg + frozen signal -> release widening rate -> release constant terms of sigmas
But all have similar bias & correlation properties
BUGFIX: background yields were initialized with nonsensical values. For now, changing the initialization to (total expected comb/sssv events)/(number of bins) for the eta-binned only fits. The effect on resulting yields is very small, of the order of 0.5 events in the Bs/Bd residual means.
Since the bias in the pulls is low, it means that large residual values, that pull the residuals mean away from zero, have also large errors. Therefore, the pulls bias corresponds to only few per-cent of the stat. uncertainty, which is acceptable.
The eta binning was changed to see the fit behaviour. First, three eta bins designed to have same signal statistics were tested. The comparison of the resulting pulls/residuals with respect to the previous case of "default" eta binning follows.
Comparison of default and equal-stat binning |
Default binning - Bs Bd correlation |
Equal stat binning - Bs Bd correlation |
Next, four eta bins with same signal statistics were found and fitted:
Comparison of default and equal-stat binning |
Equal stat binning in 4 bins - Bs Bd correlation |
Next, four eta bins with non-equal statistics were tested. The bin edges in this case were: {0., 0.9, 1.4, 2., 2.5}.
Comparison of default and equal-stat binning |
NON-Equal stat binning in 4 bins - Bs Bd correlation |
1516 model vs New model in default BDT-only binning
Old fitter was compared to the new one in same setting - fully evolved toys, FR2 stats, 4 BDT bins (same as 1516) and the differences in behaviour were investigated.
The widths of the signal components in the new model are roughly 10% wider for Bd and 5% wider for Bd wrt. the 15/16 model. Additionally, the mean position of the signal peaks (calculated as weighted average of the means of the two gaussians making up a peak, weighted by the proportion factor) are shifted by ~+2.00
MeV for the Bs component and by ~+1.27
MeV for the Bd component. These shifts and width corrections were applied to the New model and fitted again to the set of fully-evolved toys to observe same trends in biases as there were in the case of 1516 model:
New model with signal width and position corrections on fully evolved toys |
Comparison at 15/16 statistics
The fitter was compared to the one used for new BDT studies (that was already validated against 15/16 results in autumn 2020).
Bug was fixed in the autumn 2020 version - fraction of gaussian 1 in Bd model was 0.88 but should be 0.83 in accordance with the internal note. The new fitter was brought into agreement in terms of fixed parameter values for signal shape, peaking background model, starting values and ranges of fitted parameters and constraints on the average BDT value in each bin.
Changes introduced to the New fitter to be consistent with Autumn 2020 fitter |
signal means |
set to 15/16 values (4 parameters) |
signal sigmas constant terms |
set to 15/16 values (4 parameters) |
signal proportion factors |
set to 15/16 values (2 parameters) |
signal widening parameter |
set to constant 0 (1 parameter) |
eta-linear term of expConst |
set to constant 0 (1 parameter) |
eta-linear term of comb. slope |
set to constant 0 (1 parameter) |
fitting for true BDT mean |
introduced |
starting values of fitted parameters |
set to Autumn 2020 starting values with 15 decimal digit precision |
ranges of fitted parameters |
set to coincide with the reference |
A cross-check on 15/16 data showed no difference between the two fit results up to 5 significant digits. Then, an average of 15/16 expectation (2.9/3) peaking background events were generated in each BDT bin and added to the existing fully-evolved toys of 15/16 statistics (after applying 15/16 binning) and fitted with both fitters. The result of Bs/Bd pulls/residuals show cosmetical differences between the two results, which may result from different order of variables during minimization and/or differences in starting values beyond 15th decimal digit or fixed terms in the likelihood.
Pulls/resids from new fitter compared to those from autumn 2020 fitter |
Leaving out the BDT constraint in the new fitter and using fixed BDT values instead has next to none effect:
Fixed vs constrained average BDT |
Next, model was gradually brought back to the New form and pulls/residuals were compared. This is summarized in the table:
1516vsNew.xlsx
Adding Bhh component
We need peaking component model for generation, dependent on BDT/eta. For this, we need to correct the VTX mass to resemble MUCALC mass, this is ongoing. Meanwhile, preselections were investigated in the Bhh ntuple. Wrt signal ntuples the preselections need to be adapted:
- muon pT & eta cuts applied to ID tracks instead of combined muons, which are unavailable
- no muon quality requirements applied (effect on mass distribution shape discussed in Fabio's thesis)
- no trigger requirement applied (effect on mass distribution shape discussed in Fabio's thesis)
Note: "isSignal" flag in the ntuples is false for significant fraction of Bhh events. For isSignal to be true, candidate must not only have a truth B, but also correct number of decay products, they must be either K or pi and have opposite charges.
Correcting VTX mass to MUCALC
VTX is shifted and wider wrt. MUCALC. Mass-dependent correction has to be applied to VTX to resemble MUCALC. This was studied on Bd mumu and Bs mumu MCs where both mass calculations are available. Several approaches were explored:
- linear correction: fit line to plot (MUCALC - VTX) : VTX, read off correction at any point as value of that line - doesn't work
- binned correction: bin plot (MUCALC - VTX) : VTX, calculate average correction in each bin, correct VTX mass by average correction value in that bin - doesn't work
These approaches don't work because the variance of the corrections falling into each VTX bin is very large. With these approaches we're merely shifting the VTX values in a given range by constant, forgetting that the corrections are actually very different for each mass falling into a given bin - they need to be spread as well. This is supported by observation that if we don't take as correction the mean value in a given bin but rather a random number sampled from gauss(bin mean, bin RMS), the corrected samples tend not to be inconsistent by KS test with the respective MUCALC reference, when applied to it's own sample (e.g. Bd correction applied to Bd sample). Cross-species application works as well, if we read-off the correction from point shifted by (mean(correctingSample) - mean(correctedSample)).
This however can't be used because due to randomness involved, as the approach is then non-reproducible. Instead, unbinned version of the (MUCALC - VTX) : VTX is used, and the correction is read-off from the closest entry in that plot, while the cross-species application shifts are handled as described above.
All corrections were done on the preselected ntuples, without BDT cut.
Here's how they work B(d/s)-->B(d/s).
Correcting VTX mass using unbinned Bd/Bs correction plots & cross-validating
When applied to Bhh, both Bd & Bs corrections produce similar result:
Correcting VTX mass using unbinned Bd/Bs correction plots & cross-validating
NOTE: preselections applied on Bhh before correcting the mass did not include muon requirements, trigger and mass cut. Muon req and trigger cannot be applied (but have negligible effect on mass shape - Fabio), and the mass cut was applied only AFTER correcting, on the corrected mass.
After this, the events were sampled from the ntuple based on the double-fake rates of each final state by accept/reject sampling. After applying a signal-region BDT cut (> 0.2455), the composition of the sample distantly resembles that found in Fabio's thesis/15/16 paper (legend would be exactly the same as in Fabio's case):
Correcting VTX mass using unbinned Bd/Bs correction plots & cross-validating
In my case the Bd->pipi and Bs->KK components are switched. Not sure why. Bad management on my side or bad labeling in Fabio's? Shape-wise they look very much same in both plots...
Is the correction eta-dependent? Does it reflect the mass-resolution vs. eta correlation?
Here is plot of RMS of VTX and MUCALC in eta slices for Bs->mumu, Bd->mumu and B->hh. Also, a nostack of these slices for B->hh. The brighter color, the higher eta.
|
Mass slices in eta for Bhh. The brighter color, the higher eta. |
Residual dependence of Bhh mass shape on missing muon ID and trigger selection
In 15/16 this was investigated and no significant effect was found. Then they however integrated over eta, while we can have some effects on mass shape in different eta regions.
Muon ID = fake rates. The fake rates were recalculated and investigated as function of eta:
preselections other than usual: track pT > 4
GeV on both tracks, Mu ID applied to numerator, HLT_mu4 applied to numerator. Denominator without Mu ID and trigger. Just as in Fabio's thesis.
Single-leg fake rates of various hadron species as function of eta
The eta-integrated values are different from what Fabio found. Partly because I think he switched kaons and pions:
Mine vs. Fabio's fake rates
Trigger = efficiency in different eta regions. On top of this will then act the muon ID requirements, so if we have more events in some eta to act upon, there will be more fakes. The eta distribution of Bhh without trigger req. is very different to B(s)mumu after applying the trigger requirements.
!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*
*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!
JANUARY 2022 REVISION BERLOW - USE THAT!!!
!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*
*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!
The MUCALC and VTX mass values in MC are, in fact, conditional wrt. TRUTH mass. Both VTX and MUCALC distributions are (sort of gaussian) randomly broadened TRUTH distributions, due to various detector effects. Since VTX has less information (or "hits") to be reconstructed from, it's broader, while MUCALC on the contrary has more info and thus less variance. The approach above with "find the closest entry" correction showed that it's not only about moving various VTX values by a mass-dependent offset, but rather about spreading them around some mean and some variance, manifested on the "correction plots" above as the position and y-axis spread of the cloud of the points at a given VTX mass. What worked as well was to use a profile histogram with VTX on x-axis and each bin value and error being mean and RMS of the (MUCALC - VTX) in a given VTX bin. This profile was then used to
generate the correction, assuming it's gaussian, i.e. if a VTX value fell in bin
n, it was corrected by value generated from gauss(mean_n, RMS_n). Therefore, our MUCALC values were in this narrative conditional on the VTX values, and we were obtaining narrower distribution by applying "sort of broadening" to each vtx bin.
This worked, because at low and high ends of the mass spectrum, the mean of the correction in VTX bins is appreciable wrt its RMS, so the VTX values in a given bin are not only spread wrt. one another, but also shifted as a whole. This is in turn the case because the TRUTH distribution is basically a delta at the corresponding B mass. This is, however, not the case for the Bhh, where we have multiple components, each with different position & spread of the TRUTH due to particle mis-ID and Bs/Bd in the initial state. To do things properly, we therefore need to account for the TRUTH in our correction somehow. Since the TRUTH values are very narrowly spread (modulo radiative decays), we can treat our correction not based on VTX alone, but on VTX - TRUTH. Then, if the "correction profile" for Bd and Bs mumu shows same behaviour, we can use the VTX - TRUTH based correction also for Bhh, as the differences between TRUTH of the individual Bhh components due to muon mis-id are of the order of Bs/Bd difference (and thus also of the TRUTH in Bs -> mumu & Bd -> mumu).
The correction plots were created as follows: The baseline binning was 250 VTX - TRUTH bins in range of - 900
MeV to 900
MeV. Then, these were re-binned in a following manner:
1. Starting from the middle bin and going in the positive x-direction, each bin was checked. If it contained less than 10 entries, it was merged with the following bin. If the end of the histogram was reached without having 10 entries, the current bin was merged to the previous
2. Starting from the middle bin and going in the negative x-direction, the same procedure was applied.
The correction for a given (VTX - MUCALC) value is then generated from gaussian with mean and sigma read off from the corresponding (VTX - MUCALC) bin in the correction plot. The correction was cross-checked on the Bmumu MC samples:
Rebinned VTX - TRUTH correction cross-check |
Rebinned VTX - TRUTH corrections overlay |
The resulting KS pvalues are reasonable considering the high statistics.
The performance of the correction derived from Bs mumu and Bd mumu samples was then compared on the Bhh MC after applicable preselections (no trigger, no
MuID, for the time being no mass window cut)
Rebinned VTX - TRUTH correction applied to Bhh |
The performance looks consistent - no attempt was made for a statistical test, as the statistics is extremely large, while expected number of events is ~5 orders of magnitude smaller, therefore even a significant incosistence at high statistics would be irrelevant for the expected Bhh yield. As Bs correction has somewhat better pValues in the Bmumu cross check, it will be used further.
The Bhh sample was then accept-reject reselected according to the updated fake rates (column "Me" above), and the mass window cut was applied to the MUCALC proxy.
Analytical model for Bhh
First, we'll try to build Bhh model for both generation and fitting that is flat in BDT and eta in terms of shape parameter evolutions and number of events. This was done using
averages of newly found fake rates - values in the table above in column "Me". Bhh sample after preselections was accept-rejected to account for the final-state fake rates. Expected number of Bhh events in the full-run 2 dataset is 12.8 in the top3 BDT bins. This number was obtained by same procedure as expected signal yields, scaling from 15/16 analysis.
The model was build by fitting 1000x random sample of Poisson mean 1000 events with two gaussians. Parameters were gradually fixed to average observed values. The resulting shape parameters are:
mean 1 = 5249.39
MeV mean 2 = 5217.41
MeV sigma 1 = 81.7998
MeV sigma 2 = 185.808
MeV f1 = 0.824468
This was then used to generate Bhh-only toys with uniform BDT and eta so that there are Poisson average of 12.8 events in the top 3 bins.
These were then merged to the full-evolution toys already available. The same flat Bhh model was introduced into the fitter as well with fractions of events in each bin given by the area of that bin (uniformity assumption), these are not fitted but fixed.
before flat Bhh |
with flat Bhh |
|
|
|
|
|
|
The large RMSs in residuals/pulls in the 2D binning are due to several extreme fit results that nevertheless converged properly. This should be investigated. The bulk, however, has reasonable values.
!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!**!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!JANUARY 2022 REVISION BERLOW - USE THAT!!!!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!**!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*!*
The Bhh analytical model was again built assuming no dependence of the shape on BDT nor Eta. 1000 random samples of Poisson(1000) events were drawn from the re-selected, corrected Bhh MC, and were fitted with double gaussian. The starting values were those used in 15/16. After first round of these 1000 fits, the Mean 1 (narrower gaussian) was fixed to it's central value. In the next round, Sigma 1 was fixed. This was then repeated with the proportion fraction, and Mean 2, after which we had central values of all parameters. The reason for step-wise fixing of the parameters was that not all of them had "nicely behaved" distributions after the first round with all parameters free. Then, starting from these, the whole procedure was repeated in order to minimize the influence of the 15/16 starting values. Resulting shape parameters were:
'rough parameters' - after the first set of 1000 fits to o(1000) events
Mean 1 = 5239.98
MeV
Mean 2 = 5227.26
MeV
Sigma 1 = 80.8908
MeV
Sigma 2 = 190.157
MeV
f1 = 0.785496
'refined parameters'
Mean 1 = 5240.12
MeV Mean 2 = 5247.2
MeV Sigma 1 = 81.7803
MeV Sigma 2 = 162.742
MeV f1 = 0.801962
The question is, whether the rough parameters are sufficient (the procedure to obtain them is much cleaner to broadcast). This will be checked in GOF test series. Two tests were probed, Kolmogorov-Smirnov, and it's variation - Anderson-Darling. First, these two were validated with null-hypothesis satisfied, i.e. by setting up the Bhh model & generating events of O(1k), O(100) and O(10) statistics from it, then running the tests. For each of the three statistics, 10k toys were generated.
The KS p-values are significantly biased towards positive values at low statistics. I think this is a downside of the pdf for the KS statistic used in Root. For the time being, let's then use the AD test, which seems to work well at any of the probed statistics.
Bootstrap statistics |
No binning |
5 eta bins of equal stat |
5 eta X 5 bdt of equal stat each |
|
Rough parameters |
refined parameters |
rough parameters |
refined parameters |
rough parameters |
refined parameters |
O(1k) |
|
|
|
|
|
|
O(100) |
|
|
|
|
|
|
O(20) |
|
|
|
|
|
|
O(10) |
|
|
|
|
|
|
At the O(10), the AD test does not show difference in sensitivity to the two shapes. Rough parameters can be used.
Adding relative yield constraints to the fitter
valid for January 2022The relative yield constraints were introduced for the signal yields in BDT bins according to the 15/16 analysis:
Bin 0 (0.1439 - 0.2455) : yield fraction = 1/3 (fixed)
Bin 1 (0.2455 - 0.3312) : yield fraction = 1. - Bin 2 fraction - Bin 3 fraction
Bin 2 (0.3312 - 0.4163) : yield fraction = 1/3 (mean) +/- 0.03 (sigma)
Bin 3 (0.4163 - 1.) : yield fraction = 1/3 (mean) +/- 0.04 (sigma)
<a name="lookHere"></a>Fits with flat Bhh
valid for January 2022
The Bhh events were generated with flat distribution in BDT and eta - this literally means that the unbinned version is uniform, which as I've learned later was wrong, so the correct version follows, maintaining 12.8 events in BDT region > 0.2455 and added to the 4k toys containing the rest of the components. These were then binned and fitted in the following schemes: 4BDTX1ETA, 1BDTX3ETA, 4BDTX2ETA, 4BDTX3ETA - always having the eta/bdt bin edges designed to contain same fraction of signal events. The results are:
summary plots:
correlation |
Bs summary |
Bd summary |
|
|
|
Fits with flat Bhh - correct version
Here the Bhh was generated with the Bs bdt and eta distribution, having exactly the per-bin efficiency as the Bs, to which it's tied. There's a question whether to tie it to the observed event fraction (that's the fitted parameter) or the true fraction (the mean of the constraint). So far it's tied to the fitted, observed fraction, but in that case the Bhh is also picking up the statistical fluctuations of the Bs. On the other hand, it is then less prone to systematic effects related to wrong (shifted) value of the mean of the constraint.
The convention has changed to explicitly mention the presence of BDT bin 0, i.e. what used to be 4BDTX3ETA is now 3p1BDT+3ETA. In that regard, the second row of what follows is not comparable to the table above, where the BDT bin 0 was omitted completely.
summary plots:
correlation |
Bs summary |
Bd summary |
|
|
|
What's puzzling is the similarity of RMS in residuals between Bs and Bd. Ideally, we'd expect lower RMS (thus expected uncertainty) on the Bd component. The case of 15/16 analysis expected sensitivity of the yield extraction is not available. If I run the 15/16 binning case with 15/16 statistics, I'm getting RMS of 28 events on both components:
If I run on the same set of toys with the 15/16-consistent fitter that has been presented here couple sections back, I'm getting RMS of 26 on both. If I run my actual new fitter (i.e. NOT the 15/16 mimicking version) on 15/16 data, I'm getting RMS 21 on Bd and 23 on Bs, compared to 20 and 22 in 15/15 results.
Let's do another test to see if the correlation is coming mainly from the overlap or from the cross-talk from background (for instance, if I made Bs yield higher, it would take some events from COMB, that would affect Bd, etc...). We will generate toys with order-of-magnitude less background and see the RMSs and correlation. Only non-resonant background was reduced, since Bhh yield is constrained anyway. The binning is 3p1BDTX1ETA (15/16 binning). In the 15/16 statistics case, pulls are significantly biased due to few extreme outliers.
15/16 stats, ten times lower non-res bkg |
|
|
FR2 stats, ten times lower non-res bkg |
|
|
FR2 stats, ten times lower non-res bkg - reparametrized signal - Feb 22 actual version - use this |
|
|
Reparametrizing signal fractions
Signal fractions were reparametrized to be treated as per bin rather than two 1D arrays of fractions (in BDT and in ETA). The fractions are determined from Bs MC, and constrainted with gaussian constraints. The mean of this constraint is whatever comes from the MC and width has contributions from BDT uncertainty and ETA uncertainty. These will need to be revisited and possibly also treated as a source of systematics. For the 3+1 BDT bins the constraint widths are known from 15/16 analysis. For the eta bins, the constraints are very preliminarily proxied as integral difference between a single muon eta distribution between data and MC in J/psi K. Plot of mu- eta from internal note page 174 was used for this. It was folded into absolute value of eta:
I.e. if i have an eta bin from 0 to 0.75, I integrate the difference in the above histogram and that I pronounce as my eta-induced uncertainty on the bin efficiency. That is then combined in quadrature with the BDT-induced uncertainty and introduced as a constraint width in a given
BDTxETA bin. This will, again, be revisited.
The following scheme was introduced, with potentially varying numbers of bins in the BDT/eta directions:
The 15/16 BDT-induced uncertainties (sigmas of the constraints on bin efficiency) are split proportionately between the N eta bins forming one BDT bin.
Also, another calculation of the correlation was added: fitting the 2D residuals/pulls histograms with a 2D gaussian, excluding points too many sigmas away and re-fitting iteratively. This is repeated until no more points are excluded, and correlation parameter of the gaussian is reported. For example, this is what it looks like on the residuals of 1p1BDTX3ETA:
exclusion limit 5 sigma |
|
correlation original: -0.420, correlation excluded: -0.436, correlation gaussian: -0.524 (all with error approx 0.01) |
exclusion limit 3 sigma |
|
correlation original: -0.420, correlation excluded: -0.464, correlation gaussian: -0.522 (all with error approx 0.01) |
The above treatment of the binning was tested on 3p1BDTX1ETA, where the result was consistent with the previous binning modulo the very small effect of actual difference between the efficiency of the bins calculated from MC and the nominal 1./3. assumed by the previous treatment. Next, 3p1BDTX3ETA was fitted with free, fixed, and constrained signal widening parameters (s2_const was found to misbehave). The widths of the constraints are taken from the final step of the Bs shape - determining procedure, as widths of the corresponding parameter distributions. See note from 9.11. above.
|
Signal-widening parameters free (default in everything above) |
constrained |
fixed |
3p1BDTX3ETA |
|
|
|
Fits with constrained signal width evolution parameters and with Bhh bootstraps
various eta binnings were probed - from 1 to 6 eta bins with equal expected Bs efficiency. Fits were recorded with both fixed and constrained eta evolution parameters:
|
Correlation |
Bd |
Bs |
Green: constrained Red: fixed |
|
|
|
Various exclusion limits for the correlation estimation with the 2D gaussian fit were probed: what follows are fits with
fixed signal width (i.e. corresponding to the red points in the table above, which holds 5 sigma exclusion)
|
4 sigma |
3 sigma |
2.5 sigma |
2 sigma |
1.5 sigma |
1 sigma |
Green: fixed |
|
|
|
|
|
|
Also, fits with Bhh bootstraps instead of Bs-like toys (in BDT and ETA) were performed. The bootstraps are so designed to have mean of 12.8 events in the signal BDT range, although the BDT distribution is different from the signal. Here presented in the "constrained" (not fixed) eta-signal-width evolution parameters version:
|
Correlation |
Bd |
Bs |
Green: constrained width, Bhh toys with Bs-like BDT and eta distribution Blue: constrained width, Bhh bootstraps |
|
|
|
I tried to eliminate the effect of the eta and BDT inconsistency between the Bhh MC and signal MC (due to missing
MuID and trigger cuts), and resampled the Bhh MC with accept-reject to have the same BDT and eta shape as the Bs, prior to taking the bootstraps. Here are the resulting fits, green points coincede with the green points in the above table (constrained width-evolution parameters, Bhh toys) and red points are with the constrained width-evolution parameters but with the bootstraps having same eta/bdt distribution as the signal. The difference here is then only due to the missing eta-evolution of the mass shape of the Bhh.
|
Correlation |
Bd |
Bs |
Green: constrained width, Bhh toys with Bs-like BDT and eta distribution Red: constrained width, Bhh bootstraps with Bs-like BDT and eta distribution |
|
|
|
Cross-check - half-width signal
Toys were produced with signal width reduced by 0.5 in both generation and fitting, aiming to check whether the bulk of the correlation lies in the overlap or the bkg cross-talk. Fits were performed with fixed signal width evolution. The correlation dropped to ~-0.25 from the ~-0.6 at full-width. The narrowing was selected considering the Bs - Bd mass difference, so that given the resolution, Bs and Bd are ~3 sigma apart at low eta and 2 sigma apart at high eta. We would still like to see the effect of eta binning if possible.
Green: fixed width, 2 times narrower signal |
|
So yes, overlap seems to be dominant cause of the correlation. But why does the sensitivity improve when we introduce eta binning when the signals don't decorrelate? Neither in the full nor half width of the signal in the toys... Are the toys srewed up? Doesn't seem so. Proxied the resolution as a function of eta by slicing signal MC (as a sort of reference) and the generated events into narrow eta bins, and did double-gaussian fit in both, then plotted sigma of the "g1 gaussian" (initiated as narrower & larger). For comparison, there's also plain RMS of these slices:
<7.2. 2022 - switched to PDG mass separation between Bs and Bd - it's just a 0.5 MeV difference>
The RMS of the half-width toys is almost constant and that of regular toys & MC is larger than the fit-proxy. I propose to explain this with the broad component - since it's supposed to be covered by the other gaussian then the plotted one, it's not influencing the fit-proxy. On the other hand, including the broad component makes the RMS larger than if it was only determined by the core component (covered with plotted gaussian 1). In the case of half-width toys, only the sigmas were halved, why means remained the same. Therefore, the broadest of the three gaussians now lies relatively further away from the "bulk" two gaussians, so the RMS is likely to be driven by that rather than the narrowing of all the gaussians.
Since no de-correlation is occuring in the toy fits, a question is in place whether the mass-resolution is indeed well represented. Is the effect real? For that reason, we've looked at the pulls of the MUCALC mass in eta (Bs MC): These were calculated as (MUCALC - TRUTH)/sigma_m, where TRUTH is calculated from the truth track parameters.
It suggests that the sigma_m is somewhat underestimated, by approx 25-30%. I've therefore compared the sigma_m with the RMS of the residuals calculated as (MUCALC - TRUTH). To support my theory that the broad component is responsible for the enhancement of the RMS, I've added also the plain RMS of MUCALC and also RMS of MUCALC in non-radiative events.
The only difference between green and orange are the radiative events.
In the process of studying the width evolution, an observation was made that in the two signal gaussians, only the narrower one actually gets wider with ETA, while the broader one stays pretty much the same. The signal model was re-build under this assumption on 5K signal-only toys and used in the following. The checks of these sections were redone and no difference in results was observed - still no decorrelation with eta, still OK biases when using Bhh bootstraps.
Radiative business
The radiative events seem to be a significant force responsible for RMS of the signal. The evolution of the radiative events seem to be independent from BDT, eta and pointing angle, which would be likely influenced by photon radiation:
Several things were to be studied at this point:
1) If we assume no radiative events, do our signals decorrelate?
- re-build the toy models without radiative component, see what happens
2) What is the systematic effect of the fraction of radiative decays on the yields
- resample the MC's to have 25% and 50% of radiative decays. Then produce toys and fit with 15/16 consistent fitter.
3) Can we get rid of them somehow, if it's worth to reduce them?
1) No-radiative events world
First, non-radiative events were isolated from the MC and toy model was build in the usual way, except omitting the broadest gaussian from the model, as it made the fit unstable. The toy model used for both Bs and Bd was thus two gaussians with common mean, all parameters linear in both bdt and eta.
Bs non-radiative toy model |
Bd non-radiative toy model |
|
|
Toys were generated with statistics of 1) FR2 stat, 2) FR2 stat, excluding the fraction of radiative decays from the expected signal yield (i.e. 400*(1-0.36) in signal bdt range for Bs, analogously for Bd). Of course they fucked up and had to be re-generated, etc. Nothing ever works in a first go.
The signal shape for fitting was attempted to have two gaussians with common mean, the narrower one widening in eta and the common mean fixed in eta - although the toys show that the mean should actually decrease linearly with eta. Let's see if I can get away without this. The bias when fitting signal-only toys with this model in 3 eta bins was 3 events in Bs and -2 in Bd.
When fitting this to the 1) FR2 toys (i.e. 400 Bs non-radiative events, 44 Bd non-rad events), the correlation looks like this:
I've tried to also make a fancier signal fit model with only one gaussian with mean dependent on both BDT and eta, but it led to very heavy biases already at no eta binning, so I've abandoned this effort. Why it's still not significantly decorrelating remains a mystery for now. What is going on really? Ive plotted the means & rms's of the non-radiative toy & MC mass in both BDT and eta:
Sure, the toy reproduction is not perfect, but still - why does it not decorrelate more with eta binning? This is really strange to say the least...
Ten times lower bkg in no-binning case: correlation still strong.
Two times narrower signal: correlation significantly smaller.
Excluding radiative decays: correlation strong, pretty much same as with radiatives...
2) 25% and 50% radiative
First, the Bs and Bd MC was resampled to hold desired fractions of radiative events in the top 3 BDT bins. Then, the usual toy-generating models were built (3 gaussians, 2 of which have common mean):
Bs - 25% |
Bd - 25% |
Bs - 50% |
Bs - 50% |
|
|
|
|
For fitting these toys I used the 15/16-consistent flavour of the toy-generating fitter that had been used as the 15/16 reference in the previous. I've added the bin signal efficiency systematics according to 15/16 values, which was the missing piece, and also turned off the "fitting for the true average BDT value" which 1) has absolutely no effect and b) is incosistent with 15/16. Here are the resids/pulls:
25% radiative |
nominal (36%) |
50% radiative |
|
|
|
|
|
|
Here plotted as Bs/d resid/pull mean agains % of radiative decays:
The trend is expectable - the more radiative events I have in the tail, the more of them will fall under the Bd sphere of influence.
Summary of the correlation issues
correlation.pdf
Key observations:
1) Correlation is driven by the overlap of the peaks
- Shown by pretty much unchanged correlation when using 10x lower non-res bkg
- Shown also by significantly reduced correlation when using 2x narrower signals
2) Correlation is largely independent of statistics
- Shown by fitting the 15/16 stats in the radiative-fraction-systematics study
3) Correlation is not driven mainly through the radiative decays
- Shown by the largely unchanged correlation when fitting toys with and without the radiative component
- Radiative decays make the mass RMS larger, but that's an effect of few more extreme values
- Fraction of radiative decays is constant in BDT, eta and a2D
4) MUCALC mass resolution is underestimated in the MC by 25-30%
- Shown by looking at the pulls of the MC
5) On the bright side - as soon as we introduce eta binning, the Bd sensitivity goes up by ~10%.
This is the state of things with different eta binnings in one place:
Sidenote on the no-eta binning case
We have till now, for implementation & automation simplicity introduced the eta values into the fit also in the no-eta binning case, although then the average eta dependent background shapes were not determined in the fit. We could afford this, because the average eta's in the individual BDT bins are pretty much same, as well as their fluctuation is not large. If no-eta-binning was what we'd use in the final fit, we'd however not introduce it in the fit. I therefore quickly checked what would the drop in yield RMS just by excluding the average eta fluctuation of it by making the no-eta-binning fit independent of eta completely. In practice, I just used a single plausible value for eta for all the BDT bins and same for all the toys. The effect is indeed very small, but I did this anyway for my peace of mind. Hopefully this shows that there's nothing to worry about by having the av eta present and fluctuating in the signal and bkg parameterization even in the no-eta-binning case.
no eta binning, and no eta fluctuation as well, so these fits have completely no effect of average eta fluctuations |
Addressing directly the mass resolution rather than eta
If one looks at the mass resolution distribution in Bs, bkg MC and data sidebands, they seem reasonably similar (BDT > 0.2455 and applied everywhere, weights applied to MC):
Bs |
bkg MC |
data SB |
scaled overlay |
|
|
|
|
There are three peaks covering roughly 0 - 55
MeV, 55 - 65
MeV, 65 - inf
MeV. As a test, toys were genrated with signal model built on the 0 - 55
MeV sub-sample of signal MC (both Bs and Bd), bkg remains untouched. The correlation in the 4 BDT bin (15/16-like) immediately improves:
These toys were generated with FR2 statistics, but we've seen before that the correlation is rather independent of the stats.
Fitting sigma_m categories rather than eta
We could then attempt to perform the fit in these three regions of sigma_m. Given the similarity of the sigma_m distributions of our components, we may get away without Punzi term, treating it in a way as a single (=non-simultaneous) fit. Indeed the fractions of the components in the various sigma_m regions are:
fractions of (weighted) events in sigma_m regions |
|
Which are safe to be assumed as same at the expected statistics.
The idea is then to build a likelihood of the form:
L = Pois(n|n') x L_1 x L_2 x L_3,
where each L_i term is the (non-extended) likelihood in each of the sigma_m bins. This incorporates the BDT categories, and I think it should be of the form:
With the outer product running over the 3+1 BDT bins, and the inner one over events falling into each of these bins. Inside the inner product, there is the "p" factor, which represents value of the <BDT> pdf, and the sum runs over components of the mass pdf (Bs, Bd, Bhh, comb, sssv).
Generating toys in sigma_m categories
The models were built as the 15/16 case (3+1 bdt bins) on MC samples reduced to the three sigma_m categories:
sigma_m range |
Bs |
Bd |
bkg |
0 - 55 MeV |
|
|
|
55 - 65 MeV |
|
|
|
65 - inf MeV |
|
|
|
Not always they converged with pos-def error matrix, or had to be initialized with one another (i think the middle bin for Bs), in the highest sigma_m bin I had to change the starting value of the fit for SSSV yield in bin 0, but the results look OK. I've also checked that the distributions of BDT are not incompatible between the subsamples and the entire sample for bkg, for Bs and Bd I need to scale down the stats of the MC significantly, but even then it also works only in two out of the three sigma_m bins... I'll decide tomorrow how to generate the BDT values.
I've decided to generate the BDT from the entire-sample distribution just to keep things simple for the time being, for both bkg and signal. It can later be tested if it's OK. As for the individual fraction of MC events in the different sigma_m categories with BDT > 0.2455, that are listed in the table above, I decided to use one fraction for all components. I can later generate toys with different fractions if needed. Let's then use the signal ones for the time being, i.e. 0.200657 for sigma_m 0-55
MeV, 0.161018 for sigma_m 55-65
MeV and 0.638324 for sigma_m > 65
MeV.
These need to be remembered as potential systematic sources: different BDT distributions in different sigma_m bins, and different sigma_m distributions between signal and background.
Eventually, I've implemented the fit in following way: I've exchanged our Bs, Bd, Bhh, sssv and comb pdfs as
RooSimultaneous -es with sigma_m categories, and then add these into a
RooAddPdf, one for each BDT bin. Theese are then, as usual, combined into a
RooSimultaneous over the BDT bins. There's a
ROOT forum question I've posted to make sure this is OK:
https://root-forum.cern.ch/t/roosimultaneous-with-components-that-are-roosimultaneous-themselves/49015
I've built the signal-only model on signal-only toys as usual, it looks to work fine:
The result in the three sigma_m categories is:
Biases are reasonable and correlation about 10% better. The RMS's are similar to those observed once eta binning was introduced.
Scanning eta bin edges
The fitter was adapted to be able to fit non-equal-stat eta binnings. The differences had to be made in propagating the systematics on the individual bin efficiencies and in initializing the background yields (nSSSV and nCOMB are initialized as spread uniformly in eta). Now instead of assuming exactly identical efficiencies for eta bins, these are rather taken from MC (instead of always jsut dividing by nETABins), so that we can use arbitrary eta binnings. First, this was tested still with the uniform eta bin edges, but with this modification. The results are pretty much the same as in the case where the equal efficiency was hard-coded, with tiny differences in high decimal places caused most probably by the fact that we don't get
exactly same efficiencies from the MC due to the finite statistics. But they do agree to all practicality:
green: before introducing changes to enable non-equal-stat eta binning blue: after introducing them |
|
|
Scanning was performed in the cases with 2 eta bins and 4 eta bins. I've moved the lowest-eta bin upper edge up and down in places I deemed reasonable by consulting the infamous sigma_m vs. eta plot. For the 2 eta bin case the edges were: {0.6, 0.7, 1., 1.2, 1.5, 1.7, 1.8}, for the 4 eta bin case, they were: {0.5, 0.6, 0.7, 0.9, 1., 1.1}: The numbers are percentage of converged fits.
|
correlation |
Bs |
Bd |
2 eta bins |
|
|
|
4 eta bins |
|
|
|
The convergence is pretty low in the 2 bin case, with the bin edge too low. This was investigated in couple checks. The hypothesis is, that with the bkg shape linear dependence on eta we are determining this line with just two points (as we only have 2 eta bins). To test this, we did:
1) Compare errors on the eta-evolution parameters between the 2 bin and 4 bin case. Only fully-converged fits were taken into account.
2) Investigate few of the not-properly converged fits plotted
3) Try to run it with some of the bkg shape parameters frozen
1) Indeed the average error on the eta-evolution parameters is notably higher in the 2 bin case:
2) I found out that the fits are actually not completely failed, it's just that their error matrix is not flawless ("forced positive definite"). Several (o(10)) fits are genuinely weird, but the bulk suffers only from the covariance matrix issues. No wonder then that all the fit plots I've checked looked OK.
3) I did a run with the SSSV slope frozen to the generation value, and the full convergence (minimum + error matrix) was 3995 out of 4000.
Indeed this all points toward the two bins not being able to constrain the fit properly with such an unequal binning.
Additional systematics due to eta binning
It was decided that for the decision whether the eta binning is worth it considering the additional systematics, we need to challenge the assumptions on the linearity of bkg slopes in eta (and bdt).
For the time being I'm doing it like this:
baseline model & toys:
- 3+1BDTX4ETA (can be changed), toys generated binned
- bkg expected yields: taken from 2020 BDT data SB fits, but distributed according to 15/16 BDT (same as the continuous generation)
- bkg shapes taken from MC (same as in the continuous generation)
- Tell the complete S+B model (
RooSimultaneous) to generate total number of events (i.e. not component by component as in the continuous case)
- On average the bias is +10 on Bs and +4 on Bd, about double wrt. the continuous-toys case. I don't know why or whether it's a problem.
- The bias is not so large with 3 eta bins, actually it's pretty much the same as in the continuous-toys case
alternative model & toys:
- bkg slopes adjustable in each bin
- values taken from MC - I've done the fits on SSSV and comb separately - the combined model often didn't converge
Null hypothesis toys:
literally just tell the
RooSimultaneous generate((mass, categories), total expected S+B events). Here's one:
3 eta bins
I've started with this, as here the biases in the baseline case are not so large. I did the variation of the sssv and comb slopes as described above
sssv slope variations |
comb variations |
|
|
Null hypothesis fits |
sssv variation |
comb variation |
|
|
|
|
|
|
summary |
|
Difference between baseline (continuous toys) and null-hypothesis
The difference in RMS between baseline - continuous toys and null hypothesis toys generated from the fit model is strange. It seems as we get the same gain by moving from baseline to null as we get in null between no eta binning and 3 eta bins. That has to be caused by the differences in generation of the baseline toys and the null toys. We therefore need a complete list of these differences:
differences_in_toy_fits_baseline_vs_null_hypothesis.pdf
Effect of fluctuating <bdt> and <eta>
This was tested by fitting the baseline toys with fixed <BDT> and <eta> values, same for all toys, rather than calculating it per toy. The effect of this is negligible.
Replacing singal and bkg with MC bootstraps
Signal, bkg and both were replaced with MC bootstraps in three sets of fits
Adjusting BDT bin ranges
The BDT range in which we extract our signal is defined by 54% signal efficiency, and is further divided into 3 bins of equal (i.e. 18%) efficiency. This seems to be some unquestioned principle followed in 15/16 and run1 analyses. The 56% cut is probably result of optimization performed in early run1 paper. I don't know where the division to 3 bins is coming from. In any case, this optimization was done to maximize Bs sensitivity. That however makes the cut sub-optimal for the Bd component, which is ten times smaller and thus would have stronger cut as optimum. The idea here is to use the usual optimal cut for optimal Bs sensitivity as the boundary of signal extraction BDT range, and instead of dividing it into same-stat sub-intervals, find another bin boundary that would maximize the Bd sensitivity. And only then divide into further bins if needed.
OndrejKovanda - 2021-03-29