The most widely and recommended fitting tool in CMS is CMS.RooFit.
This section describes an alternative toolkit that allows to easily define in C++ function models to be used in fit problems.
Those fit tools have been originally developed for some BaBar analyses, then partially ported to CMSSW and used for Z → μ^{+}μ^{-} studies.
The toolkit has two main features:
Many fit applications are based on RooFit, which is supported under ROOT distribution. This toolkit is not intended to be a competitor of RooFit, it is just a possible alternative for the cases where defining a customized application may be more convenient.
This tool was ported to CMSSW to make easier the study of Z → μ^{+}μ^{-} inclusive cross section, and support for more general features will be considered only if there will be interest in the CMS community.
The toolikit is all contained in the package:
The following namespaces are used in this toolkit:
funct:: |
Function definitions |
fit:: |
Fit tools |
root:: |
Root utilities and wrappers |
Functions are specified as C++ function objects taking one or more arguments (up to two or three in some cases currently supported, but extensible). A simple example is the following:
struct MyFunct { double operator()(double x) { return exp(-x)*x; } };
So, one can call the following code:
MyFunct f; double y = f(x);
Expressions are like functions, but take no argument. One example is the following:
struct PiOverTwo { double operator()() { return M_PI / 2; } };
Functions and expressions become more useful than the above simple examples if parameters can be specified, as described in the following section.
Functions may be defined in terms of parameters that can be passed to function/expression
constructors. A convenient way to allow more functions to share the same
parameter is to use boost::shared_ptr
in the implementation. The utility funct::Paramter
helps passing parameters adding more functionalities to the bare boost::shared_ptr
(adds a string for the name, conversions to and from double). A simple example of parametric function is the Gaussian defined below:
namespace funct { struct Gaussian { static const double oneOverSqrtTwoPi = 1/sqrt(2*M_PI); Gaussian(const Parameter & m, const Parameter & s) : mean(m.ptr()), sigma(s.ptr()) { } double operator()(double x) const { double z = (x - *mean)/ *sigma; return oneOverSqrtTwoPi/ *sigma * exp(-z*z/2); } boost::shared_ptr<double> mean, sigma; }; }
To force, for instance, two Gaussians to have the same mean value, user code may look like the following:
Parameter mean("mean", 0.0), sigma1("sigma1", 1.0), sigma2("sigma2", 2.0); Gaussiam g1(mean, sigma1), g2(mean, sigma2); double x = 1.2345; double y1 = g1(x), y2 = g2(x);
Functions and expressions can be composed using:
+
, -
, *
, /
, power raising ^
; beware, of the unexpected precedence of the ^
operator in C++!)
A new function type is generated by the function argument types. Examples are:
typedef Sum<Product<Parameter,Gaussian>::type, Product<Parameter,Gaussian>::type>::type DoubleGaussian; Parameter a("a"), b("b"), mean("mean"), sigma1("sigma1"), sigma2("sigma2"); Gaussian g1(mean, sigma1), g2(mean, sigma2); DoubleGaussian g = a * g1 + b * g2;
Symbolic expressions are useful to define with simple C++ code complex functions without explicitly defining a new type. Symbolic expression manipulations, including simplifications, are evaluated using the compiler, and involve no polymorphism, and produce inlined code that is fully optimized by the compiler.
One has to first define the interesting variables. Variable types
X
, Y
and Z
are defined in the namespace funct::
by
default:
X x; Y y;
More variable types are easy to add:
// in the .h file: DEFINE_VARIABLE(W, "w"); // in the .cc file: IMPLEMENT_VARIABLE(W); // in user code: W w;
Expressions like the following can be easily defined:
X x; Y y; Expression expr = sin(x) * cos(y); x = 1.234; y = 4.321; double value = expr();
The only caveat is that you can't use explicitly numbers in your expression, but you have to use provided utilities:
X x; Numerical<1> _1; // this is a constant equal to 1 Numerical<2> _2; // this is a constant equal to 2 Expression expr = ((x - _1)^_2); x = 3; double value = expr(); // evaluates to (3-1)^2 = 4
Fractions can be used as follows:
Fraction<1,2> _1_2; // this evaluates to 1/2
Numerical constants and fractions can be generated on the fly with helper function templates:
Expression expr1 = x ^ num<2>(); // x ^2 Expression expre2 = x = fract<1, 2>(); // x ^ (1/2)
Symbolic derivatives are computed at compile time, including term simplifications, and involve no run-time overhead. Higher order derivative can also be computed, according to the following example:
cout << derivative<X>(sin(x)*cos(x)) << endl; cout << nth_derivative<2, X>(sin(x)*cos(x)) << endl; cout << nth_derivative<3, X>(sin(x)*cos(x)) << endl; cout << nth_derivative<4, X>(sin(x)*cos(x)) << endl;This will print the following:
cos(x)^2 - sin(x)^2 -4 sin(x) cos(x) -4 ( cos(x)^2 - sin(x)^2 16 sin(x) cos(x)More examples are provided below:
Function
, which
has as template parameters the variable(s) that need to be taken from the
expression and converted into C++ arguments. One example is:
funct::Parameter yield(kYield, commands.par(kYield)); funct::Parameter mean(kMean, commands.par(kMean)); funct::Parameter sigma(kSigma, commands.par(kSigma)); funct::Parameter c("C", 1./sqrt(2*M_PI)); funct::Numerical<2> _2; funct::Function<funct::X> f = yield * c * exp(-(((x-mean)/sigma)^_2)/_2); double y = f(1.2345);
typedef funct::Product<funct::Parameter, funct::BreitWigner>::type FitFunction; typedef fit::HistoChiSquare<FitFunction> ChiSquared; funct::Parameter yield("Yield", 1000); funct::Parameter mass("Mass", 91.2); funct::Parameter gamma("Gamma", 2.50); funct::BreitWigner bw(mass, gamma); FitFunction f = yield * bw; TF1 startFun = root::tf1("startFun", f, 0, 200, yield, mass, gamma); TH1D histo("histo", "Z mass (GeV/c)", 200, 0, 200); histo.FillRandom("startFun", yield); ChiSquared chi2(f, &histo, 80, 120); fit::RootMinuit<ChiSquared> minuit(chi2, true); minuit.addParameter(yield, 100, 0, 10000); minuit.addParameter(mass, 2, 70, 120); minuit.addParameter(gamma, 1, 0, 5); minuit.minimize(); minuit.migrad();
Any parametric expression can be used to be minimized. What is released so far is:
fit::HistoChiSquare
: chi-squared function based on an histogram and a fit function model.
fit::MultiHistoChiSquare
: simultaneous chi-squared on multiple histograms (up to three are suported) each with a fit function model.
Likelihood fits have been tested for a single dimension so far. They can be performed using the class template:
fit::Likelihood<Sample, PDF>
Sample
is the container that holds the data values. Could be std::vector
for
a single dimensional problem.
For Extended unbinned Maximum Likelihood fits, an extra parameter specifying the type of yield to
befitted has to be specified. Typically it is just funct::Parameter
:
fit::Likelihood<Sample, PDF, funct::Parameter>
The sequence of minuit commands can also be passed via data-cards, in order to avoid recompiling in case of small changes, like initial parameter values, fixing and releasing parameters, and so on. The last example of the section below uses the following data-card file as input:
The supported commands are:
par <name> <value> <min> <max> [free|fixed]
fix <name>
release <name>
set <name> <value>
minimize
migrad
print_all
Complete examples are released in CVS:
Expressions can be integrated in RooFit with the adapter tool root::RooFitFunction
.
An example of how to do it is shown below:
Interactive use of such adapters requires the generation of dictionaries, which should be trivial, but has not been tested yet.