-- LorenzoMoneta - 2015-11-16

ROOT Tutorials Exercises for UERJ (16-26 November 2015)


This tutorial aims to provide a solid base to efficiently analyse data with ROOT. The main features of ROOT are presented: histogramming, data analysis using trees and advanced fitting techniques.

The Indico page of the course is available at this link.

Material for the course

The slides of the lectures are available in electronic form. See each section for the corresponding links. Slides introducing the functionality of ROOT are also available. Here is a summary of all slides:

As complementary material, one can look at

  • ROOT Primer Guide, available in pdf, html or epub format. This introductory guide illustrates the main features of ROOT, relevant for the typical problems of data analysis: input and plotting of data from measurements and fitting of analytical functions.
  • ROOT user guide. It can be downloaded in various format (or only individual chatters) from here.
  • RooFit User Guide, available in pdf format. A coincide RooFit quick start guide is also available here.
  • A special section introducing C++ ( Exercises TWiki logo, (slides PDF )

Start using ROOT (slides PDF )

We will focus first on introductory exercises for getting started working with the ROOT environment and writing our first ROOT macro. Two levels of help will be provided: a hint and a full solution. Try not to jump to the solution even if you experience some frustration. The help is organised as follow:

Here the hint is shown.

Here the solution is shown.

Some points linked to the exercises are optional and marked with a Pointing hand icon: they are useful to scrutinise in more detail some aspects. Try to solve them if you have the time.

Start using the ROOT prompt

First of all open a terminal window on your computer and type

If this does not work, then it means ROOT is not properly installed in your system. Stop here and ask for help to fix the installation. If it works then start playing with the prompt. Use as a calculator and type:
root [0] 2+2
or whatever you like (see for example the lecture slide) or page 5 of the booklet ("A ROOT Guide for Beginners"). Note that after having issue a statement from the ROOT prompt, you can omit the ; required by C++. This will make ROOT printing the returned value, if there is one. For example:
root [0] TMath::Pi();
will not print anything, while
root [0] TMath::Pi()

Afterwards start writing your first ROOT macro. A ROOT macro is a file containing some C++ code which can be run from the ROOT prompt. You need to define a function and in the function scope you write the code. For example, you create a file, which you will call mymacro.C. Inside the file you define a function mymacro(int value) and you write the code of slide 8 of the lecture (or something else, if you prefer). The you can run the macro from the ROOT prompt by typing:

root [0] .x mymacro.C(42)
Note that if the function name is different than the macro (file) name, you need two steps to run. First you load the macro:
root [0] .L mymacro.C
Then you run the function in the macro. Let's assume the function name is test(int value):
root [1] test(42)

After having followed this introduction, you can try to move to the first exercise.

Exercise 1: Plotting a Function in ROOT

Following the lecture slide or page 6 of the Introductory Guide, create a TF1 class using the sin(x)/x function and draw it.

Create then a function with parameters, p0 * sin ( p1 * x) /x and also draw it for different parameter values. You can try to change the parameter values using TF1::SetParameters or by using the ROOT GUI. Try also to change the style of the line and its color. You can either use the ROOT GUI and/or use the methods of the class TAttLine (see TAttLine reference documentation). Since TF1 derives from TAttLine, it inherits all its functions. Try for example to set the colour of the parametric function to blue.

After having drawn the function, compute for the parameter values (p0=1, p1=2):

  • function value for x = 1.
  • function derivative for x = 1
  • integral of the function between 0 and 3.

To solve the exercise you need to :
  • create a TF1 object using a formula expression. In the case of a parametric functions, the two parameters are defined as [0] and [1]
  • call TF1::Draw()
  • call TF1::Eval(x), TF1::Derivative(x) and TF1::Integral(a,b)

You can also find the available member functions of the class TF1, by using the Tab key on the ROOT prompt, for example

root [0] TF1 * f1 = ....
root [1] f1-><TAB>
And you can get the full signature of a method by doing for example:
root [1] f1->Derivative(<TAB>

#include "TF1.h"

void plotFunction() {

   TF1 * f1 = new TF1("f1","sin(x)/x",0,10);

   TF1 * fp = new TF1("fp","[0]*sin([1]*x)/x",0,10);

   // to change axis y margins                                                                                                                  
   // (or invert the order of plotting the functions)                                                                                           

   std::cout << "Value of f(x) at x = 1 is       " << fp->Eval(1.) << std::endl; 
   std::cout << "Derivative of f(x) at x = 1 is  " << fp->Derivative(1.) << std::endl; 
   std::cout << "Integral of f(x) in [0,3] is    " << fp->Integral(0,4)  << std::endl;

Pointing hand You can try to use a different function, for example the Gamma distribution, defined in ROOT::Math::gamma_pdf. Try to make a plot as the one in Wikipedia, see here, where the function is plot for different parameter values. See the here the reference documentation of the gamma distribution in ROOT.

#include "TF1.h"
#include "Math/DistFunc.h"

void plotGamma() {

   // Note that parameter [0] is called alpha in definition of gamma_pdf or kappa in Wikipedia                                                  
   // Note that parameter [1] is  theta                                                                                                         

   // use range [0,20] as in Wikipedia plot                                                                                                     
   TF1 * f1 = new TF1("f","ROOT::Math::gamma_pdf(x,[0],[1])",0,20);


   // use DrawClone because we will plot many different copies of same object but with different                                                
   // parameter values                                                                                                                          

   // now change parameters and draw at different parameter values                                                                              





Exercise 2: Plotting Points in ROOT (TGraph class)

We will learn in this exercise how to plot a set of points in ROOT using the TGraph class.

Suppose you have this set of points defined in the attached file graphdata.txt

Plot these points using the TGraph class. Use as a marker point a black box. Looking at the possible options for drawing the TGraph in TGraphPainter, plot a line connecting the points.

To solve the exercise you need to :
  • create the Graph using the constructor where you can specify the text file name containing the points.
Otherwise you can also
  • create an array for the X points: double x[] = {1,2...
  • create an array for the y points
  • create the TGraph from the X and Y arrays

You can also create first an empty TGraph object and set the point one by one by calling TGraph::SetPoint

void plotGraph() {

   TGraph * g = new TGraph("graphdata.txt");



Pointing hand Make a TGraphError and display it by using the attached data set, graphdata_error.txt, containing error in x and y.

void plotGraphError() {

   TGraphErrors * g = new TGraphErrors("graphdata_error.txt");



Exercise 3: Making an histogram in ROOT

We will learn in this exercise how to create a one-dimensional histogram in ROOT, how to fill it with data and how to plot it.

Create a one-dimensional histogram with 50 bins between 0 10 and fill it with 10000 gaussian distributed random numbers with mean 5 and sigma 2. Plot the histogram and, looking at the documentation in the THistPainter, show in the statistic box the number of entries, the mean, the RMS, the integral of the histogram, the number of underflows, the number of overflows, the skewness and the kurtosis.

For generating gaussian random numbers use gRandom->Gaus(mean, sigma)

#include "TH1.h"
#include "TRandom.h"
#include "TStyle.h"

void plotHistogram() { 

   TH1D * h1 = new TH1D("h1","h1",50,0,10);

   for (int i = 0; i < 10000; ++i) {
      double x = gRandom->Gaus(5,2);




Pointing hand After calling the function TH1::ResetStats() , you will see that the statistics (mean, RMS,..) of the histogram is slightly different. Try to understand the reason for this, by trying for example to compute the mean of the histogram yourself.

The initial statistics is computed using the original (un-binned) data, while after calling TH1::ResetStats(), the statistics is computed using the bin contents and centres (binned data).

Pointing hand The following macro, creating, filling and plotting histogram contains an error, which one ?

#include "TH1.h"

void testPlotHistogram() { 

   TH1D h1("h1","h1",50,-5,5);


The histogram objected is deleted at the end of the macro, therefore it is not shown in the plot after exiting the macro. To fix it, either call TH1::DrawClone or create the histogram using operator new as in the previous example.

Working with Histograms

We will focus in the next exercises on the histograms and their operations

Exercise 4: Working with the histogram bins

Create an histogram with 200 bins between 0 and 10. Fill then the histogram with 1000 gaussian random numbers with mean 5 and sigma 0.3 and 10000 uniform number between 0 and 10. Plot this histogram. Find the bin number of the histogram with the maximum content. What is its bin content ? What is the bin center and the bin error ? Afterwards zoom the histogram plot between 4 and 6 using either TAxis::SetRange or the mouse (clicking on the axis).

To find the maxim bin you can use TH1::GetMaximumBin if you don't want to loop at the bins yourself.

For zooming an histogram you can use TAxis::SetRange(firstbin,last bin) or TAxis::SetRangeUser(x1,x2). You can also do it with the GUI by selecting the axis range with the mouse.

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/exerciseHistogram1.C 

Pointing hand Instead of zooming create a sub-histogram between 4 and 6 using TH1::GetBinContent on the original histogram and TH1::SetBinContent on the new histogram. How many bins has the new histogram ?

To create the sub-histogram, you need first to find the bin numbers in the original histogram corresponding to 4 and 6. You can use for this TAxis:FindBin for this. Then you create the sub-histogram using as lower value for the axis, the lower edge of the bin corresponding to 4, and, as upper value, the upper edge of the bin corresponding to 6. Afterwards, by looping on the bins, you can copy the bin content from the original histogram into the new one. It is better to use a slightly value larger than 4 (e.g. 4.0001) and a value a little bit smaller than 6 (e.g. 5.999) to avoid the bin edges.

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/exerciseHistogram1b.C 

Exercise 5: Histograms Operations

We will work in this exercise on the histogram operations like addition, subtraction and scaling.

  • Make a gaussian filled histogram between 0 and 10 with 100 bis and 1000 entries with mean 5 and sigma 1.
  • Make another histogram uniformly distributed between 0 and 10 with 100 bins and 10000 entries.
  • Add the two histogram into a new one using TH1::Add
  • Make another uniformly distributed histogram, still with 100 bins but with 100000 entries. Normalize this histogram to have a total integral of 10000 using TH1::Scale.
  • Subtract now from the histogram which contains the sum of the flat and the gaussian histograms, the flat normalised histogram, using TH1::Add
  • Plot the result using the error option (h1->Draw("E")). Do the error make sense ? If not, how can you get the correct bin errors ?

  • Use TH1::Add to add the two histogram.
  • Use TH1::Scale(10000/ 100000) to re-normalise the histogram.
  • Use again TH1::Add to subtract the histogram, but with a second coefficient equal to -1. (TH1::Add(h1,h2,1,-1)).
  • For getting the right errors you must call TH1::Sumw2 before doing the operations on the histograms (i.e. before scaling and before subtracting them)

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/HistoOperations.C 

Pointing hand Rebin now of the histogram (e.g. the one resulting at the end from the subtraction) in a new histogram with bins 4 times larger.

For rebinning in 4 times larger bins, use TH1::Rebin with ngroup=4.

Add these lines of code at the end of the macro
new TCanvas("c2","c2");

  TH1 * h6 = h5->Rebin(4,"h6");
  h6->SetTitle("Rebinned histogram");

Exercise 6: Multi-Dimensional Histograms and Profiles

Create a 2 dimensional histogram with x and y in the range [-5,5] and [-5,5] and 40 bins in each axis. Fill the histogram with correlated random normal numbers. To do this generate 2 random normal numbers (mean=0, sigma=1) u and w. Then use x = u and y=w+0.5*u for filling the histogram. Plot the histogram using color boxes (See documentation in THistPainter class) or choose what ever option you prefer. After having filled the histogram, compute the correlation using TH1::GetCorrelationFactor.

The option for plotting colour boxes is "COLZ", which draws also a palette for the scale on the Z axis (the bin content)

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/Histogram2D.C 

Make then a projection of the 2-dimensional histogram on the x. Make also a projection of the y axis into a profile. Plot the resulting projected histograms in a new canvas separated in 2 pads.

For making the projection call TH1::ProjectionX and for making the profile call TH1::ProfileX

For dividing the canvas call TCanvas::Divide(1,2) and navigate in the pad contained in the canvas by calling TCanvas::cd(pad_number).

Add these lines of code at the end of the macro
TH1 * hx = h2d->ProjectionX(); 
   TH1 * px = h2d->ProfileX(); 

   TCanvas * c2 = new TCanvas("c2","c2"); 
   // divide in 2 pad in x and one in y


Pointing hand Look at the bin error of the profile. Do you know how it is computed ? You can find this answer in the TProfile reference documentation.

Pointing hand If you have still time, after having finished the exercises you can look at some of the tutorials in the ROOT distribution directory, $ROOTSYS/tutorials/hist. They are available on the Web at this location. For example look at:

ROOT I/O and Trees (slides PDF )

Set of Exercises working with the Trees in ROOT. First will start with an exercise on the I/O of ROOT by storing and reading an histogram from a file. Then we will move to exercises using the TTree class. The first one is very simple and it could be skipped by somebody already knowledgeable of ROOT.

Exercise 7: Writing and Reading Histogram from a file

Open a file then create a simple histogram, for example an histogram generated with exponential distribution. Fit it and write it in a file. Why the ROOT Canvas does not show the histogram ? Do you know what to do to have the histogram displayed ?

Use TFile::Open to open the file or just create a TFile object. Call TH1::Write to write the histogram in the file after having filled it.

#include "TFile.h"
#include "TH1.h"
#include "TRandom.h"

void histogramWrite() { 

   TFile f("histogram.root","RECREATE");

   TH1D * h1 = new TH1D("h1","h1",100,0,10); 
   for (int i = 0; i < 10000; ++i) 
      h1->Fill(gRandom->Exp(5) ); 


The histogram is not shown, becausem when the file is close, it is automatically deleted.

Now read the histogram from the file and plot it.

Create a file object (or call TFile::Open) and then TFile::Get

void histogramRead() { 

   TFile * file = new TFile("histogram.root");

   TH1 * h1 = 0;
   // you can also use nut you need to cast if you compile the code
   //TH1 * h1 = (TH1*) file->Get("h1");


Pointing hand You can also use the TBrowser to open the file and display the histogram.

Pointing hand What is going to happen if you delete the file after having retrieved the histogram from the file ?

Exercise 8: Creating a ROOT Tree containing a collection of LorentzVector's

Create a TTree containing a collection of 4D LorentzVectors. For example one could generated a list of pions (let's suppose 20/event in average) with an exponential distribution in pt and a uniform distribution in phi and Eta.

Measure the time to write a TTree with 100000 events.

Create the Tree class and then define a branch containing a std::vector<ROOT::Math::XYZTVector> > ( or if you prefer a std::vector ). In this second case you need to generate the dictionary for the type written in the tree. You can do this by adding at the beginning of the macro these following lines
#ifdef __MAKECINT__ 
#pragma link C++ class std::vector<TLorentzVector>+;
For measuring the writing time you can use for example the TStopwatch class. Before the loop you create the TStopwatch class and you call TStopwatch::Start(). At the end of the macro you call TStopwatch::Stop() and then TStopwatch::Print() to get the elapsed time. You can also use the TTreePerfStats class (see its reference documentation), which will make also a summary performance graph.

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/WriteVectorCollection.C 

Pointing hand Try to generate the tree in split mode (default) and no-split mode. Use TTree->Print() to see the content of the generated trees. Did you see a difference in performances in writing when using or not using splitting ?

For creating a branch not split, use a plot level of zero, while for splitting use 99 (the default value). The split level is passed as last parameter in TTree::Branch (see the reference documentation).

Pointing hand Try now to use a TClonesArray a TObjArray and measure the time to create the tree. Did you see an increase/decrease in performances ?

Now you must add in the Branch the TClonesArray (or TObjArray object). Remember that for the TClonesArray you must construct them by passing the class name of the contained object. You must also use only classes deriving from TObject. Thus you can only use TLorentzVector and not the template ROOT::Math::LorentzVector class. Remember you need to use a special syntax to create the new object and to fill the TClonesArray. See the TClonesArray documentation

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/WriteArrayCollection.C 

Exercise 8b: Creating a flat ROOT Tree

This exercise show a different way to create a ROOT tree in case of a flat data structure. An simple tuple (TNtuple class could also be used in this case).

Create a simple ROOT tree containing 4 variables (for example x,y,z,t). Fill the tree with data (for example 10000 events) where x is generated according to a uniform distribution, y a gaussian and z an exponential and t a Landau distribution. Write also the tree in the file.

Create the Tree class and then declare each branch for each simple variables as described in the lecture slides. See the documentation of TTree::Branch on how to declare branches for simple variables (fundamental types). You can also look at the tutorial tutorials/tree/tree1.C as example, on how TTree:Branch is used to define the tree branches containing the variables. Alternatively you can use also the TNtuple class. An example for the tuple class is the tutorial tutorials/hsimple.C.

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/SimpleTree.C 

Afterwards having saved the file, re-open the file and get the tree. Plot each single variable and also one variable versus another one (for example x versus y) using TTree::Draw. You can also use the TBrowser

Exercise 9: Creating a ROOT Tree containing an EventData object

Create a ROOT tree which contains an EventData object. The EventData object is defined in the file below

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/EventData.h 
It contains a collection (a std::vector) of Particle object. The Particle object contains the initial vertex position of the particle, the particle momentum , its charge and a different code, depending if it is a photon, an electron, a muon, a charged pion or a changed kaon.

The EventData object provides a method (Generate() ) to generate one event and it is implemented in the file below

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/EventData.cxx 

Look at the macro defining the EventData class and the implementation generating the events. Afterwards, write the macro creating the tree and fill it with the EventData objects, which has to be generated for each event.

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/CreateEventTree.C 

Look at the macro and try to understand it. Run the macro to create and write the tree in the file.

Exercise 10: Create and query ROOT Tree from a data text file

This exercise show how you can create a Tree from a data file in text format (e.g. csv file). The aim is to create a Tree from a CMS public LHC data set which is is text format. You can use for example the J/Psi dimuon data available at this link.

  • Create a TTree from the given CMS text data using the function TTree::ReadFile
  • plot the di-muon invariant mass using TTree::Draw (variable M) for all the events
  • plot the di-muon invariant mass for the same charged and opposite charged muons
  • plot the pt distribution as function of eta in a profile plot for the first and the second muon
  • save the tree in a root binary file.

Use the function TTree::ReadFile to fill a tree from data from a text file. In principle the function should be able to read the branch names directly from the first line of the text file, if the first line contains also the description of the branch type (e.g. "x/F"). Since this is not the case It is recommended then to create a string defining the branches ("Type/C:RunNo/I:EvtNo/I:E1:/F:px1:py1.....") and pass it as second argument to TTree::ReadFile. Note that if the following branches are of the same type one can omit the branch type descriptor. For example a valid string for that CMS file is

Use then TTree::Draw to query the variable in the Tree.

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/CreateFromCSV.C 
Here is the code to analyze the tree using TTree::Draw
// to draw invariant mass
 // to draw the profile 
tree->Draw("pt1:eta1 >> p1(40,-3,3)","","prof");  
tree->Draw("pt2:eta2 >> p2(40,-3,3)","","prof");  

Exercise 10b: Read a ROOT Tree containing a collection of LorentzVector's

Use TTree::Draw to plot:

  • The pt distribution of the tracks
  • The number of tracks per event in an histogram with 50 bins between 0 and 50.
  • The E distribution for |eta| < 2
  • A profile plot showing the pt of the number of tracks vs eta.

  • To plot the size of the collection use the special keyword "@".
  • To make a profile plot use the graphics option "prof" (3rd parameter in TTree::Draw).

     TFile f("vectorCollection.root");
      t1->Draw("@tracks.size() >> h1(50,0,50)");
      t1->Draw("tracks.E()","abs(tracks.Eta() < 2)");
      t1->Draw("tracks.Pt():tracks.Eta() >> prof(50,-4,4)","","prof");

See also TTree::Draw documentation

Pointing hand Use then C++ code to plot the invariant mass of all 2-tracks combinations. This you cannot do with TTree;:Draw

To read the tree using C++ code, write a macro, where you retrieve the TTree from the file and then loop on its entry, retrieve the needed object data and then fill the histograms. In more detail, this is a suggestion on how to write this code:
  • Open the file using its file name in TFile::Open() and get the Tree. Remember to check if the file pointer is not null. If it is null means the file is not existing.
  • Get then a pointer to the tree.
  • Connect a Tree Branch with the Data Member.We have to somehow connect the branch we want to read with the variables used to actually store the data by calling TTree::SetBranchAddress().
  • Load the TTree data. For the analysis example we need to access the vector of tracks, which is stored in the branch with name "tracks". But the TTree first needs to load the data for each event it contains. For that call TBranch::GetEntry(entry) in a loop, passing the TTree entry number from the loop index to GetEntry(). Again TBranch is the class name, but you obviously need to call it on an object. To know how many entries the tree contains, simply call TTree::GetEntries().
  • Without the call to GetEntry(), the variables will not contain data. GetEntry() loads the data into the variables connected with the tree by the call to SetBranchAddress().
  • Once you have the event data (the vector of tracks) you can loop on its elements.
  • Make all the combination of 2-elements (2 tracks) and add them together to retrieve the invariant mass. Just use the M() function of the added LorentzVector to get the invariant mass.
  • Fill an histogram with the obtained value
  • Plot the histogram at the end of the loop

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/ReadVectorCollection.C 

Exercise 11: Analyze a ROOT Tree using the new TTreeReader class

Read the tree containing the EventData objects using the TTreeReader class. The aim is to get the invariant mass distribution of the photons, of the opposite charged particles and of the opposite charged leptons (electrons and muons). See as example on using the TTreeReader the tutorial tutorials/tree/TreeReaderSimple.cxx. You need to have ROOT 6 to run this tutorial, since the TTreeReaderClass is not available in the ROOT version 5.34.

Use the TTreeReader and the TTreeReaderValue classes to get an instance of the collection (std::vector) for each event. To do this you need to declare outside the event loop, after having open the file containing the TTree:
TTreeReader myReader("tree", myFile);
TTreeReaderValue<std::vector< Particle> > particles_value(myReader, "fParticles");
Then you iterate using the TTreeReader iterator and you get the std::vector object for each event by de-referencing the TTreeReaderValue object, using the * operator.
std::vector< Particle>  & particles = *particles_value; 
Once you have the vector available, you can then write the code to select the right particle and compute the invariant mass using all the right combinations.

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/EventDataReader.C 

Exercise 11b: Analyzing a ROOT Tree using TTree::MakeClass

Read again the tree containing the EventData class to get the invariant mass distribution of the photons, of the opposite charged particles and of the opposite charged leptons (electrons and muons). This time you read the Tree using TTree::MakeClass. We will see in one of the next exercise on how to use TTree::MakeSelector. Note that in order to get the right definition of the top-level branches, one needs to generate the tree with split level=1. In case of a tree with a different splitting level, one needs to declare itself the needed branches and the contained variables and call TTree::SetBranchAddress in the initialisation of the analysis class. Future versions of ROOT should have this limitation removed and should be able to correctly define the contained branches.

Use =TTree::MakeClass("myclassname") to generate the header file and the implementation of the class code required to analyze the Tree. Fill the implementation file with the needed code to get the invariant masses distributions. To run the code just do from the ROOT prompt (let's suppose the classname generated is called EventDataClass)
root[0]  .L EventDataClass.C 
root[1]  EventDataClass c; 
root[2] c.Loop(); 

Header file obtained from MakeClass. It did not require any changes by the user.
<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/EventDataClass.h 
Implementation file of class EventDataClass containing the code to plot the invariant masses
<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/EventDataClass.C 

Exercise 12: Using the TSelector class for analysing a TTree

Create the Selector for the EventData TTree made before. Use TTree::MakeSelector to create your own Selector class. The aim as in a previous exercise is to plot the invariant masses for:

  • the photons,
  • the opposite charged particles
  • for the muon and electrons with opposite charge

Inside the code of your Selector do the following:

  • book the histograms in the initialisation routine
  • fill the histogram in the Process function
  • draw the histogram in the Terminate function

Remember to generate first the TTree with a split-level = 0 to be able to have TTree::MakeSelector generating correctly the code for reading the collection (std::vector) object. Otherwise, you will have to declare its branch definition by hand using TTree::SetBranchAddress.

Here is what you need to do, after having opened the file with the tree, first load the dictionary library for the EventData class
.L EventData.cxx+;
then create your Selector class

The file EventDataSelector.h and EventDataSelector.C will be created. Add in EventDataSelector.h, inside the class EventDataSelector, a new data member, the histograms you want to create,

TH1D *  h_t;

Edit then the file EventDataSelector.C and add in EventDataSelector::SlaveBegin the booking of the histograms. For example:

h_t = new TH1D("h_t","t",100,0,100);

In EventDataSelector::Process the filling of the histogram after calling TSelector::GetEntry()


In EventDataSelector::Terminate add the drawing of the histogram.

After having saved the file run the selection by doing (for example from the ROOT prompt):

TFile f("tree.root");

Header file
<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/EventDataSelector.h 
Implementation file. The output list is used for the histograms, so this selector is ready to be used by PROOF (see next exercise).
<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/EventDataSelector.C 

Exercise 13: Chaining ROOT Files.

Using the macro to create the EventData tree, run it few times (e.g. 2 or 3 times) using a different file name each time. Afterwards use then the TChain class to merge the trees and analyse the obtained chain as in Exercise 11.

See the example in the lecture slide on how to use TTree::Chain or its reference documentation. The TChain must be created passing the name of the TTree existing in the files.

This are the few lines to create the TChain, that you can run directly from the prompt. You can also use wildcard's to chain many files

%CODE{"cpp" style="background: yellow;"  }%
TChain chain("tree");    

Interactive Data Analysis with PROOF

We will learn in this exercise how we can analyse a data set (a chain of ROOT files containing a Tree) using PROOF Lite

Exercise 14: Using PROOF to analyze the TTree

Generate 10 files with CreateEventTree in a subdirectory called 'data'. Create a TChain with these files. Using the TSelector defined in the exercise on TTrees - EventDataSelector - run the selector using PROOF-Lite. Try to compare the processing time with the standard TChain processing.

Make sure the selector compiles with
root [] TSelector::GetSelector("EventDataSelector.C+")

Check the TProofBench pages.

Look at the slides for ways to createthe TChain and to the TChain reference documentation for enabling PROOF. Use gROOT->Time() to measure times.

Create the files
$ mkdir data
$ root -l
root [] .L EventData.cxx+
root [] .L CreateEventTree.C
root [] for(Int_t i = 0; i < 10; ++i) { CreateEventTree(Form("data/evtree_%d.root", i)); }

Create the TChain (use the correct path ... /home/admin if required ...)

root [] TChain chain("tree")
root [] for(Int_t i = 0; i < 10; ++i) { chain.AddFile(Form("file:///home/user/data/evtree_%d.root", i)); }
root [] chain.ls()

Process locally

root [] gROOT->Time()
root [] chain.Process("EventDataSelector.C+")

Process in PROOF-Lite

root [] TProof::Open("lite://")
root [] gProof->Load("EventData.cxx+")
root [] chain.SetProof()
root [] gROOT->Time()
root [] chain.Process("EventDataSelector.C+")

Fitting in ROOT (slides PDF )

Welcome to the hands-on session dedicated on fitting in ROOT

Exercise 15: Gaussian fit of an histogram

We will start with an exercise where we fin an histogram with a simple function, to get familiar with the fitting options in ROOT.

  • Start creating an histogram with 50 bins in [-5,5] and fill with 1000 Gaussian distributed number
  • Fit the histogram using a Gaussian function
  • Get the value and error of the width of the gaussian
  • Retrieve the fit result and print the correlation matrix.

To solve the exercise you need first to create first the TF1 object, either using the pre-defined gaus function or by using a formula expression with "[0]*ROOT::Math::normal_pdf(x,[2],[1])". Remember that in the second case you need to set the initial function parameters (e.g. f1->SetParameters(1,0,1) ).

To get access to the TFitResult object after fitting use option "S", as shown in slide 10 of the lecture.

Use gStyle->SetOptFit(1) to display the fit result in the statistics box.

#include "TF1.h"
#include "TH1.h"
#include "TFitResult.h"
#include "TMatrixDSym.h"
#include "TStyle.h"

void gausFit() { 

   TH1D * h1 = new TH1D("h1","h1",50,-5,5);


   TF1 * f1 = new TF1("f1","gaus"); 
   // add also option "Q" (quite) to avoid prrinting two time the result
   TFitResultPtr r = h1->Fit(f1,"S Q");
   // print the result

   // get the correlation matrix and print it
   TMatrixDSym corrMatrix = r->GetCorrelationMatrix(); 


  // to get the sigma of the gaussian
  std::cout  << "Gaussian sigma = " << f1->GetParameter("sigma") << "  +/- " << f1->GetParError(f1->GetParNumber("sigma")) << std::endl;


Pointing hand If you repeat the fit few times (without exiting ROOT) you will see that the sigma you obtain is almost always less than 1. The result is then slightly bias. Try to perform a likelihood fit (option "L") and a Pearson least-square fit (option "P"). Are the results better ?

The reason is that the lest square fit is not correct in case of low statistics. The bins with zero events are not included in the fit and this bias the result to lower sigma values. In case of the Pearson least-square (i.e. using expected bin error) the bias is towards higher sigma values. The likelihood fit is the correct method for fitting count histograms. You can also study the pull distribution of the obtained sigma results by generating pseudo-experiments and fitting each one of them.

Pointing hand You can notice that the sigma and the amplitude of the gaussian are quite correlated. Can you have a better parametrisation ?

You can fit using a normalised Gaussian. In this case you get a much smaller correlation. To do this create the TF1 as following:
TF1 * f1 = new TF1("f1","[0]*ROOT::Math::normal_pdf(x,[2],[1])");  
// set the parameters (needed if not using a pre-defined function as "gauss")

Exercise 16: Fit a peak histogram

We are going to fit the histogram with a more complicated function. We can use the histogram obtained from the CMS tree data of Exercise 10. The aim is to compute the mass and width of the peak (in this case the J/Psi).

  • Create (or read from the ROOT file obtained in Exercise 10) the tree with the dimuon CMS data between 2 and 5 GeV (text data file).
  • Fill an histogram with 60 bins between 2 and 5 with the invariant mass for the events when the two muons have opposite charge.
  • Create a function composed of the gaussian plus the exponential and fit to the histogram. Do the fit works ? What do you need to do to make the fit working ?
  • Compute the number of peak events, by using the integral of the Gaussian function. Use TF1::IntegralError to compute also its error.

See Exercise 10 on how to create the tree. To fill and draw the histogram do for example:
tree->Draw("M >> h1(60,2,5)","Q1*Q2==-1")
Before fitting you need to set sensible parameter values. You can do this by fitting first a single gaussian in the range [2.7,3.2] and then the exponential separately. If you don't set good initial parameter values, the fit will probably not converge.

After the fit works, you can compute the number of peak events, by using TF1::Integral on the Gaussian only function. For the error you can use TF1::IntegralError, but you need to extract the correlation matrix from the fit and use the sub-matrix referring to the gaussian part.

<span class='twikiAlert'>
  Error: Recursive INCLUDE may occur with https://twiki.cern.ch/twiki/pub/Main/ROOTRioTutorial/../ROOTLaPlataTutorial/JPsiPeakFit.C 

Exercise 17: Using the Fit Panel GUI

Repeat the previous exercise, but by using the Fit Panel GUI

  • Select the fit panel in the Canvas/Tools menu or by right-clicking on the histogram
  • Make the fit function (gaussian plus exponential) using the Operation/Add button
  • Set the initial parameters by playing with the sliders
  • Press the Fit button

Pointing hand Explore the other functionalities of the fit panel like changing the fit method, use a different minimiser, plotting contours in the parameters.

Pointing hand If you want to know more about fitting, you can look at some of the tutorials in the ROOT distribution directory, $ROOTSYS/tutorials/fit. They are available on the Web at this location. For example look at:

Fitting using RooFit

Welcome to the hands-on session dedicated on fitting using RooFit. The aim is to start familiarizing with RooFit and trying understand the basic syntax of creating models using the workspace factory. We will also see how to save a workspace in a ROOT file which, allowing to perform the fitting analysis at a later stage or to share the models with other people.

RooFit provides also a separate User Guide (unfortunately it does not cover yet the workspace factory syntax), but it exists also a reduced booklet, showing the main functionality, which you can find here.

Exercise 18: Gaussian model and fit it to random generated data

We will start with a similar exercise we did for Root fitting. We will create a Gaussian model from which we will generate a pseudo-data set and then we will fit this data set.

Start directly creating the Gaussian model using the workspace factory, thus the syntax introduced in the lecture slides. One you have created the model, use the generate() method of the RooAbsPdf class to generate 1000 events. Try to plot the data set using RooPlot as shown in the lecture slides

After, fit the model to the data and show the resulting fitted function.

At the end save the RooWorkspace object in a file, but before remember to import, by calling RooWorkspace::import(data), the data set you have generated in the workspace. The workspace does not contains only the model, but also the data, allowing then to re-perform the analysis later on.

  • Use the syntax of the RooWorkspace factory to create first the variables (observables and parameters) of the Gaussian probability density function (p.d.f.), as shown in the corresponding lecture slide.
Every variable in RooFIt, when created needs to be created with a value a min and a max allowed range. Use very large value if you don't know the range. If you provided only a value, the variable is considered constant. If you provide only the minimum and the maximum, the initial value will be taken as half the range. To avoid undesired side effect the value given should be defined between the min and max of the given range.
  • You need to define the [value, min, max] of a variable only the first time you create in the factory. Afterwards you can reference the variable by its name.
See the slide lecture on how to create the p.d.f.
  • After you have created the variable and p.d.f in the workspace, you can access their pointers, by using RooWorkspace::var for variables or =RooWorkspace::pdf for p.d.f.
  • You can then generate the data set and plot it.
  • To fit the data set with the pdf, you need to call the RooAbsPdf::fitTo.
  • To save the model call, RooWorkspace::write(file_name).

Below is the code solution:
// make a simple Gaussian model

#include "RooWorkspace.h"
#include "RooRealVar.h"
#include "RooAbsPdf.h"
#include "RooDataSet.h"
#include "RooPlot.h"

#include "TCanvas.h"

using namespace RooFit; 

void GaussianModel(int n = 1000) { 

   RooWorkspace w("w");
   // define a Gaussian pdf

   RooAbsPdf * pdf = w.pdf("pdf");   // access object from workspace
   RooRealVar * x = w.var("x");   // access object from workspace

   // generate n gaussian measurement for a Gaussian(x, 1, 1);
   RooDataSet * data = pdf->generate( *x, n); 


   // RooFit plotting capabilities 
   RooPlot * pl = x->frame(); 

   // remove this line if you want to fit the data

   // now fit the data set 

   // plot the pdf on the same RooPlot object we have plotted the data 


   // import data in workspace (IMPORTANT for saving it )


   // write workspace in the file (recreate file if already existing)
   w.writeToFile("GaussianModel.root", true);

   cout << "model written to file " << endl;


Exercise 19: Reading a workspace from a file

Open the file you have just created in the previous exercise and get the RooWorkspace object ffrom the file. Get a pointer to the p.d.f describing your model, and a pointer to the data. Re-fit the data, but this time in the range [0,10] and plot the result.

To read and analyse the workspace you need to do:
  • Open the TFile object and use TFile::Get to get a pointer to the workspace using its name
  • Once you have the workspace in memory, retrieve from it the p.d.f. and the data set with their names using RooWorkspace::pdf and RooWorkspace::data.
  • Re-issue again the call to RooAbsPdf::fitTo. You can set the fit range using the RooFit::Range(xmim,xmas) command arg option in fitTo. See the reference documentation for all the possible options that you can pass (some are shown in the solution code).

Here is the solution. This macro (apart form the Range fit) can work and fit whatever workspace you have in the file. You just need to set the right names for file, workspace, global p.d.f. and data set.

#include "RooWorkspace.h"
#include "RooAbsPdf.h"
#include "RooRealVar.h"
#include "RooPlot.h"
#include "RooDataSet.h"
#include "RooFitResult.h"

#include "TFile.h"

// roofit tutorial showing how to fit whatever model we get from a file 

// we assume the name of the workspace is w
// the name of the pdf is pdf 
// the name of the data is data
const char * workspaceName = "w"; 
const char * pdfName = "pdf"; 
const char * dataName = "data"; 

using namespace RooFit; 

void fitModel(const char * filename = "GaussianModel.root" ) { 

   // read file: 
   // following lines are for reading workspace
   // and to check that is fine 

   // Check if example input file exists
   TFile *file = TFile::Open(filename);
   // if input file was specified but not found, quit
   if(!file ){
      cout <<"file " << filename << " not found" << endl;

   // get the workspace out of the file
   RooWorkspace* w = (RooWorkspace*) file->Get(workspaceName);
      cout <<"workspace with name " << workspaceName << " not found" << endl;

   // fit a pdf from workspace with name pdfName

   RooAbsPdf * pdf = w->pdf(pdfName);
   if (!pdf) { 
      cout << "pdf with name " << pdfName << " does not exist in workspace " << endl;

   // get the data out of the file
   RooAbsData* data = w->data(dataName);

   if(!data ){
      cout << "data " << dataName << " was not found" <<endl;

   //// real code starts here 

   // get variable x (is the first of the data)
   RooRealVar * x = w->var("x"); 
   RooPlot * plot = x->frame();


   // fit pdf - (example using option: save result and using diffferent minimizer

   // global fit  

   pdf->fitTo( *data );

   // for doing a reduce fit in a Range (plus other options)
   RooFitResult * r = pdf->fitTo( *data, Save(true), Minimizer("Minuit2","Migrad"),Range(0.,10.) );



   // if we have a result we can do 




Exercise 20: Fit of a Higgs Signal over an Exponential Background

The aim of this exercise is to learn how to build a composite model in RooFit made of two p.d.f, one representing a signal and one a background distributions. We want to determine the number of signal events. For this we need to perform an extended maximum likelihood fit, where the signal events is one of the fit parameter.

Create first the composite model formed by a Gaussian signal over a falling exponential background. Then read the data (in text format) from the attached file, Hgg.txt and create a RooDataSet class with all the data. Perform then an extended unbinned fit to the data to extract the Higgs signal strength. Plot the resulting fit function from the fit with separate signal and background components.

  • Read first the attached file (Hgg.txt in a RooDataSet using RooDataSet::read or in a ROOT TTree using TTree::ReadFile
  • Create the model using the RooWorkspace factory
    • create the Exponential pdf and the Gaussian signal p.d.f. function. We create the exponential as two different components to reproduce better the data.
    • create then a RooAddPdf using the Gaussian and the Exponential and the relative number of events. Note that we could instead of the number of events a relative fraction. In this last case we would fit only for the shape and not the normalisation.
    • The RooAddPdf is created using the SUM operator of the factory syntax.
// create model
   RooWorkspace w("w");
   w.factory("x[110,160]");  // invariant mass
   // create exponential model as two components
   w.factory("a1[ 7.5, -500, 500]");
   w.factory("a2[-1.5, -500, 500]");
   w.factory("expr::z('-(a1*x/100 + a2*(x/100)^2)', a1, a2, x)");
   w.factory("Exponential::bmodel(z, 1)");
    // signal model   
   w.factory("mass[130, 110, 150]");
   w.factory("width[1, 0.5, 5]");
   w.factory("Gaussian::smodel(x, mass, width)");

   // create RooAddPdf in extended mode
   w.factory("nbackground[10000, 0, 1000000]");
   w.factory("nsignal[100, 0.0, 1000.0]");
   w.factory("SUM::model(nbackground*bmodel, nsignal*smodel)");
 w.factory("SUM:model(nsig[100,0,10000]*sig_pdf, nbkg[1000,0,10000]*bkg_pdf)");  // for extended model

  • Fit the data: you need to call the RooAbsPdf::fitTo.
// fit the data and save its result. Use eventually the optional =Minuit2= minimiser
RooFitResult * res = pdf->fitTo(*data, Minimizer("Minuit2","Migrad"), Save(true) );
  • Plot the resulting fit function in the same plot.
  • Plot also the signal and background fit components with different colour and style
//draw the two separate pdf's
pdf->plotOn(plot, RooFit::Components("bkg_pdf"), RooFit::LineStyle(kDashed) );
pdf->plotOn(plot, RooFit::Components("sig_pdf"), RooFit::LineColor(kRed), RooFit::LineStyle(kDashed) );
plot->Draw();   // to show the RooPlot in the current ROOT Canvas

// macro to fit Higgs to gg spectrum

using namespace RooFit;

void fitHgg() {

   // read from the file and create a ROOT tree

   TTree tree("tree","tree");
   int nevt = tree.ReadFile("Hgg.txt","x");
   if (nevt <= 0) {       Error("fitHgg","Error reading data from input file ");       return;    }    std::cout << "Read " << nevt << " from the file " << std::endl;    // make the RooFit model     RooWorkspace w("w");    w.factory("x[110,160]");  // invariant mass        w.factory("nbackground[10000, 0, 10000]");    //w.factory("Exponential::z1(x, a1[-1,-10,0])");    w.var("nbackground")->setVal(nevt);

   // create exponential model as two components
   w.factory("a1[ 7.5, -500, 500]");
   w.factory("a2[-1.5, -500, 500]");
   w.factory("expr::z('-(a1*x/100 + a2*(x/100)^2)', a1, a2, x)");
   w.factory("Exponential::bmodel(z, 1)");

   // signal model   
   w.factory("nsignal[100, 0.0, 1000.0]");
   //w.factory("mass[%f, %f, %f]' % (massguess, massmin, massmax))
   w.factory("mass[130, 110, 150]");
   w.factory("width[1, 0.5, 5]");
   w.factory("Gaussian::smodel(x, mass, width)");
   RooAbsPdf * smodel = w.pdf("smodel");

   w.factory("SUM::model(nbackground*bmodel, nsignal*smodel)");
   RooAbsPdf * model = w.pdf("model");

   // create RooDataSet
   RooDataSet data("data","data",*w.var("x"),Import(tree) );
   RooFitResult * r = model->fitTo(data, Minimizer("Minuit2"),Save(true), Offset(true));

   // plot data and function

   RooPlot * plot = w.var("x")->frame();
   model->plotOn(plot, Components("bmodel"),LineStyle(kDashed));
   model->plotOn(plot, Components("smodel"),LineColor(kRed));



  • Hgg Fit Result

Fit of the H->gg CMS data

Pointing handDo a scan of the likelihood (or profile likelihood) as function of the Higgs mass parameter.

RooStats Exercises

For the RooStats exercises see this separate Twiki page . The slides are available (here PDF.

IPhython notebook

The fitting and RooFit and RooStats exercises are available also as IPhython notebook and they are attached to the page.

-- LorenzoMoneta - 26 Nov 2013

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatipynb GausFit.ipynb r1 manage 6.8 K 2015-11-24 - 05:05 LorenzoMoneta IPhython notebooks for Fitting and RooFit exercises
Unknown file formatipynb GausModelRooFit.ipynb r1 manage 59.8 K 2015-11-24 - 05:05 LorenzoMoneta IPhython notebooks for Fitting and RooFit exercises
Texttxt Hgg.txt r1 manage 330.5 K 2015-11-23 - 20:55 LorenzoMoneta Higgs data file
Unknown file formatipynb HiggsFit.ipynb r1 manage 4.3 K 2015-11-24 - 05:22 LorenzoMoneta Python notebook for Higgs fit
Unknown file formatipynb PeakFit.ipynb r1 manage 73.4 K 2015-11-24 - 05:05 LorenzoMoneta IPhython notebooks for Fitting and RooFit exercises
PDFpdf ROOT_Rio2015_Fitting.pdf r1 manage 3764.4 K 2015-11-24 - 11:58 LorenzoMoneta Lecture slides
PDFpdf ROOT_Rio2015_Part1.pdf r1 manage 46067.9 K 2015-11-24 - 12:06 LorenzoMoneta Lecture slides
PDFpdf ROOT_Rio2015_Part2.pdf r1 manage 2047.5 K 2015-11-24 - 12:06 LorenzoMoneta Lecture slides
PDFpdf ROOT_Rio2015_RooStats.pdf r2 r1 manage 12989.2 K 2015-11-26 - 04:24 LorenzoMoneta RooStats slides
PDFpdf Statistics_Rio_Part1.pdf r1 manage 4874.0 K 2015-11-24 - 11:59 LorenzoMoneta Lecture slides
C source code filec fitHgg.C r1 manage 1.7 K 2015-11-23 - 20:57 LorenzoMoneta Higgs fit solution files
PDFpdf fitHgg.pdf r1 manage 25.2 K 2015-11-23 - 20:57 LorenzoMoneta Higgs fit solution files
Compressed Zip archivetgz roostats_notebooks.tgz r2 r1 manage 199.5 K 2015-11-26 - 04:22 LorenzoMoneta Tar file with RooStats notebooks and macros
Edit | Attach | Watch | Print version | History: r12 < r11 < r10 < r9 < r8 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r12 - 2015-11-27 - LorenzoMoneta
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback