Scanning and plotting events interactively in FWLite

fwlite::Scanner is a simple tool that provides functionalities similar to ROOT's TTree::Scan and TTree::Draw but using the FWLite libraries and the physicstools string parser to intepret the expressions, which is sometimes more convenient than the parser in CINT. It can also be used to create and fill on the fly RooFit datasets for quick unbinned fits.

A short presentation of this tool was given at the AnalysisTools meeting on 16/03/2010

At a glance

startup:

  • open ROOT, and set up your FWLite environment unless it's done automatically in your rootlogon.C
  • open your favourite CMSSW data file(s), and make a fwlite::Event or fwlite::ChainEvent. e.g.
  • load the scanner library and headers
TFile *muonFile = TFile::Open("rfio:/castor/cern.ch/user/g/gpetrucc/900GeV/DATA/Feb9Skims/Data_CollisionEvents_MuonSkim.root")
#include "DataFormats/FWLite/interface/Event.h"
fwlite::Event muonEvents(muonFile);
int loadme2 = gSystem->Load("libPhysicsToolsFWLite.so");
#include "CMS.PhysicsTools/FWLite/interface/Scanner.h"

make the scanner:

fwlite::Scanner<std::vector<reco::Muon> > muons(&muonEvents, "muons");

print out some variables of the muons, applying cuts:

muons.scan("pt:eta:phi:track.numberOfValidHits","isTrackerMuon");
 :       RUN : LUMI :     EVENT : #IT :       pt :      eta :      phi : alidHits :
 ---------------------------------------------------------------------------------- 
 :    123615 :   73 :   4651337 :   0 :  3.52303 : -1.66056 : -2.39754 :       19 :
 :    123732 :   73 :   7360653 :   0 :  0.94450 : -2.15472 :  2.05554 :       20 :
 :    123732 :   77 :   8904455 :   0 :  1.22863 :  -2.0736 :  2.33478 :       21 :
 :    123732 :   80 :  10253141 :   0 :  1.02454 : -1.89976 : -3.09361 :       19 :
 :    123732 :   81 :  10492607 :   0 :  2.23193 :  1.92248 : -2.36899 :       25 :
 :    123732 :   88 :  13374301 :   0 :  1.67321 : -1.69078 :  1.17773 :       21 :
 :    123732 :   99 :  17480050 :   0 :  1.65535 :  2.01428 :   3.0603 :       25 :
 :    123732 :  104 :  19369363 :   0 :  1.06741 : -1.75581 : -3.13176 :       20 :
 :    123815 :    9 :   2501115 :   0 :  1.93197 : -1.59389 : -0.23051 :       20 :

make a few plots:

muons.draw("pt","isGlobalMuon"); // automatic x-axis
muons.draw("eta",5,-2.5,2.5,"isGlobalMuon", "E1"); // specify binning and draw options

// 2D plots (binned)
muons.draw2D("eta","pt","isTrackerMuon","COLZ"); 
muons.draw2D("eta",5,-2.5,2.5,"pt",5,0.,5.,"isTrackerMuon","BOX TEXT"); // manual binning

// Profiles
muons.drawProf("eta",10,-2.5,2.5,"track.normalizedChi2","isTrackerMuon")

// Unbinned 2D scatter plot
muons.drawGraph("eta","pt","isTrackerMuon")

// Normalized plots
TH1 *tpt = muons.draw("pt",10,0,5,"isTrackerMuon", "NORM");
tpt->SetLineColor(kBlue);
TH1 *gpt = muons.draw("pt","isGlobalMuon", "NORM SAME");
gpt->SetLineColor(kRed);

// Fill a CMS.RooFit dataset with some variables
RooDataSet *ds = muons.fillDataSet("pt:eta:@hits=track.numberOfValidHits", "@id=muonID('TMLastStationAngTight'):@goodChi2=track.normalizedChi2<4", "track.isNonnull");

Documentation

Creating and configuring the fwlite::Scanner

The fwlite::Scanner<C> class can be created to scan through the elements of a collection C . In the usual case C will be a std::vector>something<, but the scanner can also handle some other EDM collections like OwnVector, RefVector, RefToBaseVector, PtrVector

A scanner object is created starting from a fwlite::Event (or fwlite::ChainEvent) and the usual set of labels like for the getByLabel call.

fwlite::Scanner(fwlite::EventBase *ev, const char *label, const char *instance = "", const char *process="") 

There are some options that you can configure:

  • setPrintFullEventId(bool printIt=true): if set to true, when scanning it will print the run, lumisection and event number for each row; if set to false, it will just print the index of the event within the file. The default is true for real data, false for mc.
  • setExpressionSeparator(TString separator): sets the character used to separate the expressions in the scan command; defaults to "==:==" as in ROOT.
  • setIgnoreExceptions(bool ignoreThem): by default all cms::Exceptions thrown when examining the event are printed to the error stream; setting this flag to true will cause the parser to silently ignore the exceptions.
  • setMaxLinesToPrint(int lines): sets the number of lines of printout after which the scanner pauses and asks for confirmation to continue; defaults to 50; if set to a negative number, it will never stop.
  • setMaxEvents(int max): if set to a positive number M, only the first M events in the file will be considered.

IDEA! If you get some errors when trying to create the scanner object, it could be that your're not specifying the type correctly (CINT has some problems in understanding templates and typedefs for objects that are accessed through Reflex dictionaries). In order to find out what is the correct type name that CINT will like you can do something like Events->GetBranch("recoMuons_muons__RECO.obj")->GetClassName() and then substitute edm::Wrapper with fwlite::Scanner.

Scanning and counting events

The main method to use is

fwlite::Scanner<C>::scan(const char *exprs, const char *cut="", int nmax=-1) ;
It will scan the first nmax events in the file, and for the objects that pass the given selection cut print out the values of one or more expressions (separated by "==:==")

The title of each column is the text of the expression, cropped to the last 8 characters if it's too long. If this is not satisfactory, one can specify a title explicitly by using the notation "@title=expression" Each row is prefixed by the event id and by the index of the object within the collection.

A method count exist for counting the number of objects that pass a selection

size_t fwlite::Scanner<C>::count(const char *cut) ;

You can also count the number of events, using countEvents(). This is normally not needed, but it might be relevant when you're using some feature to filter the list of events (see below).

Plotting

Histograms

1D histograms can be produced with the draw methods.

The different methods provide a different control on how the histogram is binned:

  • for the automatic binning, use draw(const char *expr, const char *cut = "", TString drawopt = "", const char *hname = "htemp")
  • for manual binning, for for one of the two
    • draw(const char *expr, int nbins, double xlow, double xhigh, const char *cut = "", TString drawopt = "", const char *hname = "htemp")
    • draw(const char *expr, int nbins, double *binEdges, const char *cut = "", TString drawopt = "", const char *hname = "htemp")
  • to create a histogram cloning a template histogram use draw(const char *expr, const char *cut, TString drawopt, const char *hname, const char *, const TH1 *template)
  • fill in an existing histogram draw(const char *expr, const char *cut, TString drawopt, const char *hname, const TH1 *hist) . Note that the histogram is not reset before filling.
All methods return a pointer to the histogram; note however that the histograms named "htemp" will be considered temporary, and so they will be overwritten by the next draw command (except if drawopt contains "SAME").

A few comments on the draw options:

  • If GOFF is among the options, the plot is filled but_not_ drawn (as normally in TTree::Draw )
  • If NORM is among the options, the plot is scaled by 1.0/hist->Integral() before drawing
  • If SAME is specified, the histogram is drawn on top of the previous one, using the binning of that one.
    ALERT! This doesn't always work, although it should work in the most common cases; if you have problems, a failsafe solution is to give each histogram a different name and the same binning, keep the pointers returned by the multiple calls to draw, and then call Draw and Draw("SAME") manually.

HELP You can safely ignore the messages Warning in <TFile::Append>: Replacing existing TH1: htemp (Potential memory leak). that might pop up sometimes when using SAME (it's not totally trivial to correctly delete the old histogram while still keepting it displayed)

Profile histograms

1D profile histograms can be produced with the drawProf methods:
  • automatic binning: drawProf(const char *xexpr, const char *yexpr, const char *cut = "", TString drawopt = "", const char *hname = "htemp")
  • manual binning: drawProf(const char *xexpr, int nbins, double xlow, double xhigh, const char *yexpr, const char *cut = "", TString drawopt = "", const char *hname = "htemp")
  • filling an existing profile: drawProf(const char *xexpr, const char *yexpr, const char *cut, TString drawopt, TProfile *profile)

Two-dimensional histograms and plots

There are two types of bidimensional plots you can produce: binned 2D histograms (TH2) and unbinned scatter plots (TGraph). .

2D histograms can be produced with the draw2D methods:

  • automatic binning: draw2D(const char *xexpr, const char *yexpr, const char *cut = "", TString drawopt = "", const char *hname = "htemp")
    note that unlike for 1D histograms, the automatic binning of 2D histograms requires to loop on the data twice.
  • manual binning: draw2D(const char *xexpr, int nbinsx, double xlow, double xhigh, const char *yexpr, int nbinsy, double ylow, double yhigh, const char *cut = "", TString drawopt = "", const char *hname = "htemp")
  • filling an existing histogram: draw2D(const char *xexpr, const char *yexpr, const char *cut, TString drawopt, TH2 *h2)

Graphs can be made with drawGraph:

  • new graph: drawGraph(const char *xexpr, const char *yexpr, const char *cut = "", TString drawopt = "", const char *hname = "htemp")
  • add points to an existing histogram: drawGraph(const char *xexpr, const char *yexpr, const char *cut, TString drawopt, TGraph *graph)

Filling RooFit datasets

RooFit datasets can be created using the method
RooDataSet * fwlite::Scanner<C>::fillDataSet(const char *realvars, const char *boolvars, const char *cut="", const char *name="data") 

Real variables, mapped to RooRealVar can be declared just like the fields in the scan() method: either a simple expression, or @<name>=<expression>.
For boolean variables, a cut expression can be provided, so comparison and boolean operators are allowed; the variable will be mapped to a RooCategory with two states fail (0) and pass (1). The name of the category can be specified using the same syntax as for real variables.

Selecting the events to scan or plot

The cut that is applied in the various draw and scan commands only apply to individual items within the analyzed collection. If you want instead to filter the events that are accessed by the scanner, you can provide any event selection class that implements the generic interface defined in CMS.PhysicsTools/FWLite/interface/EventSelectors.h

namespace fwlite {
    class EventSelector : public TNamed {
        public:
            EventSelector(const char *name="", const char *title="") : TNamed(name,title) {}
            virtual ~EventSelector() {}
            virtual bool accept(const fwlite::EventBase &ev) const = 0;
    };   
}

EventSelector objects can be added to a fwlite::Scanner by using the addEventSelector method. The list of all selectors associated to a scanner can be retrieved using the eventSelectors method (which returns a TObjArray), and can be cleared using clearEventSelectors. Note that the Scanner doesn't own the pointers to the event selectors, which means you can add the same selector to more than one scanner; it also means you should delete the selectors yourself when you no longer need them.

The Scanner will process only the events that pass all the associated event selectors (i.e. it does the AND of them). Events rejected by the selectors still count against the maximum number of events to process set by the setMaxEvents method.

Two existing event selector implementations are provided:

  • fwlite::RunLumiSelector to select only events from a given run or lumi range, which is probably not very useful except as an example on how to write EventSelectors
  • fwlite::ObjectCountSelector<C> to select events by counting the number of elements of a collection C that pass a given cut. An ObjectCountSelector<C> can be created by providing the input labels (module, instance, process), a cut, and the minimum and maximum number of objects for the event to be accepted. %BR By default, minimum is 1 and maximum is set to no limit (-1). The minimum and maximum are inclusive, i.e. it select events for which min ≤ n ≤ max.
    The cut and limits can be specified also afterwards using setMin, setMax, setCut. Just like the Scanner, it can be configured to ignore cms::Exceptions when evaluating the cut by calling setIgnoreExceptions

An example fragment of macro to draw the mean number of tracks vs η for events that have at least one track with pT > 500 MeV and |&eta| < 1 is

fwlite::Event ev(gFile);
fwlite::Scanner<std::vector<reco::Track> > sc(&ev, "generalTracks");
fwlite::ObjectCountSelector<std::vector<reco::Track> > ntracks("generalTracks","","", "pt > 0.5 && abs(eta)<1", 1);
sc.addEventSelector(&ntracks);
TH1 *heta = sc.draw("eta",5,-2.5,2.5, "quality('highPurity')");

Reviewer/Editor and Date (copy from screen) Comments
GiovanniPetrucciani - 24-Mar-2010 created page

Responsible: GiovanniPetrucciani
Last reviewed by: GiovanniPetrucciani - 24-Mar-2010

Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2010-05-04 - BorisMangano
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback