Introduction

Are you looking for the tutorial? See this page.

This analysis framework has been designed to be run without much effort for the end-user, but requires a little more understanding for customization. A C++ code and a python script have been provided as examples. The idea is that analyses should be applied to events rather than to ntuple branches, which constitute only a medium to organize informations about events. To do this at runtime, we factorized as much as possible all the components:

  • Input Ntuple format and ROOT file handling
  • Analysis algorithm and cut-flow support
  • Histogramming, trigger streams and output file handling

Moreover, the Reflex technology has been applied in order to use the AnalysisFramework library within C++ and python (and any other language that supports reflection informations) codes.

Getting Started

For a brief tutorial click this link.

We suggest you to create a development directory and download/compile all the stuff there. Moreover, it is useful to have a local directory that will contain your user-dependent libraries.

   cd $HOME
   mkdir -p local
   mkdir -p development

In your working area, setup athena or be sure that you have the Boost libraries, BAT and Minuit - in the latter case you have to modify Makefile.inc

   cd $HOME/development
   svn co $ATLGRP/Institutes/Bologna/AnalysisFramework/trunk AnalysisFramework
   cd AnalysisFramework
   gmake -j4 # 4 is the number of CPUs in your computer

If the local directory is not under $HOME we suggest you to modify the first line of Makefile.inc

  PREFIX=your_local_directory_here

or to run make PREFIX=your_local_directory_here.

As an example, try out our ttbar cut-and-count (or cut-based) analysis. You can grab it from SVN:

   cd $HOME/development
   svn co $ATLGRP/Institutes/Bologna/TTbarMassAnalysis/trunk TTbarMassAnalysis
   cd TTbarMassAnalysis
   make

You can create a symlink to some ntuples stored locally. You can modify the makeLinksToDatasets.sh according to your path structure.

Ntuples

The system has been designed to allow one user to read from different kind of ntuples in a transparent way, i.e. to offer a common, straightforward interface to get events to be analyzed.

Suppose we want to deploy a given TopD3PD ntuple, implemented in a tree named "physics" which contains a number of branches that could change over time. You may not need all of them: disabling some of them may reduce the processing time significantly.

First, one must create a "wrapper" using ROOT:

  root mytopd3pd.root
  physics->MakeClass("TopD3PD")

This can also be done using scripts/NtupleInterfaceMaker.py utility.

   . $HOME/local/AnalysisFramework/scripts/NtupleInterfaceMaker.py -t analysis -w TopD3PD your_root_file_here.root

It is necessary to modify a bit the implementation file, adding:

using namespace std;

ROOT classes have all branches as public members. This results in a monolithic object, which could be heavily changed as the definition of the ntuple changes. Instead, we decided to use the Adapter Pattern to wrap the definition of the braches from what the user expects to see, i.e. a class that provides the object of the analysis (an event). This object can be processed by the user-defined analysis on an event-by-event basis. This is done using a limited number of classes that fit together: Ntuple, Analysis and HistogramManager. The idea is that these represent three separate parts of your job First you configure each of the pieces, and then you link them like this:

  TopInputsWrapper * ntuple = (TopInputsWrapper*)NtupleFactory::CreateNtuple( TopInputs );
  ...configure your ntuple...
  HistogramManager * hm = HistogramManager::GetInstance();
 ...configure histogram manager...
 MyCoolAnalysis * analy = new MyCoolAnalysis();
 ...configure your cool analysis...
 
 analy += ntuple;
 analy += hm;
 
 analy.Loop();

All ntuples derive from a base class obviously called Ntuple, whose main public functions are:

int             AddDirectory(const std::string& directory);
virtual Event*  GetEvent( long nevt ) = 0;
Event*          GetCurrentEvent()
Event*          GetNextEvent();
virtual  long   GetNEvents() = 0;
long            GetCurrentEventNumber() const

While some of its private memeber functions are virtual:

 
  bool            MakeEvent( const long eventNumber = -1 ); 
  void            ResetEvent();
  // they depend on ntuple's informations!
  virtual bool    MakeEventInfo() = 0;
  virtual bool    MakeTriggers()  = 0;
  virtual bool    MakeElectrons() = 0;
  virtual bool    MakePhotons()   = 0;
  virtual bool    MakeMuons()     = 0;
  virtual bool    MakeTaus()      = 0;
  virtual bool    MakeJets()      = 0;
  virtual bool    MakeMissingET() = 0;
  virtual bool    MakeTracks()    = 0;
  virtual bool    MakeVertices()  = 0;

At this point, the user can define his/her own ntuple this way:

class TopInputsWrapper : public Ntuple, private TopD3PD { ... };

and define the virtual functions that actually create the objects.

Finally, a factory method (NtupleFactory.cxx) simplifies the creation of standard ntuples, such as:

TopInputsWrapper * ntuple =  (TopInputsWrapper*)NtupleFactory::CreateNtuple( TopInputs );

Once the ntuple has been defined, this is the only method that has to be called in order to create the instance of a standard ntuple. The user is free to create his/her own ntuple format in standalone mode. You do not need to modify the NtupleFactory in order to deploy a custom-designed ntuple.

To read in the files:

ntuple->AddDirectory( inputDirectory );

The ntuple object provides a pointer to an event via GetNextEvent() or GetEvent(long nevt) functions. In this paradigm, analyses are applied to events, not to ntuple branches!

Histograms

A HistogramManager class is provided to manage histograms. Histograms are defined in a .xml file (namely

histograms.xml
but can be passed as a parameter to the command line executable).

The histograms are defined as such (ROOT-like style):

<TH1F name="cutflow"   title="Cut Flow" nbins=8 xlow=-0.5 xup=7.5 global=1/>
<TH1F name="hadt_m"   title="Hadronic Top Mass" nbins=100 xlow=0 xup=500 />
<TH2F name="etmiss_eT_vs_leptW_mT"   title="E_{T}^{miss} vs leptonic W transverse mass" nbinsx=50 xlow=0 xup=200 nbinsy=100 xlow=0 xup=100 />

global means that only one plot is present per analysis channel (ELE, MUON) and not for each cut level. This can be useful e.g. in order to store the event weights.

Every analysis channel will contain histogram such as EGAMMA_el_pT, EGAMMA_mu_pT, EGAMMA_etmiss_eT, MUON_el_pT, MUON_mu_pT, MUON_etmiss_eT, etc... HistogramManager is a singleton, which means that only one instance is created per program.

  static HistogramManager * GetInstance();
  void AddAnalysisChannel( const AnalysisChannel& analysisChannel );
  void BookHistogram( const string& name, const string& title, int nbinsx, double xmin, double xmax );
  void BookHistogram( const string& name, const string& title, int nbinsx, double xmin, double xmax,
                                                               int nbinsy, double ymin, double ymax );
  void Finalize();
  TH1* Get( const string& name, const AnalysisChannel& analysisChannel );
  bool OpenOutputFile( const string& filename );
  void PrintBookedHistograms( const AnalysisChannel& analysisChannel = "" );

At the end of the analysis, the user should ask the histogram manager to Finalize() - i.e. to save histograms on file and delete booked histograms from memory.

Analyses

Analyses can be subclassed from the general Analysis base-class, or from a more specialized CutBasedAnalysis. The latter provide support for cut flows. The base-class has some virtual functions. You can be interest in overloading the Execute() member function, which is called at each iteration in the event loop. However, we suggest you to deply another approach that will be explained later on.

public:
  bool                           SetNtuple( Ntuple * ntuple ); // same as +=
  bool                           SetHistogramManager( HistogramManager * hm ); // same as +=
  virtual bool                   BookHistograms();
  bool                           Loop( const long nevents = -1 );
  virtual bool                   Finalize(); 
  virtual void                   PrintParameters();
protected:
  virtual bool                   Initialize();
  virtual bool                   PrepareBeforeLoop();
  virtual bool                   Execute() = 0;
  virtual void                   DumpCurrentEvent();

Cut-based analyses

The user can define a cut flow (e.g. ELE, MUON) and add a list of cuts to each cut flow.

public:
virtual bool                   CreateCutFlow( const string& key );
virtual inline CutFlow * const GetCutFlow( const string& key );

protected:
virtual bool                   ApplyCutFlow( const string& key, int * lastCutPassed = NULL );

For example:

const string key = "ElectronSingleLeptonSemileptonicTTbar";
CreateCutFlow( key );
GetCutFlow(key)->AddCut( new SingleLeptonCut( ELECTRON_CHANNEL ) );
GetCutFlow(key)->AddCut( new MissingETCut( 20.*GeV, "ETmiss>20 GeV" ) );
GetCutFlow(key)->AddCut( new JetCut( 20.*GeV, 2.5, 4, "4j20" ) );

The list of available cuts is quite long. Some of them are:

Other cuts can be created subclassing the Cut base-class. The user must re-define at least the virtual function

virtual bool Apply( const Event& evt, int * lastCutPassed = NULL ) const;

and possibily the configuration function :

virtual void Configure( const TiXmlElement& xml );

An example of a complete analysis can be found here.

Remember that even CutBasedAnalysis cannot be instantiated directly: the user must define somewhere the Execute() function!

Your Own Analysis

In order to make your own analysis, just create a class that inherits publicly from Analysis or CutBasedAnalysis. You must override the Execute() member function, which is pure virtual in Analysis.

If you have planned to use the plugin factory called Jigsaw (see below), you must add the following lines to the header (assuming that your analysis is called TTbarMassAnalyzer):

typedef AnalysisFactory<TTbarMassAnalyzer> AnalysisFactory_TTbarMassAnalyzer;

extern "C" {
  AnalysisFactory_TTbarMassAnalyzer * MakeAnalysis();
}

plus the implementation in the source:

extern "C" {
  AnalysisFactory_TTbarMassAnalyzer * MakeAnalysis() 
  { 
    return new AnalysisFactory_TTbarMassAnalyzer(); 
  }
}

Explanation: the plugin factory looks for a file called libNAMEAnalyzer.so, where NAME is something physically meaningful such as TTbarMassAnalyzer. It is expected to find in the library a symbol called MakeAnalysis, which must be a function that returns an object of type AnalysisFactory. AnalysisFactory takes the analysis class you defined as template. Notice that AnalysisFactory_TTbarMassAnalyzer is for internal use, you can assign whatever name you prefer, but we strongly recommend you use AnalysisFactory_NAME for consistency.

Analysis Configuration

Even if it is still possible to configure the cuts "by hands", we stongly recommend to use a XML file. This is one working example:

<analysis name="Baseline" >

<cutflow key="ELE">
  <cut type="dummy" />
  <cut type="trigger" L1="" L2="" EF="EF_e15_medium" >
    <schedule manipulator="H_T" minjetpt=20 includeetmiss=1 />
  </cut>
  <cut type="pvx" nvtx=1 ntrk=4 />
  <cut type="sl" flavor="ele" >
    <schedule manipulator="leptW" flavor="ele" strategy="wmassconstraint" />
    <schedule manipulator="leptW_mT" flavor="ele" />
  </cut>
  <cut type="etmiss" met=35 />
  <cut type="mTW" min=25 flavor="ele" />
  <cut type="jetcut" n=4 ptmin=25 >
    <schedule manipulator="hadtnobtag" />
    <schedule manipulator="hadwfromhadtop" strategy="masswindow" />
    <schedule manipulator="leptt" />
    <schedule manipulator="ttbar" strategy="full" />
    <schedule manipulator="polarization" particletag="ttbar" />
  </cut>
  <cut type="btag" n=1 likelihood=5.85 />
  <cut type="jetcleaning" />
</cutflow>

<cutflow key="MUON">
  <cut type="dummy" />
  <cut type="trigger" L1="" L2="" EF="EF_mu10_MSonly,EF_mu13,EF_mu13_tight" > 
    <schedule manipulator="H_T"   minjetpt=20 includeetmiss=1 />
  </cut>
  <cut type="pvx" nvtx=1 ntrk=4 />
  <cut type="sl" flavor="mu" >
    <schedule manipulator="leptW" flavor="mu" strategy="wmassconstraint" />
    <schedule manipulator="leptW_mT" flavor="mu" />
  </cut>
  <cut type="etmiss" met=20 />
  <cut type="toptriangular" min=60 flavor="mu" />
  <cut type="jetcut" n=4 ptmin=25 >
    <schedule manipulator="hadtnobtag" />
    <schedule manipulator="hadwfromhadtop" strategy="masswindow" />
    <schedule manipulator="leptt" />
    <schedule manipulator="ttbar" strategy="full" />
    <schedule manipulator="polarization" particletag="ttbar" />
  </cut>
  <cut type="btag" n=1 likelihood=5.85 />
  <cut type="jetcleaning" />
</cutflow>

</analysis>

In AnalysisFramework's jargon, an EventManipulator is a class that takes the event in, and creates some quantity of interest, such as scalars (H_T) or reconstructed particles (hadronic top). Reconstructed quantities are added temporarily to the event via AddReconstructedScalarQuantity() and AddReconstructedParticle() member functions. These quantities end up in a stl map, which is a string-indexed array. GetReconstructedScalarQuantity() and GetReconstructedParticle() allow you to take these objects back from the event.

Event manipulators can be avoided putting some code directly inside the Execute() function. However, experience tell us that a lot of functions are shared among different analyses, and copy&paste is not always a good choice.

Execution

How to run with C++

Using Jigsaw

When you compile AnalysisFramework, the executable Jigsaw gets installed in $PREFIX/bin (is this directory in your $PATH?). When launched in you analysis directory, Jigsaw loads the shared library and containing the analysis plugin. Expect to see something similar to this:

#>  Jigsaw -d links2datasetsMC/105860_TTbar_PowHeg_Jimmy -n 1000

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Jigsaw                       
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Running on 1000 events
Found analysis: libTTbarMassAnalyzer.so / TTbarMassAnalyzer
Plugin loaded...
Initializing base class Analysis
Registering standard event manipulators
Initializing CutBasedAnalysis
Initializing histogram fillers
Added histogram filler EventWeight / 1
Added histogram filler FinalStateParticles / 2
Default histogram fillers 2
Initializing TTbarMassAnalyzer
TTbarMassAnalyzer: adding histogram fillers
...

The Old Way

You can still program an executable of your own!

const string outFilename = "BaselineAnalysis.root";
const string inDir = "/home/disipio/datanas/ntuples/MC/MC09.105861.PowhegPythia";
const long max_events = -1; // means all events found in ntuple

TTbarMassAnalysis analy(run); // defines a cut flow and implements Execute() 

HistogramManager * p_hm = HistogramManager::GetInstance();
p_hm->OpenOutputFile( outFilename );
analy.SetHistogramManager( p_hm );

TopInputsWrapper * ntuple = (TopInputsWrapper*)NtupleFactory::MakeNtuple( TopInputs );
ntuple->AddDirectory( inDir );
analy.SetNtuple( ntuple );
analy.Loop( max_event );
analy.Finalize();

How to run with Python

This feature is currently disables due to problems during compilation on the grid. In any case:

import ROOT
import PyCintex
PyCintex.loadDict("libTTbarMassAnalysis") 

from ROOT import TTbarMassAnalysis, HistogramManager, NtupleFactory

run = "105861"
max_event = 100
outFilename    = "BaselineAnalysis.root";
PowhegPythiaDir = "/home/disipio/datanas/ntuples/MC/MC09.105861.PowhegPythia"

ntuple = NtupleFactory.MakeNtuple( "JSearcyD3PD" )
ntuple.AddDirectory( PowhegPythiaDir )

hm = HistogramManager.GetInstance()
hm.OpenOutputFile( outFilename )

analy = TTbarMassAnalysis()
analy.SetHistogramManager( hm )
analy.SetNtuple( ntuple )

analy.Loop( max_event )
analy.Finalize()

-- RiccardoDiSipio - 22-Apr-2010

Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r9 - 2011-10-06 - RiccardoDiSipio
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback