4.1.1 More on CMSSW Framework

Complete: 5
Detailed Review Status

Contents

Goals of this page

When you finish this page, you should understand:
  • the modular architecture of the CMSSW framework and the Event Data Model (EDM)
  • how data are uniquely identified in an Event
  • how Event data are processed - AOD and miniAOD structures
  • the Framework Services, including the EventSetup

Introduction

The overall collection of software, referred to as CMSSW, is built around a Framework, an Event Data Model (EDM), and Services needed by the simulation, calibration and alignment, and reconstruction modules that process event data so that physicists can perform analysis. The primary goal of the Framework and EDM is to facilitate the development and deployment of reconstruction and analysis software.

Modular Event Content

It is important to emphasize that the event data architecture is modular, just as the framework. Different data layers (using different data formats) can be configured, and a given application can use any layer or layers. The branches (which map one to one with event data objects) can be loaded or dropped on demand by the application. The following diagram illustrates this concept:

modular_event_products.gif

You can reprocess event data at virtually any stage. For instance, if the available AOD doesn't contain exactly what you want, you might want to reprocess the RECO (e.g., to apply a new calibration) to produce the desired AOD.

reprocess.gif

Custom quantities (data produced by a user or analysis group) can be added to an event and associated with existing objects at any processing stage (RECO/AOD -> candidates -> user data). Thus the distinction between "CMS data" and "user data" may change during the lifetime of the experiment.

user_data_in_event.gif

Identifying Data in the Event

Data within the Event are uniquely identified by four quantities:

C++ class type of the data
E.g., edm::PSimHitContainer or reco::TrackCollection.
module label
the label that was assigned to the module that created the data. E.g., "SimG4Objects" or "TrackProducer".
product instance label
the label assigned to object from within the module (defaults to an empty string). This is convenient if many of the same type of C++ objects are being put into the edm::Event from within a single module.
process name
the process name as set in the job that created the data

For example if you do (you can find the file MYCOPY.ROOT here) :

edmDumpEventContent MYCOPY.root

you get this output:

vector<reco::TrackExtra>          "electronGsfTracks"     ""            "RECO."        
vector<reco::TrackExtra>          "generalTracks"         ""            "RECO."        
vector<reco::TrackExtra>          "globalMuons"           ""            "RECO."        
vector<reco::TrackExtra>          "globalSETMuons"        ""            "RECO."        
vector<reco::TrackExtra>          "pixelTracks"           ""            "RECO."        
vector<reco::TrackExtra>          "standAloneMuons"       ""            "RECO."        
vector<reco::TrackExtra>          "standAloneSETMuons"    ""            "RECO."        
vector<reco::TrackExtra>          "tevMuons"              "default"     "RECO."        
vector<reco::TrackExtra>          "tevMuons"              "firstHit"    "RECO."        
vector<reco::TrackExtra>          "tevMuons"              "picky"       "RECO."    

In the above output:

vector<reco::TrackExtra> is the C++ class type of the data

globalMuons" is the module label

firstHit is the product instance label

RECO is the process name

Getting data from the Event

All Event data access methods use the
edm::Handle<type>
where type is the C++ type of the datum, to hold the result of an access.

To request data from an Event, in your module, use a form of one of the following:

  • get which either returns one object or throws a C++ exception.
  • getMany which returns a list of zero or more matches to the data request.
After get or getMany, indicate how to identify the data , e.g. getByLabel or getManyByType, and then use the name associated with the handle type, as shown in the example below.

Sample EDAnalyzer Code

Here is snippet from EDAnalyzer code called DemoAnalyzer.cc ( used in the next section) showing how data is identified and accessed by a module. Notes follow:

void DemoAnalyzer::analyze(const edm::Event& iEvent, const edm::EventSetup& iSetup)

{

// These declarations create handles called "tracks" to the types of records "reco::TrackCollection" that you want
// to retrieve from event "iEvent".

using namespace edm; 
edm::Handle<reco::TrackCollection> tracks;

// Pass the handle "tracks" to the method "getByLabel", which is used to 
// retrieve one and only one instance of the type in question with
// the label specified out of event "iEvent". If more than one instance 
// exists in the event, then an exception is thrown immediately when
// "getByLabel" is called.  If zero instances exist which pass
// the search criteria, then an exception is thrown when the handle
// is used to access the data.  (You can use the "failedToGet" function
// of the handle to determine whether the "get" found its data before
// using the handle)


        iEvent.getByLabel("generalTracks", tracks); 
         .....................
         .....................

}

Notes:
  • Line 1: The method analyze receives a pointer iEvent to the object edm::Event which contains all event data.
  • Middle section: Containers are provided for each type of event data and can be obtained by using the object edm::Handle.
  • Last 3 section: iEvent.getByLabel (handle to types of event data) will retrieve the data from the event and store them in a container in memory.

No matter which way you request the data, the results of the request will be returned in a smart pointer (C++ handle) of type edm::Handle<>.

You may refer to the code 4.1.2 called DemoAnalyzer.cc to see a used case.

The Processing Model

Events are processed by passing the Event through a sequence of modules. The exact sequence of modules is specified by the user via a path statement in a configuration file. A path is an ordered list of Producer/Filter/Analyzer modules which sets the exact execution order of all the modules. When an Event is passed to a module, that module can get data from the Event and put data back into the Event. When data is put into the Event, the provenance information about the module that created the data will be stored with the data in the Event. The components involved in the framework and EDM are shown here:

fw_edm.gif

The Standard Input Source shown above uses a ROOT I/O. The Event is then passed to the execution paths. The paths can then be ordered into a list that makes up the schedule for the process. Note that the same module may appear in multiple paths, but the framework will guarantee that a module is only executed once per Event. Since it will ask for exactly the same products from the event and produce the same result independent of which path it is in, it makes no sense to execute it twice. On the other hand a user designing a trigger path should not have to worry about the full schedule (that could involve 100's of modules). Each path should be executable by itself, in that modules within the path, only ask for things they know have been produced in a previous module in the same path or from the input source. In a perfect world, order of execution of the paths should not matter. However due to the existence of bugs it is always possible that there is an order dependence. Such dependencies should be removed during validation of the job.

Framework Services

ServiceRegistry System

The ServiceRegistry is used to deliver services such as the error logger or a debugging service which provides feedback about the state of the Framework (e.g., what module is presently running). Services are informed about the present state of the Framework, e.g., the start of a new Event or the completion of a certain module. Such information is useful for producing meaningful error messages from the error logger or for debugging. The services to be used in a job and the exact configuration of those services are set in the user's configuration file via a ParameterSet. For further information look here.

Event Setup

To be able to fully process an event, one has to take into account potentially changing and periodically updated information about the detector environment and status. This information (non-event data) is not tied to a given event, but rather to the time period for which it is valid. This time period is called its interval of validity or IOV, and an IOV typically spans many events. Examples of this type of non-event data include calibrations, alignments, geometry descriptions, magnetic field and run conditions recorded during data acquisition. The IOV of one piece of non-event data is not necessarily related to that of another. The EventSetup system handles this type of non-event data for which the IOV is longer than one Event. (Note that non-Event data initiated by the DAQ, such as the Event or a Run transition, are handled by the Event system.)

The figure illustrates the varying IOVs of different non-event data (calibrations and alignments), and how their values at the time of a given event are read by the EventSetup system.

Event setup from Paolos presentation

The EventSetup system design uses two categories of modules to do its work: ESSource and ESProducer. These components are configured using the same configuration mechanism as their Event counterparts, i.e., via a ParameterSet.

ESSource
is responsible for determining the IOV of a Record (or a set of Records). (A Record is an EventSetup construct that holds data and services which have identical IOVs.) The ESSource may also deliver data/services. For example, a user can request the ECAL pedestals via an ESSource that reads the appropriate values from a database.

ESProducer
an ESProducer is, conceptually, an algorithm whose inputs are dependent on data with IOVs. The ESProducer's algorithm is run whenever there is an IOV change for the Record to which the ESProducer is bound. For example, an ESProducer is used to read the ideal geometry of the tracker as well as the alignment corrections and then create the aligned tracker geometry from those 2 pieces of information. This ESProducer is told by the EventSetup system to create a new aligned tracker geometry whenever the alignment changes.

For further information look here.

Provenance Tracking

The CMS Offline framework stores provenance information within CMS's standard ROOT event data files. The provenance information is used to track how every data product was constructed including what other data products were read in order to do the construction. We record information to understand the history of how data were produced and chosen. Provenance information does not have to be sufficient to allow an exact replay of a process. Storing provenance in output files is very crucial to insure trust in the data, given the large scale, highly distributed nature of production, especially for physicists' personal skims which are not centrally managed. Using Provenance information one can track the source of a problem seen in one file but not another one, guarantee compatibility when reading multiple files in a job, confirm that an analysis was done using the proper data, track why two analyses get different results etc. A good source of info is a talk by Chris Jones at given at CEHP09. Also refer to WorkBook 2.3. Also see http://iopscience.iop.org/1742-6596/219/3/032011.

Review status

Reviewer/Editor and Date (copy from screen) Comments
SudhirMalik - 25 March 2010 Filled the information in the Provenance Tracking section
PetarMaksimovic - 23 Jul 2009 Created by moving material from the old Sec. 2.3

Responsible: SudhirMalik
Last reviewed by: SudhirMalik - 26 Nov 2009
%EDITING% AltanCakir - 09 Oct 2017

-- AltanCakir - 09 Oct 2017

Topic attachments
I Attachment History Action Size Date Who Comment
GIFgif event_setup.gif r1 manage 27.7 K 2009-08-01 - 06:51 PetarMaksimovic  
GIFgif fw_edm.gif r1 manage 17.3 K 2009-08-01 - 06:45 PetarMaksimovic  
GIFgif modular_event_products.gif r1 manage 33.7 K 2009-08-01 - 06:48 PetarMaksimovic  
GIFgif reprocess.gif r1 manage 19.4 K 2009-08-01 - 06:47 PetarMaksimovic  
GIFgif user_data_in_event.gif r1 manage 67.7 K 2009-08-01 - 06:50 PetarMaksimovic  
Edit | Attach | Watch | Print version | History: r17 < r16 < r15 < r14 < r13 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r17 - 2017-10-09 - AltanCakir


ESSENTIALS

ADVANCED TOPICS


 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback