Chapter 9: Advanced Tools and Tasks



9.1 Introduction to Advanced Tools and Tasks

Complete: 3
Detailed Review status

This chapter presents some of the more advanced tools in CMSSW and tasks involved in analysis.

Common EDM Utilities (WorkBookEdmUtilities) summarizes the standalone utilities available for checking and searching the components of your CMSSW application.

Common Containers for EDM Objects (WorkBookCommonContainersEdm) describes containers for objects produced by CMSSW reconstruction that must be stored according to the Event Data Model (EDM), and inserted into an Event. It covers standard, OwnVector and AssociationMap containers.

Some common data types in CMSSW (GlobalPoint, GlobalVector) (WorkBookCommonDataTypes) summarizes the constructors and functions of two commonly used data types, GlobalPoint and GlobalVector.

Writing your own framework objects to a file (WorkBookEDMTutorialProducer) is a tutorial which describes how to add data to the event. It covers the steps of creating a package to hold the C++ class for teh data, and then how to create an EDProducer which is a framework module that creates the data and then places it into the Event.

Navigating the CMS Detector Geometry (SWGuideDetectorGeom) discusses the standard configuration files for XML ideal geometry access and points to the POOL Object Relational Access page which provides information on the parameter set variables needed by this means of access.

Application of Alignment and Calibration Constants (SWGuideMisAlignCalib) explains how to simulate the effect of misalignment and miscalibration in Monte Carlo studies. In particular:

  • how to misalign the tracker geometry according to misalignment scenarios;
  • how to apply custom (mis)alignment to the tracker geometry;
  • how to miscalibrate the rechit energies in the ECAL starting from uncalibrated rechits;
  • how to miscalibrate the rechit energies starting from already produced rechits;

Finding the Beam Spot (SWGuideFindingBeamSpot) describes how to determine the beam spot position in simulated events.

Information Sources

Review status

Reviewer/Editor and Date (copy from screen) Comments
AnneHeavey - 20 Dec 2006 Created page, provided initial content
JennyWilliams - 08 Jan 2007 Added WorkBookEDMTutorialProducer page to listings

Responsible: SudhirMalik
Last reviewed by: KatiLassilaPerini - 28 Feb 2008



9.2 Common EDM Utilities

Complete: 3
Detailed Review status

Contents:

Goals of this page:

To let people know about all the standalone utilities available in CMSSW.

cmsglimpse

Searches through all the files in the release for a given string

 > cmsglimpse funny
CondTools/L1Trigger/interface/L1ConfigOnlineProdBase.h:    
// Explanation of funny syntax: since record is dependent, we are not
CondCore/DBCommon/test/testTokenBuilder.cc:     std::cout<<"funny error"<<std::endl;
DQM/SiStripMonitorCluster/test/batch_mtcc.sh: 
#the funny sort is done so that the files are ordered 1, 2, 3, ..., 10, 11, ..., 
#and not 1,10,11,...,2,20, and so on
Validation/Generator/bin/FeedParser.py:     
This is so much trickier than it sounds, it's not even funny.
....

edmConfigHash

Returns the Parameter Set ID for a config file. Used by the production system.

Usage: edmConfigHash <configuration file name>

edmConfigIncludeChecker

Prints out a list of all the include files used by a config file, and checks for missing includes.

edmConfigSearch

Finds instances of a search string in the configuration. The syntax is

EdmConfigSearch <searchstring> <configfile>

edmConfigFromDB

Retrieve job configuration text files (*.cfg / *.cff / *.py) from the configuration database. More documentation can be found here.

edmDumpEventContent

Shows which products exist in the file:


unix> edmDumpEventContent simemu100.root
L1CMS.GlobalTriggerObjectMapRecord    "hltL1GtObjectMap"      ""            "HLT."         
L1CMS.GlobalTriggerReadoutRecord      "hltGtDigis"            ""            "HLT."         
L1MuGMTReadoutCollection          "hltGtDigis"            ""            "HLT."         
(snipped)

More information is found here.

edmEventSize

edmEventSize is a tool to measure the average size per event of each edm::Product present in an output file of a cmsRun job. Detailed documentation may be found here.

edmFileUtil

edmFileUtil is a general edm file utility. It can be used to get the Physical File Name (PFN) from the Local File Name (LFN):

edmFileUtil -d /store/RelVal/2007/5/31/RelVal/RelVal142SingleEPt35-1180627277/
   0000/B46C3E2A-2811-DC11-866E-000E0C3F0C60.root
#edit the above 2 lines to be a single line
rfio:/castor/cern.ch/cms/store/RelVal/2007/5/31/RelVal/RelVal142SingleEPt35-1180627277/
   0000/B46C3E2A-2811-DC11-866E-000E0C3F0C60.root
#edit the above 2 lines to be a single line
see also: WorkBookDataSamples#IntroDuction It has many other options as well:
> edmFileUtil -h
Allowed options:
  -h [ --help ]                print help message
  -f [ --file ] arg            data file (Required)
  -c [ --catalog ] arg         catalog
  -l [ --ls ]                  list file content
  -P [ --print ]               Print all
  -u [ --uuid ]                Print uuid
  -v [ --verbose ]             Verbose printout
  -d [ --decodeLFN ]           Convert LFN to PFN
  -b [ --printBranchDetails ]  Call Print()sc for all branches
  -t [ --tree ] arg            Select tree used with -P and -b options
  --allowRecovery              Allow root to auto-recover corrupted files
  -e [ --events ] arg          Show event ids for events within a range or set
                               of ranges , e.g., 5-13,30,60-90

edmInventory.sh

SWGuideUsingFileCheckingTool

#edmLumisInFiles.py

edmLumisInFiles.py

Lists lumi sections and total integrated luminosity of given EDM files. See Good Lumi Section Twiki for more details.

edmMakePhDThesis

This tool is still under development. Scheduled for inclusion within the main branch in CMSSW_9_6_3.

edmPluginDump

Shows which products and producers are visible to the framework. If you've written a new plugin, this will test to see if the framework "sees" it.

edmPluginHelp

To get the actual parameters edmPluginHelp . For a given module name, e.g. PoolSource, you'd do

           edmPluginHelp -p PoolSource
This will give you the exact parameters which the code validates are actually the ones used.

edmPluginRefresh

Refreshes the lists of products and producers visible to the framework. Sometimes necessary after creating a new plugin.

edmProvDump

Prints out all the tracked parameters which were used to create this file.

usage: edmProvDump [options] <root-file>
--sort - sorts the resulting dump so that it can be reliably diff'd to a dump from a different file.

More information is found here.

edmPythonSearch

usage: edmPythonSearch <string> <config>

Searches the specified Python config, as well as all the fragments imported, for the given string.

>  edmPythonSearch noise RecoTau_DiTaus_pt_20-420_cfg.py

    #    3 time bins noise (in ADC counts)
RecoLocalMuon.CSCRecHitD.cscRecHitD_cff (line: 37)
From RecoTau_DiTaus_pt_20-420_cfg -> Configuration.StandardSequences.Reconstruction_cff -> 
RecoLocalMuon.Configuration.RecoLocalMuon_cff -> RecoLocalMuon.CSCRecHitD.cscRecHitD_cfi


# Addition of HCAL noise by JP Chou
RecoMET.Configuration.RecoMET_cff (line: 11)
From RecoTau_DiTaus_pt_20-420_cfg -> Configuration.StandardSequences.Reconstruction_cff

...

#edmPythonTree

edmPythonTree

Dumps a color-coded list of all the configuration fragments included by this configuration.

 > edmPythonTree CSCDigitizerTest_cfg.py
 +  CSCDigitizerTest_cfg
   +  Configuration.StandardSequences.FrontierConditions_CMS.GlobalTag_cff
     +  CalibCalorimetry.EcalLaserCorrection.ecalLaserCorrectionService_cfi
     +  CalibCalorimetry.HcalPlugins.Hcal_Conditions_forGlobalTag_cff
     +  CalibTracker.Configuration.Tracker_DependentRecords_forGlobalTag_nofakes_cff
       +  CalibTracker.SiStripESProducers.SiStripGainESProducer_cfi
       +  CalibTracker.SiStripESProducers.SiStripQualityESProducer_cfi
       +  CMS.RecoLocalTracker.SiStripRecHitConverter.SiStripRecHitMatcher_cfi
       +  CMS.RecoLocalTracker.SiStripRecHitConverter.StripCPEfromTrackAngle_cfi
         +  CMS.RecoLocalTracker.SiStripRecHitConverter.OutOfTime_cff
     +  Configuration.StandardSequences.FrontierConditions_CMS.GlobalTag_cfi
       +  CondCore.DBCommon.CondDBSetup_cfi
   +  Configuration.StandardSequences.GeometryPilot2_cff
     +  Geometry.CMSCommonData.cmsPilot2IdealGeometryXML_cfi
     +  Geometry.CSCGeometryBuilder.idealForDigiCscGeometry_cff
       +  Alignment.CommonAlignmentProducer.fakeForIdealAlignmentProducer_cfi
....

edmCopyPickMerge

This utility is located in the package PhysicsTools/Utilities, tag V08-01-01 or higher (this tag is backwards compatible with CMSSW 3.6 or later). It can be used to copy or pick events or merge edm files. For more details, please look at WorkBookDataSamples and WorkBookPickEvents.

Review status

Reviewer/Editor and Date (copy from screen) Comments
CMSUserSupport - 13 Jul 2007 created the template page
RickWilkinson - 25 Feb 2008 try to make more user-centric, and flag obsolete tools
ElizabethSextonKennedy - 02 Jul 2008 more obsolete tools
-- CharlesPlager - 10 Jun 2009 Updated edmDump
Main.William.Tanenbaum - 25 Aug 2009 more obsolete tools
Responsible: RickWilkinson
Last reviewed by:

9.3 Common Containers for EDM Objects

Complete: 5
Detailed Review status

Contents

Introduction

An object produced by CMSSW reconstruction must be stored according to the Event Data Model (EDM), see creating a new Event Data Product , and inserted into the Event. Most of the simplest objects can be stored just as:

std::vector<T>

Where T is the object type.

In cases where more advanced features are required, more specialized containers are appropriate. Some common generic containers have been developed for the most common of these cases.

The class templates are defined under CMSSW in the package:

DataFormats/Common

The main generic containers are documented in the pages linked below.

OwnVector Container

OwnVector stores a collection of polymorphic objects that are automatically destroyed at the end of the event processing.

AssociationMap Container

AssociationMap template implements different types of associations of objects stored in different collections. Object references are stored internally as indices of objects in existing collections, and are usable via edm::Ref<...>.

AssociationVector Container

AssociationVector template implements simple one-to-one associations stored by value in the container. It is a lighter alternative to one-to-value AssociationMap in the cases where all objects in a collection have an associated quantity.

Review status

Reviewer/Editor and Date (copy from screen) Comments
LucaLista - 27 Mar 2006 added AssociationVector
ChrisDJones - 06 Dec 2006 corrected spelling
AnneHeavey - 12 Oct 2006 added Luca's page to workbook
LucaLista - 11 Oct 2006 created page

Responsible: LucaLista
Last reviewed by: PetarMaksimovic - 28 Feb 2008



9.3.1 OwnVector Container

Complete: 5
Detailed Review status

Contents

Introduction

In some cases, it may be necessary to store a collection of objects belonging to different class types inheriting from a common base class. Such a type of collection is called polymorphic. Storing polymorphic objects can't be done just using:

  • std::vector<T *>,
where T is the base class type. This is because once the std::vector<T *> is inserted in the event, the event takes ownership of the container, but not of the objects; the objects are thus not destroyed automatically when the collection is destroyed.

A container that would automatically destroy the polymorphic objects is a boost::ptr_vector<T>, but its persistent capabilities with the EDM have not been proven.

In order to destroy automatically contained objects when the event is destroyed, a specific generic container template has been designed, OwnVector:

  • edm::OwnVector<T, P>

Where T is the base class type, P is the clone policy. OwnVector is described below.

OwnVector Interface

The OwnVector interface is very similar to the std::vector interface. In order to insert a new object in the container, the pointer has to be passed to the push_back function:

  edm::OwnVector<MyBaseType> v;
  v.push_back( new MyConcreteType1( ... ) );
  v.push_back( new MyConcreteType2( ... ) );

Where MyConcreteType1 and MyConcreteType2 are two concrete class types that inherit from the base class MyBaseType.

Please, note that v takes ownership of the passed objects. So, the passed pointer can't be used after push_back is called. For instance the following assertion will pass:

  edm::OwnVector<MyBaseType> v;
  std::unique_ptr<MyBaseType> obj = std::make_unique<MyConcreteType1>( ... );
  v.push_back( std::move(obj) ); // o is set to zero here
  assert( obj == 0 );

In the future, we will migrate to the use of std::unique_ptr to make this policy more explicit in the code.

One important difference with std::vector is that, as for boost::ptr_vector<T>, some of the STL algorithms can't be used on poymorphic containers. Below is a sentence taken from boost documentation:

  • "Unfortunately it is not possible to use pointer containers with mutating algorithms from the standard library"

In particular, the most used algorithm, sort, has been implemented, as for boost::ptr_vector<T>, as a member function. So, the following code can be used to sort an OwnVector:

  edm::OwnVector<MyBaseType> v;
  v.push_back( new MyConcreteType1( ... ) );
  v.push_back( new MyConcreteType2( ... ) );
  // . . .

  // sort using uses the < operator between 
  // two objects of type MyBaseType
  v.sort(); 

  // uses a custom comparator object
  v.sort( MyComparator() ); 

Clone Policies and OwnVector

In order to clone an OwnVector, the correct policy for cloning the contained objects has to be specified. This is done using a second template parameter type in OwnVector.

By defalult, the policy is ClonePolicy, which calls a method called clone(), which is assumed to be defined (in most cases purely virtual) in the base class.

It is possible to use any user-defined clone policy by implementing the follwing interface:

    static T * clone( const T & t );
where T is the base class type.

Generating OwnVector Dictionaries

In order to create a dictionary to insert OwnVectors objects in the event, the following guidelines should be followed:

  1. the base class and all subclass types should be added to the dictionary
  2. std::vector<MyBaseType *>, the underlying container should be added
  3. edm::OwnVector<MyBaseType> should be added
  4. edm::Wrapper<edm::OwnVector<MyBaseType> >
  5. if the type MyBaseType does not support the " < " operator, the sort() function should be excluded from the dictionary generation
  6. in order to allow automatic loading of the shared libraries contaning the concrete subtypes, it may be necessary to add the dictionaries of the concrete types, like: edm::Wrapper<MyConcreteType1>. This is the case if the concrete types' dictionaries (MyConcreteType1, MyConcreteType2, ...) are contained in a separate library w.r.t. the base type (MyBaseType).

An example of dictionary generation is the following:

<lcgdict>
<selection>
  <class name="MyBaseType" />
  <class name="MyConcreteType1" />
  <class name="MyConcreteType2" />
  . . .
  <class name="edm::OwnVector<reco::MyBaseType, edm::ClonePolicy<reco::MyBaseType> >" />
  <class name="edm::Wrapper<edm::OwnVector<reco::MyBaseType, 
                                               edm::ClonePolicy<reco::MyBaseType> > >" />
</selection>
<exclusion>
  <class name="edm::OwnVector<reco::MyBaseType, edm::ClonePolicy<reco::MyBaseType> >">
    <method name="sort" />
  </class>
</exclusion>
</lcgdict>

Review status

Reviewer/Editor and Date (copy from screen) Comments
AnneHeavey - 12 Oct 2006 copied Luca's page to workbook; minor editing
LucaLista- 11 Oct 2006 created page

Responsible: LucaLista
Last reviewed by: PetarMaksimovic 28 Feb 2008



9.3.2 AssociationMap Container

Complete: 5
Detailed Review status

Contents

Introduction

In many cases it is convenient to associate quantities to objects existing in a collection. This is true for different reasons, some of which could be:

  • to add extra information to an object without modifying its structure
  • to reprocess only the associated quantities without needing to reprocess the main object
  • to make it possible to drop extra information when writing to disk in order to save disk space

We provide a generic association map implementation that uses EDM persistent references (edm::Ref<..>) in the template:

AssociationMap maps object already existing in a collection to other objects that could either be stored in a different collection (association by reference) or stored inside the map itself (association by value).

  • AssociationMap<T>

where T is one of several possible helper classes described below.

Different Types of AssociationMap

Different types of AssociationMap are supported and can be used for specifying different types of the template argument T in AssociationMap<T>.

The supported types for the parameter T are:

  • edm::OneToValue <CKey, Val, index>: associates an object of type Val to an object in a collection of type CKey. The associated object is contained by value in the map. The reference to the object in the collection of type CKey is stored as an index of type index, which, by default, is the type unsigned int. Shorter indices can be used for collections with a small number of objects in order to save disk space.
  • edm::OneToOne <CKey, CVal, index>: associates an object in a collection of type CVal to an object in a collection of type CKey. The association is stored by reference in the map. As before, the reference to the object in the collection of type CKey is stored as an index of type index.
  • edm::OneToMany <CKey, CVal, index>: associates many objects in a collection of type CVal to an object in a collection of type CKey.
  • edm::OneToManyWithQuality <CKey, CVal, Q, index>: associates many objects in a collection of type CVal to an object in a collection of type CKey. The association is stored by reference in the map in conjunction with an object of type Q that is intended to measure the quality of the match. The class Q should support the operator " < ".

Data Storage in an AssociationMap

With the exception of data stored by value in AssociationMap<OneToValue> type, references to objects are stored as indices that are of type unsigned int by default. But these indices can be of different types if they refer to collections with a small number of objects, and this may help save disk space. For instance, if the collection has less than 65536 objects, you can use an unsigned short, if the collection has less than 256 objects, you can use unsigned char.

A reference to the collection as a whole (edm::RefProd<...>) contains the product identifier of the collection (again an unsigned integer index). This is stored only once in the AssociationMap.

So, the total size of a map containing N associations is determined by:

  • one product identifier for the main collection of type CKey
  • one product identifier for the associated collection of type CVal (not stored for OneToValue map)
  • N indices for the associated objects in the collection of type CKey
  • either:
    • N objects of type Val, for OneToValue map
    • N indices for the associated objects from the collection of type CVal for OneToOne map
    • N vectors of indices for the type OneToMany map
    • N vectors of indices plus quality Q pair for the type OneToManyWithQuality map

AssociationMap Interface

The AssociationMap interface is designed in a way similar to std::map, from the Standard C++ Library.

Whenever the associated object is accessed by reference, a persistent edm::Ref<..> is used in place of a C++ reference.

An example of how to fill and retrieve objects from an AssociationMap is the following, that associates multiple references to tracks to jets:

  typedef 
    edm::AssociationMap<
      edm::OneToMany<CaloJetCollection, TrackCollection> 
    > JetTracksMap;

  JetTracksMap map;

  CaloJetRef rj = ...; 
  TrackRef rt = ...;
  map->insert( rj, rt );
  TrackRefVector tracks = map[ rj ];

  JetTracksMap::const_iterator i = map.find( rj );
  assert( i != map.end() );
  const CaloJetRef & jet1 = i->key;
  const TrackRefVector & tracks1 = i->val;

AssociationMap Interactive Access

AssociationMap provides two method called keys() and values() that return purely transient vectors filled with object pointers (or values for OneToValue maps). For instance, if you have a branch called "assoc" containing an association of type:

  AssociationMap<OneToOne<TrackCollection, SuperClusterCollection> >

you could plot the track momentum vs super-cluster energy with:

   Events.Draw("assoc.keys().pt():assoc.values().energy()");

WARNING: the above example in recent releases gives problems, probably because a ROOT bug. An error message like the following can be produced:

root [9] Events.Draw("electronMatch.keys()")
Error: class,struct,union or type constreco not defined  (tmpfile):1:
Error: class,struct,union or type constreco not defined  _vector.h:49:
Error: Illegal pointer operation (tovalue) (tmpfile):1:
*** Interpreter error recovered ***
This has been fixed in ROOT, will be released in december 2007 release.

In order to use the above interactive ROOT command, you need to create, in addition to the dictionary of the AssociationMap, also a dictionary for the types:

    std::vector<const reco::Track *>
    std::vector<const reco::SuperCluster *>

The following transient vector types are returned by the methods keys() and values() respectively, and require a dictionary if you want to allow interactive access:

  • for AssociationMap<OneToOne<K, V> > :
       std::vector<const K::value_type *>
       std::vector<const K::value_type *>
  • for AssociationMap<OneToValue<K, V> > :
       std::vector<const K::value_type *>
       std::vector<V> 
  • for AssociationMap<OneToMany<K, V> > :
       std::vector<const K::value_type *>
       std::vector<std::vector<const V::value_type *> >
  • for AssociationMap<OneToManyWithQuality<K, V, Q> > :
       std::vector<const K::value_type>
       std::vector<std::vector<std::pair<const V::value_type *, Q> > >

Generating AssociationMap Dictionaries

If you wish to store an AssociationMap in the event, it is mandatory to define a dictionary of the map type you are using.

In order to create a dictionary of an AssociationMap type, the following guidelines should be followed:

  • references to products (collections) are stored using the template type edm::helpers::KeyVal<CKey, CVal> for all maps except for OneToValue, which needs one one reference, and uses the template edm::helpers::Key&CKey>. Those template specializations should be added to the dictionary
  • the internally stored map type should be added added to the dictionary if not already defined in DataFormats/Common library. In particular:
    • AssociationMap<OneToValue<CKey, Val, index> > requires std::map<index, Val> that is already defined in DataFormats/Common for some trivial cases of the type Val
    • AssociationMap<OneToOne<CKey, CVal, index> > requires std::map<index, index> that in most of the cases is already defined in DataFormats/Common library
    • AssociationMap<OneToMany<CKey, CVal, index> > requires std::map<index, std::vector<index> > that in most of the cases is already defined in DataFormats/Common library
    • AssociationMap<OneToManyWithQuality<CKey, CVal, Q, index> > requires std::map<index, std::vector<std::pair<index, Q> > >
  • the type edm::AssociationMap<...> should be added declaring the field transientMap_ as transient data member
  • the wrapper edm::Wrapper<edm::AssociationMap<...> > should be added, as for any EDM type

The following dictionary is also required for OneToValue map:

  • edm::helpers::KeyVal<edm::Ref<CKey>, Val>

The table below summarises the internally stored map type for the different association map types.

map type required internal map availability of dictionary in DataFormats/Common
OneToValue<CKey, Val, index> std::map<index, Val> defined for Val identical to index and of type undigned long, unsigned int, unsigned short
OneToOne<CKey, CVal, index> std::map<index, index> defined for index of type undigned long, unsigned int, unsigned short
OneToMany<CKey, CVal, index> =std::map<index, std::vector<index> > defined for index of type undigned long, unsigned int, unsigned short
OneToManyWithQuality<CKey, CVal, Q, index> std::map<index, std::vector< std::pair< index, Q > > > not available

An example of dictionary generation is the following. It associates many tracks to a jet, and is inspired by DataFormats/BTauReco:

<lcgdict>
  <class name="edm::helpers::KeyVal<edm::RefProd<std::vector<reco::CaloJet> >,
                       edm::RefProd<std::vector<reco::Track> > >" />
  <class name="edm::AssociationMap<edm::OneToMany<std::vector<reco::CaloJet>, 
                       std::vector<reco::Track>, unsigned int > >">
    <field name="transientMap_" transient="true" />
  </class>
  <!-- the dictionary for std::map<unsigned int, std::vector<unsigned int> > is not needed
         because it is defined in DataFormats/Common library -->
</lcgdict>

Review status

Reviewer/Editor and Date (copy from screen) Comments
AnneHeavey - 12 Oct 2006 moved page to workbook; minor edits and major questions!
LucaLista - 11 Oct 2006 created page

Responsible: LucaLista
Last reviewed by: PetarMaksimovic 28 Feb 2008



9.3.3 AssociationVector Container

Complete: 5
Detailed Review status

Contents

Introduction

AssociationVector<KeyRefProd, CVal> stores internally:

  • a container of type V storing the associated quantities. Could be as simple as a std::vector<float>.
  • a reference to a key collection (typically, and edm::RefProd<...>)

The container interface enforces that the stored container (Cval) has the same size as the associated collection (KeyRefProd).

AssociationVector can be used as lighter alternative to AssociationMap<OneToValue<...> > if every object in a collection has an associated quantity that can be stored by value in the association container.

AssociationVector Interface

AssociationVector<KeyRefProd, CVal> assumes there is a collection of objects of type Key already stored in the event, to which a reference object of type KeyRefProd refers to, and stores internally a collection CVal containing the same number of entries as the first collection.

AssociationVector<KeyRefProd, CVal> has an interface very similar to std::vector<std::pair<Key, Val> >, where Val is the object type contained in CVal, Key is a reference to an object in the collection referred to by KeyRefProd.

An example of usage of AssociationVector is the following:

  typedef AssociationVector<MuonRefProd, vector<double> > MuonIsolationCollection;
  
  Handle<MuonCollection> muons;
  event.getByLabel( "muons", muons );

  // create and fill the association vector
  MuonIsolationCollection isolations( MuonRefProd( muons ) );
  for( size_t i = 0; i < muons->size(); ++ i ) {
    isolations[ i ] = ....;
  }

  // read the association vector
  for( size_t i = 0; i < isolations.size(); ++ i ) {
    MuonIsolationCollection::value_type iso = isolations( i );
    MuonRef mu = iso.first;
    double iso = iso.second;
  }

AssociationVector Interactive Access

Given the simple internal structure of AssociationVector, interactive access is rather simple. The above example, modeling muon isolation, can be plotted against muon momentum as follows:

  Events.Draw( "isolations.data_:globalMuons.pt()" );

Dictionary Generation

AssociationVector<KeyRefProd, CVal> requires the dictionary generation of both:

  • KeyRefProd, and
  • CVal
  • std::pair<Key, Val>

where Key is a reference to an object in the collection referred to by KeyRefProd.

Moreover, the dictionary of the class itself has to be generated declaring the fields transientVector_  and fixed_  (Note: from CMSSW 7 the class template doesn't declare a fixed_ member anymore) as transient data members. Of course, the dictionary of edm::Wrapper<edm::AssociationVector<...> > has to be generated as well.

As example for the type:

  • edm::AssociationVector<edm::RefProd<std::vector<reco::Muon> >, std::vector<float> >"

The following dictionary directives are needed:

  <class name="edm::AssociationVector<edm::RefProd<std::vector<reco::Muon> >,
                                      std::vector<float>,
                                      edm::Ref<std::vector<reco::Muon>,
                                               reco::Muon,
                   edm::refhelper::FindUsingAdvance<std::vector<reco::Muon> >,
                                     reco::Muon> >,
                                     unsigned int>">
    <field name="transientVector_" transient="true"/>
  </class>
  <class name="std::pair<edm::Ref<std::vector<reco::Muon>,
                      reco::Muon,
                      edm::refhelper::FindUsingAdvance<std::vector<reco::Muon> >,
                                     reco::Muon>, 
                                     float>"/>
  <class name="std::vector<std::pair<edm::Ref<std::vector<reco::Muon>,
                      reco::Muon,
                      edm::refhelper::FindUsingAdvance<std::vector<reco::Muon> >,
                                           reco::Muon>, 
                                            float> >"/>
  <class name="edm::Wrapper<edm::AssociationVector<edm::RefProd<std::vector<reco::Muon> >,
                      std::vector<float>,
                      edm::Ref<std::vector<reco::Muon>,
                               reco::Muon,
                               edm::refhelper::FindUsingAdvance<std::vector<reco::Muon> >,
                                     reco::Muon> >,
                                     unsigned int> >"/>

Review Status

Editor/Reviewer and date Comments
LucaLista - 26 Apr 2007 Updated to release 1_5_0_pre1
LucaLista - 23 Apr 2007 Page author and page content last edited

Responsible: LucaLista
Last reviewed by: PetarMaksimovic 28 Feb 2008



9.4 Commonly used vector/matrix classes in CMSSW (GlobalPoint, LorentzVector, etc.)

Complete: 5
Detailed Review status

Contents

Goal of this page

This page describes commonly used vector/matrix classes in CMSSW. These use heavily templated classes, so it is not straightforward to discover their functionality by direct reading of the code. (The templating is used for generality, and would make it trivial to define, for example, analogous classes using double precision rather than float).

GlobalPoint and GlobalVector

The main constructors and functions of GlobalPoint and GlobalVector are summarized here. This is a fairly complete list. These classes are based on PV3DBase - see http://cmslxr.fnal.gov/source/DataFormats/GeometryVector/interface/PV3DBase.h] OR https://cmssdt.cern.ch/lxr/source/DataFormats/GeometryVector/interface/PV3DBase.h - which lists additional functions inside them.

GlobalPoint and GlobalVector are classes in DataFormats/GeometryVector for representing, with float precision, a 3-dim space point and 3-dim direction vector respectively in the CMS global coordinate system. They are widely used in track reconstruction code.

Similar classes called LocalPoint and LocalVector represent points and vectors in the local coordinate system of a given Detector Unit. They are not described here, but have an entirely analogous set of functions (due once again to templating). Transforming between global and local coordinates is a functionality of the GeomDet class, the fundamental base class of many tracking subdetector units in CMS.

GlobalPoint constructors

constructor what it does
GlobalPoint( float x, float y float z ); // construct from Cartesian x, y, z coordinates
GlobalPoint(); // effectively GlobalPoint( 0., 0., 0. )
GlobalPoint( float x, float y ); // effectively GlobalPoint( x, y, 0. )
GlobalPoint( Polar(float theta, float phi, float r) ); // construct from spherical polar coordinates, phi angular range (-pi,+pi]
GlobalPoint( Cylindrical(float r, float phi, float z) );  
GlobalPoint( GlobalPoint gp );  

GlobalPoint functions

function what it does
GlobalPoint& operator+=( const GlobalVector& gv ); // shifts a point - the Cartesian components are added
GlobalPoint& operator-=( const GlobalVector& gv ); // shifts a point - the Cartesian components are subtracted
   
float x() const; // Cartesian x component
float y() const; // Cartesian y component
float z() const; // Cartesian z component
float mag2() const; // x*x + y*y + z*z
float mag() const; // sqrt( x*x + y*y + z*z )
float perp2() const; // x*x + y*y
float perp() const; // sqrt( x*x + y*y )
float transverse() const; // same as perp()
Geom::Phi phi() const; // azimuthal phi, radians, range (-pi, +pi]
Geom::Theta theta() const; // polar theta, radians, range [0, pi]
float eta() const; // pseudorapidity

Notes on GlobalPoint functions

i) BEWARE! Geom::Phi enforces its range to be (-pi, +pi] and includes various functions, e.g. operator*, which may confuse you. For example, if you try to scale phi() it will subvert your probable intentions:

 
gp.phi() * 180./3.14159; // gp is a GlobalPoint
does not give you the phi in degrees!

But Geom::Phi has a function degrees() which you can use instead of doing your own scaling:

float ang3 = gp.phi().degrees(); // returns the phi in degrees

There's a matching function to return the value in radians:

float ang2 = gp.phi().value();    // returns the angle in radians

However Phi has a type conversion operator so that the following simple call works:

float ang1 = gp.phi();          // returns the angle in radians using implicit 
                                 // type conversion to template type (float)

ii) phi() and theta() use atan2 to calculate their values.

iii) eta() is calculated as

{ float x(z()/perp()); return log(x+sqrt(x*x+1));}
which is claimed to be faster than the direct -log( tan( theta()/2.). It does not check for zero transverse component; in this case the behavior is as for divide-by zero, i.e. system-dependent.

GlobalVector constructors

constructor what it does
GlobalVector( float x, float y float z ); // construct from Cartesian x, y, z coordinates
GlobalVector(); // effectively GlobalVector( 0., 0., 0. )
GlobalVector( float x, float y ); // effectively GlobalVector( x, y, 0. )
GlobalVector( Polar(float theta, float phi, float r) ); // construct from spherical polar coordinates, phi angular range (-pi,+pi]
GlobalVector( Cylindrical(float r, float phi, float z) );  
GlobalVector( GlobalVector gv );  

GlobalVector functions

function what it does
GlobalVector& operator+=( const GlobalVector& gv ); // adds a vector - the Cartesian components are added
GlobalVector& operator-=( const GlobalVector& gv ); // subtracts a vector - the Cartesian components are subtracted
GlobalVector& operator*=( float ); // multiply by a scalar
GlobalVector& operator/=( float ); // divide by a scalar
GlobalVector operator-() const; // returns GlobalVector(-x, -y, -z)
GlobalVector cross( const GlobalVector& gv ) const; // cross (vector) product
float dot( const GlobalVector& gv ) const; // dot (scalar) product
GlobalVector unit() const; // unit vector parallel to this. If mag()=0 a zero vector is returned.
   
GlobalVector also has the same functions as GlobalPoint (and the same caveats apply - see above):  
float x() const; // Cartesian x component
float y() const; // Cartesian y component
float z() const; // Cartesian z component
float mag2() const; // x*x + y*y + z*z
float mag() const; // sqrt( x*x + y*y + z*z )
float perp2() const; // x*x + y*y
float perp() const; // sqrt( x*x + y*y )
float transverse() const; // same as perp()
Geom::Phi phi() const; // azimuthal phi, radians, range (-pi, +pi]
Geom::Theta theta() const; // polar theta, radians, range [0, pi]
float eta() const; // pseudorapidity

Particle::Point, Vertex::Point, Particle::Vector and Vertex::Vector

Particle::Point and Particle:Vector are defined in DataFormats/Candidate/ . They are used to represent 3-dimension space-point and direction information about MC truth particles and of most reconstructed objects (except tracks).

Vertex::Point and Vertex::Vector are identical to Particle::Point and Particle:Vector . They are defined in DataFormats/VertexReco and used to provide information about reconstructed vertices.

These classes are all based on the ROOT objects:

PositionVector3D<ROOT::Math::Cartesian3D >

DisplacementVector3D<ROOT::Math::Cartesian3D >

which are described in the ROOT User Manual.

Lorentz Vectors

To manipulate Lorentz vectors, CMSSW uses (notably in the Candidate class) code similar to the following example:

  // N.B. Confusingly this #include defines XYZTLorentzVector, not LorentzVector !
  #include "DataFormats/Math/interface/LorentzVector.h"
  using namespace math;

  XYZTLorentzVector p4Sum;
  for (i=0; ....) {
      p4Sum += XYZTLorentzVector(px[i], py[i], pz[i], E[i]);
  }
  double massSum = p4Sum.M();

Note you can also create a ROOT Lorentz vector specifying the 3-momentum and the mass (instead of the energy):

  #include "Math/LorentzVector.h" 
  #include "Math/PxPyPzM4D.h"
  ROOT::Math::LorentzVector<ROOT::Math::PxPyPzM4D<double> > p4(px, py, pz, m);

These Lorentz vectors are based on _ ROOT::Math::LorentzVector<ROOT::Math::PxPyPzE4D > _ which explains which functions are available.

AlgebraicSymMatrix55 and AlgebraicVector55

These are matrix and vector classes commonly used inside the track reconstruction software. They are usually only used by tracking software developers, so are described in the CMSSW Offline Guide.

Warning: Do not confuse these with AlgebraicVector and AlgebraicSymMatrix (with no number "55"). These also exist, but are completely different classes based on CLHEP/Matrix/SymMatrix.h !

Acknowledgements

GlobalPoint and GlobalVector have been part of CMS software since 1999. They were written by Teddy Todorov.

Review Status

Editor/Reviewer and Comments Comments
TimCox - 02 Jul 2007 page author
JennyWilliams - 03 Jul 2007 added workbook markup and moved page into cms workbook
IanTomalin - 25 Jun 2008 added info on Particle::Point and Vertex and AlgebraicSymMatrix55
IanTomalin - 11 Nov 2009 various improvements. Added LorentzVector documentation

Responsible: TimCox
Last reviewed by: -- TimCox - 25 Jan 2008 Last reviewed by: -- IanTomalin - 25 Jun 2008



9.5 Writing your own framework objects to a file

Complete: 5
Newsbox
Read first WorkBookWriteFrameworkModule

Contents

Introduction

CMSSW includes the nice feature that it is possible to include your own objects and framework in an output file which can then be read in in a private analysis in the framework. The largest advantage of this method is that most of the analysis is done in the framework, which means that it is easy to port analysis tools developed in one analysis to another. Running your analysis mostly in the framework has the additional advantage that you can more easily run using grid and batch tools, thus speeding up the analysis process.

We will only discuss how to use objects that already exist in the framework. It is of course also possible to create new personal object types and book these into your events but this is covered in more advanced tutorials. We will discuss one of many possible analysis techniques.

  • We want to save only certain objects in the event and save some additional information. For example: we want to save all standard tracks and their inner and outermost points.

Other possibilities would include the use of the Candidate container classes, or a small analysis that for example would create Z boson or J/Psi candidates from two muons and would save the resonance candidates as LorentzVectors or some kind of Candidate.

CMSSW releases

This tutorial has been done in CMSSW_5_3_11.

The basic principles and design concepts of EDProducers

A producer is a module in the framework that creates new objects. Besides the usual class constructors and destructor methods a producer has the following methods:

  void beginJob( const edm::EventSetup & );
  void produce( edm::Event& , const edm::EventSetup& );
  void endJob();
The beginJob and endJob methods are similar as used in framework analyzers, and can be used to instantiate objects and finish up after running. The actual storing of objects in the event happens in the produce method.

The most convenient way to create a producer code skeleton is by using the scripts that are available in the framework. In a set up CMSSW environment you can create a producer by typing:

scram p CMSSW CMSSW_5_3_11
cd CMSSW_5_3_11/src
cmsenv
mkdir ProdTutorial
cd ProdTutorial
mkedprod ProducerTest

1. Take a look at the code skeleton created by the mkedprod method. Identify the different methods in the ProducerTest/src/ProducerTest.cc file. Things to notice are:

  • In the producer's constructor you define the name and type of object that you will eventually produce;
  • In the produce() method the objects are created and then saved into the event;
  • A framework macro DEFINE_FWK_MODULE(ProducerTest); is called to define the producer object as a framework plugin.
Note that you will always have to modify the CMS.BuildFile of your producer if you want to save objects, to make sure that the appropriate libraries are known to the framework.

A very simple producer that saves existing tracks and two space points

The following example loops over tracks and saves the inner and outer point of the track, together with the existing track. The point objects will be pointers to type math::XYZPointD. This is also the type returned by the track's Track::outerPosition() and Track::vertex() methods.

Creating the code

Create the producer skeleton

The first step would be to create a producer skeleton:

cd $CMSSW_BASE/src 
cd ProdTutorial
mkedprod TrackAndPointsProducer

Note: Doing cd $CMSSW_BASE/src above brings you back to the CMSSW_5_3_11/src directory.

Open the source file TrackAndPointsProducer/src/TrackAndPointsProducer.cc in your favorite editor and start adding the following code:

Include the appropriate header files

You will use objects of type Point and Track, and will use the standard vector classes and the standard library, so include these in your list of header files:
#include <vector>
#include <iostream>
#include "DataFormats/Math/interface/Point3D.h"
#include "DataFormats/TrackReco/interface/Track.h"
#include "DataFormats/TrackReco/interface/TrackFwd.h"

Modify the class definition and add members

In the class definition you will need to define the labels of the input objects (Tracks in this case). This way you can use your configuration file to easily switch between different track algorithms without recompiling. So add the input information to your class definition. You should also make sure that your class recognizes the containers that you will eventually book into the event, we use Point objects in a container class PointCollection. After adding these your class definition should look like this:

class TrackAndPointsProducer : public edm::EDProducer {
   public:
      explicit TrackAndPointsProducer(const edm::ParameterSet&);
      ~TrackAndPointsProducer();

   private:
      virtual void beginJob(const edm::EventSetup&) ;
      virtual void produce(edm::Event&, const edm::EventSetup&);
      virtual void endJob() ;

      // ----------member data ---------------------------
  
    edm::InputTag src_;
    typedef math::XYZPointD Point;
    typedef std::vector<Point> PointCollection;
   
};

Modify the class constructor

When the class constructor is called, the framework should be instructed that the TracksAndPointsProducer class will add something to the file. This is done by calling the produces < CollectionName > (label) method. Also this is the place to read in information from the config file. Your constructor should look like this:
TrackAndPointsProducer::TrackAndPointsProducer(const edm::ParameterSet& iConfig)
{ 
  src_  = iConfig.getParameter<edm::InputTag>( "src" );
  produces<PointCollection>( "innerPoint" ).setBranchAlias( "innerPoints");
  produces<PointCollection>( "outerPoint" ).setBranchAlias( "outerPoints");
}

Modify the produce() method to book your objects

In the produce method you will have to create the appropriate vectors, fill them and put them in the event:

void TrackAndPointsProducer::produce(edm::Event& iEvent, const edm::EventSetup& iSetup)
{
   using namespace edm; 
   using namespace reco; 
   using namespace std;
   // retrieve the tracks
   Handle<TrackCollection> tracks;
   iEvent.getByLabel( src_, tracks );
   // create the vectors. Use auto_ptr, as these pointers will automatically
   // delete when they go out of scope, a very efficient way to reduce memory leaks.
   auto_ptr<PointCollection> innerPoints( new PointCollection );
   auto_ptr<PointCollection> outerPoints( new PointCollection );
   // and already reserve some space for the new data, to control the size
   // of your executible's memory use.

   const int size = tracks->size();
   innerPoints->reserve( size );
   outerPoints->reserve( size );
   // loop over the tracks:
   for( TrackCollection::const_iterator track = tracks->begin(); 
       track != tracks->end(); ++ track ) {
     // fill the points in the vectors
     innerPoints->push_back( track->innerPosition() );
     outerPoints->push_back( track->outerPosition() );
   }
   // and save the vectors
   iEvent.put( innerPoints, "innerPoint" );
   iEvent.put( outerPoints, "outerPoint" );
   
}

The complete module is here: TrackAndPointsProducer.cc

Edit the BuildFile

Make sure that before you compile you edit the BuildFile so it includes the object libraries you will use in your analysis. The following BuildFile includes the Track and Point class libraries:
<use name=FWCore/Framework>
<use name=FWCore/PluginManager>
<use name=DataFormats/TrackReco>
<use name=DataFormats/Math>
<flags EDM_PLUGIN=1>
<export>
   <lib name=1/>
</export>

And... Compile!

You are now ready to compile:
cd $CMSSW_BASE/src  
cd ProdTutorial
scram b
If everything goes to plan you shouldn't have any compilation errors and you should see something similar to the following printout:

[lxplus429] /afs/cern.ch/user/x/xuchen/workbook/CMSSW_5_3_11/src/ProdTutorial > scram b
Reading cached build data
>> Local Products Rules ..... started
>> Local Products Rules ..... done
>> Entering Package ProdTutorial/ProducerTest
>> Creating project symlinks
>> Leaving Package ProdTutorial/ProducerTest
>> Package ProdTutorial/ProducerTest built
>> Entering Package ProdTutorial/TrackAndPointsProducer
>> Leaving Package ProdTutorial/TrackAndPointsProducer
>> Package ProdTutorial/TrackAndPointsProducer built
>> Subsystem ProdTutorial built
>> Local Products Rules ..... started
>> Local Products Rules ..... done
gmake[1]: Entering directory `/afs/cern.ch/user/x/xuchen/workbook/CMSSW_5_3_11'
>> Creating project symlinks
>> Done python_symlink
>> Compiling python modules python
>> Compiling python modules src/ProdTutorial/ProducerTest/python
>> Compiling python modules src/ProdTutorial/TrackAndPointsProducer/python
>> All python modules compiled
>> Pluging of all type refreshed.
gmake[1]: Leaving directory `/afs/cern.ch/user/x/xuchen/workbook/CMSSW_5_3_11'

Running the producer

If you do not have any compilation errors you should now have a working producer module. The next step would be running the module. This means creating a config file, which should contain the following:
  • a definition of your producer module. Note that you can easily define more than one if you want to compare objects/algorithms. Note that only at this point you will have to choose which track objects to use;
  • an input file of some sorts. You can explore use the DBS/DLS database discovery tool to find data files you want to look at, or create your own;
  • an output file that contains the information you need and removes everything you are not interested in.

Actually this file ( which you will modify to have the contents below) already exits in the $CMSSW_BASE/src/ProdTutorial/TrackAndPointsProducer/ directory. It is called trackandpointsproducer_cfg.py. Either replace the contents this file with the lines below, otherwise this file in its entirety is here called trackandpointsproducer_cfg.py.txt

import FWCore.ParameterSet.Config as cms

process = cms.Process("OWNPARTICLES")

process.load("FWCore.MessageService.MessageLogger_cfi")

process.maxEvents = cms.untracked.PSet( input = cms.untracked.int32(100) )

process.source = cms.Source("PoolSource",
    # replace 'myfile.root' with the source file you want to use
    fileNames = cms.untracked.vstring(
       'file:/afs/cern.ch/cms/Tutorials/TWIKI_DATA/CMSDataAnaSch_RelValZMM536.root'
    )
)

#from ProdTutorial.TrackAndPointsProducer.trackandpointsproducer_cfi import *
process.MuonTrackPoints = cms.EDProducer('TrackAndPointsProducer'
        ,src    =cms.InputTag('globalMuons')

)

process.TrackTrackPoints = cms.EDProducer('TrackAndPointsProducer'
        ,src    =cms.InputTag('generalTracks')
)

process.out = cms.OutputModule("PoolOutputModule",
    fileName = cms.untracked.string('myOutputFile.root')
    ,outputCommands = cms.untracked.vstring('drop *',
      "keep *_generalTracks_*_*",
      "keep *_globalMuons_*_*",
       "keep *_MuonTrackPoints_*_*",
      "keep *_TrackTrackPoints_*_*")

)


process.p = cms.Path(process.MuonTrackPoints*process.TrackTrackPoints)

process.e = cms.EndPath(process.out)

To run the module do the following:

cmsRun trackandpointsproducer_cfg.py

This creates an output root file called myOutputFile.root

Look at the output

The output file myOutputFile.root can now be viewed in bare root or analyzed further in the framework. Here you see the newly added objects:
a root browser window

Example of a Simple Producer (EventCountProducer)

In many workflows, you might run a filter that drops some portion of events, but you want to keep track of how many events were present before the filter was run. This information can be provided by a simple tool called EventCountProducer.

In your python configuration, you simply create an instance and then include it in your path at the point where you want to count events. For example, if you include a filter to select events with muons, you could create two producers, one to count events before the filter and one to count the number of events that pass the filter:

process.nEventsTotal = cms.EDProducer("EventCountProducer")
process.nEventsFiltered = cms.EDProducer("EventCountProducer")

process.p = cms.Path(
    process.nEventsTotal *
    process.muonFilter *
    process.nEventsFiltered
)

The EventCountProducer stores its product in the luminosity block and is able to merge event counts from multiple files. So, if you were creating patTuples in the previous step, and were now running an analyzer on several patTuple files, you could access the event counts in the endLuminosityBlock method of your analyzer:

void MyAnalyzer::endLuminosityBlock(const edm::LuminosityBlock & lumi, const EventSetup & setup) {
// Total number of events is the sum of the events in each of these luminosity blocks
Handle nEventsTotalCounter;
lumi.getByLabel("nEventsTotal", nEventsTotalCounter);
nEventsTotal += nEventsTotalCounter->value;

Handle nEventsFilteredCounter;
lumi.getByLabel("nEventsFiltered", nEventsFilteredCounter);
nEventsFiltered += nEventsFilteredCounter->value;
}

Summary and Conclusion

This tutorial shows several ways to use producers for analysis purposes. Using producers for analysis is a very efficient way to use the framework. It reduces the amount of code duplication that usually happens in private ntuple based analyses and has the additional advantage that all debugging facilities in the framework are available, which makes debugging easier than in root macros.

Review Status

Editor/Review and date Comments
FreyaBlekman - 23 Feb 2007 Original Author
JennyWilliams - 05 Mar 2007 Moved tutorial into WorkBook, moved contents of this WB page to SWGuideEDProducer
ChristopherJones - 28 Jan 2008 Updated to work with releases equal to or greater than 1_5_X
AndriusJuodagalvis - 2009-09-05 Added info how to access the produced branches
XuanChen - 10 Jun 2014 Changed CMSSW release to CMSSW_5_3_11, replaced files and outputs
Responsible: Sudhir Malik
Last reviewed by: Reviewer. Sudhir Malik- 26 November 2009.

Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r13 - 2010-01-19 - KatiLassilaPerini


ESSENTIALS

ADVANCED TOPICS


 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback