Creating New Products

Complete: 3

Goal of this page

This page contains instructions on how to introduce a new product into the system so that it can be stored in the edm::Event and later in a EDM/ROOT file.

We distinguish between a product put directly into the edm::Event (e.g. a JetCollection), which we call an EDProduct, and a product that is a constituent of an EDProduct (e.g. a single Jet, a Hit), which we refer to as a constituent.

Making the package containing your product:

The package that defines your product must be separate from the package of the EDProducer that will actually create instances of your product. If this is not done, then errors may occur when running cmsRun.

The package defining your product needs to contain:

  • The code defining your product
  • The classes_def.xml file
  • The classes.h file
Multiple related products can be placed in the same package.

Restrictions on the package defining your persistence capable product:

These restrictions apply to the package defining your products that will be persisted in a EDM/ROOT file.

Note: Products that will be persisted only in non-EDM ROOT files need to follow the rules for transient products given in the next section.

The package defining your product should be located in a subsystem devoted to product definitions. Examples of such subsystems are AnalysisDataFormats, DataFormats, FastSimDataFormats, SimDataFormats, and TBDataFormats.

In your package BuildFile.xml, you declare any dependencies on other packages or external tools.

Your BuildFile.xml must not contain a plugin directive

The files defining your products must be in the src or interface directories

If your package includes any headers in DataFormats/Common, you must declare a dependency on DataFormats/Common.

If your package throws any exceptions, you must explicitly declare a dependency on FWCore/Utilities.

You must not write directly to cerr or cout. If you need text output, use the MessageLogger. If your package uses the MessageLogger, you must explicitly declare a dependency on FWCore/MessageLogger.

The package containing your product must not be declared dependent on any other package in CMSSW, except as stated above, and also except other packages defining lower level products that are constituents of your product.

There are also restrictions on external dependencies (dependencies on tools/packages outside of CMSSW).

In most cases, there should be none.

A dependency on clhep is permitted if your product contains a clhep product (e.g. HepLorentzVector).

A dependency on boost is permitted if necessary.

Dependencies on ROOT I/O package are prohibited. Dependencies on ROOT non-I/O packages (e.g. Math, Histograms) are permitted.

Dependencies on geant4 are prohibited.

Each persistence capable class must have a public default constructor.

If your package directly throws any exceptions, they must be of the class "cms::Exception" or "edm::Exception". Do not throw a std::exception or define your own exception classes.

The methods of your class should be kept short and simple.

Restrictions on a package defining your transient only product:

In rare cases, you may wish to put a product in the event that you wish NEVER to be output into a EDM/ROOT file.

Also, you may need to use ROOT dictionaries for products that will be persisted in only non-EDM ROOT files. These follow these same rules as transient only products

If the package defining your product is located in a subsystem devoted to product definitions (e.g. DataFormats, SimDataFormats), your package must meet all the same restrictions as given above for persistence capable products. However, for a transient only product, you may choose to locate your package outside of one of these subsystems. If so, the requirements on external dependencies of your package are eased somewhat. Your package may be dependent on other CMSSW packages outside of the subsystems devoted to product definitions, but you still must avoid dependencies on packages defining plug-ins (e.g. producers, filters, analyzers, etc.). Your package must still meet all of the other restrictions given for persistence capable products above. In particular, your product still does need a public default constructor.

Restrictions on the package producing your product:

You must not write directly to cerr or cout. If you need text output, use the MessageLogger. If your package uses the MessageLogger, you must explicitly declare a dependency on FWCore/MessageLogger. (Applies to all packages)

If your package directly throws any exceptions, they must be of the class "cms::Exception" or "edm::Exception". Do not throw a std::exception. If you define your own exception class, it must publicly inherit from cms::Exception. (Applies to all packages).

Your producer should do as little as possible other than producing the product(s) and putting them into the event. The producer should be dependent only on packages necessary for this task. Therefore, dependencies on ROOT (other than non/IO packages, such as math packages) are strongly discouraged.

Make your product

For this example, our product will be very simple.

#ifndef DataFormats_SampleProd_h
#define DataFormats_SampleProd_h

#include <vector>


// a simple class
struct SampleProd
{
  explicit SampleProd(int v):value_(v) { }
  SampleProd():value_(0) { }
  int value_;
};

// this is our new product, it is simply a 
// collection of SampleProd held in an std::vector
typedef std::vector<SampleProd> SampleCollection;

#endif

The first step is to add this header and the associated implementation file to the interface and src directories of your package.

Create a classes_def.xml in the src directory

The classes_def.xml file directs scram dictionary generation and must contain the names of the classes that you will want to put into the event. The EDM always adds a wrapper around an EDProduct that you stick directly into to event. You will need to

  • specify the product and any of its (non-transient) constituents defined in your package,
  • If the product or a constituent defined in your package has any transient (i.e. non-persistent) data members, these members must be indicated as such in the classes_def.xml file. A dictionary is not usually needed for the transient member itself or its constituents.
  • If the product is an EDProduct, mention the wrapped EDProduct.
  • If a dictionary for an std::map is specified, also specify the dictionary for the corresponding std::pair.
  • In some exceptional cases, due to the requirements of schema evolution, a dictionary for a transient member or a component of a transient member may be required. If such a needed dictionary is missing, a run time exception should occur indicating what dictionary or dictionaries are needed.

In addition, you will likely need to mention any of the class template instantiations used in the product.

Some important rules must be followed:

  • The namespace of each class must be fully spelled out
  • If the class is an instance of a template, the values of all template parameters, including defaulted parameters, must be fully specified.
  • A typedef may often be used in place of the fully specified type, but the typedef must expand to the fully defined type.
  • In some cases a typedef may not work.

Finally, here are the contents of the file for our sample product:

<lcgdict>
 <class name="SampleProd"/>
 <class name="std::vector<SampleProd>"/>
 <class name="edm::Wrapper<std::vector<SampleProd> >"/>
</lcgdict>

classes_def.xml entries for an EDProduct that stores only some of its data members

Suppose there is a data member of your class that you do not want to store because it can be calculated from other stored data members. E.g.

// a simple class
struct SampleProd
{
private:
  //do not store
  mutable int square_;
public:
  explicit SampleProd(int v):value_(v),square_(0) { }
  SampleProd():value_(0),square_(0) { }
  int value_;

  int square() const {
    if ( 0 == square_) {
       square_=value_*value_;
    }
    return square_;
  }
};

In the entry for the class in the classes_def.xml file you use a <field name="..." transient="true" /> node, where the name attribute is the name of the member you do not want to store. E.g.,

 <class name="SampleProd">
   <field name="square_" transient="true"/>
 </class>
 <class name="std::vector<SampleProd>"/>
 <class name="edm::Wrapper<std::vector<SampleProd> >"/>

In addition, one needs to be sure ROOT clears the value each time it reads a new object back from storage. This clearing must be done since ROOT may reuse the same memory area over and over for different events. The clearing can be done using the same tools for ROOT's explicit schema evolution handling (http://root.cern.ch/root/html532/io/DataModelEvolution.html). This entails writing an ioread rule in the classes_def.xml file. E.g.

<ioread sourceClass = "SampleProd" version="[1-]" targetClass="SampleProd" source="" target="square_">
<![CDATA[ square_=0;
]]>
</ioread>
The rule says it applies to the class type SamepleProd which doesn't change type between the file and the job (sourceClass is identical to targetClass) and there is no data to read from the file (source is empty) and this is needed when ROOT must read the square_ data member.

It is also possible to just do the calculation right at readback time. In that case the class would not need a mutable data member and would include a function which could be used both by the constructor and the ROOT readback code. E.g.,

// a simple class
struct SampleProd
{
private:
  //do not store
  int square_;
public:
  explicit SampleProd(int v):value_(v),square_(0) { initSquare(); }
  SampleProd():value_(0),square_(0){}
  int value_;

  int square() const {
    return square_;
  }
  //called by constructor and ROOT read back
  void initSquare() { square_=value_*value_;}
};
The declaration of the member as transient remains the same but we do need a new iorule which would be
<ioread sourceClass = "SampleProd" version="[1-]" targetClass="SampleProd" source="int value_" target="square_">
<![CDATA[ newObj->initSquare();
]]>
</ioread>
The difference between this rule and the previous is we have to tell ROOT which data member needs to be read from disk, source="int value_" in order for our call to initSquare() to work properly.

Class versioning

The above examples of classes_def.xml files ignore class versioning. Class versioning is used to support schema evolution by giving the current version of a persistent class a version number, and incrementing the version number if and when a persistent non-static data member of the class is added, removed, renamed, or has its type changed.

Up to and including CMSSW_4_4_X, CMSSW uses class versioning only sporadically. However, in CMSSW_5_0_X, class versioning has been implemented throughout CMSSW as much as is feasible. Most of this initial implementation was done centrally by the Core Software group. Users should add class versioning for any classes added in 5_0_X or subsequent releases. Class versioning need not be added to 4_4_X and prior releases, but it is harmless if done correctly.

Class versioning is used in classes_def.xml files only for classes that are not instances of templates. For example, here is the classes_def.xml for a persistent product with a transient data member, using class versioning:

 <class name="SampleProd" ClassVersion="10">
  <version ClassVersion="10" checksum="238838498"/>
   <field name="square_" transient="true"/>
 </class>
 <class name="std::vector<SampleProd>"/>
 <class name="edm::Wrapper<std::vector<SampleProd> >"/>

Version numbers 0, 1, and 2 have special meaning to ROOT and should not be used as version numbers. By convention, CMS will use 3 as the initial version number of a class which has never been stored before. If a class was previously stored without a class version assigned then one should use 10 as the initial version instead of 3 because ROOT will automatically assign a version number to each unversioned class starting with 3 and increment the version number each time it finds an instance of a class with a different checksum while writing to the file. By using 10 we can accommodate 7 unversioned instances of a stored class which should be more than enough to avoid conflicts. The checksum is an automatically generated number over the class. The tool currently used to add class versioning to a file, edmAddClassVersion, will generate the ckecksums automatically. scram build will fail if any class with a class version does not have a correct checksum. The error message will indicate the correct value for the checksum.

Note that the templated classes do not have any versioning information in classes_def.xml. Versioning of CMS provided templates will be done by instrumenting a function in the template code itself.

Should the class SimpleProd be modified by adding, removing, renaming, or changing the type of any non-transient non-static data member, the new classes_def.xml file might look like this:

 <class name="SampleProd" ClassVersion="11">
   <version ClassVersion="11" checksum="169027539"/>
     <field name="square_" transient="true"/>
   <version ClassVersion="10" checksum="238838498"/>
     <field name="square_" transient="true"/>
 </class>
 <class name="std::vector<SampleProd>"/>
 <class name="edm::Wrapper<std::vector<SampleProd> >"/>

The new class version number is 11. The information about the previous version remains in the file for backward compatibility.

Schema evolution

ROOT is able to automatically deal with some changes to a class (i.e. schema evolution) such as
  • dropping of a member data
  • adding a member data (when reading back old versions the new member data will have the value it was assigned in the class' default constructor)
  • changing the same member data from one builtin type to another builtin type, e.g. from float to double or unsigned short to unsigned long.
However, it is possible to explicitly handle schema changes by adding a snippet of code to the classes_def.xml file. ROOT's documentation for that feature can be found at http://root.cern.ch/root/html532/io/DataModelEvolution.html .

classes_def.xml entries for a transient only EDProduct

In rare cases, you may wish to put a product in the event that you wish NEVER to be output into a EDM/ROOT file. If you specify

persistent="false"
in classes_def.xml file in the appropriate place (example is just below), the framework will guarantee that any EDProduct of this class will never be written to a EDM/ROOT file. You do not need a dictionary for constituents of the transient only class. In CMSSW_4_3_0_pre3 and prior releases, you do not need to specify a dictionary for a Wrapper for the transient only class. In CMSSW_4_3_0_pre4 and later releases, a dictionary for the wrapper must be specified.

CMSSW_4_3_0_pre4 or later release example.

<lcgdict>
 <class name="SampleProd"/>
 <class name="std::vector<SampleProd>"/>
 <class name="edm::Wrapper<std::vector<SampleProd> >" persistent="false"/>
</lcgdict>

CMSSW_4_3_0_pre3 or prior release example:

<lcgdict>
 <class name="SampleProd"/>
 <class name="std::vector<SampleProd>" persistent="false"/>
</lcgdict>

Note that this directive does not prevent the output of an EDProduct containing objects of type:

std::vector<SampleProd>
It only prevents the output of an EDProduct of the exact type:
std::vector<SampleProd>

Create classes.h

This C++ file must contain an #include (direct or indirect) of a header containing a full definition of each class or template used in classes_def.xml.

#include "DataFormats/<mypackage>/interface/SampleProd.h"
#include "DataFormats/Common/interface/Wrapper.h"

Do a build

You can now to "scram b". You will see the dictionary generation step. Watch for errors or warning during this step because the build might continue even if there are messages. If you see a message like "WARNING: dictionary not generated for XX", you must go back to the classes_def.xml and classes.h to try and remove the trouble. The trouble could be as simple as a missing or misspelled namespace name or a class that you forgot to mention that is used in a later declaration. Remember to make instances of any template used in the classes.h file.

After a successful build, run edmPluginDump to make sure your new product and associated things from the classes_def.xml are included in the list.

Producing the EDProduct

NOTE: The producer must not be in the same package as the EDProduct.

The producer package is a plug-in, and must follow the rules for plug-ins: https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideDeclarePlugins

This section only contains code relevant to producing and using a new EDProduct. Any producer that makes one of these new products must announce that it makes it.

This section is not directly relevant to constituents, as only EDProducts have producers.

Here is an example EDProducer constructor and produce function that makes one instance of an SampleCollection. Because we are inserting only one product of each type (and, in this case, we are dealing with only one type), there is no need to add an extra instance name. The class of your EDProducer must publicly inherit from class EDProducer.

Note that Event::put() does not immediately add the product to the event. It simply places the product in a queue. Immediately after the EDProducer returns successfully, the framework will put any queued products into the event. This means that a product cannot be obtained from the event by the same EDProducer instance that produced it.

[NOTE: You can use the skeleton code generator mkedprod to generate the initial C++ source files, which include examples]


SamplesProducer::SamplesProducer(edm::ParameterSet const& ps)
{
   // note: no argument in the call to produces
   // The product instance name will be empty.
   // The "setBranchAlias()" for setting the ROOT branch alias is optional.
   // If "setBranchAlias()" was not used, the aliases would default to the module label.
 
  produces<SampleCollection>().setBranchAlias("SampleCollection");
}

void SamplesProducer::produce(edm::Event& e, edm::EventSetup const&)
{
   std::unique_ptr<SampleCollection> result(new SampleCollection);
   // ... fill the collection ...
   e.put(std::move(result));
}

Here is an example producer constructor and produce function that makes two named instances of SampleCollection. Each call to put is now given an instance name, so that the two instances of the same class, created by the same producer instance, can be distinguished within the Event.

SamplesProducer::SamplesProducer(edm::ParameterSet const& ps)
{
   // notes: The argument string (e.g. "one") identifies the instance name.
   // The "setBranchAlias()" for setting the ROOT branch alias is optional.
   // If "setBranchAlias()" was not used, the aliases would default to "one" and "two".

   produces<SampleCollection>("one").setBranchAlias("SampleCollectionOne");
   produces<SampleCollection>("two").setBranchAlias("SampleCollectionTwo");
}

void SamplesProducer::produce(edm::Event& e, edm::EventSetup const&)
{
   std::unique_ptr<SampleCollection> result1(new SampleCollection);
   std::unique_ptr<SampleCollection> result2(new SampleCollection);
   // ... fill the collections ...
   e.put(std::move(result1),"one");
   e.put(std::move(result2),"two");
}

If a module does not put the product in the event itself but instead uses a helper class, then it can allow the helper class to also make the calls to the "produces" function. This is done by calling the function "producesCollector()" which returns a ProducesCollector. Then the ProducesCollector is passed as an argument into the constructor of the helper class. The ProducesCollector class defines "produces" functions and calling them as member functions of the ProducesCollector will have the same effect as calling the module "produces" member functions directly. (Note ProducesCollector was added to the repository in the 11_0_X release series. It didn't exist before then.)

Review Status

Reviewer/Editor and Date Comments
WilliamTanenbaum - 04 Mar 2016 replace std::auto_ptr with std::unique_ptr
WilliamTanenbaum - 05 Nov 2015 removed or replaced obsolete terminology
WilliamTanenbaum - 09 Mar 2015 removed references to rootrflx
ChrisDJones - 18-Sep-2012 added link to ROOT's data model evolution documentation
WilliamTanenbaum - 19 Aug 2011 updated information for class versioning
WilliamTanenbaum - 28 Jul 2011 add preliminary information for class versioning
WilliamTanenbaum - 18 Apr 2011 add required dictionary for Wrapper of transient classes
WilliamTanenbaum - 28 Aug 2009 removed obsolete information about exporting dependencies
WilliamTanenbaum - 04 May 2009 corrected classes.h example again
WilliamTanenbaum - 04 May 2009 corrected classes.h example
WilliamTanenbaum - 21 Apr 2009 Minor updates
WilliamTanenbaum - 27 Nov 2008 Minor updates
MatthewLeBourgeois - 04 Aug 2007 corrected classes def xml entries
WilliamTanenbaum - 14 May 2007 General editing

Responsible: WilliamTanenbaum
Last reviewed by: Reviewer

Edit | Attach | Watch | Print version | History: r65 < r64 < r63 < r62 < r61 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r65 - 2019-10-11 - DavidDagenhart



 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback