How to write your own input source

Complete: 3

Goal of this page

This page contains examples of how to write and use your own input source in release cycle CMSSW_6_1_X and later releases. For release cycle CMSSW_6_0_X and prior cycles, use the following link:

https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideInputSources?rev=23

Introduction

An input source creates an event and inserts zero or more products into the event.

This page does not apply to input sources such as PoolSource that read events previously created by the framework. This page only describes those input sources that create a new event from scratch.

An input source is very much like a producer. As you will see below, if you know how to write an EDProducer, you are 90% of the way there to writing an input source. Here is the link to the WIKI about how to create new products using an EDProducer.

https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCreatingNewProducts

All of the information there applies also to input sources, except as indicated below.

However, we strongly prefer, if possible, that you use EmptySource for your input source, and put any products using an EDProducer. You should write your own input source if and only if there is a very good reason that you cannot use EmptySource and a producer.

Writing an input source

Note: GeneratedInputSource and ExternalInputSource in the 6_1_X cycle have been replaced respectively by two new base classes, ProducerSourceBase and ProducerSourceFromFiles. ProducerSourceBase and ProducerSourceFromFiles will respectively have the identical configuration parameters as GeneratedInputSource and ExternalInputSource, but the details of the produce() and setRunAndEventInfo() functions will change. The framework group will migrate existing sources to the new base classes.

Your input source must be a class publicly inheriting from the class ProducerSourceBase, ProducerSourceFromFiles, or RawInputSource. In almost every case, ProducerSourceBase or ProducerSourceFromFiles should be used.

RawInputSource was created explicitly for use by the DAQ online source, but it can be used as a base class by other input sources if appropriate. See the section on RawInputSource below.

The headers for the base classes can be included by:

#include "FWCore/Sources/interface/ProducerSourceBase.h"
OR
#include "FWCore/Sources/interface/ProducerSourceFromFiles.h"
OR
#include "FWCore/Sources/interface/RawInputSource.h"

Writing a ProducerSource

Your class must contain the following methods:

  • A constructor, as documented below
  • A setRunAndEventInfo() method, as documented below
  • A produce() method, as documented below

Your class may optionally contain the following methods:

  • A beginRun() method
  • An endRun() method.
  • A beginLuminosityBlock() method
  • An endLuminosityBlock() method
  • Any other methods your class needs internally.

The beginRun() and endRun() methods are called at the begin and end of a run, respectively, and can be used to put products into a run. The beginLuminosityBlock() and endLuminosityBlock() methods are called at the begin and end of a luminosity block, respectively, and can be used to put products into a luminosity block.

Making an event with generated product(s)

Here is an example input source constructor and produce function that makes an event with one instance of a SampleCollection, generated internally.

MySource::MySource(edm::ParameterSet const& ps,
                   edm::InputSourceDescription const& desc) :
                   ProducerSourceBase(ps, desc, false)
{
   // The false argument to the ProducerSourceBase constructor indicates that this is not real data
   // ... read any configuration parameters specific to your input source
   // ... do any other initialization specific to your input source.

   // note: no argument in the call to produces
   // the standard module label is assumed
   produces<SampleCollection>();
}

bool MySource::setRunAndEventInfo(EventID& id, TimeValue_t& time) {
   // This function must return false if it can be determined there are no more input events to process.
   // This function may modify the run number, lumi number, event number, or time value.

   // If this function always returns true and never modifies the EventID or time value,
   // you should not write your own source.  You should use EmptySource plus a producer. 

    return true;
}


void MySource::produce(edm::Event& e)
{
   auto result = std::make_unique<SampleCollection>(); // more concise than std::unique_ptr<SampleCollection> result(new SampleCollection);
   // ... fill the collection ...
   e.put(std::move(result));
}

Note that there is no code required to create the event, or to track the number of events created. The framework creates the event for you automatically, and will exit when maxEvents have been created. The event numbers will be consecutive, beginning at 1, as will the run numbers, subject to control by the configuration parameters. All you need to do is announce and create your products and put them into the event.

Note also that you do not have to read the common parameters specified here in your constructor. You only need to read parameters specific to your input source.

Note also that this InputSource code looks just like an EDProducer, with only these differences:

  • The class inherits from ProducerSourceBase (or ProducerSourceFromFiles) instead of from EDProducer.

  • The constructor takes two arguments, rather than one, and must pass both arguments to the base class constructor, as well as a boolean argument indicating if it is real data. Usually this will be false.

  • The produce() method takes only an Event as an argument (no EventSetup argument).

  • A setRunAndEventInfo() function must be provided. If your source does not need a setRunAndEventInfo() function, throw away your source and use EmptySource plus a producer.

If both setRunAndEventInfo() and produce() need the same external data, there is no need to read the same data twice. You can, if you wish, read the data in setRunAndEventInfo(), and store your data, or pointers to your data, as member(s) of your input source.

The framework supplied input source EmptySource is simply a ProducerSource that puts no products into the event.

Making an event with externally read product(s)

Here is an example input source constructor, setRunAndEventInfo() method, and produce() method that makes an event with two instances of a SampleCollection, where the information needed for each event is read externally from a file.

MySource::MySource(edm::ParameterSet const& ps,
                   edm::InputSourceDescription const& desc) :
                   ProducerSourceFromFiles(ps, desc, false)
{
   // ... read any configuration parameters specific to your input source
   // ... do any other initialization specific to your input source.

   // "product instance name" argument needed
   // to distinguish two otherwise identical products.
 
   produces<SampleCollection>("one");
   produces<SampleCollection>("two");
}

bool MySource::setRunAndEventInfo(EventID& id, TimeValue_t& time) {
     //  This example method is much longer than needed,
     //  It is written this way for instructional purposes.
 
   // Read external data to see if there is more data.
   // Save data if necessary.
   if (noMoreData) return false;

   // If a new input file has been opened, you may call *incrementFileIndex()*, a member function of *ProducerSourceFromFiles*
   // with no arguments and no return value, to notify the framework that a new file has been opened.  If you do not call
   // *incrementFileIndex()* when a new file is opened,  your source will still function, but the framework will not know that
   // a new input file has been opened.  This may cause some options not to work properly (e.g. the *NOMERGE* option in the options block).
   
   if (newFileWasOpened) incrementFileIndex();

   // Optionally modify the EventID and/or the time value for the timestamp.
   // If you don't modify them, they will be determined by the previous event, incremented by the values of the configuration parameters.

   //  Modify the EventID or timestamp only if you need to, e.g :   id = EventID(runNum, lumiNum, eventNum)
  
    return true;
}

void MySource::produce(edm::Event& e) {
   auto result = std::make_unique<SampleCollection>(); // more concise than std::unique_ptr<SampleCollection> result(new SampleCollection);
 
   // ... fill the collection ...
   e.put(std::move(result));
}

The framework calls the setRunAndEventInfo() method before it creates the event, so the information set here becomes part of the event. The input values of the run number, lumi number, event number, and timestamp will be those of the previous event incremented as determined by the configuration parameters. If you need external data to set the run number, event number, or timestamp, you must read it in setRunAndEventInfo(), as the produce() method is not called until later, after the event is created by the framework. The setRunAndEventInfo() method needs to return false when there is no more data to be read. The setRunAndEventInfo() method should call incrementFileIndex() if it has opened a new input file during this invocation.

Note: The incrementFileIndex() function was introduced in CMSSW_6_2_0_pre4. It did not exist prior to then.

If both setRunAndEventInfo() and produce() need the same external data, there is no need to read the same data twice. You can, if you wish, read the data in setRunAndEventInfo(), and store your data, or pointers to your data, as member(s) of your input source.

Configuration parameters for ProducerSourceBase

The following configuration parameters are read and used by ProducerSourceBase

  • untracked uint32 firstRun - The number of the first run. The default value is 1. 0 is an illegal run number..

  • untracked uint32 firstLuminosityBlock - The number of the first luminosity block. The default value is 1. 0 is an illegal luminosity block number. firstLuminosityBlock was not supported prior to CMSSW_1_3_0_pre1.

  • untracked uint32 firstEvent - The number of the first event. The default value is 1. 0 is an illegal event number.

  • untracked uint32 numberEventsInLuminosityBlock - The maximum number of events is a luminosity block. After this number of events is generated, the luminosity block number will be incremented by 1. The default is the value of maxEvents. numberEventsInLuminosityBlock was not supported prior to CMSSW_1_3_0_pre1.

  • untracked uint32 numberEventsInRun - The maximum number of events is a run. After this number of events is generated, the run number will be incremented by 1, and the event number and luminosity block number will begin again at 1. The default is the value of maxEvents.

  • untracked uint32 firstTime - The value of the timestamp before the first event. The default value is 0. Note that the timestamp will be incremented by timeBetweenEvents once before the first event.

  • untracked uint32 timeBetweenEvents - The time interval between events. The default value is 5,000,000. The unit of the time interval is nanoseconds, and 5,000,000 corresponds to 200 events/second. This number affects only the event timestamps. The time delay is not simulated (no delay in generation).

Configuration parameters for ProducerSourceFromFiles

The parameters for ProducerSourceFromFiles are identical to those of ProducerSourceBase, with one additional parameter:

  • untracked vstring fileNames - A vector of filenames. This is a required parameter. The data identifies files or other external sources of data. This parameter is read and stored by the framework. Your input source must do your own I/O from these external sources. If a specified file name contains a colon (':'), it is interpreted as a physical file name, and is not modified by the framework. If a specified file name does not contain a colon (':'), it is interpreted as a logical file name, and the framework will look up the name in the file catalog, and translate the name into a physical file name, throwing an exception if no catalog entry is found.

Writing a RawInputSource

RawInputSource was designed explicitly for use by the DAQ online source. It can be considered as an alternative to ProducerSourceFromFiles if the event number, run number, and timestamp are taken directly from your data and do not need to be configurable, and you don't need Event::put() to put products into the event. However, "RawInputSource" is a much lower level interface than ProducerSourceFromFiles, so it will involve handling more details yourself.

Your class must contain the following methods:

  • A constructor, as documented below
  • A checkNextEvent() method, as documented below
  • A read() method, as documented below

Your class may optionally contain the following methods:

  • Any other methods your class needs internally.

Making an event with a RawInputSource

Placeholder checkNextEvent() to be documented here. read() to be documented here.

RawInputSource provides no fileNames parameter. If you need one, you must implement it in your source.

Review Status

Reviewer/Editor and Date (copy from screen) Comments
WilliamTanenbaum - 15 Jan 2007 page author
JennyWilliams - 01 Feb 2007 editing to include in SWGuide
WilliamTanenbaum - 26 Mar 2007 editing for maxEvents changes
WilliamTanenbaum - 17 Oct 2012 RawInputSource has changed
WilliamTanenbaum - 07 Nov 2012 RawInputSource has changed again
WilliamTanenbaum - 13 Nov 2012 ProducerSourceBase introduced
WilliamTanenbaum - 19 Nov 2012 ProducerSourceFromFiles introduced
WilliamTanenbaum - 06 Mar 2013 incrementFileIndex() introduced
WilliamTanenbaum - 17 Sep 2016 Replace auto_ptr with unique_ptr

Responsible: WilliamTanenbaum
Last reviewed by: Reviewer

Edit | Attach | Watch | Print version | History: r30 < r29 < r28 < r27 < r26 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r30 - 2016-09-17 - WilliamTanenbaum



 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback