Luminosity block data and run data

Complete: 3

Purpose of this document

The purpose of this document is to define the concepts of luminosity block data and run data, and to define the public (i.e. outside the framework) interface to this data and the behavior one should expect. Except for the last section titled Design Considerations (which readers might want to skip), the internals of this implementation are not discussed.

Events, Luminosity Blocks, and Runs

For the purpose of this document, an event is the recorded data that corresponds to one beam crossing. A luminosity block is a collection of temporally consecutive events. A run is a collection of temporally consecutive luminosity blocks. Prior to CMSSW_1_3_0, there was no ability to define per run data or per luminosity block data. Only per event data was supported.

Event Data, Run Data, and Luminosity Block Data

Event data consists of EDProducts (Event Data Products). EDProducts are capable of persistence. Luminosity block data and run data will also consist of EDProducts. There is no difference in requirements on the C++ classes defining event, luminosity block, and run EDProducts. Logically event data corresponds to a particular event, run data consists of data corresponding to either an entire run or the parts of the run processed when the product was created, while luminosity block data consists of data corresponding to either an entire luminosity block or the parts of a luminosity block processed when the product was created.

The Event object

The user's access to event data is controlled by an object of the class Event. This is a transient object created and owned by the framework. The user does not explicitly create the Event. The user may produce new EDProducts and put them into the Event, or get products from the Event.

The Run and LuminosityBlock objects

These are transient objects created and owned by the framework, analogous to the Event object. The user does not explicitly create the Run or the LuminosityBlock. The user may produce new EDProducts and put them into the Run or LuminosityBlock, or get products from the Run or LuminosityBlock. The public interface to an object of class Run or of class LuminosityBlock is analogous to the interface of class Event. Many parts of these interfaces are identical. Although there are some differences such as:
  • Class Event has functions getRun() and getLuminosityBlock() to access the corresponding Run or LuminosityBlock.
  • Class LuminosityBlock has function getRun() to access the corresponding Run.
  • The id() function of LuminosityBlock will return an object of type LuminosityBlockID which contains both the run number and luminosity block number.
  • The id() function of Run will return an object of type RunID which contains the run number.
For both runs and luminosity blocks valid numbers begin with 1, zero is an invalid value.

Producing an EDProduct and putting it in the Event

A per event EDProduct can be produced by an EDProducer, an InputSource, or (rarely) an EDFilter, as described here SWGuideCreatingNewProducts and here SWGuideInputSources. In any case, as documented above, the user must announce any possible production of a product in the producer, source, or filter constructor, and put any product actually produced into the event in the produce() function of the producer or source (or the filter() function of the filter).

Producing an EDProduct and putting it in the Run or LuminosityBlock

An EDProduct can be produced by an EDProducer, an InputSource, or (rarely) an EDFilter, and then put into a LuminosityBlock or Run in a manner very similar to that for an EDProduct being put into an Event. There are three main differences. First, when the producer, source, or filter constructor declares which products it produces an extra template parameter must be provided. For example, if a product of type ThingCollection was going to be produced the module constructor would use one of the following to declare it depending on when the product was actually going to be put:

  • produces<ThingCollection, edm::Transition::BeginLuminosityBlock>("instanceName")
  • produces<ThingCollection, edm::Transition::EndLuminosityBlock>("instanceName")
  • produces<ThingCollection, edm::Transition::BeginRun>("instanceName")
  • produces<ThingCollection, edm::Transition::EndRun>("instanceName")

Second, the put function would be invoked as a member function of the Run or LuminosityBlock instead of as a member function of the Event. And third, the put function would be called from a different function depending on the circumstances. For stream type modules these functions are described here: FWMultithreadedFrameworkStreamModuleInterface. For global modules they are described here: FWMultithreadedFrameworkGlobalModuleInterface. For one type modules they are described here: FWMultithreadedFrameworkOneModuleInterface. Legacy type modules are no longer allowed to put products into a Run or LuminosityBlock.

Getting a product from the Event

The user may retrieve an EDProduct from the Event by using any of the methods documented in this link: SWGuideEDMGetDataFromEvent. Retrieving an EDProduct does not modify the content of the Event.

Getting a product from the Luminosity Block or Run

Products can be retrieved from the Luminosity Block or Run by invoking the appropriate get function, just as they can be retrieved from the Event. If needed, the Luminosity Block or Run can first be retrieved from the Event with a call to getLuminosityBlock() or getRun(), respectively. For the most part the interface is the same for Run and LuminosityBlock as for Event. Again there are 3 main differences. First, when the module constructor declares which products the module consumes an extra template parameter must be provided. For example, if a product of type Thing was going to be consumed, then the module constructor would call one of the following:

  • consumes<Thing,edm::InRun>(inputTag)
  • consumes<Thing,edm::InLumi>(inputTag)

Second, the get function (for example, getByToken) would be invoked as a member function of the Run or LuminosityBlock instead of as a member function of the Event. And third, the get function would be called from a different function depending on the circumstances. For stream type modules these functions are described here: FWMultithreadedFrameworkStreamModuleInterface. For global modules they are described here: FWMultithreadedFrameworkGlobalModuleInterface. For one type modules they are described here: FWMultithreadedFrameworkOneModuleInterface.

Products in a Run or LuminosityBlock might be mergeable (this is defined below). It is highly recommended that mergeable run products be gotten in an endRun method and mergeable luminosity block products be gotten in an endLuminosityBlock method instead of a beginRun, beginLuminosity, or event method. The values in mergeable products can change as additional fragments of the Run or LuminosityBlock are merged. This merging will happen anytime a new input file is opened and the first run or luminosity block in the newly opened file continues the last run or luminosity block from the previous file. (Similar merging of all fragments of a run or luminosity block within a single input file also occurs while cmsRun processes files, but that all occurs before the beginRun transition).

Note that when the secondary file input feature is being used, mergeable run and luminosity block products from the secondary file are automatically dropped on input (see here for more information about secondary file input: SWGuideEDMParametersForModules#PoolSource).

Making the Data persistent

The EDProducts making up an event, luminosity block, or run can be written to persistent store by invoking the PoolOutputModule in the configuration file, as documented in this link: SWGuideSelectingBranchesForOutput By default, all EDProducts are written to persistent store. In a later job, the EDProducts can be read back in from persistent store and the Event, LuminosityBlock, or Run object reconstituted by using PoolSource as the primary source in the configuration file, as documented in this link: SWGuidePoolInputSources In persistent store, the EventData is stored in a ROOT tree named Events, while the Run Data and Luminosity Block Data are stored in ROOT trees named Runs and LuminosityBlocks, respectively.

Limitations

  • edm::Ref, edm::Ptr, edm::RefToBase, and any variations of them, are supported only for per event products.
  • The View feature is not supported by Run or LuminosityBlock.
  • Filters only effect the execution of the event functions of modules on a Path. Filters do not have the ability to stop the execution of the functions associated with the begin/end transitions for luminosity blocks or runs. They work as filters only on a per event basis. The run and luminosity functions are always executed.
  • Modules that produce run or luminosity products can be configured to be run unscheduled, but it is only the event methods that have the special unscheduled behavior. The begin/end run and luminosity block functions are always run.

Note there is also a special mode available for running modules unscheduled that produce run or luminosity block products and do not produce event products and you still want the event method to run on every event. For more information about this special case see:

Merging Run and Luminosity Block Products

Run products get merged during or after processes that merge multiple input files into one output file. This is also true for luminosity block products. Event products do not get merged. Lets start by considering a simple example of this merging.

Assume for the purpose of this example, we have two files which each contain 10 events. Also assume that all the events and luminosity blocks in the files are associated with the same ProcessHistoryID, run number, and luminosity block number. Also assume that the runs in the files are associated with the same ProcessHistoryID and run number as the events and luminosity blocks. Assume each of these files has 10 entries in its Events TTree, 1 entry in its Runs TTree, and 1 entry in its LuminosityBlocks TTree. If one runs a cmsRun process that uses both of these files as input and only these two files, one can create a single output file. This merging process will see 1 beginRun transition, 1 beginLuminosityBlock transition, 20 event transitions, 1 endLuminosityBlock transition, and finally 1 endRun transition. In the merged output file, there will be 1 entry in the Run TTree, 1 entry in the LuminosityBlocks TTree, and 20 entries in the Events TTree. For the events this is simple, the products are copied forward with newly produced products added. There is a one to one correspondence between event entries in the input and output. There is also a one to one correspondence between the products contained in those TTree entries (except of course for the newly produced or dropped products). There is no merging of event products. But for runs and luminosity blocks, two different products (one from each input file) must be used to create each product in the output file.

Constant and Mergeable Products

There are two different kinds of run and luminosity block products. In one kind we expect all the products associated with the same run (or luminosity block) to have exactly the same content. Lets call these constant products. For example, a product might contain trigger settings and CMS might have a rule that a new run must be started if these trigger settings are changed or that they can only be changed at the start of a new luminosity block. In the second kind of product, we expect the product to accumulate some quantity from events or luminosity blocks as they are processed. An example might be a histogram of the transverse momentum of all tracks in all events processed. Lets call these mergeable products.

The Framework requires that the C++ class that defines a constant product must have a member function with the following signature:

    bool isProductEqual(ProductType const& product) const;

The Framework uses this function to check that the product is in fact the same in all cases where two products are to be combined into one. A MessageLogger error will print if that function returns false. It is up to the person defining the function what criteria need to be checked. The products are combined by taking the first and using its values and ignoring subsequent ones after the comparison has been made. Note this is a const function.

The Framework requires that the C++ class that defines a mergeable product must have a member function with the following signature:

    bool mergeProduct(ProductType const& product);

The Framework uses this function to merge together two products which are accumulating something.

For example, the mergeProduct function might add the contents of two histograms bin by bin to form one new histogram. It is up to the developer who defines the function what it does exactly. Note that this function is not const and it is expected that in most cases it will modify the object as it merges in the other product.

The Framework determines whether a product is constant or mergeable by checking whether or not the above functions exist in the C++ class definition. If the product class definition does not define either the mergeProduct or the isProductEqual function, then a warning will be printed by the MessageLogger every time the Framework tries to combine two products into one (the Framework treats them as constant products without running any equality check in this case). To get rid of this warning one must define one of the two functions. You should not define both functions because it is confusing (the Framework will silently use the mergeProduct function and ignore the isProductEqual function if both are defined).

ProcessHistoryID, Run Numbers, and LuminosityBlock Numbers

The Framework will never try to combine two runs if they have different ProcessHistoryIDs or run numbers. If either the ProcessHistoryID or run number are different the products are written into separate TTree entries and the Framework will order the work it does so that the runs and all they contain are processed separately with non overlapping sequences that start with a beginRun transition, continue with event transitions, and then finish with an endRun transition.

The analogous thing is true for luminosity blocks, except to be combined they must also have the same luminosity block number in addition to having the same ProcessHistoryID and run number.

The ProcessHistoryID depends on many things. It depends on all processes in the processing history of a run, luminosity block or event where any module declared that it produced a persistent product or where a Path was defined (because definition of a Path causes a TriggerResults product to be produced). The ProcessHistoryID depends on the order of those processes and their process names. It also depends on the release version of those processes. And it depends on the tracked parts of the ParameterSet defining the process and the tracked objects the top level ParameterSet contains. For example, if you modified one tracked parameter in the configuration of one producer, then the ProcessHistoryID would be different. This would cause mergeable products with and without that change to never be merged.

To be very precise, one should mention that the ProcessHistoryID used is actually the "reduced" ProcessHistoryID. This means that many things in the configuration which do not affect the products in the output file have been dropped before calculating the ProcessingHistoryID, such as the configurations of EDAnalyzers and OutputModules on EndPaths. Also the last one of the three numbers in the release name is dropped when calculating the reduced ProcessHistoryID.

Contiguous Runs and Luminosity Blocks

Runs must be contiguous in the processing order to be merged. If a run is encountered with a different ProcessHistoryID or run number, then the run in memory is written to the output TTree and after that point it is not modified in that process.

Within a single input file, the Framework sorts so that all runs, luminosity blocks, and events within that single file that have the same run number and ProcessHistoryID are processed contiguously. This sorting does not extend across input file boundaries. Everything from an input file is processed before moving to the next input file.

When merging multiple input files this can and often does create output files that contain multiple entries that correspond to the same ProcessHistoryID and run number. These do not get merged together until the next process. In the following process, they are merged as they get read.

The analogous thing is true for LuminosityBlocks. The only difference is that the luminosity block number is also used.

Almost always, the contents of an input file are sorted by order of first appearance in the TTree. This means you start with the first entry in the TTree and continue processing everything with that ProcessHistoryID and run number in that file until they are all done. Then you go to whatever appears next in the TTree. And the analogous thing is done for luminosity blocks. There is an alternate order that can be configured, but is almost never used (see SWGuideEDMParametersForModules#PoolSource for details about the noEventSort parameter). These sort orders only apply within one input file. Everything in a file is always processed before moving to the next file.

A design restriction with mergeable products

Before the 10_3_X release series, the mergeProduct function would always be called when trying to combine two run or luminosity block mergeable products. Always! This creates a restriction on the ways data containing mergeable run or luminosity block products can be processed. I will try to explain this with an example of a problem case. Here is the example:

  1. Assume there is a file containing one run with 20 events.
  2. Then assume a process is run that creates a mergeable run product and that process runs over all 20 events and is written to an output file. The histograms in that product correspond to all 20 events.
  3. Then assume there is another processing step which consists of 2 separate processes. One process runs over the first 10 events and the other process runs over the other 10 events. Both output files contain the same histograms which correspond to all 20 events even though there are only 10 events in each file.
  4. Then there is another processing step where one process uses both files as input and merges them into a single output file. Assume mergeProduct function adds the histograms bin by bin so that in the output of this job the histogram bins are incorrectly set to double the values they should hold.

The design prior to the 10_3_X release series does not allow splitting an input file unless the split is done on a run boundary! It is a CMS rule that the splitting done in step 3 of the above example is not allowed, because it is not done on a run boundary. If you split a file this way, you will silently get invalid values in any mergeable run or luminosity block products.

Proposed changes for the 10_3_X release series

(When this was originally written the changes were just a proposal. Since then those changes were approved and merged into the 10_3_X release series.)

The changes proposed for the 10_3_X release series would make it possible to split runs on luminosity block boundaries and merge those split files back together without causing problems in the content of mergeable run products. The handling of luminosity block mergeable products does not change. Splitting runs in places other than luminosity block boundaries is also still not supported if there are mergeable run products.

Here is how it will work. The Framework will start keeping track of which luminosity blocks were processed when mergeable run products were created. It keeps track of this for every run and also for every process that created a run mergeable product that is in the input. In addition, the Framework will keep track of which luminosity blocks are processed in the current process. By comparing luminosity block numbers processed currently with those processed when the product was created, the Framework can detect when a particular run entry was split and when a run mergeable product represents more luminosity blocks than were actually processed to make a file. This information is recorded in each output file. Depending on the result of this comparison, the Framework will select one of two possible merging behaviors.

  1. If a particular run mergeable product in a particular run entry represents the same set of luminosity blocks as were processed when creating the run entry, then the merging behavior is the same as before. The mergeProduct function is used. This will be true if files were never split. It will also be true if files were split but all the fragments were already merged back together. Note that this merging can occur in one process if all the data to be merged is processed contiguously, but if the processing is not contiguous it may not be until after the succeeding process that all run entries have been merged in the data file.
  2. If a particular run mergeable product in a particular run entry does not represent the same set of luminosity blocks as were processed when creating the run entry (input was split and is still split), then a different algorithm is used.

In the second case, the Framework will compare the set of luminosity blocks associated with each of the two mergeable run products being combined. These are the sets of luminosity blocks processed when the products to be combined were created.

  • If the sets of luminosity block numbers are disjoint, then the mergeProduct function will be used.
  • Else if the set of luminosity block numbers of the existing product contains the set of luminosity block numbers of the product being merged, then nothing is done (this includes the common case where the sets are the same).
  • Else if the set of luminosity block numbers of the product being merged contains the set of luminosity block numbers corresponding to the existing product then the existing product is replaced by the product being merged.
  • Otherwise, it is impossible to properly merge the objects. The mergeProduct function is used just like in the first case, but the product is marked in the Provenance to be knownImproperlyMerged. And this can be accessed from the Handle when the product is retrieved through the Provenance.

If one has used getByToken to get a handle to a run mergeable product, then one can call a Provenance function to determine if the problem discussed in the last bullet above has occurred.

    handle.provenance()->knownImproperlyMerged()

Here is an example of how the problem discussed in the last bullet above can occur.

  1. Assume we start with three files containing only the same run and assume they contain a mergeable run product. Assume one file contains lumis 1-10, the second file contains lumis 11-20, and the third file contains lumis 21-30.
  2. Then assume a subsequent processing step consisting of 4 processes that produces 4 split files. Say the split files contain lumis as follows: file 1 contains lumis 6-10, file 2 contains 11-15, file 3 contains 16-20, and file 4 contains 21-25.
  3. Then assume there is another processing step where split file 1 and split file 2 are merged to form a file containing lumis 6-15 and also split file 3 and split file 4 are merged to form a file containing lumis 16-25.
  4. Finally we are left with two files that cannot be properly merged. They both contain run products which represent a range of luminosity blocks that partially overlaps and partially does not overlap. There is no way to properly merge these products. This is the case where the product Provenance gets marked as *knownImproperlyMerged".

If we modify processing step 3 in the above example so that all 4 split files are processed in the same process, and the run entries are processed contiguously there will be no problem with the merged output. But it is important to realize that even if the 4 split files are merged in one process into one output file, there will still be a problem if the processing of the run fragments is not contiguous and some other run is processed between the first two run fragments and the last two run fragments.

If runs or luminosity blocks are split at other than luminosity block boundaries, then problems with merging will occur and these types of problems will likely not be detected by the Framework and the knownImproperlyMerged function may return false even when they occur. The function only detects problems that can occur when the split occurred on a luminosity block boundary.

The changes proposed for the 10_3_X release series add a requirement that the C++ classes defining the data format of mergeable run products have a swap member function.

The changes proposed for the 10_3_X release series add an additional check when there is an attempt to combine two mergeable products. If one product was dropped and the other was not dropped, then an exception will be thrown. Similarly, if one product was actually put into the run by the producer and the other product was not actually put into the run by the producer an exception will be thrown. If inconsistent input files need to be merged then the offending branches can be dropped on input to avoid these exceptions.

Design considerations

We considered performance when implementing the behaviors described above. This affected the design. Also there are optimizations in the implementation to reduce overheads due to support for run mergeable products. In cases where the number of runs and luminosity blocks is small or moderate, the overheads should be negligible. In cases where there are very large numbers of luminosity blocks or runs, one might consider profiling a process to determine if the overheads are tolerable and also looking at the size of the branch in the output file (TBranch MergeableRunProductMetadata in TTree MetaData exists only to support mergeable run products). If the overheads become unacceptably large, one could drop all the run mergeable products on input and then the overheads should be negligible. Almost nothing is done or saved in that case.

Note that when the input is not split, the persistent information that needs to be stored is already stored in the IndexIntoFile object that is required in every data file for unrelated purposes. The implementation takes advantage of that.

There are alternative designs and some comments follow on the choices we made.

We require splitting occur on luminosity block boundaries and only let runs be split. This requires the implementation to keep track of many luminosity block numbers. Instead one could keep track of event numbers or event ranges. This would allow supporting splitting files at any event instead of only at luminosity block boundaries. This would significantly increase the overheads, because there would be a lot more event numbers than luminosity block numbers that we would have to keep track of. This increased overhead is why we did not make that choice.

One might consider looking at the configuration of the current process and previous processes. Then one might try to determine the history of how files were split and merged from this configuration information. The problem with this approach is that one use case driving the splitting is that we may run on resources where availability is limited and unpredictable. We may have to stop a process because the resource is no longer available, not because of some configured parameters. We may not know in advance how long a resource will be available. So in general an approach using configuration information would not work.

One might consider filling a database containing information related to how files have been created, what runs and luminosity blocks were initially created and the splitting/merging that has occurred. This is not something that could be implemented solely within the Core Framework code. Unless it has information from outside the process, the Framework can do little more than observe and record the runs, luminosity blocks and events that are processed. And this limits what can be done.

Another alternative would be to have a top level parameter that controls the merging behavior in a particular process. This would be useful if a file was split in one processing step and then there was a dedicated merging step that followed, where it was known in advance that the mergeProduct function should never be called. This would support splits at arbitrary points and the Framework part of this would be simple to implement. We did not implement this because there was no indication that anyone wants to run in this manner. There would be many issues that would arise in designing workflows with a processing step dedicated to merging run and luminosity block products.

Another consideration is that in practice there are not many mergeable products in existence. As of July 2018, here is the list of the mergeable products in the CMSSW repository, excluding the ones that exist only for unit tests:

  • CondFormats/Common/interface/FileBlobCollection.h
  • DataFormats/Common/interface/MergeableCounter.h
  • DataFormats/Histograms/interface/MEtoEDMFormat.h
  • DataFormats/NanoAOD/interface/MergeableCounterTable.h
  • SimDataFormats/GeneratorProducts/interface/GenFilterInfo.h
  • SimDataFormats/GeneratorProducts/interface/GenLumiInfoProduct.h
  • SimDataFormats/GeneratorProducts/interface/LHERunInfoProduct.h
  • SimDataFormats/GeneratorProducts/interface/LHEXMLStringProduct.h

In designing future improvements to code that merges run and luminosity block products, one should ask the questions:

  • In what workflows and processing steps do we need to keep these products?
  • What are the actual file splitting needs in those workflows and processing steps?
  • Is it worth resources to develop code and execute all the bookkeeping necessary to support file splitting for the use cases that exist?

One possible alternative would be to require that all the mergeable products be dropped when files are split. Then all these issues vanish.

Review Status

Reviewer/Editor and Date (copy from screen) Comments
WilliamTanenbaum - 17 Jan 2007 page created
JennyWilliams - 07 Feb 2007 editing to include in SWGuide
DavidDagenhart - 17 July 2018 Add section on merging, update and major revisions to the rest of it

Responsible: DavidDagenhart
Last reviewed by: DavidDagenhart - 17 July 2018

Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r13 - 2018-12-06 - DavidDagenhart



 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback