Subprocesses

Complete: 3

Goal of this page

This page describes how two or more tiers (e.g. HLT and RECO) can be run in a single process.

This feature is available as of IB CMSSW_4_2_X_2011-01-30-0200. It will be in CMSSW_4_2_X_pre2. But see the caveat immediately below.

Caveat:

The code to propagate information needed to resolve references (edm::Ref, etc.) passed down from a process to a subprocess was implemented only as of CMSSW_5_0_X, and not in CMSSW_4_X_X. If you wish to use subprocesses in 4_X_X, please contact the framework group.

Overview of Subprocesses

The subprocess feature allows a user to run two or more processing steps (e.g. HLT-RECO) in a single process. Prior to CMSSW_8_0_X, the top level process can have at most one subprocess, and each subprocess can in turn have at most one subprocess, that runs as a subprocess of it. There is no limit to the length of this chain, other than your jobs running out of a resource (e.g. memory). Beginning with CMSSW_8_0_X, the top level process and each subprocess may have multiple subprocesses. There is no limit on the number of subprocesses a process may have, other than your job running out of a resource.

A simple example is given here, of RECO running as a subprocess of HLT. The syntax used in a configuration file to specify subprocesses has changed in CMSSW_8_0_X, so the example is show twice. Note that the subprocess object has two optional top level parameters of its own, SelectEvents and outputCommands, which select events and products respectively in exactly the same manner as they do for an output module. By default, all events and products are kept.

Pre CMSSW_8_0_X example

import FWCore.ParameterSet.Config as cms
process = cms.Process("HLT")
#configuration for process goes here.
#This syntax only works in CMSSW_7_X_X and prior releases.
prodProcess = cms.Process("PROD") # The name 'prodProcess' is arbitrary.
process.subProcess = cms.SubProcess(prodProcess,
   # Optional SelectEvents parameter can go here.
   outputCommands = cms.untracked.vstring('keep *)  #Optional parameter, defaulting to "keep *"
)
#configuration for prodProcess goes here
#same format as for 'process', but using 'prodProcess' instead of 'process'

Post CMSSW_8_0_X example

import FWCore.ParameterSet.Config as cms
process = cms.Process("HLT")
#configuration for process goes here
#This syntax only works in CMSSW_8_0_X and subsequent releases.
prodProcess = cms.Process("PROD") # The name 'prodProcess' is arbitrary.
process.addSubProcess(cms.SubProcess(prodProcess,
   # Optional SelectEvents parameter can go here.
   outputCommands = cms.untracked.vstring('keep *)  #Optional parameter, defaulting to "keep *"
))
#configuration for prodProcess goes here
#same format as for 'process', but using 'prodProcess' instead of 'process'

Relation between a Subprocess and its Parent

With a few restrictions, the configuration of a subprocess is independent of the configuration of its parent. Each subprocess has:

  • A subprocess has its own modules, paths, end paths, all of which are independent of those in its parent.
  • A subprocess may also have its own services, most of which are independent of those in its parent. See restrictions, below.
  • A subprocess may also have its own options block, although only some of the parameters have an effect in subprocesses. See restrictions below.

Currently, the feature is implemented so that any configuration parameters that violate the restrictions in a subprocess are ignored, and will not cause an error.

The restrictions are:

  • A subprocess may not have a primary source. The input runs, luminosity blocks, and events are passed down directly from the parent. All products processed by the parent are passed to the subprocess, except those dropped on input in the parent, or by the outputCommands parameter of the subprocess. All runs, lumis, and events processed by the parent are passed to the subprocess, except those events dropped by the selectEvents parameter of the subprocess.
  • A subprocess may not contain a looper. If the parent contains a looper, the subprocess will loop also.
  • A subprocess may not contain a top level maxEvents or maxLuminosityBlocks parameter set. These are inherited from the parent.
  • A subprocess may not reconfigure the MessageLogger service, and certain other specific services that have been designated as process-wide services. That is inherited from the parent.
  • In the options block in a subprocess, only the values of the wantSummary, and exception handling (Rethrow, SkipEvent, *FailPath, FailModule, IgnoreCompletely) parameters are meaningful. Other parameters are either irrelevant to the subprocess or inherited from the parent.
  • Services and event setup are discussed in more detail in the following sections.

Subprocesses and Services

A service configured in the top level process or a subprocess is inherited by subprocesses. A service sees all signals generated by its own (sub)process and any subprocesses recursively, including path and module transitions, A service configured in a subprocess does not see module or path transition signals from its parent or ancestors.

By default, configurable services can be configured in more than one (sub)process. The default implementation of services in subprocesses is such that if the same service is configured in more than one (sub)process, each configured (sub)process will have its own instance of the service, operating independently.

For some services, such as EnableFloatingPointExceptions or RandomNumberGenerator, this is the desired behavior. For some other services, it might be better if the service in a parent overrode the service in a subprocess, so that there will be at most a single instance of the service. A mechanism to enable each service to select its own behavior has been implemented. Services that have a single instance, and cannot be reconfigured in a subprocess, include CPU, IgProf, InitRootHandlers, JobReport, MessageLogger, SiteLocalConfig, TFileAdaptor, and TFileService.

If multiple instances of the RandomNumberGenerator service are used and they are configured to save the engine states to enable replay, then the output filenames used for the saveFileName parameter must be different and the module labels used for the RandomEngineStateProducer must be different.

If you are the author of a service, you may wish to guarantee that there is only a single instance of your service in a process, and the service cannot be reconfigured in a subprocess, you should write a simple isProcessWideService function as shown below for the MessageLogger service, and put it in the header file defining your service. The function must be in namespace edm::service, regardless of the namespace in which your service resides. Note that the function is not a member function of your service. It is not necessary that the function be in-lined.

namespace edm {
    namespace service {
       class MessageLogger {
          ...  // class definition goes here
       };
       inline
       bool isProcessWideService(MessageLogger const*) {
           return true;
      }
   }
}

Regardless if there is a separate instance of your service for each subprocess, or a single instance for the entire process, some rewriting of your service for subprocesses may be necessary. Each instance of your service sees all the signals from not only its own tier, but from all its subprocess tiers, recursively. So, there may or may not be some needed work in your service to distinguish the signals and treat them differently depending on in which tier they originated. That depends on the nature of your service.

SubProcess's and the EventSetup

For the most part, the EventSetup is run and configured in a SubProcess in the same way as in a normal process. There is one special feature of the EventSetup related to SubProcess's. When possible ESSource's and ESProducer's are shared across SubProcess's. The purpose of this sharing is to save memory by having only one copy of the same data and same module in memory instead of one copy per SubProcess.

An ESSource is shared between two SubProcess's if its configuration is identical in both SubProcess's. Both tracked and untracked parameters must be identical. If they are not, then an instance of the ESSource and its associated data is created for each SubProcess (which is not a problem except for the fact that it takes more memory). This is done automatically. If a user wants this sharing to occur, then the user only needs to make sure the configurations are identical. A LogInfo message is printed during the initialization of the EventSetup to indicate when an ESSource is being shared.

There is one rule related to ESSource's that is important. An ESSource is not supposed to get data from the EventSetup, only deliver data. Checks were made for ESSource's that violated this rule in standard sequences and the two violations that were found were fixed. At the moment there is nothing in the code to automatically enforce this requirement, although we are discussing how to add this in the future. Subtle problems could occur if this rule was violated.

The situation is similar for ESProducer's except more things must be true for an ESProducer to be shared. Here is a list of the requirements:

  • The two ESProducer's must have identical configurations.
  • All other ESProducer's and ESSource's associated with the same records as the shared ESProducer must have identical configurations.
  • There must be the same number of ESProducer's and ESSource's associated with the same records as the ESProducer.
  • There can be no looper associated with the same records as the ESProducer.
  • Plus the same must be true for all records that the records associated with the ESProducer depend on.

As for ESSource's, this sharing is automatic. The only thing a user needs to ensure is that the configurations are identical. There is a LogInfo level message that will be printed when ESProducer's are shared.

Note this sharing was not fully implemented until the 6_0_X release series. It is not in 6_0_0_pre5, but should be in one of the pre releases soon after.

Review Status

Reviewer/Editor and Date (copy from screen) Comments
WilliamTanenbaum - 28 Oct 2015 Added support for multiple subprocesses in CMSSW_8_0_X
WilliamTanenbaum - 26 Aug 2011 Added limitation info on as yet unimplemented Refs in 4_4_X.
WilliamTanenbaum - 28 Jul 2011 Added info on process wide services
WilliamTanenbaum - 02 Feb 2011 updating

Responsible: WilliamTanenbaum
Last reviewed by: Reviewer

Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r10 - 2018-01-30 - DavidDagenhart
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback