Unscheduled Execution
Complete:
Explanation
The
EDM Framework allows two ways to setup the order in which EDProducers, EDFilters, EDAnalyzers run): scheduled and unscheduled
- scheduled the configuration file explicitly states the order of modules on Paths and EndPaths
- unscheduled Some EDProducers and EDFilters which have not been explicitly placed on a Path or EndPath will be run the first time someone asks for their data (this is known as 'lazy evaluation').
The important thing to understand about unscheduled execution is it appears works exactly like if you had placed all the unscheduled modules that were not explicitly scheduled at the beginning of the path AND in exactly the correct order. Behind the scenes, however, the system actually only runs the unscheduled modules when its data is requested for the first time for the Event.
Usage
Prior to release 9_1_0, one would add the following line to a Python configuration
file to enable 'unscheduled' execution in cmsRun:
process.options = cms.untracked.PSet( allowUnscheduled = cms.untracked.bool(True) )
In release 9_1_0 and later the above line is obsolete and no longer has any effect. As a
matter of cleanup it should be deleted anywhere it occurs (although it is harmless). In release
9_1_0 and later unscheduled execution is always 'enabled'.
Prior to release 9_1_0, one configured an EDProducer to run unscheduled by defining it and
attaching it to the Process, but not putting it on any Path or EndPath.
In release 9_1_0 and later one configures an EDProducer or EDFilter to run unscheduled by
also putting it in a Task and associating the Task to a Path, EndPath, Sequence, or Schedule.
The details are described here:
SWGuideAboutPythonConfigFile#Task_Objects.
In release 9_1_0, unscheduled execution behaves as follows. When an EDProducer
or EDFilter is run unscheduled, it is executed immediately before execution of the first module
that declares it consumes a product the module produces. This includes the case when an
OutputModule declares it consumes a product only to write it to an output file.
The filter decision of an EDFilter is ignored when it is run in unscheduled mode. You must
put the EDFilter on a Path to make use of its filter decision.
If a module is both on a Path or EndPath and also on a Task, then the module is run
on the Path or EndPath and not run unscheduled. The fact that it is on a Task is ignored
in this case.
Circular Dependence Errors
If an EDProducer with label A needs a product produced by an
EDProducer with label B and also EDProducer B needs a product
produced by EDProducer A, there is a circular dependence.
The code that manages unscheduled execution checks for
these circular dependences and will throw a fatal exception
if one is encountered. If it didn't, then when A was executed it would ask
for data from B and B would be executed, then B would ask
for A's data so A would be executed again (while it is already
running on the stack), then A would ask for B's data and
B would be executed again ... This pattern would continue
until memory was exhausted and the process would die for
that reason.
A circular dependence could involve any number of modules.
Module A could need module B which needs module C which
needs module D which needs module A. The problem is the
same. Also if module A tries to get data it produces itself,
the same problem occurs. The circular dependence could
involve only one module.
The best way to avoid or fix these circular dependences
is to use getByToken or getByLabel to request data from the
Event and specify the module label, instance, and type for
each data request. Then the logical dependences between
modules should be designed in such a way that there are
no circles. Finally, if the module label, instance, and type
uniquely defined a product and its EDProducer, there will
be no circular dependences.
If a circular dependence error occurs, there are several
things that can be done. First, one should examine to
ensure that all the data requests are actually needed and
used. One link in the circle may be unnecessary or a mistake.
The second thing one should consider is redesigning the
dependences to break the circle. There are a few
other things that can be done.
One way for these circular dependences to occur is if
the same module label, instance, and type are produced
in different processes. In this case, one could specify the
process name in the getByLabel or getByToken call and
this might serve to resolve the circular dependency.
In some cases, the intent is that one product involved in
the circle always be retrieved from input file and never produced in
the current process. Starting in 6_2_X, one can require this
using an InputTag in the configuration as follows:
cms.InputTag("theModuleLabel", processName=cms.InputTag.skipCurrentProcess())
Note that using the same label, instance, and type in multiple processes
restricts the flexibility to move modules between processes
or combine multiple processes into one or use differing
process names.
Circular dependences can be caused by using the function getManyByType
or GetterOfProducts and only specifying the product type.
In these cases, the data request is too general and can
end up requesting unneeded products and causing unscheduled
EDProducers to run to produce them. In this case, the
recommended solution is to replace that call with getByLabel
or getByToken where the module label is specified.
One last option is to place some or all of the EDProducers in
the circle on a path in the schedule. One should also place
other EDProducers that depend on data from these EDProducers
in the schedule. This is a bad option if many things depend
on the modules placed in the schedule.
Converting Scheduled to Unscheduled
A convenience python function is available to automatically convert a
cms.Process
from scheduled to unscheduled mode. The function is
FWCore.ParameterSet.Utilities.convertToUnscheduled
. Simply pass the function the
cms.Process
instance you want converted.
Prior to release 9_1_0 the function will:
- Remove all modules not on Paths or EndPaths
- Pull EDProducers not dependent upon EDFilters off of all Paths
- Drop any Paths which are made empty by the change
- Fix up the Schedule if needed
In release 9_1_0 and later releases the function will:
- Resolve SequencePlaceholders
- Remove EDProducers and ignored EDFilters from Paths and EndPaths
- Add removed modules to a Task
Review Status
Responsible:
ChrisDJones
Last reviewed by:
Sudhir Malik- 24 January 2009