Instructions for using the DataMixingModule


The DataMixingModule provides a means of merging the single-channel information from two or more data streams into a single event. Its primary purpose is to allow the production of Monte Carlo events that take the "simulation" of detector noise, pileup, and anything else that could be present in a given beam-crossing from the collider data itself. To make the following discussion easier, we define an "input" stream, which would typically be the Monte Carlo hard scatter event, and the "overlay" stream, which would typically be zero bias data, and which can consist of multiple events. In order to preserve as much information as possible, the philosophy has been to merge the data streams at the earliest stage where the two streams can have the same format. This typically occurs at the single-channel level, although there are some exceptions that will be mentioned below. The merging is done by "adding" in a sub-detector-appropriate manner energy or charge deposition that happens to occur in the same channel in both streams. The DataMixingModule makes a new copy of the single-channel information that combines the two streams. Channels with hits in only one of the streams are passed directly through to the new output stream. As will be seen below, it is not possible to just overwrite the old single-channel information, so any subsequent processing that is done on these events must be redirected to take the new output stream as its input. (e.g., Reconstruction must be re-directed to use the new information instead of the "standard" collections.) One consequence of this combination philosophy is that the DataMixingModule is source and geometry "agnostic", meaning it doesn't care what the input and overlay streams are or where they come from. As long as the appropriate database conditions are provided in the event processing, data events can be overlaid on data events, simulated events can be overlaid on data, simulation can be overlaid on simulation, etc. Also, different detector geometries are naturally accommodated (e.g., for upgrade designs) because the overlay is done by matching channel IDs, not geometrical elements. As long as the same channels are present in the input and overlay streams, the combination should be handled correctly.

The following sections outline the necessary steps to create the input and overlay streams, run the DataMixingModule, and submit its results to further processing.

Current Release Setup (Current as of 17 March, 2009)

The DataMixingModule and the associated packages from other systems should be available in the 3_1_X releases, and a work-around for 2_2_X also exists.

For 3_1_X:

For now, add the version V00-01-20 of SimGeneral/DataMixingModule to your release area. This will change as soon as the next _preN version comes out, when you can then use the version in the release. All of the requisite files should be present. If, for some reason, you get lots of SimCalorimetry errors, try adding the following tags:

SimCalorimetry/HcalSimProducers  V01-10-08
SimCalorimetry/HcalSimAlgos      V02-06-03
SimCalorimetry/CaloSimAlgos      V00-07-06

There seem to be some problems (not related to the DataMixer) in _pre3 at the moment - I would stick to _pre1. If you do this, you will need the latest tag of Mixing/Base (V03-00-03).

For 2_2_X:

Here, several specific release tags are necessary. They are listed here:

SimGeneral/DataMixingModule       V00-01-20
SimCalorimetry/HcalSimAlgos       rpwMix22X
SimCalorimetry/HcalSimProducers   V01-10-08
SimCalorimetry/CaloSimAlgos       V00-07-06
Mixing/Base                       V02-01-01

Note that the DataMixer executable may or may not be available in the release, so you may have to build it in your local area.

Overlay Options

Most subdetectors are overlaid using Digis, with the collections specified below. Various overlay options exist for Calorimetry. Code exists to overlay RecHits, Digis, or the specific detailed mode of overlay for Hcal metioned above. Switching between calorimeter options will be described below in the "How to Run" section, but are mentioned here so that the following specification of input classes will make sense.

Event Preparation

As mentioned above, the DataMixingModule is capable of overlaying any type of data on any other, if the event streams have been prepared with the appropriate collections of input data. For the majority of subdetectors, events are merged at the Digi level, so the desired collections of digis must be present in the input and overlay streams, and must be specified with the corresponding InputTags. The major exception to this treatment is the Hcal, where proper overlay for MC production should add data pedestals to simulated hits before the digitization is done.

The input collections are specified in the files that configure the DataMixingModule. These are listed in the /python directory, and are given in the files

Each of these specify a "signal" input, and a "pileup" input.

The following input collections should be present in the events to be overlaid:

For an MC "input" file ("Signal"): (NOTE: the suffix on each of the input labels must be "Sig" for the signal input.)

The following InputTags are specified:


SiStripsLabelSig = cms.InputTag("ZeroSuppressed"),
SistripdigiCollectionSig = cms.InputTag("simSiStripDigis"),


pixeldigiCollectionSig = cms.InputTag("simSiPixelDigis"),


CSCDigiTagSig = cms.InputTag("simMuonCSCDigis"),
CSCwiredigiCollectionSig = cms.InputTag("muonCSCWireDigi"),
CSCstripdigiCollectionSig = cms.InputTag("muonCSCStripDigi"),
RPCDigiTagSig = cms.InputTag("simMuonRPCDigis"),                   
RPCdigiCollectionSig = cms.InputTag("simMuonRPCDigis"),
DTDigiTagSig = cms.InputTag("simMuonDTDigis"),
DTdigiCollectionSig = cms.InputTag("simMuonDTDigis"),


for Ecal Digis:

EBdigiProducerSig = cms.InputTag("simEcalDigis"),
EBdigiCollectionSig = cms.InputTag("ebDigis"),
EEdigiProducerSig = cms.InputTag("simEcalDigis"),
EEdigiCollectionSig = cms.InputTag("eeDigis"),
ESdigiProducerSig = cms.InputTag("simEcalPreshowerDigis"),
ESdigiCollectionSig = cms.InputTag(""),

or for Ecal RecHits:

EBProducerSig = cms.InputTag("ecalRecHit"),
EBrechitCollectionSig = cms.InputTag("EcalRecHitsEB"),
EEProducerSig = cms.InputTag("ecalRecHit"),
EErechitCollectionSig = cms.InputTag("EcalRecHitsEE"),
ESProducerSig = cms.InputTag("ecalPreshowerRecHit"),
ESrechitCollectionSig = cms.InputTag("EcalRecHitsES"),

for Hcal Digis*:

HBHEdigiCollectionSig  = cms.InputTag("simHcalDigis"),
HOdigiCollectionSig    = cms.InputTag("simHcalDigis"),
HFdigiCollectionSig    = cms.InputTag("simHcalDigis"),
ZDCdigiCollectionSig   = cms.InputTag("ZDCdigiCollection"),

or for Hcal RecHits:

HBHEProducerSig = cms.InputTag("hbhereco"),
HBHErechitCollectionSig = cms.InputTag("HBHERecHitCollection"),
HOProducerSig = cms.InputTag("horeco"),
HOrechitCollectionSig = cms.InputTag("HORecHitCollection"),
HFProducerSig = cms.InputTag("hfreco"),
HFrechitCollectionSig = cms.InputTag("HFRecHitCollection"),
ZDCrechitCollectionSig = cms.InputTag("ZDCRecHitCollection"),

*Note that in the "production" overlay mode, the Hcal Digis for the MC input file are NOT used: pCaloSimHits are used instead, so these must be present in the event.

For a Data "overlay" file: (NOTE: the suffix on each of the input labels must be "Pile" for the overlay file.)


SistripLabelPile = cms.InputTag("ZeroSuppressed"),                                              
SistripdigiCollectionPile = cms.InputTag("siStripDigis"),


pixeldigiCollectionPile = cms.InputTag("siPixelDigis"),


CSCDigiTagPile = cms.InputTag("muonCSCDigis"),
CSCwiredigiCollectionPile = cms.InputTag("muonCSCWireDigi"),
CSCstripdigiCollectionPile = cms.InputTag("muonCSCStripDigi"),
RPCDigiTagPile = cms.InputTag("muonRPCDigis"),
RPCdigiCollectionPile = cms.InputTag("MuonRPCDigis"),
DTDigiTagPile = cms.InputTag("muonDTDigis"),
DTdigiCollectionPile = cms.InputTag("MuonDTDigis"),


for Ecal Digis:

EBdigiCollectionPile = cms.InputTag("ebDigis"),
EEdigiCollectionPile = cms.InputTag("eeDigis"),
ESdigiCollectionPile = cms.InputTag(""),
EBdigiProducerPile = cms.InputTag("ecalDigis"),
EEdigiProducerPile = cms.InputTag("ecalDigis"),
ESdigiProducerPile = cms.InputTag("ecalPreshowerDigis"),

or for Ecal RecHits:

EBProducerPile = cms.InputTag("ecalRecHit"),
EBrechitCollectionPile = cms.InputTag("EcalRecHitsEB"),
EEProducerPile = cms.InputTag("ecalRecHit"),
EErechitCollectionPile = cms.InputTag("EcalRecHitsEE"),
ESProducerPile = cms.InputTag("ecalPreshowerRecHit"),
ESrechitCollectionPile = cms.InputTag("EcalRecHitsES"),

for Hcal Digis:

HBHEdigiCollectionPile  = cms.InputTag("hcalDigis"),
HOdigiCollectionPile    = cms.InputTag("hcalDigis"),
HFdigiCollectionPile    = cms.InputTag("hcalDigis"),
ZDCdigiCollectionPile   = cms.InputTag("ZDCdigiCollection"),

or for Hcal RecHits:

HBHEProducerPile = cms.InputTag("hbhereco"),
HBHErechitCollectionPile = cms.InputTag("HBHERecHitCollection"),
HOProducerPile = cms.InputTag("horeco"),
HOrechitCollectionPile = cms.InputTag("HORecHitCollection"),
HFProducerPile = cms.InputTag("hfreco"),
HFrechitCollectionPile = cms.InputTag("HFRecHitCollection"),
ZDCrechitCollectionPile = cms.InputTag("ZDCRecHitCollection"),

Note that the standard input files offer three different choices. The file is configured to send data inputs to both input and overlay with the appropriate tags. The file is configured to overlay simulated hits on simulated hits. Note that in this mode, the "overlay" events are treated as data, such that the "lowest" level of hit that can be used is the Digi. In particular, the Hcal overlay cannot overlay pCaloSimHits on pCaloSimHits. (That's what the standard MixingModule is for.) With the appropriate configuration, it is possible to overlay MC Digis on MC Digis.

Example scripts have been provided in the DataMixingModule/python directory for some of this pre-processing., for example, is a cosmics prompt-reco script modified to just produce appropriate digis.

How to Run

In addition to the standard parameters for running a CMS framework job, the "input" (MC) file, Frontier conditions for calibration, and the mode of overlay are all specified in a standard file like found in the DataMixingModule/test directory. The mode of overlay is chosen by specifying the appropriate file, as in


As mentioned above, the inputTags for all of the data types are specified in one of the files in the DataMixingModule/python directory. The overlay file is also specified here.

The files in the DataMixingModule/python directory also contain some specific flags for running the DataMixingModule and the underlying overlay code in Mixing/Base, which handles all of the event juggling. Comments are included, below.

nbPileupEvents = cms.PSet(
            averageNumber = cms.double(1.0)  # one pileup per event
type = cms.string('fixed'),  # constant value of one pileup per event (as opposed to, e.g., 'poisson')

# other Mixing Module parameters  - these don't really matter for the DataMixer, as long as maxBunch
# and minBunch are set to zero.
Label = cms.string(''),
maxBunch = cms.int32(0),
bunchspace = cms.int32(25),
minBunch = cms.int32(0),
checktof = cms.bool(False),  # probably best to leave this false... 
(COMMENT OUT the previous line for 2_2_X )

DataMixingModule parameters/Overlay Options:

There are two modes for Ecal overlay, and three modes for Hcal. The Ecal overlay can merge Digis by merely adding channels with hits in the input and overlay streams after gain conversion. An appropriate gain is selected for the final merged Digi. The Ecal can also add RecHit energies in cells where there is energy in both the input and overlay event. For Hcal, the same two Digi or RecHit modes are available. In addition, there is a "production" mode where the digis from the overlay stream are used as "pedestal" for the SimHits that are in the input stream. The Digis are converted back to charge using appropriate calibration constants, the charge from the SimHits is added, and then new Digis are made.

EcalMergeType = cms.string('Digis'),  # set to "Digis" to merge digis                            
HcalMergeType = cms.string('Digis'),
HcalDigiMerge = cms.string('FullProd'),  # select full production mode, using Digis on SimHits

Frontier Conditions

Note that you will need different Frontier Conditions for 3_X vs 2_X. Here are two examples:

For 3_1_X:

process.GlobalTag.connect = "frontier://FrontierInt/CMS_COND_30X_GLOBALTAG"
process.GlobalTag.globaltag = "CRAFT_30X::All"

For 2_2_X:

process.GlobalTag.connect = "frontier://PromptProd/CMS_COND_21X_GLOBALTAG"
process.GlobalTag.globaltag = "CRUZET4_V4P::All"

Also, be sure that the magnetic field is appropriate for the data you want to overlay:


DataMixingModule Outputs

As specified in the files, the outputs of the DataMixingModule are new Digis with different producers and labels. One cannot just overwrite the old Digis, unfortunately. Here is a list of the possible produced objects:

SiStripDigiCollectionDM = cms.string('siStripDigisDM'),
PixelDigiCollectionDM = cms.string('siPixelDigisDM'),
DTDigiCollectionDM = cms.string('muonDTDigisDM'),
CSCWireDigiCollectionDM = cms.string('MuonCSCWireDigisDM'),
CSCStripDigiCollectionDM = cms.string('MuonCSCStripDigisDM'),
RPCDigiCollectionDM = cms.string('muonRPCDigisDM'),

Calorimeter Digis

EBDigiCollectionDM   = cms.string('EBDigiCollectionDM'),
EEDigiCollectionDM   = cms.string('EEDigiCollectionDM'),
ESDigiCollectionDM   = cms.string('ESDigiCollectionDM'),
HBHEDigiCollectionDM = cms.string('HBHEDigiCollectionDM'),
HODigiCollectionDM   = cms.string('HODigiCollectionDM'),
HFDigiCollectionDM   = cms.string('HFDigiCollectionDM'),
ZDCDigiCollectionDM  = cms.string('ZDCDigiCollectionDM')

Calorimeter RecHits

EBRecHitCollectionDM = cms.string('EcalRecHitsEBDM'),
EERecHitCollectionDM = cms.string('EcalRecHitsEEDM'),
ESRecHitCollectionDM = cms.string('EcalRecHitsESDM'),
HBHERecHitCollectionDM = cms.string('HBHERecHitCollectionDM'),
HFRecHitCollectionDM = cms.string('HFRecHitCollectionDM'),
HORecHitCollectionDM = cms.string('HORecHitCollectionDM'),
ZDCRecHitCollectionDM = cms.string('ZDCRecHitCollectionDM'),

As can be seen in the cff file, these are from producer 'mix', with process name 'PRODMIX'.

Further Processing of DataMixingModule Output

As is obvious from the previous section, any new processor that wishes to read the new Digis has to have its input re-directed to look at the DataMixer digis instead of the "standard" inputs.

For Reconstruction, a sample set of scripts has been provided in the DataMixingModule/python directory. The script sets up the Reconstruction. This imports, where all of the re-direction of inputs is done. In the current version of the release, there is also the file which handles some issues with the SiStrip reconstruction processing specific to cosmics data. These scripts will be updated as proper Reconstruction sequences change.

-- MichaelHildreth - 17 Mar 2009

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2009-03-19 - AmnonHarel
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback