Skim Configuration Guidelines

Complete: 4


Describe a common structure for skim configurations.

Secondary Datasets (SD) and Central Skims (CS)

The configurations to be run by DataOps are collected in the package Configuration/Skimming. Additional C++ code needed for a SD/CS will be put into CMS.PhysicsTools/Skimming

Declaration of a skim (a.k.a. filtered stream)

To declare a skim a snippet of the following kind is needed:

import FWCore.ParameterSet.Config as cms
... (definition of paths and producers or import of these from other files)

selectionSDHighPtMuon = cms.PSet(
    SelectEvents = cms.untracked.PSet(
        SelectEvents = cms.vstring('pathSDHighPtMuon')

SDHighPtMuon = cms.FilteredStream(
   responsible = 'John Doe',
   name = 'SDHighPtMuon',
   paths  = (pathSDHighPtMuon,),
   content = RECOEventContent.outputCommands,
   selectEvents = selectionSDHighPtMuon.SelectEvents,
   dataTier = cms.untracked.string('ALCARECO')

Multiple of those snippets can then be combined in the following way:

import FWCore.ParameterSet.Config as cms

process = cms.Process("SKIM")
process.schedule = cms.Schedule()
from Configuration.PyReleaseValidation.ConfigBuilder import installFilteredStream
installFilteredStream(process, process.schedule, "SDHighPtMuon", definitionFile = "Configuration.Skimming.SDHighPtMuon_cff")
installFilteredStream(process, process.schedule, "SDHighPtElectron", definitionFile = "Configuration.Skimming.SDHighPtElectron_cff")

Naming conventions

To avoid name clashes, modules, sequences and select statements should get a name containing the name of the respective SD/CS.

PAG specific skims

PAG specific skim configurations can be put either in the package XYZAnalysis/Skimming or XYZAnalysis/Configuration. They need to be defined as Filtered Stream as described above.

Naming conventions

Some simple naming conventions, also adopted for simulation and reconstruction, should be followed throughout the configuration files.

  • Package names should start with upper case (ZReco, not zReco).
  • Module and sequence names should start with lower case (e.g.: zToMuMu)

Since most of the times a skim matches one or more analysis sequences, we also adopt the following convention:

  • Skim names should start with lower case (e.g.: zToMuMu, not ZToMuMu)

Module names should be unique throughout all analysis sequences. This is needed in order to avoid clashes when running multiple skims within a single job, which is a task that makes skim management from the computing side easier.

Including external module configurations

If standard module configurations are imported replace statements should be used with care. Parameter changes happen globaly so other groups' skims could be affected. The standard solution to this problem is cloning the module and changing parameters while doing that:

from aPackage import oldName 
newName = oldName.clone (changedParameter = 42)

Details can be found in the Workbook.

Event content definition

The event content for a Physics Analysis Group should be defined in a file called:

  • XYZAnalysis/Configuration/data/XYZAnalysis_EventContent.cff

This file should include configuration fragment files (.cff) from all other packages within the sub-system XYZAnalysis.

For each skim two blocks should be defined:

  • skimNameEventContent, defining the event content definition
  • skimNameEventSelection, defining the event selection criteria (i.e.: filter paths to select)

The event content directives should specify only the analysis collections, not the standard event content (e.g.: AOD or RECO). This allows to flexibly add analysis content to other output definitions (typically, AOD, RECO or FEVT).

Detailed documentation on how to configure event output content can be found in the following document:

Warning: Please, avoid using wildcards (' * ') for module names, otherwise you risk to pick in your skim collections defined by other groups!

Testing the skimming process

A script to test a single skim can be easily written using More details to follow

The developers of the skims should test them before submitting for release. Failing to do so may cause the skim to be rejected from the release.

Review status

Reviewer/Editor and Date (copy from screen) Comments
BenediktHegner - 17 Jul 2007 created page

Responsible: BenediktHegner

Edit | Attach | Watch | Print version | History: r24 < r23 < r22 < r21 < r20 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r24 - 2010-03-24 - BenediktHegner

    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback