Misc. CMSSW notes / howtos
Where is the cmsRun source code ?
seems to be
FWCore/Framework/bin/cmsRun.cpp
.
Crashes when using the debugger
- some advice is given here
(seems however not to help with uaf and zsh)
How to (graphically) browse CMSSW configuration files and their included paths/sequences
How to produce a vector from non-cartesian coordinates ?
- ROOT's TLorentzVector
has a few useful functions in this respect such as:
-
TLorentzVector::SetPtEtaPhiM(...)
How to check on an event-by-event basis whether an event is MC or data ?
How to mix data events (combine different events into one) ?
How to re-run the HLT ?
How can I combine files which seem to have duplicate events ?
When running an MC production with several jobs each having the same run number, these files look like they have duplicate events.
Assuming your input source is
process.source
, one may add the
following line to the
cmsRun
configuration file to ignore
duplicates:
process.source.duplicateCheckMode = cms.untracked.string('noDuplicateCheck')
What parameters does the MessageLogger know in general ?
What severity levels does the MessageLogger know ?
- see the method
ELseverityLevel::getInputStr()
in FWCore/MessageLogger/src/ELseverityLevel.cc
. At the moment, these seem to be: ?no value?
, ZERO
, INCIDENTAL
, DEBUG
, INFO
, WARNING
, WARNING2
, ERROR
, ERROR2
, NEXT
, UNSPECIFIED
, SYSTEM
, SEVERE2
, ABORT
, FATAL
and HIGHEST
. Unfortunately, it only seems to support the following in the configuration file: DEBUG
, INFO
, WARNING
, ERROR
.
What destinations does the MessageLogger know ?
- cout, cerr and file names
How to suppress/limit the 'Begin processing the...' messages ?
These are in the category
FwkReport
. Set the limit of this category to zero.
For example to print only every 1000th event, do:
process.load("FWCore.MessageLogger.MessageLogger_cfi")
process.MessageLogger.cerr.FwkReport.reportEvery = 1000
Can I suppress the messages printed when opening/closing a file (e.g. Initiating request to open ...
) ?
- These messages seem to have the category
fileAction
. However, setting a limit of zero for them seems not to suppress them. (one can however duplicate them e.g. on cerr
).
How can I know in which modules events are rejected in the HLT paths ?
- Set the parameter
wantSummary
of the process' options
to true, e.g. by doing:
process.options = cms.untracked.PSet(wantSummary = cms.untracked.bool(True))
This will produce a rather verbose list of which modules were called how often
etc. at the end of the job. This is useful e.g. to see where events
are rejected on the HLT etc. Look for lines starting with
TrigReport
(above the ones starting with
TimeReport
).
See also
this link in the CMSSW workbook.
How do I select events which passed a certain HLT (higher level trigger) path ?
Here is an example for vetoing random trigger events (
HLT_Random
path):
import HLTrigger.HLTfilters.triggerResultsFilter_cfi as hlt
process.rejectHltRandom = hlt.triggerResultsFilter.clone(
hltResults = cms.InputTag( "TriggerResults","","HLT"),
triggerConditions = ( 'NOT HLT_Random', ),
l1tResults = '',
throw = False
)
This is creates an
EDFilter
object which can then e.g. be put in front of other sequences. In order to remove these events from the output file (if any), one needs to use the
SelectEvents
parameters of the
PoolOutputModule
(see also
Which parameters does PoolOutputModule know), e.g. like:
process.... = cms.OutputModule("PoolOutputModule",
SelectEvents = cms.untracked.PSet(SelectEvents = cms.vstring('myfilterpath')),
...
)
Which parameters does PoolOutputModule
know ?
- new: there seems to be a list here.
- Look for calls to
getParameter
and getUntrackedParameter
in IOPool/Output/src/PoolOutputModule.cc
. There seems to be additional parameters (such as selectEvents
) which seem to be defined somewhere else in the code...
How can I find out which CMSSW versions are installed on a grid site ?
- go to https://cmsweb.cern.ch/sitedb/sitelist/
(hypernews login required)
- select the site you're interested in
- at the bottom left you should see a frame titled
Software installed on <sitename>
- this lists the installed software versions for each scram architecture
How can I dump the HLT decisions from a file ? (How to print out which trigger path fired how often ?)
- Make sure there is a
MessageLogger
known to the process, e.g. by adding:
process.load("FWCore.MessageLogger.MessageLogger_cfi")
- Add something like the following to the configuration file:
process.hltTrigReport = cms.EDAnalyzer( 'HLTrigReport',
HLTriggerResults = cms.InputTag( 'TriggerResults','','HLT' )
)
process.HLTAnalyzerEndpath = cms.EndPath( process.hltTrigReport )
process.MessageLogger.categories.append("HLTrigReport")
where the
HLT
in the input tag refers to the process which actually calculated these bits.
Which eventcontents are available for cmsDriver.py and what do they contain ?
Not sure, but probably those defined in
Configuration/EventContent/python/EventContent_cff.py
(look for variables ending in
EventContent
).
The following python snippet:
import Configuration.EventContent.EventContent_cff as evc
for name in dir(evc):
if not name.endswith("EventContent"):
continue
print name[:-len("EventContent")]
gave me (at the time of writing) the following output:
ALCARECO
AOD
AODSIM
Common
DATAMIXER
FEVTDEBUG
FEVTDEBUGHLT
FEVT
FEVTHLTALL
FEVTSIM
HLTDEBUG
MIXINGMODULE
RAWDEBUG
RAWDEBUGHLT
RAW
RAWSIM
RECODEBUG
RECO
RECOSIM
One can then get the keep/drop statements for e.g.
HLTDEBUG
by doing:
print evc.HLTDEBUGEventContent
How to get the number of events in a given run and dataset ?
e.g. with the following query:
find sum(file.numevents) where dataset = /XXX/YYY/ZZZ and run = 123456
FWLite: Getting the aliases of an Events tree
create a dict mapping from branch name to alias:
import pprint.pprint
pprint.pprint(dict([ (x.GetTitle(), x.GetName()) for x in ROOT.Events.GetListOfAliases() ]))
finding a string in branch names and print the alias:
searchtext = "trackcandidates"
searchtext = searchtext.lower()
for x in ROOT.Events.GetListOfAliases():
if searchtext in x.GetTitle().lower():
print "branch:",x.GetTitle(),"/ alias:",x.GetName()
No registered converter was able to produce a C++ rvalue of type std::string from this Python object of type NoneType
When running with my configuration file, I get the following error message:
%MSG-s ConfigFileReadError: 11-Sep-2010 18:32:42 CEST pre-events
Problem with configuration file test.py
---- Configuration BEGIN
python encountered the error: No registered converter was able to produce a C++ rvalue of type X from this Python object of type NoneType
---- Configuration END
when running
python test.py
the python interpreter does not complain.
One possibility to find out where this comes from is to do
edmConfigDump test.py | less
and look for occurrences of
None
or
NoneType
and check whether they should be there or not.
See also
this hypernews message
.
A possible source for this problem is the following scenario:
- module
X
is created and put into the process object as process.X
- a path
Y
referencing module X
is added to the process
- another configuration file is loaded into the process which contains a new module which *also* defines a module
X
and thus overwrites the old module definition. The original modules now appear with label None
.
Which operators exist to put together modules into a sequence ? How can I negate a EDFilter decision ?
How do I get a list of modules which are used in any path in a CMSSW configuration file ?
process.load(..)
often loads more modules into the
process
object than are actually
put into any of the paths. To get a list of the names of all modules which are used in at least one path, the following should work:
import itertools
set(itertools.chain(*[ path.moduleDependencies().keys() for path in process.paths.itervalues() ]))
Note that this does not include things like
ESProducers
etc. which usually aren't part of any path.
This also does not include endpaths.
How can I select events with at least n objects of a given type ?
Try using a
CandViewCountFilter
in the CMSSW configuration, e.g. as follows:
process.goodElectronsCounter1 = cms.EDFilter("CandViewCountFilter",
src = cms.InputTag("gsfElectrons"),
minNumber = cms.uint32(2)
)
Not sure with which type of input collections this works however. Most likely it works only with collections of elements inheriting from
reco::Candidate
.
There is also a selector acting on the elements of such a collection, e.g.:
process.selectedObjects = cms.EDFilter("CandViewSelector",
src = cms.InputTag("inputCollectionLabel"),
cut = cms.string("pt > 20 & abs( eta ) < 1.4")
)
Note that these can be chained, e.g. first produce a collection with objects passing a given pt cut
and then with a filter require that there is at least one object in this collection.
See also
CommonTools/CandAlgos/plugins
,
CMSPublic/SWGuideCandidateModules and
CMSPublic/SWGuidePhysicsCutParser.
Are there predefined functions to calculate delta phi and delta R ?
There are functions called
deltaPhi
in
DataFormats/Math/interface/deltaPhi.h
. One of these (overloaded) functions also works with objects which have a member
phi()
, e.g. some of the vector classes used in CMSSW.
There is a similar file for
deltaR
in
DataFormats/Math/interface/deltaR.h
How can I combine two (sets of) files with different data tiers of the same events ?
This was called 'two file solution' (use 'secondary input files'). One can do the following in the configuration file:
process.source = cms.Source("PoolSource",
# main data tier to run on (e.g. RECO)
fileNames = cms.untracked.vstring(...),
# these should be a 'parent' data tier
# e.g. simhits when the other files are rechits
secondaryFileNames = cms.untracked.vstring(...),
...
)
Is there a FAQ for the E/gamma Higher Level Trigger (HLT) ?
Yes, see
this link.
Are there any tools to match reconstructed objects to generated particles ?
Yes, see
this link.
Where does the E/gamma HLT calculate the distance in phi (delta phi) and eta (delta eta) between track and cluster ?
See
RecoEgamma/EgammaHLTProducers/src/EgammaHLTElectronDetaDphiProducer.cc
.
I'm trying to associate reconstructed to generated/simulated tracks, as described
here but I get the following error message:
No "TrackAssociatorRecord" record found in the EventSetup.
Please add an ESSource or ESProducer that delivers such a record.
even though I added
process.load('SimTracker.TrackAssociation.TrackAssociatorByHits_cfi')
to my configuration.
Solution: see
this link
.
How can I select events with at least one supercluster in a given eta range ?
The following sequence worked for me:
# combine barrel and endcap super clusters into one collection
process.barrelAndEndcapSuperClusters = cms.EDProducer("EgammaSuperClusterMerger",
src = cms.VInputTag(
cms.InputTag('correctedHybridSuperClusters'), # barrel
cms.InputTag('correctedMulti5x5SuperClustersWithPreshower') # endcap
))
# produce another collection based on the previous one containing only
# superclusters in the given eta range
process.selectedSuperClusters = cms.EDFilter("SuperClusterSelector",
filter = cms.bool(True),
src = cms.InputTag("barrelAndEndcapSuperClusters"),
# using the cut parser. See https://twiki.cern.ch/twiki/bin/view/CMS/SWGuidePhysicsCutParser
cut = cms.string('abs(eta()) <= 1.4442 || (abs(eta()) >= 1.566 && abs(eta()) <= 2.5)')
)
# we use an count filter here for counting the number
# of superclusters in the collection which contains all those
# within the fiducial volume.
#
# note that this does NOT produce a new collection but just
# veto/let pass the event
process.superClusterCountFilter = cms.EDFilter("EtMinSuperClusterCountFilter",
# input collection for this filter. We take the output
# of the previous module
# (see CommonTools/UtilAlgos/interface/ObjectCountEventSelector.h)
src = cms.InputTag("selectedSuperClusters"),
# require at least one object in the collection
# see CommonTools/UtilAlgos/interface/MinNumberSelector.h
minNumber = cms.uint32(1),
# minimum Et for the super clusters
# see CommonTools/RecoAlgos/plugins/EtMinSuperClusterSelector.h
etMin = cms.double(-1),
)
# put these three modules into one path.
# depending on how the filter is used, we could also
# put these three modules into a sequence which is then inserted
# in another path or use this path with the SelectEvents
# option of the PoolOutputModule.
process.superClusterFilterPath = cms.Path(process.barrelAndEndcapSuperClusters *
process.selectedSuperClusters *
process.superClusterCountFilter)
Can I easily match generated electrons with reconstructed superclusters ?
There is a module
MCTruthDeltaRMatcher
for this based on a delta R match (which probably does not
take into account the bending of the electron in the magnetic field).
See
this link
and the section 'Using
MCTruthDeltaRMatcher' on
this page.
How can I merge multiple EDM (CMSSW root) files into one ?
Is there a way to automatically determine the appropriate global tag in a python configuration file ?
Can't rfcp a file even though rfdir shows it
Trying to do:
rfcp /castor/cern.ch/.... /my/local/dir/
I get the following error message:
/castor/cern.ch/.... : File has no copy on tape and no diskcopies are accessible
even though I can see the file with
rfdir
.
Similarly, when trying to open this file from a
cmsRun
job, I get something like the following:
%MSG-s CMSException: AfterFile 07-Jan-2011 14:04:40 CET pre-events
cms::Exception caught in cmsRun
---- FileOpenError BEGIN
---- StorageFactory::open() BEGIN
Failed to open the file 'rfio:///castor/cern.ch/...' because:
---- RFIOFile::open() BEGIN
rfio_open(name='rfio:///?path=/castor/cern.ch/...', flags=0x0, permissions=0666) => error 'No such file or directory' (rfio_errno=0, serrno=2)
---- RFIOFile::open() END
---- StorageFactory::open() END
RootInputFileSequence::initFile(): Input file rfio:/castor/cern.ch/... was not found or could not be opened.
Error occurred while creating source PoolSource
---- FileOpenError END
Also
stager_qry -M
reported the following:
stager_qry -M /castor/cern.ch/...
Error 2/No such file or directory (File /castor/cern.ch/... (.......@castorns) not on this service class)
In my case, the problem was that the file (according to DBS) was stored at
caf.cern.ch
. The solution was
to login to CAF (see
https://twiki.cern.ch/twiki/bin/view/CMS/CAF#Access_to_the_Interactive_CAF_cm ), then
initialize the environment (see
https://twiki.cern.ch/twiki/bin/view/CMS/CAFSETUP#Setup ) . After this
I could
rfcp
the file.
Is there a simple way to convert a JSON file (good lumi sections file) to a CMSSW configuration fragment ?
See the discussion
here
.
Which types of parameters does ParameterSet
support ?
See the template specializations of the method
ParameterSet::getUntrackedParameter()
at
http://cmslxr.fnal.gov/lxr/source/FWCore/ParameterSet/src/ParameterSet.cc
.
I want to select single events (e.g. for scanning them) but the datasets are not available at my site. How do I select them 'from the grid' ?
See
this link.
How do I compile with debug symbols built into the binaries ?
use
scram b USER_CXXFLAGS=-g
How do I get the per-bunch crossing instantaneous luminosity for each event in CMSSW ?
see e.g.
Where is the Event class defined ?
--
AndreHolzner - 26-Feb-2010