Command line option parsing
Being revised
Command line option parsing is a method to set the values of different variables when running your executable from the command line.
After reading this page, you should be able to set command line options from the command line, add new ones, change the default values of the default command line options and use them for FWLite.
However, you should understand that VarParsing is recommended if you have a few command line options. If you are having an increasing number of command line options, configuring python to do the same is a lot more easier and flexible. You can find in the example at
WorkBookFWLiteExamples. Also so far it's still not possible to use CRAB for FWLite scripts in a simple way at all. For both condor and CRAB you can use VarParsing. In case you want to use CRAB one should switch to EDAnalyzer.
Using VarParsing from command line
Let us start here right away with an example that shows the usage command line options with python script called
copyPickMerge_cfg.py
. This script is used to copy a given number of events, from a given data file to an output file. The number of events, name of the input data file and the output file can all be defined as the command line options shown as follows.
cmsRun copyPickMerge_cfg.py inputFiles=one.root inputFiles=two.root inputFiles=third.root outputFile=MyOutputFile.root maxEvents=100
OR
cmsRun copyPickMerge_cfg.py inputFiles=one.root, two.root, third.root outputFile=MyOutputFile.root maxEvents=100
OR
cmsRun copyPickMerge_cfg.py inputFiles=myFiles.txt
outputFile=MyOutputFile.root maxEvents=100
where
myFiles.txt
is a text file containing
one.root
,
two.root
and
three.root
data files on three different lines.
Using text file containing names of data files is useful particularly when you select them from a dataset in
DBS discovery
using
plain
option as shown below.
For where to click for
plain
option ( see the red circle) for a dataset in DBS , here is an example:
After you click
plain
, the list of files looks like below. You can simply cut and paste them in a text file and use this text file for the parameter option
-inputFiles
.
OR you can click on
plain
and copy and
the link within double quotes as shown in the command below and get the files to a text file:
wget -O myListOfFiles.txt "<paste>" --no-check-certificate
Note: the link in the above example is
https://cmsweb.cern.ch/dbs_discovery/getLFNsForSite?dbsInst=cms_dbs_prod_global&site=all&datasetPath=/JetMETTau/Run2010A-Jul23ReReco_PreProd_v1/RECO&what=txt&userMode=user&run=*
You an further read about it at
WorkBookDataSamples.
VarParsing in config file
To understand usage of variable parsing inside a configuration file, we look at the
copyPickMerge_cfg.py
used above in detail. In this configuration file, another configuration file called
VarParsing.py
is imported.
VarParsing.py
contains the infrastructure to parse variable definitions passed to
cmsRun
configuration script
copyPickMerge_cfg.py
.
The variables passed to the
cmsRun
are defined in the configuration script
copyPickMerge_cfg.py
as follows ( read the comments in red that are not part of the code):
import FWCore.ParameterSet.Config as cms
The line below always has to be included to make VarParsing work
from FWCore.ParameterSet.VarParsing import VarParsing
In teh line below 'analysis' is an instance of VarParsing object
options = VarParsing ('analysis')
Here we have defined our own two VarParsing options
# add a list of strings for events to process
options.register ('eventsToProcess',
'',
VarParsing.multiplicity.list,
VarParsing.varType.string,
"Events to process")
options.register ('maxSize',
0,
VarParsing.multiplicity.singleton,
VarParsing.varType.int,
"Maximum (suggested) file size (in Kb)")
options.parseArguments()
The (option.inputFile) in the lines below is simply replaced by say one.root
process = cms.Process("PickEvent")
process.source = cms.Source ("PoolSource",
fileNames = cms.untracked.vstring (options.inputFiles),
)
And here we use our own VarParsing options defined above
if options.eventsToProcess:
process.source.eventsToProcess = cms.untracked.VEventRange (options.eventsToProcess)
process.maxEvents = cms.untracked.PSet(
input = cms.untracked.int32 (options.maxEvents)
)
process.Out = cms.OutputModule("PoolOutputModule",
fileName = cms.untracked.string (options.outputFile)
)
if options.maxSize:
process.Out.maxSize = cms.untracked.int32 (options.maxSize)
process.end = cms.EndPath(process.Out)
Default options
The default options are defined in the
VarParsing.py
are as follows:
-
maxEvents
-
totalSections
-
section
-
inputFiles
-
outputFile
Adding new options
If you need to add new options, you can define them in your configuration file. In the configuration file
copyPickMerge_cfg.py
discussed here, the two new options added are
eventsToProcess
and
maxSize
. They are defined as follows in the
copyPickMerge_cfg.py
file.
options.register ('eventsToProcess',
'',
VarParsing.multiplicity.list,
VarParsing.varType.string,
"Events to process")
options.register ('maxSize',
0,
VarParsing.multiplicity.singleton,
VarParsing.varType.int,
"Maximum (suggested) file size (in Kb)")
And to use the option
maxSize
, you can define the option as follows:
outputFile=output.root maxSize=100000
where
maxSize=100000
defines the size of the output file.
Command Line Options in FWLite (PyROOT)
In FWLite you can readout the command line options in your executable as indicated in the following lines:
from [[FWCore.ParameterSet.VarParsing]] import VarParsing
options = VarParsing ('python')
options.parseArguments()
and then loading the events using "options":
from DataFormats.FWLite import Events, Handle
events = Events (options)
Finally you can call the code using the options. Example:
python3 DataFormats /FWLite/examples/bin/Jpsi_peak.py inputFiles=/eos/cms/store/relval/CMSSW_12_1_0_pre2/RelValPsi2SToJPsiPiPi_14/MINIAODSIM/121X_mcRun3_2021_realistic_v1-v3/10000/934b45ff-3f99-4f69-9648-bfd129ab7d90.root secondaryInputFiles=/eos/cms/store/relval/CMSSW_12_1_0_pre2/RelValPsi2SToJPsiPiPi_14/MINIAODSIM/121X_mcRun3_2021_realistic_v1-v3/10000/934b45ff-3f99-4f69-9648-bfd129ab7d90.root maxEvents=10
Important:
VarParsing is the only way to use the secondary file solution in FWLite/PyROOT %ENDCOLOR%
See examples:
https://github.com/jmejiagu/MyFWLite_12
(
https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookFWLitePython )
In CMSSW,
DataFormats /FWLite unit test:
https://github.com/gpetruc/cmssw/blob/master/DataFormats/FWLite/test/pyroot_multichain.py
https://github.com/gpetruc/cmssw/blob/master/DataFormats/FWLite/test/run_all_t.sh#L26
Command Line Options in FWLite
In FWLite you can readout the command line options in your executable as indicated in the following lines:
#include "CMS.PhysicsTools/FWLite/interface/CommandLineParser.h"
// ...
// initialize command line parser
optutl::CommandLineParser parser ("Analyze FWLite Histograms");
// set defaults
parser.integerValue ("maxEvents" ) = 1000;
parser.integerValue ("outputEvery") = 10;
parser.stringValue ("outputFile" ) = "analyzeFWLiteHistograms.root";
// parse arguments
parser.parseArguments (argc, argv);
int maxEvents_ = parser.integerValue("maxEvents");
unsigned int outputEvery_ = parser.integerValue("outputEvery");
std::string outputFile_ = parser.stringValue("outputFile");
std::vector inputFiles_ = parser.stringVector("inputFiles");
Note:
Don't forget to make the object known to the compiler by including the proper
.h file. You will also need to link the
FWLite
package to your executable to make the implementation known. In the first part of this example the
CommandLineParser is initialised, in the second part default values for the command line options
outputFile,
maxEvents and
outputEvery are defined. In the third part the command line parameters listed before and the additional parameter
inputFiles are passed on. You can find the full executable in the
FWLiteHistograms.cc
file in the bin directory of the FWLite package. Have a look to
WorkBookFWLiteExamples to see the command line parsing in FWLite in action. The lines above were taken from
Example 2. Please refer to the experts in case of more questions.
If you are using the PythonProcessDesc to parse _cfg.py configuration files you must pass the argc and argv variables to the constructor so that _cfg.py parser has access to options given on the command line. An example is given below.
#include "FWCore/PythonParameterSet/interface/MakeParameterSets.h"
int main(int argc, char* argv[])
{
// load framework libraries
gSystem->Load( "libFWCoreFWLite" );
AutoLibraryLoader::enable();
// only allow one argument for this simple example which should be the
// the python cfg file
if ( argc < 2 ) {
std::cout << "Usage : " << argv[0] << " [parameters.py]" << std::endl;
return 0;
}
// get the python configuration
// PythonProcessDesc builder(argv[1]); /* this method prevents option parsing in the _cfg.py file */
PythonProcessDesc builder(argv[1], argc, argv); // <--- use this constructor.
edm::ParameterSet cfg = *builder.processDesc()->getProcessPSet();
/* your analysis here */
return 0;
}