How to Submit SiD Simulation and Reconstruction Jobs Using ILCDirac

Interfaces for all the SiD simulation and reconstruction software packages are available in ILCDIRAC and allow easy definition and submission of jobs using python. This page includes python examples for all possible steps. It also explains the use of a flexible job submission script, that allows to directly submit common job configurations from the command line and let's you get started immediately.

Python Examples

The following statements are required to create an ILCDirac job.

# import ILCDIRAC classes
from ILCDIRAC.Interfaces.API.DiracILC import DiracILC
from ILCDIRAC.Interfaces.API.ILCJob import ILCJob

# create an instance of ILCDIRAC and define a repository file to store the submitted job IDs
dirac = DiracILC ( True , "myRepositoryFile.txt" )

# create a new job
job = ILCJob ( )

# add any step to be executed here

job.setSystemConfig ( "x86_64-slc5-gcc43-opt" ) # need to correspond to a config defined in the dirac configuration
job.setName ( "myJob" ) # just an identifier

# the following statements are optional
job.setOutputSandbox ( [ "*.log" , "*.mac", "*.xml", "*.lcsim" ] ) # beeing able to retrieve the steering files is useful for debugging
job.setInputSandbox ( [ "file1", "file2", ... ] ) # required if the job needs input
job.setOutputData ( [ "outputFile1", "outputFile2", ... ], "CERN-SRM", "/my/storage/path" ) # other possible storage elements are RAL-SRM, IN2P3-SRM, etc.
job.setCPUTime( cpuLimit ) # maximum is 300000 seconds
job.setJobGroup( "myGroup" ) # helps finding jobs that belong together
job.setBannedSites( [ "Site1", "Site2", ... ] )
job.setDestination( "LCG.CERN.ch" )

# submit the job
dirac.submit ( job )

In addition any one of the following steps can be added before the submit command to define what is being executed in the job.

If several steps are defined, the output of one step is automatically used as the input for the following step (if no input is defined). Only certain combinations are supported, though. LCSim can follow an SLIC or SlicPandora step and SlicPandora can follow an LCSim step.

SLIC

res = job.setSLIC ( appVersion = "v2r9p8" , # has to be available in the dirac configuration
		detectorModel = "clic_sid_cdr" , # dirac will retrieve the detector geometry automatically from www.lcsim.org
		macFile = "mySlicMacro.mac" , # steering file
		inputGenfile = "LFN:/my/files/input.stdhep", # optional input file, if an LFN is given it is automatically added to the input sandbox
		nbOfEvents = 100 ,
		startFrom = 0 , # only set if you want to skip events from the input file
		outputFile = "slicOutput.slcio" # don't forget to add it also to the outputdata!
		)
# checking the return value is not required, but considered good practice
if not res['OK']:
	print res['Message']
	sys.exit(2)

Overlay background

This step will download the required background files and overlay them in the next LCSim step present.

res = job.addOverlay( detector = "SID", # need to set which background samples to use
		energy = "3tev", # need to set which background samples to use
		BXOverlay = 60, # number of bunch crossings per signal event
		NbGGtoHadInts = 3.2, # number of events per bunch crossing
		NSigEventsPerJob = 100 # number of signal events
		)
# checking the return value is not required, but considered good practice
if not res['OK']:
	print res['Message']
	sys.exit(2)

LCSim with background overlay

res = job.setLCSIM ( appVersion = "CLIC_CDR" , # has to be available in the dirac configuration
		xmlfile = "myLcsim.xml" , # steering file
		aliasproperties = "mtAlias.properties" , # optional
		evtstoprocess = 100, # optional, default is -1
		inputslcio = [ "input1.slcio", "LFN:/my/files/input2.slcio", ... ], # can be a list of files, LFNs will be added to the input sandbox automatically
		outputFile = "lcsimOutput.slcio" # don't forget to add it also to the outputdata!
		)
# checking the return value is not required, but considered good practice
if not res['OK']:
	print res['Message']
	sys.exit(2)

SlicPandora

res = job.setSLICPandora ( appVersion = "CLIC_CDR" , # has to be available in the dirac configuration
		detectorgeo = "clic_sid_cdr" , # pandora.xml or detector name, dirac will then retrieve the detector geometry automatically from www.lcsim.org
		inputslcio = [ "input1.slcio", "LFN:/my/files/input2.slcio", ... ], , # can be a list of files, LFNs will be added to the input sandbox automatically
		pandorasettings = "myPandoraSettings.xml" , # optional, if none given a default settings file will be taken
		nbevts = 100 , # optional, default is -1
		outputFile = "slicPandoraOutput.slcio" # don't forget to add it also to the outputdata!
		)
# checking the return value is not required, but considered good practice
if not res['OK']:
	print res['Message']
	sys.exit(2)

Job Submission Scripts

There are some flexible command line scripts available shipped together with ILCDIRAC:

$ILCDIRAC/Interfaces/API/Examples/SIDChain
Of course you can check them also check them out as a standalone from the repository:
svn co svn+ssh://svn.cern.ch/reps/dirac/ILCDIRAC/trunk/ILCDIRAC/Interfaces/API/Examples/SIDChain

LCSim Job

The script lcsimJob.py executes a single lcsim step with a user provided steering file. It allows running directly on the full output of a certain production ID, as well as on a provided list of LFNs. Since DIRAC does not know which output to expect and to upload, the script takes care of this. In order to do that the steering file is parsed and placeholder strings are replaced by the correct file name for every job. Thus, your steering xml should have the following strings instead of file names, where output files are produced:
  • __outputSlcio__
  • __outputAida__
  • __outputRoot__
  • __outputDat__
  • __outputTxt__
All of these will be replaced by the output file name, which is based on the job title and the input file name, plus the respective file ending. If your script creates multiple files of the same type or a different file type you can simply extend the list of replacement strings that are looked for.

If the job requires user code not available in the lcsim build you have to provide it in a jar file (see the tutorial here). The easist way would be to place the jar file on a web accessible space and set the url directly in the lcsim steering file using jarURL instead of jar. This way lcsim will get the file automatically at runtime. You can also add it to the inputsandbox by appending -J <myCode.jar> to the submission command, the steering file should then have the jar file defined as the jar field.

Examples

The easiest possibility to submit a job would be
python lcsimJob.py -p <prodID> -l <myLcsimSteering.xml>
In this case all DST files from the given production will be processed. The output file path (default: detectorName/eventType/jobTitle) will be determined from the production meta data and the job name will be based on the steering file name.

Instead of using a production ID to define the input data, one can also provide directly a list of LFNs. In this case the LFN list has to be passed as a seperate python script that just contains a list of strings (the LFNs) called lfnlist, i.e.:

lfnlist = [ "LFN:/my/file1.slcio", ... ]
This convention is identical to the one created by the script dirac-repo-create-lfnlist.

When using such a file list as input, the minimal example looks like this:

python lcsimJob.py -i <myFileList.py> -l <myLcsimSteering.xml> -e <myEventType>
In addition to the file list, also an event type has to be given, since no meta data is available to extract this automatically. Instead one can also specify the full output file path directly
python lcsimJob.py -i <myFileList.py> -l <myLcsimSteering.xml> -o <my/output/path>

Command Line Parameters

Parameter Takes Argument Description
a alias yes Sets the alias.properties file to use, necessary if your desired detector model is not in the lcsim repository.
A agent no Submit the job in agent mode for debugging. The job will be executed on the local machine but is also registered with the job monitoring system.
b banlist yes Sets the file with the list of banned sites. Default is bannedSites.py shipped with the script.
D detector yes Sets the detector name. Only used for determining the output path, since lcsim gets the geometry files automatically. Default is clic_sid_cdr.
e eventtype yes Sets the name of the event type used for the output path. Overrides the one obtained from the production meta data.
f files yes Sets the maximum number of files to process, i.e. only the first five files.
h help no Shows this list of paramters.
i input yes Sets the file with the list of LFNs to process. See above for details.
J jar yes Sets which jar file should be added to the input sand box.
l lcsimxml yes Sets the xml steering file to be used.
L lcsim yes Sets the lcsim version to used. Has to be available in DIRAC. Default is CLIC_CDR.
M merge yes Sets the number of input files processed in each job.
n events yes Sets the number of events to process per job. Default are all in each input file.
o outputpath yes Sets the path where the output data will be stored directly, instead of using detectorName/eventType/jobTitle.
O override no By default, jobs which would produce a file that already exists in the file catalog will be skipped. Warning, important If this option is set the old file will be deleted and the job will be submitted.
p prodid yes Sets the production ID to be processed.
R recfiles no If running over a production, the REC files are used instead of the DST files.
S storageelement yes Sets the storage element where the output data will be stored. Default is CERN-SRM.
t time yes Sets the maximum cpu time limit in seconds. Default is 100000 and maximum is 300000.
T title yes Sets the job title, used in the output path and in the job monitoring. Default is the steering file name.
v verbose no Switches of the output that is usually printed.
y strategy yes Sets the tracking strategy file used in the job. Only required for special jobs.
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2011-07-04 - ChristianGrefe
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CLIC All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback