This is the unofficial page of the Ganga CMSSW Plugin (GangaCMS)

At Uniandes, we have developed a plugin in Ganga for CMS software framework. We called it GangaCMS. Ganga is a great application for job submission to any batch system including the Grid. Its motto is "Configure once - run anywhere".

Warning, important Disclaimer: this is not an official initiative from CMS nor from the Ganga team. The information contained in this website is for general information purposes and valid only within our research group.

Get started with Ganga

  • First of all you need to make sure you can run Ganga on your machine (or from where you will submit jobs). Details on installation are found on the official pages. Ganga is installed on the lxplus machines and one way to get it setup from your account is to add an alias in your shell configuration file (.bashrc or *.cshrc):

# this is to get Ganga added to your environment
alias gangaenv 'eval `/afs/ --version=latest --interactive --experiment=generic csh` '

  • Note Ganga in our Tier-3 ( installed under /opt/exp_soft/ganga/ ):

# this is to get Ganga added to your environment
alias gangaenv 'eval `/opt/exp_soft/ganga/ --version=latest --interactive --experiment=generic sh` '

Get the Plugin

  • Get a copy of the plugin. At the moment it is in our group SVN repository (not included in the official Ganga release):

svn co svn+ssh://

First time Configuration

  • After downloading the plugin, you are now ready to run Ganga for the first time. Setup the environment and start Ganga with option "-g":

<lxplus249> gangaenv
Setting up Ganga 5.3.1 (csh,generic)
<lxplus249> ganga -g

The option -g creates a hidden configuration file for Ganga ( .gangarc ). We need to add a few lines in there to drive the plugin. Open .gangarc in your favourite editor and add/edit the following lines (you need to adapt them to your case):

## ....... Edit the following lines


gangadir = /afs/

## ....... Add the following lines anywhere in the configuration file
## .. dataTwiki is optional - this is the URL to a twiki where dataset files are located (in ascii files) 

dataOutput = /castor/
dataTwiki =

copyCmd = rfcp
mkdirCmd = rfmkdir

  • These options correspond to:
    • RUNTIME_PATH: to tell Ganga where the plugin is located
    • gangadir: Ganga creates a repository for your jobs. Tell Ganga where you want this repository to be created (needs space on disk)
    • [CMSSW] and [CMSCAF]: these sections contain the corresponding options to be given to the GangaCMS plugin (for example, dataOutput tells Ganga where is the massive storage element to save the output from your jobs)

  • Regarding Grid, the following lines could be edited in the .gangarc file:

#  Enables/disables the support of the GLITE middleware


#  Enables/disables the support of the GLITE middleware

#  sets the LCG-UI environment setup script for the GLITE middleware


#  sets the name of the grid virtual organisation
VirtualOrganisation = cms

A full list of the current options and their defaults is described here:

[CMSSW] Description Default
arch platform/architecture slc4_ia32_gcc345
cmsswdir Path to CMSSW $VO_CMS_SW_DIR
version Default version of CMSSW CMSSW_3_2_5
dataOutput The place where Outputdata should go $HOME/scratch0
dataTwiki Points to a Twiki where datafile list can be uploaded empty
workArea Top directory where user creates its CMSSW working area $HOME
dbsPath Path to script /afs/.../cms/dbs-client/DBS_2_0_6/lib/DBSAPI
dbsCommand The command line DBS client

The Application

A job in Ganga is make upon different building blocks. The two main blocks are the application and the backend. In our case, the application is the cmsRun executable.

At the moment, the cmsRun application has the following attributes:

Attribute Description Default
platform OS platform where the application is supposed to run slc4_ia32_gcc345
version CMSSW version CMSSW_3_2_5
args simple list of arguments associated to a list of parameters empty
uselibs if the application depends upon user build libraries (0=false, 1=true) 0
cfgfile configuration file for cmsRun empty

Other components of a Job

GangaJob.png From Ganga homepage

Job splitting


A very simple job splitter was implemented: SplitByFiles. As its name indicates, a job is splitted in n subjobs given a partition of the input dataset. You will need to construct the splitter object by passing a Ganga File, containing a plain list of the dataset file names, and tell what type of data is going to be used ("local" prepends the prefix "file:", "castor" prepends "rfio:" etc). Here a show a snippet of the splitter definition:

ff      = File(name='/opt/CMS/CMSSW_2_2_3/src/ForGangaTest/SimpleAnalyzer/files.txt')
fdata = CMSDataset( ff , 'local' )
myjob.inputdata = fdata


Ganga comes with an Argument Splitter: your provide a list of n-arguments and Ganga builds n-subjobs having each one the specific set of arguments. We adapted this functionality to modify the configuration file that drives the cmsRun application. You will need first to understand what parameter, its type and value/argument:

# You want the following configuration parameter to change in each job:

process.source = cms.Source("EmptySource", 
                                                  firstEvent = cms.untracked.uint32(1001) )

You need to use the cmsRun() application method add_cfg_Parameter which takes two arguments: parameter and its type. Ganga will append at the end of the file the appropriate line and pass the argument. You can add as many lines you want but make sure they match also the number of arguments in the list i.e.

parameter_1 = parameter_type_1 ------> [ [ arg_1_1, 
parameter_2 = parameter_type_2 ------>     arg_1_2,
parameter_n = parameter_type_n ------>     arg_1_n ],

parameter_1 = parameter_type_1 ------>    [ arg_2_1, 
parameter_2 = parameter_type_2 ------>      arg_2_2,
parameter_n = parameter_type_n ------>      arg_2_n ],

# app is your application object of type cmsRun()
# in this example we will change the first event of each job.

# List of Arguments for this job
arguments = [ [1] , [11] , [21] , [31] ]

# Set the Argument Splitter for this job
myjob.splitter = ArgSplitter( args = arguments )

# ArgSplitter will produce 4 subjobs in this case

The effect on the file will be: process.source.firstEvent = cms.untracked.uint32( 1 ) for subjob 1 for example.

More: ArgSplitter in the Ganga documention.

Job merging

Ganga comes with some great merging plugins, among them RootMerger which collects the root output from all jobs and merges it (using hadd). The RootMerger has attributes files, overwrite and ignorefailed, all of them self explanatory. I put here an example of the RootMerger definition:

rm = RootMerger()
rm.files = ['histo.root']        #  files to merge
rm.overwrite = True           # Overwrite output files
rm.ignorefailed = True       # ignore root files that failed to open


The following script illustrates all main characteristics of a job configuration:

from GangaCMS.Lib.CMSexe import *

#construct the cmsRun application object
app = cmsRun()
app.uselibs = 1
app.cfgfile = File(name='/opt/CMS/CMSSW_2_2_3/src/ForGangaTest/SimpleAnalyzer/')
app.version = 'CMSSW_2_2_3'

#construct the job with backend Local
myjob = Job( application = app, backend = 'Local' )

#set the data you want to get back from the job

#define a data set
ff      = File(name='/opt/CMS/CMSSW_2_2_3/src/ForGangaTest/SimpleAnalyzer/files.txt')
fdata = CMSDataset( ff , 'local' )
myjob.inputdata = fdata

#create a splitter
sp = SplitByFiles()
sp.filesPerJob = 1
sp.maxFiles = -1

#create a root merger
rm = RootMerger()
rm.files = ['histo.root']
rm.overwrite = True
rm.ignorefailed = True

myjob.splitter = sp
myjob.merger = rm

#submit job

LXPLUS and CAF queues

This table summarizes the CAF batch queues available in addition to the usual LSF queue on lxplus ( 1nh, 1nd, 1nw ):

Name cmscaf1nh cmscaf1nd cmscaf1nw Sorted descending
max jobs/user 100 100 10
max length 1 norm. hour 1 norm. day 1 norm. week

(cmscaf1nd replaces cmscaf8nh from July 2009).


Available Backends Runtime Handlers

Here is a list of the backends runtime handlers we have implemented so far:

Backend Tested Dataset
Local DONE CMSDataset


With the LCG backend one sends jobs directly to the Grid using the GLITE middleware. According to your needs, some configuration is required:

  • In your .gangarc:

search for [CMSSW]

# this is the path to your output files on your favorite Storage Element: for example
dataOutput = /dpm/

search for [LCG]:

#here you put the name of the Server that runs as Storage Element
DefaultSE = 

  • In your job script make sure you select the LCG backend:

#... construct the job with backend Local

myjob = Job( application = app, backend = 'LCG' )
myjob.backend.middleware = 'GLITE'

#... specify here the Computing Element where you want your job to run
myjob.backend.CE = ''


A very simple CRAB wrapper has been implemented. It basically helps creating in a consistent way a job configuration file and submits to CRAB.


DBS Search

To do

There are lots of exciting things to be developed:

  • Integrate with other APIs (DBS-API for instance)
  • Talk to CRAB Server directly
  • Cleaning up the code, anything else?

-- AndresOsorio - Mar 2009

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng GangaJob.png r1 manage 32.3 K 2009-09-07 - 21:43 AndresOsorio From Ganga homepage: what is a job in Ganga

This topic: Main > TWikiGroups > CMSUniandesGroup > CMSUniandesGroupGangaCMS
Topic revision: r23 - 2009-09-07 - AndresOsorio
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback