Ganga Utilities
Ganga is the be-all and end-all LHCb software submission tool. If you can do it on the command line, you can do it in Ganga and mostly with less hassle. If you want to submit a local test job, then a grid job without changing your environment, writing new scripts or whatever, Ganga's your freind.... Ctrl-D to exit
GreigCowan +
RobLambert have put together some common operations that need to be performed when using Ganga to submit, monitor and manage jobs. Hopefully some of these utilities will eventually be folded into the main Ganga release.
Ganga utils is configurable, and is frequently expanded by users. By default it is configured to work at
Edinburgh, and has been adopted at
Imperial and
Nikhef. If you'd like to add your own institute, give us a bell!
Geting Started
If Ganga Utils is not directly provided at your institute, or you want a different version, or you want to edit the code, you can get access to these utilities by checking them out of the LHCb
SVN repository.
The example here is for the bash shell of unix, so if you're not using bash, you should translate it accordingly.
There are three ways to obtain Ganga Utils, depending exactly what you want to do (methods A, B, C, below).
This is the simplest way to use Ganga Utils
$ svn checkout svn+ssh://svn.cern.ch/reps/lhcb/packages/trunk/GangaUtils/python/GangaUtils <somewhere_local> #authenticated access
$ svn checkout http://svn.cern.ch/guest/lhcb/packages/trunk/GangaUtils/python/GangaUtils <somewhere_local> #anonymous access
Once you have copied the file to somewhere local, add the location to your Python path, in your bashrc:
export PYTHONPATH=${PYTHONPATH}':/Home/uname/<somewhere_local>'
-B) Using sys.path
This is the next-simplest way in case you don't want to edit
.bashrc
, so it is a bit more safe. Checkout the package following method A, but then append the
PYTHONPATH
at runtime, rather than in your .bashrc.
The
PYTHONPATH
variable is only required to keep track of the package location, it is internally parsed by python into sys.path, which you can also edit at run-time
<<ganga 5.X>>
[Configuration]
StartupGPI = import sys as __sys__
__sys__.path.append('/Home/uname/<somewhere_local>')
-C) Using CMT
The third option is the most safe, but also the most cumbersome.
Ganga Utils is also wrapped in a CMT package, so if you like you can make your own wrapper to import it automatically:
- create a MyStuff local CMT project
- create a MyStuffSys with use GangaUtils v
- getpack GangaUtils head into that project
- SetupProject Ganga --runtime "MyStuff"
Loading ganga utils
If not loaded automatically at your institute, in ganga 5.X you can load it at start up by specifying it to be loaded in your gangarc
<<ganga 5.X>>
[Configuration]
StartupGPI = import ganga_utils as gu
gu.configure() #configuration options can go in the brackets
You can then call the functions by doing:
gu.function_name()
You can also do:
from ganga_utils import *
if you want all of the functions in the global namespace.
Feedback on these is welcome, as are patches and contributions of your own functions.
Updating to the latest version
This can be done automatically inside Ganga using
gu.update():
gu.update()
reload(gu)
gu.configure('mylocation') #configure as you like, reloading junks the configuration
Similarly functions to view/edit the code, and commit changes are provided,
gu.edit(),
gu.commit('I fixed the bug').
Key features
- Enables a lot of ganga features to be more easily accessed
- Configures the behaviour of Condor and SGE automatically
- Managed Locally
- User-editable
- gu.info() and gu.man() print some help information.
- tab completion of gu. will list all members
- help(gu.function) will give the usage of each function
- Jobs can be cast from int, string or float. All job functions take in int string or floats to reference jobs or subjobs, for example:
- gu.function(jobs(20))
- j=jobs(20); gu.function(j)
- gu.function(20)
- gu.function('20')
- gu.function(jobs(20).subjobs[0])
- js=j.subjobs[0]; gu.function(js)
- gu.function(20.0)
- gu.function('20.0')
- Most job functions will work on lists and slices for example:
- gu.function([jobs(20),jobs(40)])
- j_list=[jobs(20),jobs(40)]; gu.function(j_list)
- gu.function([20,40])
Configuration
There are many variables which need to be set to configure e.g. condor/SGE in ganga 4.X . By setting up ganga utils, using the features of Ganga 5.X, this is no longer required.
[Configuration]
StartupGPI = import ganga_utils as gu
gu.configure()
Is configured to set up all variables in ganga at Edinburgh and Imperial. You do still need to edit your bashrc though.
To alter the configuration, call the gu.configure() function, supplying arguements.
Configuration options include:
Option |
Description |
Example |
location |
Setup paths and commands specific to the institute |
gu.configure(location='edinburgh') |
editor |
Choose an editor to use on options and requirements files |
gu.configure(editor='emacs') |
userdata |
Choose a path to a directory of gaudi cards or ascii files of lists of data |
gu.configure(userdata='~/cmtuser/MyPyOpts/Data/') |
urldata |
Choose a web directory which contains gaudi cards or ascii files of lists of data |
gu.configure(urldata='http://www.ph.ed.ac.uk/~gcowan1/data/') |
Typical use case
j=jobs[-1] #get the last job in the list
j=gu.local_copy(j) #copy it and correct the cmt paths
j.inputdata=gu.dataset_from_twiki() #change dataset
gu.options(j) #change the options file
j.submit() #submit it
gu.qstat() #check the status of the queue
#wait 'til the job completes
gu.subjob_checkAndFail(j) #has it really completed?
gu.subjob_resubmit(j) #submit failed subjobs
#wait 'til the job completes
gu.subjob_checkAndFail(j) #has it really completed?, maybe it missed some files
k=gu.resplit(j) #get a new job with the missed files
k.splitter.filesPerJob=3 #change this job in some way
k.submit() #submit this new job
#wait 'til the job completes
gu.subjob_checkAndFail(k) #has it finally all completed?
gu.subjob_merge([j,k]) #merge this output with that in the previous job
gu.cd(j) #go to the output directory of j to check the merged files
!root -l
gu.subjob_remove([j,k]) #remove the subjob input and output
#but leave the merged files alone... do last!
Creating an LHCbDataset from the list of locally available files
The ganga_utils module contains a method for automatically creating an LHCbDataset from the .dat file that you can find in the
LHCbEdinburghGroupDataFiles table, or the
LHCbImperialGroupData, or
AfsDataSets. table. i.e.
j.inputdataset = gu.dataset_from_twiki('name_of_file.dat')
where j is your job and name_of_file.dat is filename that you are interested in. A blank filename will give the list of files from :
LHCbEdinburghGroupDataFiles or
LHCbImperialGroupData, or
AfsDataSets.
--
RobLambert - 31 Mar 2009