Ganga Tutorial 1
We start with a very simple problem which doesn't require knowledge about any other LHCb software - factorizing prime numbers. Be careful what you copy and paste from this wiki as python doesn't like random white space (it's best if you don't copy and paste anything).
Ganga Setup
First, set the environment for Ganga in a fresh terminal:
SetupProject Ganga
which gives you the latest version (5.4.3 at time of writing). If you have last used
ganga
before version
5.4
you may need to do
ganga -g
to update your
.gangarc
file to include some of the newer options.
For this tutorial, we need to add an extra Ganga configuration file, so depending on your shell do
setenv GANGA_CONFIG_PATH ${GANGA_CONFIG_PATH}:/afs/cern.ch/user/j/jwilliam/public/GangaTutorial/Tutorial.ini
or for bash-like shells
export GANGA_CONFIG_PATH=${GANGA_CONFIG_PATH}:/afs/cern.ch/user/j/jwilliam/public/GangaTutorial/Tutorial.ini
If you don't have access to
/afs/cern.ch/user/j/jwilliam/public/GangaTutorial
(e.g. you're using a machine that doesn't have afs or if something is wrong w/ my afs account), then you can do everything in this tutorial except submitting to the grid by simply doing this instead:
setenv GANGA_CONFIG_PATH ${GANGA_CONFIG_PATH}:GangaTutorial/Tutorial.ini
or for bash-like shells
export GANGA_CONFIG_PATH=${GANGA_CONFIG_PATH}:GangaTutorial/Tutorial.ini
You will not need to do either of these for standard LHCb running.
Now start an interactive session:
ganga
*** Welcome to Ganga ***
Version: Ganga-5-4-3
Documentation and support: http://cern.ch/ganga
Type help() or help('index') for online help.
This is free software (GPL), and you are welcome to redistribute it
under certain conditions; type license() for details.
In [1]:
You should now see the Ganga prompt! Check to make sure that the application for this tutorial was loaded (we need
PrimeFactorizer
):
In [1]:plugins('applications')
Out[1]: ['GaudiPython', 'Executable', 'Brunel', 'Moore', 'DaVinci', 'Panoptes', 'Gauss', 'Boole', 'Gaudi', 'Vetra', 'Root', 'Euler', 'PrimeFactorizer']
Ignore the warning if you don't have a valid Grid proxy (you should only see this once; we'll create the proxy when we need it below). You can check which plugins are available to you in each category in your current Ganga session using
plugins
. Try using the
help
utility to see if you can figure out how to list all of the available plugins in all categories:
In [2]:help(plugins)
This runs
less
, so type
q
to exit. Ganga provides
help
information on just about every object, method, etc. Try this first if you get stuck.
If you would like to run the files locally, copy them to a directory you own and change the contents of the
Tutorial.ini
file.
cp -R /afs/cern.ch/user/j/jwilliam/public/GangaTutorial ~/public/
The
Tutorial.ini
file should be edited to look something like this.
[Configuration]
RUNTIME_PATH = /afs/cern.ch/user/a/auser/public/GangaTutorial
Now add your
Tutorial.ini
file to
GANGA_CONFIG_PATH
.
setenv GANGA_CONFIG_PATH ${GANGA_CONFIG_PATH}:/afs/cern.ch/user/a/auser/public/GangaTutorial/Tutorial.ini
or for bash-like shells
export GANGA_CONFIG_PATH=${GANGA_CONFIG_PATH}:/afs/cern.ch/user/a/auser/public/GangaTutorial/Tutorial.ini
Prime Number Factorization
In this tutorial, our task is to find the prime factors of a given integer. Finding very large prime factors requires a lot of CPU time. This tutorial provides code that can factorize any number whose prime factors are among the first 15 million known prime numbers. We have 15 tables of 1 million prime numbers each and we can scan the table in search of the factors. The python modules we will use (which are already written for you) include a collection of prime number tables called a
PrimeTableDataset
which are used by the
PrimeFactorizer
application.
Running a Factorization Job using Ganga
Let's start with a small example. The goal is to find the prime factors of the integer 1925. For such a small number, we (clearly) only need the first prime number data table (recall that each table contains 1 million prime numbers). At the Ganga prompt, type the following:
In [1]: j = Job()
In [2]: j.application = PrimeFactorizer(number=1925)
In [3]: j.inputdata = PrimeTableDataset(table_id_lower=1, table_id_upper=1)
At this point, we've created a
Job
object but we haven't run anything yet. We're free to edit its attributes as much as we like prior to submitting the job. For actual LHCb jobs, the
application
might be
DaVinci
while the
inputdata
could be a list of LHCb data files. The idea and most of the syntax are the same though as in this simple example. To see all of the job's attributes, do
In [4]:j
Out[4]: Job (
status = 'new' ,
name = '' ,
...
backend = Local ( ... )
)
Notice that the
backend
is set to
Local
(which is the default value since we didn't specify where we wanted the job to run). This means that the job will run in the background on the local machine.
OK, let's submit the job:
In [5]: j.submit()
Ganga.GPIDev.Lib.Job : INFO submitting job 0
Ganga.GPIDev.Adapters : INFO submitting job 0 to Local backend
Ganga.GPIDev.Lib.Job : INFO job 0 status changed to "submitted"
We can check the status of the job by doing
In [6]:j.status
This will either be
submitted
,
running
or
completed
. If the job hasn't finished yet, wait for a few seconds and check again (for such a small number, the job should finish very quickly).
We can see what files were output by the job by doing
In [7]:j.peek()
total 8.0K
-rw-r--r-- 1 jwilliam z5 0 Jan 9 14:24 __syslog__
-rw-r--r-- 1 jwilliam z5 232 Jan 9 14:24 stdout
-rw-r--r-- 1 jwilliam z5 4.9K Jan 9 14:24 stderr
-rw-r--r-- 1 jwilliam z5 86 Jan 9 14:24 __jobstatus__
-rw-r--r-- 1 jwilliam z5 26 Jan 9 14:24 factors-1925.dat
All Ganga jobs return the standard output and error in the files
stdout
and
stderr
. This job has also produced the file
factors-1925.dat
. We can view the contents of this file using
In [8]:j.peek('factors-1925.dat')
which opens the file using
less
(use standard
less
commands to scroll etc., type
q
to quit).
The file should contain the factors [(5, 2), (7, 1), (11, 1)], let's check if this is correct:
In [9]:(5**2)*7*11 == 1925
Out[9]: True
Remember, standard python syntax works at the Ganga prompt!
OK, so we've run a job and checked the output using Ganga's magic but for a real analysis you'll often want
direct access to the file. So, where is
factors-1925.dat
? It's in the job's output directory. You can obtain the full path of this directory via
In [10]:j.outputdir
Out[10]: /afs/cern.ch/user/j/jwilliam/gangadir/workspace/jwilliam/LocalAMGA/0/output/
This is a
normal directory that you own; thus, you have permission to access the files there from a process independent of Ganga. So, you could exit Ganga and examine
factors-1925.dat
using, e.g.,
cat
on the Linux command line...or, you could do this from Ganga. You can access
shell
commands from the Ganga prompt using
!
as follows:
In [11]:!ls ~/.globus
usercert.pem userkey.pem
In [12]:!cat $j.outputdir/factors-1925.dat
[(5, 2), (7, 1), (11, 1)]
Notice that you can use the
$
character to access python variables when using the
!
to access shell!
A few other basic
convenience features which you can play around with involve scrolling through the history and using the
TAB
completion. Try using the
arrow-UP
to scroll through the history of the Ganga commands you've executed so far (works the same as when in a shell). You can use
TAB
completion on keywords, variables, objects, etc. Try the following (where
TAB
and
arrow-UP
mean hit those keys, don't type it out):
In [13]:j.app<TAB>
In [13]:j.application
In [13]:j.application<arrow-UP>
In [13]:j.application = PrimeFactorizer(number=1925)
The
arrow-UP
key scrolls through the history of commands that match what's been typed so far. In this case it scrolls through all commands which start with
j.application
(which is only 1 command so far, but try it again latter on in the tutorial!). This behavior is similar to using
ESC-P
in
tcsh
or
CTRL-R
in
bash
.
Splitting a Ganga Job into Multiple Concurrent Jobs
Now that you've seen some of the basics of Ganga, let's try something a little more interesting - factorizing a very large integer. For this we'll need a
PrimeTableDataset
which contains all 15 tables of prime numbers. To speed things up, we will also split the job into 5 local
subjobs
which will run concurrently.
First, define a job as before but w/ a larger number and using all 15 prime number tables (feel free to use the
arrow-UP
and
TAB
keys to do this instead of typing it all out!):
In [1]: j = Job()
In [2]: j.application = PrimeFactorizer(number=118020903911855744138963610)
In [3]: j.inputdata = PrimeTableDataset()
In [4]: j.inputdata.table_id_lower = 1
In [5]: j.inputdata.table_id_upper = 15
Now add a splitter to divide up the task of finding all the prime factors (here we'll make 5
subjobs):
In [6]: j.splitter = PrimeFactorizerSplitter(numsubjobs=5)
For LHCb jobs, similar splitters are provided to split jobs up which run on multiple data files, etc.
We also want to add a merger to merge the output from each of the 5 subjobs:
In [7]: j.postprocessors = TextMerger(files=['factors-118020903911855744138963610.dat'])
When all 5 subjobs are complete, the merger will merge the contents of each of the 5
factors-118020903911855744138963610.dat
files into a single file in the master job's output directory (we'll look at what this means below).
OK, now submit the job (actually, the 5 jobs) just like we did above:
In [8]: j.submit()
Ganga.GPIDev.Lib.Job : INFO submitting job 1
Ganga.GPIDev.Adapters : INFO submitting job 1.0 to Local backend
Ganga.GPIDev.Lib.Job : INFO job 1.0 status changed to "submitted"
Ganga.GPIDev.Adapters : INFO submitting job 1.1 to Local backend
Ganga.GPIDev.Lib.Job : INFO job 1.1 status changed to "submitted"
Ganga.GPIDev.Adapters : INFO submitting job 1.2 to Local backend
Ganga.GPIDev.Lib.Job : INFO job 1.2 status changed to "submitted"
Ganga.GPIDev.Adapters : INFO submitting job 1.3 to Local backend
Ganga.GPIDev.Lib.Job : INFO job 1.3 status changed to "submitted"
Ganga.GPIDev.Adapters : INFO submitting job 1.4 to Local backend
Ganga.GPIDev.Lib.Job : INFO job 1.4 status changed to "submitted"
You can check the status off all 5 jobs by simpy doing:
In [9]:j.status
If any of the jobs is still running, the status of the master job will be listed as
running
. If all 5 jobs are
completed
, the master job's status will also be
completed
. Wait until all 5 jobs are done (should take less than a minute) before moving on (in the mean time you can play around with
help
, e.g. try
help(j.submit)
...remember, type
q
to quit).
Once the jobs are complete, let's look at the output of one of the subjobs (do exactly what we did above):
In [10]:j.subjobs[2].peek()
total 18K
-rw-r--r-- 1 jwilliam z5 0 Jan 9 18:08 __syslog__
-rw-r--r-- 1 jwilliam z5 564 Jan 9 18:08 stdout
-rw-r--r-- 1 jwilliam z5 15K Jan 9 18:08 stderr
-rw-r--r-- 1 jwilliam z5 86 Jan 9 18:08 __jobstatus__
-rw-r--r-- 1 jwilliam z5 17 Jan 9 18:08 factors-118020903911855744138963610.dat
In [11]:j.subjobs[2].peek('factors-118020903911855744138963610.dat')
The file should contain the factor [(141650963, 1)]. Each of the
j.subjobs
is itself a
Job
(try printing it), so you can do anything you would do on an
independent job on the subjobs.
Now examine the merged output of all the jobs:
In [12]:j.peek()
total 2.0K
-rw-r--r-- 1 jwilliam z5 653 Jan 9 18:08 factors-118020903911855744138963610.dat.merge_summary
-rw-r--r-- 1 jwilliam z5 869 Jan 9 18:08 factors-118020903911855744138963610.dat
In [13]:j.peek('factors-118020903911855744138963610.dat')
The file should contain the factors [(2, 1), (3, 1), (5, 1), (7, 1), (15485867, 1)] [] [(141650963, 1)] [] [(256203221, 1)] (some of the prime number tables don't contain any factors of this particular number). You can check if they're right on the Ganga prompt like we did above. Notice that the
master job doesn't have the
stdout
and
stderr
files since itself was never actually run. In fact, had we not added the merger to the job there would be no output in the master job's directory.
Running Ganga Jobs on the Grid
NOTE: This appears to be broken, as of 17 Dec 2012, and ganga complains that is_prepared is not set for
PrimeFactorizer. Perhaps this error is why there is an attached document with a corrected
PrimeFactorizer.py file, which is supposed to be an updated version for Ganga > 5.7 (current is 5.8)
For many LHCb jobs (which often involve processing large amounts of data), running concurrently isn't enough. When a large number of CPU's is required for a job, we need the grid! Specifically, we want to run on the LHC Computing Grid (LCG). For LHCb jobs, this involves the DIRAC workload manager.
As an example of running on the grid, we'll run the same set of jobs we ran above but using a different
backend
. We could retype all of the required info from the previous job definition or, better yet, we could use the
TAB
and
arrow-UP
functionality to re-enter the info. An easier way is to just copy the previous
Job
object, then change the
backend
so that the jobs run on the
Dirac
:
In [14]: j = j.copy() # we could've also used Job(j), etc.
In [15]: j.backend = Dirac()
In [16]: j.outputfiles = [SandboxFile('factors-118020903911855744138963610.dat')]
Notice that we had to add a
SandboxFile to the
outputfiles
. This is to tell
ganga
that we want this file returned to us after the grid job is completed.
Now just submit the jobs the same way as before (since this is the 1st time we've done something that requires a grid proxy, you'll be asked for your grid password if you don't currently have a valid grid proxy on this machine):
In [17]: j.submit()
Ganga.GPIDev.Lib.Job : INFO submitting job 2
Enter Certificate password:
Ganga.GPIDev.Adapters : INFO submitting job 2.0 to Dirac backend
Ganga.GPIDev.Lib.Job : INFO job 2.0 status changed to "submitted"
Ganga.GPIDev.Adapters : INFO submitting job 2.1 to Dirac backend
Ganga.GPIDev.Lib.Job : INFO job 2.1 status changed to "submitted"
Ganga.GPIDev.Adapters : INFO submitting job 2.2 to Dirac backend
Ganga.GPIDev.Lib.Job : INFO job 2.2 status changed to "submitted"
Ganga.GPIDev.Adapters : INFO submitting job 2.3 to Dirac backend
Ganga.GPIDev.Lib.Job : INFO job 2.3 status changed to "submitted"
Ganga.GPIDev.Adapters : INFO submitting job 2.4 to Dirac backend
Ganga.GPIDev.Lib.Job : INFO job 2.4 status changed to "submitted"
Congratulations! You've just submitted 5 jobs to the LCG grid via Dirac.
Let's check the status of the jobs:
In [17]: j.subjobs
Out[2]:
Job slice: jobs(2).subjobs (5 jobs)
--------------
# fqid status name subjobs application backend backend.actualCE
# 2.0 completed PrimeFactorizer Dirac LCG.Glasgow.uk
# 2.1 completed PrimeFactorizer Dirac LCG.PDC.se
# 2.2 completed PrimeFactorizer Dirac LCG.GRIDKA.de
# 2.3 completed PrimeFactorizer Dirac LCG.PIC.es
# 2.4 completed PrimeFactorizer Dirac LCG.USC.es
Notice that the hostname of the computer which ran (or is running if the job hasn't finished yet) the job is displayed along with the current status. Hopefully your jobs will start soon, but it's possible (depending on where the job is running) that some of your jobs will stay in the
submitted
state for a while. If all the jobs are finished, go ahead and check the master's output. If some are still running, check some of the subjobs output and check that it matches what was output by the same subjob when run locally. Once any of the jobs is
running
or
completed
, you've run on
The Grid!
Running a Later Version of Ganga
If you are running a version of Ganga that is at least v5.7.0, you may encounter problems when submitting the jobs to the Grid. To resolve these the application needs to be converted to a
prepared application. This requires creating a
local version of
GangaTutorial
, as described above.
Inside the
Lib
directory of the
GangaTutorial
folder, download the corrected version (you may need to specify the
-k
flag to ignore the unknown CERN certificate_.
curl https://twiki.cern.ch/twiki/pub/LHCb/GangaTutorial1/PrimeFactorizer.py.txt -o PrimeFactorizer.py
This converts the
PrimeFactorization app to a
prepared app.
Recreate the job in Ganga (from scratch, not using
j.copy()
) and try submitting the job to the Grid (Dirac) again.
A Few Other Features
The Job Registry
All of the jobs you've ever run (and not deleted) are contained in the list
jobs
:
In [1]: jobs
Out[1]:
Job slice: jobs (3 jobs)
--------------
# fqid status name subjobs application backend backend.actualCE
# 0 completed PrimeFactorizer Local lxplus242.cern.ch
# 1 completed 5 PrimeFactorizer Local
# 2 completed 5 PrimeFactorizer Dirac
If we wanted to rerun the first job, we could do the following:
In [2]: j = jobs(0).copy()
In [3]: j.submit()
The last job can always be accessed using the python list directly using
jobs[-1]
.
Job Templates
Often times when running LHCb jobs you will want to rerun a
type of job (e.g. Monte Carlo production jobs). Rather than always copying a previous job, you could set up a template of it. To template the first job we ran, do
In [1]:t = JobTemplate(jobs(0))
In [2]:t.name = 'small-prime-factorizer'
You don't have to name it, but this will be useful later on to help you find the template you're looking for. The list of all your job templates is stored in the python list
templates
(the same way jobs are stored in
jobs
). Try printing it.
Now, create a new job from the template and run it:
In [3]:j = Job(t) # or j = Job(templates(0)),...
In [4]:j.submit()
Job templates are quite useful due to the fact that they're easy and fast to search through.
Removing Jobs
If you want to remove a job to save disk space or just because it's obsolete, simply do (try it):
In [1]: jobs
Out[1]:
Job slice: jobs (3 jobs)
--------------
# fqid status name subjobs application backend backend.actualCE
# 0 completed PrimeFactorizer Local lxplus242.cern.ch
# 1 completed 5 PrimeFactorizer Local
# 2 completed 5 PrimeFactorizer Dirac
... plus whatever other jobs you've submitted so far ...
In [2]:jobs(0).remove()
Ganga.GPIDev.Lib.Job : INFO removing job 0
In [3]: jobs
Out[3]:
Job slice: jobs (2 jobs)
--------------
# fqid status name subjobs application backend backend.actualCE
# 1 completed 5 PrimeFactorizer Local
# 2 completed 5 PrimeFactorizer Dirac
... plus whatever other jobs you've submitted so far ...
This removes the job workspace (i.e. the output directory and all output files) and all traces of the job in Ganga's registries....so be careful when doing this!
The GANGA Box
You can persist (store) any GANGA object in the GANGA box. E.g., you could create a bookkeeping query object:
In[1]: bkq = BKQuery()
In[2]: bkq.path = '/LHCb/Collision09/Beam450GeV-VeloOpen-MagDown/Real Data + RecoToDST-07/90000000/DST'
which can be used at any time to get an up-to-date list of LHCb data files obtained by this query by doing:
In[3]: data = bkq.getDataset()
This data could then be used as the input data for Ganga jobs. To store this object so that you don't need to recreate it every time you want to update the query, simply do:
In[4]: box.add(bkq,'example bk query')
You can then access this object at any time. E.g., try quitting and restarting GANGA and then do:
In[1]: data = box['example bk query'].getDataset()
In[2]:data[0]
Out[2]: LogicalFile (
name = '/lhcb/data/2009/DST/00005842/0000/00005842_00000194_1.dst'
)
Writing Your Own Functions
You can also write your own functions and load them into Ganga. Exit Ganga and create the file
~/.ganga.py
:
def foo(): print 'bar'
Ganga will automatically load this file, so restart it and try the following:
In [1]:foo()
bar
As an exercise, try and write your own function that creates a job from your "small prime numbers" template, submits it and returns a reference to the
Job
object.
Etc...
There are many more features in Ganga which I don't have time to cover here. Remember to use the
help
function if you're unsure about something (try
help()
if you're unsure about everything!). There are also answers to common questions on the
FAQ wiki page and the
user's guides
are also quite useful.
Good luck and happy grid-ing!
--
MikeWilliams - 09 Jan 2009