ARC Academy 2010

Introduction

Welcome to the ARC Academy, part of the NGIn School at NorduGrid2010. The aim of this short course is to give novice grid users a feel for what the grid is about, and also let you send some simple first jobs, while still giving some relevant information and more complex exercises for experienced users.

The tutorial consists of two parts:

  1. A series if introductory lectures. Program here
  2. A set of exercises, which can be found on this page.

Prerequisites

To follow the exercise part of this tutorial, you need:

  • A computer with web access
  • A grid certificate issued by the NorduGrid CA authority, or from elsewhere if you're already part of the ATLAS VirtualOrganization. (This is to get access to the resources made available for the tutorial. If you do not have a grid certificate a test certificate can be provided. This certificate however is not a member of the ATLAS VO and can therefore not access any ATLAS data)
  • For the ganga exercise, you need a linux computer running a RHEL4/5 (or compatible) distribution. If you're on another system, login to an external RHEL machine via ssh will work.

Exercises

The exercises given here are meant for users at various levels. If you already know the content of one exercise, feel free to skip on to the next one. However:

  • The steps in part 1 (regarding the grid proxy) must be followed by everyone
  • Parts 2 and 3 are essentially independent, though for part 3 we recommend that you at least read through part 2 first.

1) Getting a certificate, making a voms proxy

Getting a certificate

In order to use the grid you need a certificate provided by a Certificate Authority. The general instructions for request a grid certificate can be found at the NorduGrid Certificate Authority.

However, since getting a certificate normaly takes a couple of days we have provided some temporary certificates.

To get a temporary certificate

  • Download and unpack the certificate tar-ball
  • Follow the instructions given in the package at ngin2010-certificate/README.

Logging in into the grid

First, log in into the grid by creating a proxy which will be used to identifying yourself when using the grid. The proxy will be time limited, the default value is 12 hours.

Create your proxy with

grid-proxy-init

followed by your grid password. If you are using one of the test certificates the password is given in the certificate/README file in the tarball.

You can view information about the validity of your proxy with

grid-proxy-info

If you need to create a proxy with a different validity time you can do this by

 grid-proxy-init -valid=hh:mm:ss 

However you will not need this for these exercises.

Once you are finish using the grid you can destroy your proxy (log-out) by typing
grid-proxy-destroy

2) The ARC middleware, direct job submission

This part of the tutorial will guide you through how to obtain and use the ARC middleware. For the usage we will focus on the ARC command line client. For more information please see the ARC User Manual.

Getting the ARC middleware client

If you do not have a working copy of the ARC middleware it can be obtained at the NorduGrid download page. Just click on Standalone client package button.

  • Download and un-pack the standalone client
  • Setup the package
    cd nordugrid-arc-standalone-...tgz
    source setup.sh

Submitting a simple job

Consider the script hellogrid.sh provided in examples directory in the examples tar-ball. In order to create a grid job that will execute this script we need a job description of the following form

& (executable=hellogrid.sh) 
(stdout=hello.out) 
(stderr=hello.err) 
(gmlog=gridlog) 
(cputime=10) 

Now save the above job description in hellogrid.xrsl and you are ready to send it to a cluster.

If you are using one of the test certificates you should specify the cluster which you want to submit your jobs to. To submit the job withour specifying the cluster
ngsub hellogrid.xrsl
or if using the test certificates specify one of the following clusers
ngsub -c pikolit.ijs.si hellogrid.xrsl

If the submission is successful you will be provided with a job-id of the form gsiftp://pikolit.ijs.si:2811/jobs/225321262768030254571653.

To see the status of your jobs you can either

  • check the status of a specific job
    ngstat "job-id"
  • check the status of all your jobs
    ngstat -a

If you get an error message saying that the job was not found just wait a bit and try again.

Once the jobs has finish its status will be either FINISHED (if successful) or FAILED (if an error happened).

  • To retrieve the output from a specific job
    ngget "job-id"
  • To retrieve the output from all jobs
    ngget -a

Once the output from a job has been retrieved all information about the job on the cluster is cleaned unless the --keep options is used in the get command. Try to resubmit a job and retrieve the output by
ngget --keep "job-id" Then try to clean the job by
ngclean "job-id"
This command is also very useful if you want to clean information about failed jobs.

In some cases it can be useful to kill running job. This can be done by the ngkill command.

  • Extend the sleep command in hellogrid.sh and submit a job.
  • Then while the job is running try to kill it by executing
    ngkill "job-id"

More advanced jobs

In the previous job no other files than the job description were transfered to or from the cluster. In general jobs will make use of both input and output data. A simple example of this is given in the following job description

& (executable=hellogrid2.sh) 
(inputfiles=("myinputfile.txt" ""))
(outputfiles=("myoutputfile.txt" ""))
(jobname=hellogrid2) 
(stdout=hello.out) 
(stderr=hello.err) 
(gmlog=gridlog) 
(cputime=10) 

In this example the job expects an input file called myinputfile.txt and an output file called myoutputfile.txt. For the input files the second parameter is the path to the input files from where the job submission is done ("" means that the inpt files should be found in the current directory).

For the output files the second parameter is the output destination of our output files i.e. once the job has finished a copy command will be executed from the first parameter to the second. If the second parameter is "" no transfer is done before you make the ngget command.

Try to submit this job and see that you get back the output files once you do the ngget command.

3) The NorduGrid monitor

To track the progress of your jobs, and also to monitor the general load level of the various ARC-connected resources, you can use the NorduGrid web monitor:

http://www.nordugrid.org/monitor

Open a web browser and point it to the above URL. You will seee a long list of ARC sites and an overview of their various resources.

Try the following:

  • Click on a site name to get more detailed information about the site. E.g. to see what software is installed there, you can browse the 'Runtime environments' box close to the bottom.
  • Find a site with a number of running jobs, shown as a green bar in on the main monitor page. Click the bar to get a list of the running jobs. Click a job in this window to get details on that particular job.
  • Click the "running man" icon close to the top right corner of the main page. From the list of users, find your own name (or the name associated with your certificate if you are using a test proxy). Click it to get a list of your own jobs that are presently in the system.

4) The ganga job submission toolkit

This part of the tutorial will introduce you to ganga, a tool for defining and submitting grid jobs. If you don't know what ganga is, see the talk 'Introduction to ganga' at NorduGrid 2010.

Installing ganga

The first step is to install ganga. This should take less than a minute, depending on the network connection.

If you have a linux system that is compatible with either RedHat Linux 4 or 5 (RHEL4/5), or equivalently Scientific Linux 4 or 5 (SLC4/5), then we recommend that you install ganga on your local machine. (Other linux flavours are not supported at the moment, because ganga comes with a number of external dependencies that are only available for these systems.)

If not, ssh into a compatible system, e.g. lxplus at CERN if you have an account there.

Then, follow these steps:

  1. Make a directory for ganga, e.g. ~/ganga/ and cd into it.
  2. Get the ganga install script, like this:
    wget http://cern.ch/ganga/download/ganga-install
  3. Make the script executable: chmod u+x ganga-install
  4. Figure out your system version string from the following table:
System String
RHEL4 32 bit slc4_ia32_gcc34
RHEL4 64 bit slc4_amd64_gcc34
RHEL5 32 bit i686-slc5-gcc43-opt
RHEL5 64 bit x86_64-slc5-gcc43-opt
  1. Run the following command:
     ./ganga-install --extern=GangaAtlas,GangaNG --platf=VERSIONSTRING --prefix=${PWD} 5.5.4 
    where VERSIONSTRING is what you found in the step above.

This should install ganga on your system.

Things to note:

  1. 5.5.4 is the latest version as of NorduGrid 2010. Check the ganga homepage to see the latest version at any given time.
  2. The install option
    --extern=GangaAtlas,GangaNG
    ensures you get all the external software you need to control ARC jobs via ganga. One of these is the ARC middleware itself, so if you use ganga you don't need to install this.

Finally, the install script ends by telling you the path to the ganga executable. Make a note of this - in the following we will call it /path/to/ganga. Feel free to add an alias, like this:

alias ganga="/path/to/ganga" (for bash)

or

alias ganga /path/to/ganga (for csh)

Setting ganga up for submitting ARC jobs

Before running Ganga properly, we will create a configuration script that you can use to alter the way Ganga behaves:

ganga -g

This creates a file '.gangarc'. The options given in here will override anything else (except the command line).

To activate GangaNG and ARC support, you need to make one change to this file:

Close to the top of the file there's a line like this:

#RUNTIME_PATH =

Replace this line with

RUNTIME_PATH =GangaAtlas:GangaNG

Your ganga is now set up for ARC usage.

In addition, to safeguard against some potential grid certificate problems, please do the following before starting ganga (in the terminal where you plan to start it):

cd /where/you/installed/ganga/external/nordugrid-arc-standalone/0.6.5/*/
source setup.sh

First steps with ganga

NOTE This section is only designed to give a brief overview of basic Ganga functionality. For more information, see the Ganga website!

Starting Ganga

We now assume you have a working ganga available - start it up:

/path/to/ganga

(or if you've put it in your path, just ganga)

This should present you with something similar to:


*** Welcome to Ganga ***
Version: Ganga-5-5-4
Documentation and support: http://cern.ch/ganga
Type help() or help('index') for online help.

This is free software (GPL), and you are welcome to redistribute it
under certain conditions; type license() for details.


ATLAS Distributed Analysis Support is provided by the "Distributed Analysis Help" HyperNews forum. You can find the forum at
    https://hypernews.cern.ch/HyperNews/Atlas/get/distAnalysisHelp.html
or you can send an email to hn-atlas-dist-analysis-help@cern.ch

GangaAtlas                         : INFO     Found 0 tasks
Ganga.GPIDev.Lib.JobRegistry       : INFO     Found 0 jobs in "jobs", completed in 0 seconds
Ganga.GPIDev.Lib.JobRegistry       : INFO     Found 0 jobs in "templates", completed in 0 seconds

********************************************************************
New in 5.2.0: Change the configuration order w.r.t. Athena.prepare()
              New Panda backend schema - not backwards compatible
For details see the release notes or the wiki tutorials
********************************************************************

In [1]:

You can quit Ganga at any point using Ctrl-D.

Getting Help

Ganga is based completely on Python and so the usual Python commands can be entered at the IPython prompt. For the specific Ganga related parts, however, there is an online help system that can be accessed using:

In [1]: help() 
************************************

*** Welcome to Ganga ***
Version: Ganga-5-3-5
Documentation and support: http://cern.ch/ganga
Type help() or help('index') for online help.

This is free software (GPL), and you are welcome to redistribute it
under certain conditions; type license() for details.


This is an interactive help based on standard pydoc help.

Type 'index'  to see GPI help index.
Type 'python' to see standard python help screen.
Type 'interactive' to get online interactive help from an expert.
Type 'quit'   to return to Ganga.
************************************

help>

Type 'index' at the prompt to see the Class list available. Then type the name of the particular object you're interested in to see the associated help. You can use 'q' to quit the entry you're currently viewing (though there is currently a bug that displays help on a 'NoneType' object!). You can also do this directly from the IPython prompt using:

In [1]: help(Job)

You might find it useful at this point to have a look in the help system about the following classes that we will be using:

Job
Athena
AthenaMC
ATLASLocalDataset
ATLASOutputDataset
AthenaJobSplitter
DQ2Dataset
DQ2OutputDataset
DQ2JobSplitter
LCG
Panda
NG

Your First Job

We will start with a very basic Hello World job that will run on the machine you are currently logged in on. This will hopefully start getting you used to the way Ganga works. Create a basic job object with default options and view it:

In [1]: j = Job()
In [2]: j

This should give an output similar to:

Out[6]: Job (
 status = 'new' ,
 name = '' ,
 inputdir = '/home/slater/gangadir/workspace/mws/LocalAMGA/0/input/' ,
 outputdir = '/home/slater/gangadir/workspace/mws/LocalAMGA/0/output/' ,
 outputsandbox = [] ,
 id = 0 ,
 info = JobInfo (
    submit_counter = 0
    ) ,
 inputdata = None ,
 merger = None ,
 inputsandbox = [] ,
 application = Executable (
    exe = 'echo' ,
    env = {} ,
    args = ['Hello World']
    ) ,
 outputdata = None ,
 splitter = None ,
 subjobs = 'Job slice:  jobs(0).subjobs (0 jobs)
' ,
 backend = Local (
    actualCE = '' ,
    workdir = '' ,
    nice = 0 ,
    id = -1 ,
    exitcode = None
    )
 )

Note that by just typing the job variable ('j'), IPython tries to print the information regarding it. For the job object, this is a summary of the object that Ganga uses to manage your job. These include the following parts:

  • application - The type of application to run
  • backend - Where to run
  • inputsandbox/outputsandbox - The files required for input and output that will be sent with the job
  • inputdata/outputdata - The required dataset files to be accessed by the job
  • splitter - How to split the job up into several subjobs
  • merger - How to merge the completed subjobs

For this job, we will be using a basic 'Executable' application ('echo') with the arguments 'Hello World'. There is no input or output data, so these are not set. We'll now submit the job:

In [3]: j.submit()

If all is well, the job will be submitted and you can then check it's progress using the following:

In [4]: jobs

This will show a summary of all the jobs currently running. You're basic Hello World job will go through the following stages: 'submitted', 'running', 'completing' and 'completed'. When your job has reached the completed state, the standard output and error output are transferred to the output directory of the job (as listed in the job object). There are several ways to check this output. First, we will use the 'peek' function of the job object:

In [5]: j.peek()

This function can also be used to look at specific files:

In [6]: j.peek("stdout")

The shell command 'less' is used by default. To use other programs, you can specify them as a second argument:

In [7]: j.peek("stdout", "emacs")

You can also use the exclamation mark (!) to directly access shell commands and the dollar sign to use 'python' variables. The above commands could also be carried out using:

In [8]: !ls $j.outputdir
In [9]: !emacs $j.outputdir/stdout

Using either of these two methods, view the stdout file. With any luck, you will see the Hello World message. Congratulations, you've run your first job!

Creating Submission Scripts

Clearly, it would be very tedious if you had to keep typing out the same text to submit a job and so there is scripting available within Ganga. To test this, let's try three types of Hello World job in one go. Create a file called 'first_job.py' and copy the following into it (Mini-test: can you see what's going on in each case?):

j = Job()
j.submit()

j = Job()
j.application.args=['Hello Another World', '42', 'My aunt is a Neptunian giant hamster']
j.submit()

j = Job()
j.application.exe='python'
j.application.args=['-c','print "Hello pythons"']

Then, from within Ganga, you can use the 'execfile' command to execute the script:

In [1]: execfile('first_job.py')

You can also run Ganga in batch mode by doing the following:

ganga first_job.py

At this point, just to show the persistency of your jobs, quit and restart Ganga. Your jobs will be preserved just as you left them!

More Advanced Job Manipulation

To finish off, we will cover some useful features of managing jobs. This is a fairly brief overview and a more complete list can be found at: http://ganga.web.cern.ch/ganga/user/html/GangaIntroduction/

Copying Jobs

You can copy a job regardless of it's status using the following:

j = Job()
j2 = j.copy()

The copied job is able to be submitted regardless of the original jobs status. Consequently, you can do the following:

j = jobs(3)
j.submit()

Job Status

Jobs can be killed and then resubmitted using the following:

j.kill()
j.resubmit()

The status of a job can be forced (e.g. if you think it has hung and you want to set it to failed) using the following:

j.force_status('failed')

Removing Jobs

To clean up your job repository, you can remove jobs using the 'remove' method:

j.remove()

Configuration Options

You can supply different configuration options for Ganga at startup through the .gangarc file. If you wish to change things on the fly however, you can use (there are examples in the next section):

config[section][parameter] = value

To show what the current settings are, just use the straight config value as with jobs:

config[section]

Interactively, the 'config' object behaves as a class so you can do the above using:

config.section.parameter = value

This also allows you to tab-complete your commands.

A simple ARC job

You should now have a basic feel for what ganga can do. The next step is to send a job to the grid. This requires only one further line of code in your job definition - try the following:

j = Job()
j.backend=NG()
j.submit()

You will notice that job submission now takes a bit longer. This is because ganga now uses the ARC middleware to submit the job to the set of grid resources that you are authorized to use. After a few seconds, your job should be accepted by some grid site, and will be queued there for execution. After a while it should finish, and the output will be copied back to the computer where you're running ganga.

NB: You have to keep ganga running for this to happen. You can turn it off and it will pick up all done jobs when you restart it, but if you keep it off for >24h the grid site will regard the job as orphaned and delete the output.

A slightly more complex ARC job

Finally, let's try a more advanced job that uses a few more features of ganga. For this you need something for your job to do - start by downloading these two files:

This is the source code for a simplistic gladiator fight simulator on a square arena. To see what it does, try the following command:

./bgwrapper.sh -B -x 25 -y 25 -f 100 --printfield -w 0.2

The goal now is to run a number of such simulations on grid sites. Create a file with the following ganga job description:

# Make a job
j = Job()

# Give the job a reasonable name
j.name = 'BattleGrid'

# Set the executable application...
j.application=Executable()
# ...and in this case specify it to be 'source'
j.application.exe='source'

# Add some input files to the job
j.inputsandbox=['./bgwrapper.sh','./BattleGrid.tar.gz']

# Set up a job splitter, which will create a number of subjobs
j.splitter=ArgSplitter()
j.splitter.args.append('bgwrapper.sh -B -x 100 -y 100 -f 100 -N 10')
j.splitter.args.append('bgwrapper.sh -B -x 200 -y 200 -f 200 -N 10')
j.splitter.args.append('bgwrapper.sh -B -x 100 -y 100 -f 100 -N 50')
j.splitter.args.append('bgwrapper.sh -B -x 1000 -y 1000 -f 1000 -N 1')

# Specify the ARC/NorduGrid backend
j.backend=NG()

# Submit the job
j.submit()

Now execute this file (execfile('jobfile.py')) in ganga, and you should get four simulations running with various parameters.

This concludes the tutorial - feel free to contact the responsible people if you have further questions.

Links and references

Contact info

Name Email Main area
Martin Skou Andersen skou@nbiNOSPAMPLEASE.dk ARC
Bjorn H. Samset b.h.samset@fysNOSPAMPLEASE.uio.no Ganga

-- BjornS - 29-Dec-2009

-- BjornS - 03-May-2010-- BjornS - 03-May-2010-- BjornS - 03-May-2010

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2010-05-07 - MartinSkouAndersen
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    ArdaGrid All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback