Preliminary steps

To use the grid resources you need a certificate and to join a virtual organisation (VO). The certificate guarantees your identity, the association to a VO gives you access to the corresponding resources (computer centres accepting job of a given VO etc...).

Each future user has to apply for a certificate. It is a 5-min procedure and the instructions are under: http://lcg.web.cern.ch/LCG/digital.htm . Please take the time to read the page in order to apply to the correct certification authority (it depends on your affiliation).

The second step is joining the VO. If you have been told to join the GEAR VO (often use for the initial traing/application porting do the following:

  • Upload the certificate in your browser (part o the previous step)
  • Visit https://lcg-voms.cern.ch:8443/vo/vo.gear.cern.ch/vomrs and follow the registration Phases I and II. It is very important that they fully fill the phase II after Phase I you will get a mail leading to Phase II.

Import the certificate on lxplus (here we assume your (initial) submission host. Instructions (linked from the prev. pages) are under https://ca.cern.ch/ca/Help/?kbid=024010 .

Try out your certificate

Normally one would do:

bash
. /afs/cern.ch/project/gd/LCG-share/sl5/etc/profile.d/grid-env.sh
voms-proxy-init --voms vo.gear.cern.ch
voms-proxy-info

but let Ganga do all the job for you smile

Fire up Ganga

The main tool for porting and supporting application on the Grid is Ganga (http://cern.ch/ganga). A very complete user doc page is available under http://ganga.web.cern.ch/ganga/user/index.php (also reachable from the Ganga home page). We suggest to start from this because Ganga is a natural bridge from your familiar batch system to Grid infrastructures.

To know more about the internals of the Grid middleware used in LCG, a very good document is available (maintained by the LCG project). This can be found under: https://edms.cern.ch/file/722398/1.2/gLite-3-UserGuide.pdf

This section is from the Ganga pages: http://ganga.web.cern.ch/ganga/user/installation/other.php . Essentially you have to use Ganga (no need of install it from lxplus) .

First of all log-in on lxplus.

Get this file and save it as ~/.gangarc: .gangarc example

% bash
% /afs/cern.ch/sw/ganga/install/5.5.3/bin/ganga

First examples

First helloWorld example (see the Ganga user guide):

j0 = Job()
j0.application = Executable(exe=File('/bin/echo'), args=['HelloWorld'])
j0.submit()

this will execute on your machine.

If you want to run it on LSF, please do:

j1 = j0.copy()  #"Same" job but can be executed (you cannot re-execute j0 since it ran already)
j1.backend='LSF'
j1.submit()

Then on the Grid ( LCG)...

j2 = j0.copy()  #"Same" job but can be executed (you cannot re-execute j0 since it ran already)
j2.backend='LCG'
j2.submit()

More on Ganga

More on files (executable, input, output)

Slightly more complex example. To run this simple example, you need to have the gangaHello.py in your home directory. The file executes it and copy files onto the working directory jobs. The files declared in the "outputsandbox" will be returned to the user in case of successful execution.

os.system('cp /etc/fstab ~/input1') # prepare a file
jf=j0.copy()
jf.application = Executable(exe=File('~/gangaHello.py'),args=['World!'])
jf.inputsandbox = [File('~/input1')]
jf.outputsandbox=['output1','output2','output3']

jf.submit()

!ls -l $jf.outputdir

Example with splitter (same exe multiple inputs)


mySplitter = ArgSplitter(args=[[str(x)] for x in range(10)])
full_print(mySplitter)

### Typical output:
### ArgSplitter (
### args = [['0'], ['1'], ['2'], ['3'], ['4'], ['5'], ['6'], ['7'], ['8'], ['9']] 
### ) 

j3=j2.copy()

# j3 will be one job with 10 subjobs. A single submission generates 10 (sub)jobs each one with input parameter 0,1,2,...9
j3.splitter=mySplitter

j3.submit()

After a little while you should be seen something like this (the "mother" job is called 16 and its subjobs are 16.0, 16.1, ..., 16.9) :

    fqid |    status |      name | subjobs |    application |        backend |                             backend.actualCE 
-----------------------------------------------------------------------------------------------------------------------------
    16.0 |   running |           |         |     Executable |            LCG |    ce.cyf-kr.edu.pl:2119/jobmanager-pbs-gear 
    16.1 |   running |           |         |     Executable |            LCG |    ce.cyf-kr.edu.pl:2119/jobmanager-pbs-gear 
    16.2 | submitted |           |         |     Executable |            LCG |   gazon.nikhef.nl:2119/jobmanager-pbs-ekster 
    16.3 | submitted |           |         |     Executable |            LCG | trekker.nikhef.nl:2119/jobmanager-pbs-ekster 
    16.4 |   running |           |         |     Executable |            LCG |    ce.cyf-kr.edu.pl:2119/jobmanager-pbs-gear 
    16.5 | submitted |           |         |     Executable |            LCG |   gazon.nikhef.nl:2119/jobmanager-pbs-ekster 
    16.6 |   running |           |         |     Executable |            LCG |ce01.lcg.cscs.ch:2119/jobmanager-lcgpbs-other 
    16.7 | submitted |           |         |     Executable |            LCG |   gazon.nikhef.nl:2119/jobmanager-pbs-medium 
    16.8 | submitted |           |         |     Executable |            LCG |   gazon.nikhef.nl:2119/jobmanager-pbs-medium 
    16.9 | submitted |           |         |     Executable |            LCG |ce11.lcg.cscs.ch:2119/jobmanager-lcgpbs-other 

To find out (after job completion) what happened to subjob 16.0, just do (from the Ganga prompt)

outdir = jobs[-1].subjobs[0].outputdir
!ls -l $outdir

The next block of code does an automatic merge of the output of the files (of the last job). Note that from the ganga promp you can execute a file by entering exefile(myfile.py)

  
import os       

lastJob = jobs[-1]

nj = len(lastJob.subjobs)                                                       
cj = nj                                                                         

cmd = 'cat '                                                                    

for sj in lastJob.subjobs:                                                      
    if sj.status=='completed':                                                  
        if sj.backend.exitcode==0:
            cj -=1                                    
            cmd += sj.outputdir + 'stdout' + ' '

if cj==0:                                                                       
    print 'Job completed: merge possible'
    print cmd                                                                   
    os.system(cmd)           

One more reason to use subjobs...

Realistic tasks consist of several jobs. In most of the cases all jobs (or at least a given percentage) should be executed successfully. Some jobs may fail and the bookkeeping of resubmitting (and keep track of which job "replcaes" its failed counter part is a nightmare. In the following example (for illustration: obtained by artificially allowing also sites with a old version of the python interpreter to be selected) some subjobs failed:

  100.91 |    failed |           |         |     Executable |            LCG |dc2-grid-65.brunel.ac.uk:2119/jobmanager-lcgp 
  100.93 |    failed |           |         |     Executable |            LCG |dgc-grid-44.brunel.ac.uk:2119/jobmanager-lcgp 
  100.95 |    failed |           |         |     Executable |            LCG |    ce.cyf-kr.edu.pl:2119/jobmanager-pbs-gear 
  100.99 |    failed |           |         |     Executable |            LCG |    ce.cyf-kr.edu.pl:2119/jobmanager-pbs-gear 

Ganga allows you to resubmit the failed one by "recycling" their subjob number (I assume this is the last one in jobs[]):

jobs[-1].subjobs.select(status="failed").resubmit()

Monitoring my jobs... (extremely preliminary - complete new version)

Resources?

With a valid proxy, one can use

lcg-infosites

. An example is attached here:

bash-3.2$ lcg-infosites  -v 2 --vo vo.gear.cern.ch ce
RAMMemory    Operating System    System Version              Processor   Subcluster name
-------------------------------------------------------------------------------------------------------------------------
   2000        ScientificCERNSLC   Boron                                       Xeon                         ce202.cern.ch
   2000        ScientificCERNSLC   Boron                                       Xeon                         ce203.cern.ch
   2048             ScientificSL   Beryllium                               Xeon                      ce.cyf-kr.edu.pl
  32768             ScientificSL   Boron                                       Xeon                      ce02.lcg.cscs.ch
  32768             ScientificSL   Boron                                    Opteron                      ce01.lcg.cscs.ch
  32768             ScientificSL   Boron                                       Xeon                      ce11.lcg.cscs.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce106.cern.ch
   2000        ScientificCERNSLC   Boron                                       Xeon                         ce129.cern.ch
   2000        ScientificCERNSLC   Boron                                       Xeon                         ce133.cern.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce125.cern.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce111.cern.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce103.cern.ch
   2000        ScientificCERNSLC   Boron                                       Xeon                         ce128.cern.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce113.cern.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce124.cern.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce112.cern.ch
   3072                   CentOS   Final                                       IA32                       gazon.nikhef.nl
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce107.cern.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce105.cern.ch
   3072                   CentOS   Final                                       IA32                       gazon.nikhef.nl
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce127.cern.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce114.cern.ch
   2000        ScientificCERNSLC   Boron                                       Xeon                         ce130.cern.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce104.cern.ch
   2000        ScientificCERNSLC   Boron                                       Xeon                         ce131.cern.ch
   2000        ScientificCERNSLC   Beryllium                               Xeon                         ce126.cern.ch
   2000        ScientificCERNSLC   Boron                                       Xeon                         ce132.cern.ch
   3072                   CentOS   Final                                       IA32                     trekker.nikhef.nl
   2048             ScientificSL   Boron                                    Opteron                   grid36.lal.in2p3.fr
   2000        ScientificCERNSLC   Boron                                       Xeon                   grid10.lal.in2p3.fr
   4054             ScientificSL   Final    Dual Core AMD Opteron(tm) Processor 265                         ce201.cern.ch
  16384             ScientificSL   Final                                       Xeon              dgc-grid-40.brunel.ac.uk
   4054             ScientificSL   Beryllium    Dual Core AMD Opteron(tm) Processor 265              dc2-grid-65.brunel.ac.uk
      0                                                                                          dgc-grid-44.brunel.ac.uk

Release "Beryllium" correspond to SLC4, "Boron" to SLC5.

-- MassimoLamanna - 20-Apr-2010

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng Screenshot.png r1 manage 142.9 K 2010-04-26 - 17:05 MassimoLamanna Monitoring screen shot
Texttxt gangaHello.py.txt r1 manage 0.4 K 2010-05-07 - 11:39 MassimoLamanna  
Unknown file formatext gangarc r1 manage 36.0 K 2010-04-26 - 17:00 MassimoLamanna Decent .gangarc (it should be placed in your home directory)
Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r13 - 2010-09-10 - MassimoLamanna
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback