Ganga/Dirac hands on session

Playing with Python/Ganga

First of all you need to gain some experience with the ganga CLIP and python. From a fresh login to your favourite Grid UI start a Ganga session.

  • GangaEnv (pick latest version)
  • Type "ganga" at prompt.

Playing with CLIP

Try a few Python exercises at the prompt to gain a minimal experience like:

  • Create a variable
  • Print its value
  • Write an if statement
  • Test tab expansion and the history feature
  • Try the help system like "help(Gauss)" or dir(jobtree)
  • Execute a shell command from within Ganga
  • Write a loop over items in a list/array
  • Write a script and execute it with execfile('scriptname')
  • Import the module needed to access the ENV variables

Ganga Jobs.

Now you're ready to start to play with jobs and ganga.

Hello World job

Create a trivial Hello World job in Ganga and submit it to the local machine.

  • Create a default job 'j = Job()'
  • Investigate content 'print j'
  • Change executable to something like /bin/hostname or any other exec that you like
  • Submit job 'j.submit'
  • Investigate the stdout in the j.outputdir directory
  • Copy the job, change the backend to LSF and submit again.
  • Query the job status using the jobs command or the ganga query script (from outside ganga)

Gauss Jobs

Try to replicate what you did for exercise 2 in the Simulation exercise yesterday (generator level events).

  • Create an application object inside Ganga of type Gauss.
    • g = Gauss()
  • Change the parameters to match your Gauss job
    • options file location (just run for a low number of events)
    • version (v30r4)
  • Create a job and submit the job to the Local backend.
    • j = Job(application = g)
    • j.submit()
  • Look at the outputdir directory for the files placed there.
  • Create a copy of the job, add an "extraopts" line to the application part and submit the job.
    • j.application.extraopts = 'ApplicationMgr.EvtMax = 20;' to change the number of simulated events before submitting the job.
  • Create another copy, change the backend to LSF and submit a new job.

To exercise the splitters try the GaussSplitter:

  • Export a job, modify it and submit it
  • Try the splitter on a backend = than Dirac (suggestion: low events on local backend to immediately see result)

Boole Jobs

You can try to process with Boole some newly generated sim files.

  • First of all run a job with gauss that generate 10 events (full simulation)
  • Choose the Boole version you want to use for the processing and setup your ui (setenvBoole vxry)
  • Load the ganga session and do everything from ganga:
    • getpack the proper Boole code
    • make it
    • edit the option files (or add at the end the correct extraoptions)
    • submit the job to LSF or Dirac

DaVinci Jobs

In this exercise you will get ready for the Analysis part and at the same time learn about templates in Ganga. You will create a Job, create a template, modify it to your own needs and then create and run a DaVinci job from it.

  • Define the DaVinci job
  • Create a template : save the file DaVinci.txt somewhere in your file system
  • Load a template for an empty DaVinci job.
    • t = load('DaVinci.txt')
  • Look at available templates
    • templates
  • Change template to fit your own directories.
    • Select also the backend and any other option you'd like to change
  • Create job from the template
    • j = Job(t)
  • Download any other package you'd like to use in you DaVinci job
  • [if needed] Compile code
    • 'j.application.make()'
  • Submit a job to the Grid
    • Submit job. Notice how monitoring will inform you about progress.
  • Monitor the job status using the web interface
  • You have now completed your first grid analysis job!

Practising datasets

Using ganga it's particularly useful because you can use the LHCbDataset and LHCbDataFile classes to handle the datasets. To practice those tools you should:

  • Create a python script that takes as input a list of LFN and builds from them a LHCbDataset by checking if the files have or not a replica.
  • Submit the job for those files defined in the LHCbDataset
Try this script with file data.opts that you can dowlonad from here

Example

def datasetFromCard(filename):
    files = []
    f = file(filename)
    for line in f.xreadlines():
        l = line.lstrip()
        if l[0:2]=="//" : continue
        if l.rfind("DATAFILE")==-1: continue
        wds = l.split("'")
        files.append(wds[1])
        
        if len(files)>0:
           ds = LHCbDataset(files)
       else:
           ds = LHCbdataset() 

    return ds

###### Get the data
print "Get LHCb data set"
ds = datasetFromCard("data.opts")
print "Update replica cache"
ds.updateReplicaCache()

###### Call the previous function when defining a dataset
print "Create good and bad list"
goodlist=[]
badlist=[]
for df in ds.files:
  if len(df.replicas)==0:
    badlist.append(df)
  else:
    goodlist.append(df)

dsbad=LHCbDataset(files=badlist)
dsgood=LHCbDataset(files=goodlist)

f=open('goodlist.py','w')
#f.write('goodlist =[')
f.write(str(dsgood))
#f.write(']')
f.close()

job=Job(application=DaVinci(version='v19r7'), backend=Dirac(CPUTime=160000))
job.inputdata = eval(mylist)

## here you should specify / change the code you want to run

## print "Calls Dirac splitter"
job.splitter = DiracSplitter(filesPerJob = 10, maxFiles = -1)

Ganga scripts

To exercise the ganga scripts you need to:

  • Create a script for a DaVinci application on Dirac backend
  • Edit the script in order to have it doing what you want to
  • Submit the script and monitor the job status without logging into ganga (using just scripts)

   
   # Create a job for submitting DaVinci to DIRAC
    ganga make_job Davinci DIRAC test.py 
   [ Edit test.py to set DaVinci properties ]
   # Submit job
    ganga submit test.py 
   # Query status, triggering output retrieval if job is completed
    ganga query 

Dirac MC production

To exercise the Dirac submission of job divided in steps you need:

  • To setup your ui using the DiracEnv command
  • Write down a Dirac job that uses Gauss, Boole Brunel in 3 different steps
  • Submit the job using python
  • Retrieve the job output using a Dirac script

Example of Dirac submission

from DIRAC.Client.Dirac import *
dirac = Dirac()
job = Job()
step1 = Step()
step1.setApplication('Gauss', 'v30r4')
step1.setInputSandbox(['Sim/Gauss/v30r4/options/v200601.opts'])
step1.setOutputSandbox(['Gauss.hbook', 'Gauss_v30r4.log'])
step1.setOption('"ApplicationMgr.OutStream += {"GaussTape"}; GaussTape.output = " DATAFILE="PFN:Gauss.sim" TYP="POOL_ROOTTREE" OPT="RECREATE" "; "')
job.addStep(step1)
step2 = Step()
step2.setApplication('Boole', 'v13r3')
step2.setOption(' "EventSelector.Input={"DATAFILE="PFN:Gauss.sim" TYP="POOL_ROOTTREE" OPT="READ" "}; "')
job.addStep(step2)
jobid = dirac.submit(job,verbose='1')
Example on output retrieval
from DIRAC.Client.Dirac import *

import sys
dirac = Dirac()
jobid = sys.argv[1]
dirac.getOutput(jobid)

-- AlessioSarti - 15 Nov 2007

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatopts data.opts r1 manage 35.0 K 2007-11-26 - 15:25 AlessioSarti Option file containing data
Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2007-11-27 - AlessioSarti
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback