Distributed Analysis Example from CERN / GVA CLUSTER

This twiki gives a simple example and some hopefully useful hints of how to use Ganga for distributed analysis from CERN / GVA cluster. Please note that complete and up-to-date instructions are given on DistributedAnalysisUsingGanga. In this example we will be producing a ntuple from AOD with EventView.

Setting up the environment

First set up your Athena release, as described in WorkBookSetAccount. Say we are using 15.3.1:

source ~/cmthome/setup.sh -tag=15.3.1,32

Then type

source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh

to set up Ganga. This automatic script sets the correct environment and finds the latest stable release of Ganga. Ganga will create a directory 'gangadir' in your home afs directory. To prevent quota issues it is recommended that you create a symbolic link to a directory on one of the clusters disks. Do

mkdir /atlas/users/<username>/gangadir
cd ~
ln -s gangadir /atlas/users/<username>/gangadir

Finding the dataset location

The idea is to send your job to a grid site where the dataset you are running on is already present - that way you don't have to download it locally before sending the job. Therefore it is useful to check in advance on which computing sites / in which computing clouds complete replicas of your desired dataset are available. This can be done by

dq2-ls -r <datasetname>

assuming you have initialized dq2 commands from CERN AFS repository before by doing

source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
voms-proxy-init -voms atlas

In our example we want to run on the dataset mc08.108364.AlpgenQcdJ4Np4_pt20.merge.AOD.e389_s462_r635_t53 so we get the ouput:

Container name: mc08.108364.AlpgenQcdJ4Np4_pt20.merge.AOD.e389_s462_r635_t53/
Total  datasets: 1
Summary:
        SITE                      / # COMPLETE / # INCOMPLETE /  TOTAL
        ----------------------------------------------------------------
        CYFRONET-LCG2_MCDISK         1             0             1
        TRIUMF-LCG2_MCDISK           1             0             1
        INFN-FRASCATI_MCDISK         1             0             1
        UKI-NORTHGRID-LIV-HEP_MCDISK    1             0             1
        TOKYO-LCG2_MCDISK            1             0             1
        BNL-OSG2_MCDISK              1             0             1
        RRC-KI_MCDISK                1             0             1

As we can see there are a couple of sites with complete replica.

Note that you can also use other tools such as AMI to check the dataset availability. More information on how to find your data is summarized on FullGangaAtlasTutorial#7_3_Finding_your_Data. If you have a big production and want ganga to check automatically where datasets are available this thread from the ATLAS distributed analysis hypernews might be of interest to you.

Choosing backend, clouds, sites

Now we need to decide the grid backend we want to run on. There is LCG, NorduGrid, and Panda. So far I have made the best experience in terms of job failure rate and dataset availability with Panda so this example will show how to run on the Panda backend. The usage of other backends however is very similar. More details can be found on FullGangaAtlasTutorial.

Submitting Jobs

You are now set up for job submission. Go to your workarea and make sure you have a working job options file for your athena job. If you want to follow this example get these job options and put them into your working directory. Start the python interface of ganga by simply typing

ganga

from your workarea. You will have to enter your grid credentials. Now use the following script to submit your ganga job:

j = Job()
j.name = 'PandaTestJob'
j.application = Athena() 
j.application.exclude_from_user_area = ["*.o","*.root*","*.exe"]
j.application.option_file = ['runEV_QCD_udsc_J4_108364_e389_s462_r635_t53.py']  # here enter the path to your jobOptions file
j.application.prepare()
j.inputdata = DQ2Dataset()
j.inputdata.dataset = "mc08.108364.AlpgenQcdJ4Np4_pt20.merge.AOD.e389_s462_r635_t53/" # here enter the DQ2 dataset you want to run on, note that if you put the name of a DQ2 datasetcontainer you need to close with a slash
j.outputdata = DQ2OutputDataset()
j.outputdata.datasetname = "PandaTestJob"   # here put the desired output dataset name, ganga will prepend user09.FirstNameLastName.ganga.
j.splitter = DQ2JobSplitter()
j.splitter.numsubjobs = 250
j.backend = Panda() # here the backend is specified
j.backend.requirements.cloud = 'US' # specification of cloud; we have seen above that our dataset is available at BNL-OSG2_MCDISK so we send our job to the US cloud
j.backend.requirements.excluded_sites = ['ANALY_SLAC'] # do this if you want to exclude sites in a cloud, e.g. if you know there are problems... 
j.submit()

Ganga should split up your job in subjobs as specified above and submit them. You can check the status of your ganga jobs by typing jobs. To get more specific info about the status and details of your subjobs do jobs(<jobid>) and jobs(<jobid>).subjobs[<subjob>]. There are also a couple of functions available to kill, resubmit, etc. jobs and subjobs. You can see these functions with tab-complete, e.g.

mySubJob = jobs(<jobX>).subjobs[<subjobY>]
mySubJob.<tab-complete>

Once your jobs finish or fail, Panda will send you an email notification. If they failed you will have to resubmit the corresponding subjobs manually by doing

jobs(<jobX>).subjobs[<subjobY>].resubmit()

If all jobs finished successfully you can copy yout dataset from the grid using e.g. dq2-get. Note that your output dataset will be deleted from the grid 30 days after creation! See https://twiki.cern.ch/twiki/bin/view/Atlas/DDMReplicationDeletionPolicy#Deletion_policy_per_DDM_site.

Miscellaneous

  • List clouds, sites, storage elements
    a=AtlasLCGRequirements()
    
    #List coulds:
    a.list_clouds()
    
    #List sites (SEs) in a cloud:
    a.list_sites_cloud('DE')
    
    #List CE of a SE:
    a.list_ce('FZK-LCG2_MCDISK') 
    

Useful Links

DistributedAnalysisUsingGanga

FullGangaAtlasTutorial

DQ2ClientsHowTo

ATLAS Distributed Analysis Hypernews Very useful! * * * * *

Old but still useful:

Trash.AtlasGangaTutorial5

Trash.AtlasGangaShortTutorial5


Major updates:
-- MoritzBackes - 2009-09-18

%RESPONSIBLE% MoritzBackes
%REVIEW% Never reviewed

-- MoritzBackes - 11-Nov-2009

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2010-10-12 - JohannesElmsheuser
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback