Ganga Notes

These are some personal notes on using Ganga on the RAL PPD Tier 2. I relied mainly on

Setting up your account to run Atlas Ganga

You first need an ATLAS Grid Certificate.

Use the RAL PPD Tier 2 SL4 front-ends (linux.pp.rl.ac.uk). Here is the first-time setup, for run directory ~/testarea/13.0.40/WorkArea/run and using the scratch area /opt/ppd/scratch/USER for Ganga's workspace:

% cp ~adye/testarea/13.0.40/setup.sh ~/testarea/13.0.40/setup.sh
% cp ~adye/testarea/13.0.40/WorkArea/run/submit.sh ~/testarea/13.0.40/WorkArea/run/submit.sh

setup.sh sets up your environment. submit.sh contains an example submission command.

% mkdir /opt/ppd/scratch/USER

For your convenience, you may want to replace ~/gangadir with a symlink to this directory.

% source ~/testarea/13.0.40/setup.sh
% ganga -g

The first time setup.sh runs, it will prompt you for your Grid Certificate passphrase (voms-proxy-init). This creates a default ~/.gangarc. You can then edit the following settings (each must go in the same section as the commented-out template setting with the same name):

RUNTIME_PATH = GangaAtlas
local_root = /opt/ppd/scratch/USER/gangadir/repository
topdir = /opt/ppd/scratch/USER/gangadir/workspace
merge_output_dir = /opt/ppd/scratch/USER/gangadir/merge_results
VirtualOrganisation = atlas
ATLAS_SOFTWARE = /opt/ppd/atlas/sl4

Using Ganga to submit an Athena Job

Each time you log in:

% source ~/testarea/13.0.40/setup.sh
% cd ~/testarea/13.0.40/WorkArea/run

Create AnalysisSkeleton_topOptions_Ganga.py by commenting out InputCollections and EvtMax lines from AnalysisSkeleton_topOptions.py. Edit submit.sh to specify the jobs submission parameters you want (eg. input dataset). Then run it:

% ./submit.sh

Specify --ce heplnx206.pp.rl.ac.uk:2119/jobmanager-lcgpbs-atlas to force the job to run on the RAL Tier 2 (use lcg-infosites --vo atlas ce | fgrep .pp.rl.ac.uk for a full list of PPD CEs), though I haven't tested this.

To see the job status and see when it finishes, the GUI is convenient (or at least as convenient an option as I found!):

% ganga --gui &

stdout is automatically copied to ~/gangadir/workspace/Local/JOBNUM/output if Ganga is running.

Right click on job and select Extras -> Job.outputdata.retrieve to get AnalysisSkeleton.aan.root to this directory.

Alternatively, should be able to get root output with dq2_get (see datasetname in output data section of Job Details), but it didn't work for me (I can't get dq2_get working at all on PPD Tier 2, except for when Ganga runs it with the above retrieve command).

Also, can watch individual jobs with edg-job-status https://... (from backend.id).

List of Ganga bugs, annoyances, and questions that maybe I'll post somewhere sometime

  • Why aren't the Grid commands included in the release kit? AtlasLogin replaces, in the $PATH and $LD_LIBRARY_PATH etc, the local installation (eg. /opt/globus/bin) with release directories (eg. /opt/ppd/atlas/sl4/13.0.40/sw/lcg/external/Grid/globus/4.0.3-VDT-1.6.0/slc3_ia32_gcc323/globus/bin), but those directories don't exist in the kit (they are in the CERN installation). I worked around this problem by adding the local installation back after running ~/cmthome/setup.sh:

export PATH="$PATH:/opt/globus/bin"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/c-ares/lib"
unset SRM_PATH

(this was a suggestion from Chris and Bill).

  • Warning messages displayed in yellow. Nearly invisible on a light background. Tried to fix in .gangarc with args = ['-colors','NoColor'], but that didn't change anything.

  • Ganga is very verbose. When run in "ganga athena" mode, surely it could just say what it is doing, not print pages of help and summary info. On the other hand, the summary info could be written to a log file.

Ganga.GPIDev.Lib.Job : WARNING  ApplicationConfigurationError: No inputdata has been specified.  ... reverting job 12 to the new status

  • Have to keep Ganga open to watch for output. No simple "bjobs" shell command.

  • Can I find the output dataset name and EDG id without starting Ganga?

  • Is there a better way to retrieve the output ROOT file? Should be able to use dq2_get, but that's not working at RAL (and need to fire up GUI to get dataset name). Using the GUI's Extras -> Job.outputdata.retrieve seemed to work well, but this is hardly obvious!

  • Ganga GUI gives a segmentation fault on exit. I hope that's OK.

  • "ganga athena" submit gives lots of error messages (in red), eg.
GangaAtlas.Lib.AtlasLCGRequirements: ERROR    Cannot extract host from root://acas0420.usatlas.bnl.gov//data
GangaAtlas.Lib.AtlasLCGRequirements: ERROR    No CE information on site BNLXRDHDD1. Maybe it failes the SAM test.

  • By default Ganga puts all its output in ~/gangadir. Even without the job output ROOT files, that can quickly fill up one's quota (Atlas logfiles are enormous). My recipe uses a scratch disk. Perhaps that should be documented. Or is there some way to control the download of stdout files?

  • It would be really nice to have a Panda-style status page to see the progress of jobs in the system. Have they started yet (on a real CPU, rather than somewhere in Gridland)? Where are they running? View logfiles when finished, rather than downloading to home dir.

-- TimAdye - 13 Feb 2008

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2008-04-21 - TimAdye
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback