pRun

Introduction

prun allows users to submit general jobs to Panda. The jobs may run ROOT(cint,c++,pyRoot), ARA, user's executable and so on. All files under the current directory are sent to WNs, by default.

Getting started

Installation

Installation procedure is different from pathena's one since Athena Runtime is not mandatory.

via tarball

$ wget https://twiki.cern.ch/twiki/pub/Atlas/PandaRun/panda-client-0.1.tar.gz
$ tar xvfz panda-client-*
$ cd panda-client-*
$ python setup.py install --prefix=[install dir]

via rpm

$ wget https://twiki.cern.ch/twiki/pub/Atlas/PandaRun/panda-client-0.1-1.noarch.rpm
$ rpm -Uvh panda-client-*

Setup

First, setup the grid runtime, or set PATHENA_GRID_SETUP_SH if you don't want to pollute environment variables. e.g.,
$ source /afs/cern.ch/project/gd/LCG-share/current/etc/profile.d/grid_env.sh
or
$ export PATHENA_GRID_SETUP_SH=/afs/cern.ch/project/gd/LCG-share/current/etc/profile.d/grid_env.sh
This page explains PATHENA_GRID_SETUP_SH. Then, source the prun setup script according to your installation.

via tarball

$ source [install dir]/etc/panda/panda_setup.[c]sh

via rpm

$ source /etc/panda/panda_setup.[c]sh

How to run (an example of ARA job)

The usage of prun is
$ prun [options] script
Try
$ prun -h
to see all available options.

Here is an example of ARA job.

$ cat aratest.py

#!/bin/bash

"exec" "python" "-Wignore" "$0" "$@"

def main():
    import sys
    import user
    import ROOT
    import PyCintex
    # output
    outF = ROOT.TFile('out1.root','recreate')
    import AthenaROOTAccess.transientTree
    CollectionTree = ROOT.AthenaROOTAccess.TChainROOTAccess('CollectionTree')
    # input
    inputFiles = sys.argv[1].split(',')
    for inputFile in inputFiles:
        print "add %s" % inputFile
        CollectionTree.Add(inputFile)
    tt = AthenaROOTAccess.transientTree.makeTree(CollectionTree)
    # event loop
    for i in range(tt.GetEntries()):
        tt.GetEntry(i)
        photons = tt.PhotonAODCollection
        print [e.eta() for e in photons]

if __name__ == "__main__":
    main()
This script gets the list of input filenames via sys.argv[1] and produce an output file out1.root. You may run this script locally like
$ aratest.py AOD1.pool.root,AOD2.pool.root
Now you can submit this to Panda by using prun.
$ prun --athenaTag=14.2.24,32,setup --exec "aratest.py %IN" --outDS user08.TadashiMaeno.test123 --inDS valid1.006384.PythiaH120gamgam.recon.AOD.e322_s412_r577 --outputs out1.root --nFiles 5
prun gathers all files under --workDir (default=./) and sends them to WNs. Single prun job instantiates one buildGen job and some runGen jobs like pathena jobs. buildGen stores your source files to the remote SE and activates runGen jobs as soon as it finishes. The real jobs run in runGen. The argument of the --exec option is executed after converting %IN to a list of input files. Input files are available in the 'current directory' on the WN. Output files are auomatically renamed to DatasetName_SerialNumber_OriginalName, e.g, user08.TadashiMaeno.test123._00001.out1.root. Output files can be retrieved by using dq2 tools.

More Examples

Run ROOT macro

Here is an example of CINT macro.
$ cat macrotest.C

#include <TROOT.h>
#include <TChain.h>
#include <TFile.h>

gROOT->Reset();

void macrotest()
{
  std::string argStr;

  std::cin >> argStr;

  // split by ','
  std::vector<std::string> fileList;
  for (int i=0,n; i <= argStr.length(); i=n+1)
    {
      n = argStr.find_first_of(',',i);
      if (n == string::npos)
        n = argStr.length();
      string tmp = argStr.substr(i,n-i);
      fileList.push_back(tmp);
    }

  // open input files
  TChain fChain("CollectionTree");
  for (int iFile=0; iFile<fileList.size(); ++iFile)
    {
      std::cout << "open " << fileList[iFile].c_str() << std::endl;
      fChain.Add(fileList[iFile].c_str());
    }

  Int_t           EventNumber;
  TBranch        *b_EventNumber;
  fChain.SetBranchAddress("EventNumber", &EventNumber, &b_EventNumber);

  // main loop
  Long64_t nentries = fChain.GetEntriesFast();
  for (Long64_t jentry=0; jentry<nentries;jentry++)
    {
      Long64_t ientry = fChain.LoadTree(jentry);
      if (ientry < 0)
        break;
      fChain.GetEntry(jentry);

      std::cout << EventNumber << std::endl;
    }
}
This macro reads a string via stdin and splits the string to the list of input files. You may run this locally like
$ echo NTUP1.root,Ntup2.root | root.exe macrotest.C
So you can submit it like
$ prun --exec "echo %IN | root.exe macrotest.C" --athenaTag=14.2.24 --inDS ...
This section explains the --athenaTag option. Stand-alone ROOT is not available on the grid WNs. So you have to use ROOT included in an Athena release.

Run jobs without input files

If you don't specify --inDS, --nJobs (default=1) subjobs will be instantiated. Perhaps you may set random seeds using %RNDM in the --exec option. e.g,
$ prun -exec "somescript %RNDM=123 %RNDM=456" myscript --outDS user...
where %RNDM=basenumber (e.g., %RNDM=100) will be incremented per sub-job.

Use Athena Runtime on WN

Sometimes you may need Athena Runtime on WNs. e.g.,
$ prun --athenaTag=14.2.24,32,setup ...
This will setup Athena-14.2.24 runtime on remote WNs. The syntax of --athenaTag may be familiar to Athena users. If you want to use caches, 14.2.24.3,32,AtlasProduction,setup..., for example.

Archive output files

When you have a lot of output files on WN, you may want to archive them.
$ prun --outputs "abc.data,JiveXML_*.xml" ...
this will produce DatasetName_SerialNumber_abc.data and DatasetName_SerialNumber_JiveXML_XYZ.xml.tgz. The latter will contain all JiveXML_*.xml files.

Send jobs to a particular site

prun automatically chooses an appropriate site by using information about dataset location, site occupancy, and user's VOMS FQAN. But users can send jobs to a particular cloud/site using --cloud= or --site option. e.g.,
$ prun --cloud FR ...
$ prun --site TRIUMF ...


TODO

  • development of book-keeping tool


Major updates:
-- TadashiMaeno - 14 Nov 2008



Responsible: TadashiMaeno

Never reviewed

Edit | Attach | Watch | Print version | History: r99 | r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r4 - 2008-11-15 - TadashiMaeno
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PanDA All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback