Using prun
Prun documentation
https://twiki.cern.ch/twiki/bin/view/Atlas/PandaRun
and more general information:
https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda
Setup pathena (at LPSC)
ssh lpsc-ui
source ~/setup_grid.sh
set PATHENA_GRID_SETUP_SH /etc/profile.d/env.sh
source /etc/panda/panda_setup.sh
Run prun with ROOT macro
cd /atlas/donini/Analysis/BTagAna/grid
First test if this works:
echo [listoffilesseparatedbycomma] > input.txt; root.exe run.C
If successfull send job on grid (just 5 files):
prun --exec "echo %IN > input.txt; root.exe run.C" --athenaTag=15.5.0 --outDS user10.JulienDonini.t\
est1 --inDS user10.CecileLapoire.mc09_7TeV.108067.J1_pythia_jetjet_muFIXED.JetTagNtuple35.e521_s765\
_s767_r1207_r1210rereco1001 --outputs out1.root --nFiles 5
Job splitting: use option --nFilesPerJob:
--nFilesPerJob 10
makes subjobs of 10 files.
Panda monitoring
Panda bookeeping
Use
PandaBook for bookeeping, resubmit failed jobs, etc.
- pbook --gui with nice graphical interface
- pbook: with command line, see help() for list of commands.
Using Ganga
Ganga
official website and
user documentation.
Submitting Athena jobs (at LPSC)
See instructions (in french) to install and configure ganga here:
tutorial
Go in cmt/ directory of the package that you want to submit to the grid, start ganga:
ganga
then within ganga, configure and submit:
cmtsetup 14.2.21,setup
setup
j = Job()
j.application=Athena()
j.application.exclude_from_user_area=["*.o","*.root*","*.exe"]
j.application.prepare(athena_compile=True)
#j.application.prepare(athena_compile=False)
j.application.option_file="$HOME/testarea/AtlasOffline-14.2.21/SingleTopDPDMaker/run/runSingleTopDPDMaker_FDR2_test.py"
j.application.max_events=100
j.inputdata=DQ2Dataset()
j.inputdata.dataset="fdr08_run2.0052283.physics_Egamma.merge.AOD.o3_f8_m10"
j.outputdata=DQ2OutputDataset()
j.outputdata.outputdata=['SingleTopD3PD.root']
j.splitter=DQ2JobSplitter()
j.splitter.numsubjobs=6
j.splitter.numfiles=1
j.merger=AthenaOutputMerger()
j.backend=LCG()
j.backend.requirements=AtlasLCGRequirements()
#j.backend.requirements.sites=['IN2P3-LPSC_DATADISK']
j.backend.requirements.cloud="FR"
j.submit()
General remarks
Some hints taken from
this tutorial.
A few hints how to work with the Ganga Command Line Interface in Python (CLIP), after you have started Ganga with:
ganga
- The repository for input/output files for every job is located by default at:
$HOME/gangadir/workspace/Local/
- All commands typed on the GANGA command line can also be saved to single script files like
mygangajob1.py
, etc. and executed with: execfile('/home/Joe.User/mygangajob1.py')
- The job repository can be viewed with:
jobs
- Subjobs of a specific job can be view with:
subjobs jobid
(e.g. jobid=42
)
- A running or queued job can be killed with:
jobs(jobid).kill()
- A completed job can be removed from the job repository with:
jobs(jobid).remove()
- The job output directrory of finished jobs that is retrieved back to the job repository can be viewed with:
jobs(jobid).peek()
- The
stdout
log file of a finished job can viewed with: jobs(jobid).peek('stdout', 'cat')
- Export a job configuration to a file with:
export(jobs(jobid), '~/jobconf.py')
- Load a job configuration from a file with:
load('~/jobconf.py')
- Change the logging level of Ganga to get more information during job submission etc. with:
config["Logging"]['GangaAtlas'] = 'INFO'
or config["Logging"]['GangaAtlas'] = 'DEBUG'
- For further intructions how to work with the Ganga Command Line Interface in Python (CLIP) see: Working with GANGA 5
Monitoring the jobs
Ganga commands:
jobs
jobs(<job_number).
Other commands:
glite-wms-job-status <subjob_url> # the job url is given by jobs(<job_number)
glite-wms-job-output <subjob_url> # get job logfiles output
Using pathena
See some useful documentation on Panda
here
Setup of pathena (at LPSC)
Follow instructions here (in french):
setup
Job submission
Go to the package directory, in the run/ directory (where job options are).
Run:
pathena --inDS <input dataset> --outDS <output dataset: user08.JulienDonini....> --cloud=FR --split 6 --nEventsPerJob 100 <job option file name>
for example:
pathena --inDS fdr08_run2.0052280.physics_Egamma.merge.AOD.o3_f8_m10 --outDS user.FabianLambert.panda.1.20081105 --site IN2P3-LPSC_DATADISK --split 6 --nEventsPerJob 100 RunTopAnalysisDPD.py
Job monitoring
Monitoring
webpage.
Then two methods
1- Click on listUser (up right side), then click on user name
2- type job id in the left side box (quick search->job)
Help on DQ2
Useful web pages:
Register local files on dq2
These instructions work for local files at LPSC (not tested elsewhere).
Note: be careful of choosing unique dataset and file names. Beside dataset name must follow some rules explained
here
.
Create and register dataset:
dq2-register-dataset user.JulienDonini.PythiaZbnoMI.evgen.EVNT.v1
dq2-register-location user.JulienDonini.PythiaZbnoMI.evgen.EVNT.v1 IN2P3-LPSC_LOCALGROUPDISK
Copy one local file in dataset:
dq2-put -d -L IN2P3-LPSC_LOCALGROUPDISK -f pythia.JulienDonini.ZcnoMI.1.pool.root user.JulienDonini.PythiaZcnoMI.evgen.EVNT.v1
Copy directory content:
dq2-put -d -L IN2P3-LPSC_LOCALGROUPDISK -s ./ user.JulienDonini.PythiaZcnoMI.evgen.EVNT.v1