--
MarcusDeBeurs - 2017-10-25 ---+
Introduction
This short tutorial will show you how you can make I/O monitoring plots of panda jobs offline.
All the information from a panda job is available at "https://bigpanda.cern.ch/job?pandaid=<jobid>"
Where <jobid> is the ID number for each specific job. (have a look for an example at the panda page of this derivation job:
https://bigpanda.cern.ch/job?pandaid=3523491380
,where the id = 3523491380)
I/O plots are also provided through this panda website, however using this offline script allows the user to tune these to their liking.
In this example additional information about the separate steps a panda job undergoes is added to the plot.
Setup the environment
We need to setup athena, so let's start to setup the ATLAS software environment:
setupATLAS
This alias is already defined in lxplus. After you type it you will see a list of commands you can type to setup various ATLAS computing tools.
If you are not working on lxplus: you will need to define these variables:
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
alias setupATLAS='source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh'
setupATLAS
Now we can setup athena, for now we will use version 21.0.22:
asetup Athena,21.0.22
Getting the info from Panda website
Before we can start plotting, we need to extract the values from the panda website. Here the I/O values are situated in the file 'memory_monitor_output.txt' which is located among the logfiles.
Many more interesting files are located here, for example the output from the athena job ('athena_stdout.txt'), which we will use to extract the information about the supsequent steps the job undergoes.
In the attachments the script:
getPandaJob.sh is written to grab everything that has been found interesting from the panda website. This script requires 1 argument and that is the pandaID. So lets do it for our example, where pandaID = 3523491380:
. getPandaJob.sh 3523491380
It produces the following 4 files:
- <pandaID>_memory_monitor_output.txt -> Containing the full output from the mem + I/O monitor
- <pandaID>_athena_stdout.txt -> Containing the full std output from the athena job
- <pandaID>_jobsteps.txt -> Contains all the steps in the job execution, with a time stamp (in seconds) determined from the start of the job. The next execution is found by searching for "Starting execution" in the athena_stdout.txt. The start of the validation is found by searching for "INFO Validating output files" in the athena_stdout.txt
- <pandaID>_jobinfo.txt -> Contains additional information that is stated on the panda website, like size of input and output files, etc.