Running AtlasHbb

Introduction

There are the instructions I learned from Mike and ran myself to create flat tuples on the grid for v16 for data files. There are 12 data periods (6 for MUONS, 6 for EGAMMA) described by letters (letters A and C are missing because period C is bad and period A is only used in v17?). The first step is to submit these on the grid using multitask with one task for each data period (tasks 0 through 11). For data, each task as one unit, and one unit is a job (in ganga), and each unit has several partitions and each partition several files. After that we use ganga and panda to supervise the jobs to see their status. When one period finishes, we run a script to create a log file with the output files on the grid. Finally we run a script to retrieve those files on a Glasgow computer. For all processes we will use the screen command, as we want to be able to close our computer while these keep running. When we retrieve the jobs from the grid, we will not use only one computer, but to speed the process we will use several ppe computers at the same time, one for each data period.

How to use screen

ssh -Y ppepc137.physics.gla.ac.uk

Then open an xterm terminal.

xterm

To see if there is any screen window attached

screen -ls 

The output looks like this

There are screens on:
        18281.pts-5.ppepc137    (Detached)
        31921.pts-6.ppepc137    (Detached)
2 Sockets in /var/run/screen/S-abuzatu.

The two are independent terminals, so you can connect to one, then detach and connect to the other one, all from the same xterm terminal. To connect to it.

screen -D -R 18281

Then you work in it and you detach with "control + A" and then "D". Then you can attach again to another one

screen -D -R 31921

To kill a screen terminal

exit

To start a new screen session

screen

and not "screen -D -R" which would attach to the existing one, if there is only one.

How to check the current black listed sites

Start a new xterm

xterm

We have to setup athena v17, as only this one works also for 64 bits, which is used by the script that parses the database of black listed sites.

export AtlasSetup=/afs/cern.ch/atlas/software/dist/AtlasSetup 
alias asetup='source $AtlasSetup/scripts/asetup.sh'
asetup 17.1.0,64

Then we setup AGIS

source /afs/cern.ch/user/a/agis/public/AGISClient/latest/setup.py26.sh

We go to the appropriate folder

cd /data/atlas13/abuzatu/AtlasHbb/Grid_GlaNtpPackage/ModifiedD3PD

Run the script from this folder

python blacklistedSites.py

And you get an output like this

IN2P3-CC,SARA-MATRIX,NDGF-T1,INFN-ROMA1,RAL-LCG2,INFN-FRASCATI,UKI-LT2-RHUL,NIKHEF-ELPROD,IN2P3-CPPM
['IN2P3-CC', 'SARA-MATRIX', 'NDGF-T1', 'INFN-ROMA1', 'RAL-LCG2', 'INFN-FRASCATI', 'UKI-LT2-RHUL', 'NIKHEF-ELPROD', 'IN2P3-CPPM']

We we mention this sites in the command to submit the grid jobs for the excluded sites option, as we do not want to run over these sites, as they will be down in the next week when our jobs will run. To this you may add your not preferred sites, such as ANALY_OX.

How to clean, build, and test AtlasHbb

Start a new xterm. You do not need to setup anything as the scripts will setup for you.

cd /data/atlas13/abuzatu/AtlasHbb/

To see the instructions to run the script

./AtlasHbbScript.sh -h

If we do just small edits like adding some couts,

cd GlaNtpPackage/testsvn1/AtlasHbb-00-00-20/BaseAnalysisCode/src/
emacs FlatTupleMaker.cxx &

To clean and rebuilt and run for flattuples the test for only 10 events

./AtlasHbbScript.sh -a 16.6.5 -p 00-00-20 -g 00-00-42 -n -r -cleanBuildHbb -runtype corrections -D3PDtype flattuple -process T1 -Nevents 10

If we make smaller changes, we can build without a clean

./AtlasHbbScript.sh -a 16.6.5 -p 00-00-20 -g 00-00-42 -n -r -BuildHbb -runtype corrections -D3PDtype flattuple -process T1 -Nevents 10

How to submit to the grid using multitask

First setup ganga

source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh

Then copy the AtlasHbb from Mike to your own local area and make sure you remove the .log files as they are very big (20 GB out of 21GB total). In the future you may checkout AtlasHbb, but some latest things are not in CVS yet. Go to the appropriate folder

cd /data/atlas13/abuzatu/AtlasHbb

and then run, and the example below is pretty safe explanatory what the options mean.

./AtlasHbbScript.sh -a 16.6.5 -p 00-00-20 -g 00-00-42 -n -r -grid -runtype data -d3pdtag r2300_p605 -version V1 -release R3 -grid_subV FLT -inputDSDictFile InputDSV16_Data1034.95invfb1.txt -ganga 5.7.12 -D3PDtype flattuple -user_area_path /data/atlas13/abuzatu/ -ExcludeSites JINR-LCG2,IN2P3-CC,NDGF-T1,INFN-ROMA1,RRC-KI,SE-SNIC-T2,JINR-DNLP-T3,UKI-LT2-Brunel,IN2P3-CC-T2,ANALY_OX

This will submit the jobs to the grid. Now you can detach screen and it will continue running. After it finishes submitting, it closes ganga, and then we should open it so that it gets updated as jobs finish or fail.

If you want to run only on a subset of the periods, because in them some partitions failed, you do this.

cd /data/atlas13/abuzatu/AtlasHbb/Grid_GlaNtpPackage/ModifiedD3PD
cp InputDSV16_Data1034.95invfb1.txt InputDSV16_Data1034.95invfb1.txt
emacs -nw InputDSV16_Data1034.95invfb1.txt
(and remove the datasets you do not want)

Then find out again what are the current grid sites that are black listed by opening a new xterm and following the above details.

Go back to

cd /data/atlas13/abuzatu/AtlasHbb

and then run, and replace the input text file with the new one

./AtlasHbbScript.sh -a 16.6.5 -p 00-00-20 -g 00-00-42 -n -r -grid -runtype data -d3pdtag r2300_p605 -version V1 -release R3 -grid_subV FLT -inputDSDictFile InputDSV16_Data1034.95invfb1_reduced4.txt -ganga 5.7.12 -D3PDtype flattuple -user_area_path /data/atlas13/abuzatu/ -ExcludeSites ,ANALY_OX

How to check on the progress of your jobs

Start a new xterm

xterm

Setup ganga

source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh

Start ganga

ganga

To see which tasks you have

tasks

To see which jobs you have

jobs

To see for a given task what jobs it has (for data we expect only one)

tasks(24).getJobs()

To see for a given task the unit overview (all blue means completed)

tasks(24).unitOverview()

To do a loop over all our desired tasks while pausing for 4 seconds for each task output, we create a loop

tasks(24).overview()

To see for a given task the overview of the partitions (all blue means completed)

import time
In [9]:for i in range(24,30):
   ...:     print "TASK: %s" % i
   ...:     tasks(i).unitOverview()
   ...:     tasks(i).overview()
   ...:     time.sleep(4)

To check the details of one partition inside the unit of the task, for example those that have errors.

tasks(25).transforms[0].getPartitionJobs(462)

We can see an output like

462.188

Now we can see the details like

jobs('462.188')

so I get the jobsetID as 1218 (the panda job id) of the 1st retrial. But since it started rerunning, we expect no error here, just running, but the previous jobsetID had errors (1185), we get that number with "jobs(462)").

From that we get the jobsetID, which is what we search for in the panda webpage

http://panda.cern.ch:25980/server/pandamon/query

More precisely, the jobs for Adrian Buzatu

http://panda.cern.ch/server/pandamon/query?ui=user&name=adrian%20buzatu

To remove a task. But before note down what data and period it referrs to.

tasks(11).remove(True)

And from here you see which tasks have finished completely. The outputs of those ones are ready to be downloaded at Glasgow

Prepare the files to download Glasgow

Go the screen window from where you have submitted the ganga jobs, which is also the same window where you are currently running ganga to monitor the jobs. Exit ganga with "control+D". Alternatively, you can start a new xterm and setup ganga.

xterm

All we need is to setup ganga, but first we need to close the ganga that runs elsewhere.

source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh

Either way, after that you have to first make a text file with the output file names and locations

cd /data/atlas13/abuzatu/AtlasHbb/Grid_GlaNtpPackage/ModifiedD3PD/

To see the instructions on how to use the script

python dq2get.py -h

To produce the output text file for task 3

ganga dq2get.py --ProduceOutfiles --TaskNumber 3 --TrfMinNo 0 --TrfMaxNo 0 --UnitMinNo 0 --UnitMaxNo 0 

Download output at Glasgow

Now that the file is made, we use it to download the file. It is better to start a new screen window, even better on a different machine. We used ppepc101 and ppepc103.

Start a new xterm

xterm

cd /data/atlas13/abuzatu/AtlasHbb/Grid_GlaNtpPackage/ModifiedD3PD/

All we need is to setup dq2 first.

source Dq2Setup.sh

Then we retrieve the desired files.

python dq2get.py --RunFromFile --FileName DATA_periodE_EGAMMA_V1_R3_FLT_gangaOutput.txt --RootFileDir /data/atlas12/abuzatu/data/flattuple

To retrieve the output of another task, close this screen window, start a new one on a different machine and repeat the procedure.

Investigate the output

If it takes too long to come back, we checked how big the output is. Started a new xterm, then go to the appropriate folder

cd /data/atlas13/abuzatu/AtlasHbb/Grid_GlaNtpPackage/ModifiedD3PD

then setup up dq2 with

source Dq2Setup.sh

Then we list all files names and sizes

dq2-ls -fH user.abuzatu.20120503165133.DATA_periodD_EGAMMA_V1_R3_FLT.id_8.periodD.0.periodD.j457.t8.trf0.u

To check where the container and the number of replicas, for maybe the output is at a site that is temporary down, than you not retrieve the output at the moment

dq2-ls -r user.abuzatu.20120503165133.DATA_periodD_EGAMMA_V1_R3_FLT.id_8.periodD.0.periodD.j457.t8.trf0.u

-- AdrianBuzatu - 08-May-2012

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2012-05-17 - AdrianBuzatu
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback