This page is outdated. The new FileStager tutorial can be found here.

Introduction to the FileStager package

The FileStager package is ideal for running over collections on a nearby Tier1 or Tier2 site, such as the AOD and DPD that are located at CERN, as if these files are available on your local harddisk.

Given a list of collections on a nearby Tier1 or Tier2 site you wish to run over. The FileStager allows you to run over these files interactively, by one-by-one making local copies of the desired collections and running over these. Copying of the next file happens while you're running over the previous file. The copying happens with a given (local) copy command, eg. rfcp or lcg-cp. Given a fast enough network connection, this is as fast as running over the same files on a local harddisk, and is normally faster than running over nfs.

How it works:

E.g. the stager makes a local copy of file1, while running over file0. At the end of (the local copy of) file0, the filestager deletes (the temporary) file0. While running over the local copy of file1, start staging file2. Etc.

Copying the file and running over it locally, on average turns out to be >4 times faster than running over the collection interactively using a network protocol such as rfio or gfal. When running a normal reconstruction job, e.g. taking at least a few ms per event, the next file should normally have finished staging before it is actually needed. Running over the collections is then as fast as having the files on local disk, and the only time loss you experience is from staging is for copying over very first file.

Files

The FileStager can run standalone or in athena. The core software of the filestager is: Root/TStageManager.cxx, which takes care of copying the files you need behind the scenes. This stagemanager is called in Root through the classes TCopyFile and TCopyChain.

In Athena, the stagemanager is called by the wrapper algorithm: src/FileStagerAlg.cxx.

Requirements

  • Make sure to have a working grid certificate.
  • A few gigabytes of temporary diskspace.

Setting up the FileStager package

Start from a clean session. I assume that we are working with Athena release 14.2.24 and that it has already been setup, via

source ~/cmthome/setup.sh -tag=14.2.24,32
or similar commands on your account. In the tutorial we will work with the FileStager-package, we will therefore check out in our Athena working directory, via
cmt co -r FileStager-00-00-31 Database/FileStager
go to
cd Database/FileStager/cmt
and type
cmt config
source setup.sh
gmake

Something we won't do now, but which can be very useful: The filestager can also be compiled stand-alone, i.e. independent of Athena, doing:

cd Database/FileStager/cmt
and type
cd cmt
gmake -f Makefile.Standalone
in which case the shared library will be stored in the StandAlone/ directory.

Getting a grid proxy certificate

You need a working grid certificate to use the FileStager package. To activate a grid proxy certificate, do:

source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
voms-proxy-init -voms atlas -out $HOME/.globus/gridproxy.cert
export X509_USER_PROXY=${HOME}/.globus/gridproxy.cert
Like this, your grid proxy will also work when submitting batch jobs, since the certificate is not stored in the local tmp directory.

You can check with the command

grid-proxy-info
whether you have a valid, working grid proxy certficate.

Defining a grid sample:

Go back to

cd Database/FileStager/
and do
ls -l scripts/

If you want to run over files on a particular tier site, you can do:

define_dq2_sample [-n sampleName] dq2-sample [dq2-destination]
where dq2-sample is a sample found with dq2-ls. The list of available dq2-destinations can be seen using
dq2-destinations
By default, define_dq2_sample searches for files at: CERN-PROD_MCDISK.

A second command exists in case you want to run over files stored on castor, at CERN. Do:

define_castor_sample [-n <sampleName>] <castorDirectory>
It does a similar job (at cern) as define_dq2_sample for files in castor directories.

eg. try:

define_dq2_sample            valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768
define_dq2_sample -n top valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768

or at cern:

define_castor_sample /castor/cern.ch/grid/atlas/tzero/prod2/perm/fdr08_run2/physics_Egamma/0052283/fdr08_run2.0052283.physics_Egamma.merge.AOD.o3_f8_m10/
define_castor_sample -n 52283 /castor/cern.ch/grid/atlas/tzero/prod2/perm/fdr08_run2/physics_Egamma/0052283/fdr08_run2.0052283.physics_Egamma.merge.AOD.o3_f8_m10/

Perform the second and fourth command, and take a look at the file:

samples/52283.def
Apart from some flags (you can ignore them) it contains the srm grid-filenames of files stored in the castor directory given above. Note that these names are preceded by the prefix:
"gridcopy://" 
This prefix is used by the FileStager package to trigger the filestager protocol. These sample definition files will be used below as input to the filestager examples.

Grid collection names starting with "lfn" can be staged too, as long as they have the "gridcopy://" prefix. But note that these do not necessarily point to local files on the Tier-X, whereas srm files do.

Search for a few more dq2 samples, using dq2-ls, and use define-dq2-sample to get the corresponding srm filenames.

Finally, there is a similar script called define_rfio_sample, e.g.: define_rfio_sample -n 52283 /castor/cern.ch/grid/atlas/tzero/prod2/perm/fdr08_run2/physics_Egamma/0052283/fdr08_run2.0052283.physics_Egamma.merge.AOD.o3_f8_m10/

This produces a sample definition file (samples/52283rfio.def) that can be used when the FileStager uses rfcp as the underlying copying command. See more below.

Running the FileStager

The FileStager makes local copies of the grid files you wish to run over. Behind the scenes, by default the stagemanager uses lcg-cp to copy over grid collections. (The copy command and arguments can be set, see below.) The default settings should work for nearly everyone in ATLAS with a grid certificate.

General note: if the stager fails or seems to take forever, it means the underlying copy command hangs. Try doing a manual lcg-cp of the slow file to see what's going on. The FileStager is only as stable as the network access to the grid collections in question.

Before we go to the examples, first some general comments.

The Nop>TStageManager is a singleton class (ie. there's only one instance), and can be retrieved anywhere in C++ by doing:

TStageManager& manager = TStageManager::instance();
By default, the filestager stages only one file ahead. This can be changed with the command (see below):
manager.setPipeLength(2);
where the 2 means the stager copies ahead two files.

To turn on verbosity of the stager, there are two options:

manager.verboseWait();
tells you when you're waiting for a file to finish copying. It's useful to have this turned on by default.
manager.verbose();
gives all the stager output, which is quite a lot, and is only useful for debugging.

See more examples below.

Running in ROOT

Let's first run the example and then see what it's doing. Go to

cd Database/FileStager/run
and do
root -l
.L stagerExampleRoot.C
Example("../samples/top.def")
Root will now start running over the collections defined in the file "../samples/top.conf", copying them one-by-one. The actual events in the collections are skipped over.

You will see output like:

TStageManager::getFile() : Waiting till <gridcopy://srm://srm-atlas.cern.ch/castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid3/AOD/valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768/AOD.022768._00001.pool.root.1> is staged.
and
TCopyChain::LoadTree() : Opened new file <gridcopy://srm//srm-atlas.cern.ch/castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid3/AOD/valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768/AOD.022768._00001.pool.root.1>
Now processing event 0

When the job has finished, type

.q
and you should see:
TStageManager::print() : Successfully staged 1756 mb over 5 files.

Now take a look at what stagerExampleRoot.C is actually doing. At the start of the Example() function, the following lines are called:

// make sure ROOT picks up TCopyFile when filename starts with gridcopy://
gROOT->GetPluginManager()->AddHandler("TFile", "^gridcopy:", "TCopyFile","TCopyFile_cxx", "TCopyFile(const char*,Option_t*,const char*,Int_t)");
These make sure that ROOT associates filenames starting with "gridcopy://" with the class TCopyFile.

This is the configuration of the file stager.

    // turn on staging functionality
    TCopyChain::SetOriginalTChain(false);
    TCopyFile::SetOriginalTFile(false);
  
    // stager settings
    TStageManager& manager = TStageManager::instance();
    //manager.setCpCommand("cp");
    //manager.addCpArg("-f");
    manager.setInfilePrefix("gridcopy://");
    manager.setOutfilePrefix("file:");
    manager.setCpCommand("lcg-cp");
    manager.addCpArg("-v");
    manager.addCpArg("--vo");
    manager.addCpArg("atlas");
    manager.addCpArg("-t");
    manager.addCpArg("1200");
    //by default manager stores in $TMPDIR, or else /tmp ifndef
    //manager.setBaseTmpdir("/tmpdir");
    //manager.setPipeLength(1);

    // turn on verbosity
    if (kFALSE) manager.verbose();     // lots of output
    if (kTRUE)  manager.verboseWait(); // useful to see if your next has not yet finished staging

TCopyFile and TCopyChain work just like TFile and TChain in ROOT, and handle file traffic with the TStageManager class. By default the file staging is turned off. First the file staging is turned on with the lines.

    // turn on staging functionality
    TCopyChain::SetOriginalTChain(false);
    TCopyFile::SetOriginalTFile(false);

Then follow the the filestager settings:

    // stager settings
    TStageManager& manager = TStageManager::instance();
    //manager.setCpCommand("cp");
    //manager.addCpArg("-f");
    manager.setInfilePrefix("gridcopy://");
    manager.setOutfilePrefix("file:");
    manager.setCpCommand("lcg-cp");
    manager.addCpArg("-v");
    manager.addCpArg("--vo");
    manager.addCpArg("atlas");
    manager.addCpArg("-t");
    manager.addCpArg("1200");
    //by default manager stores in $TMPDIR, or else /tmp ifndef
    //manager.setBaseTmpdir("/tmpdir");
    //manager.setPipeLength(1);
In the example, TStageManager is configured to use the underlying copy command lcg-cp, with the options -v --vo atlas -t 1200. Before coping, the string "gridcopy://" is stripped off the input collection-name, and the prefix "file:" is added in front of the ouput collection-name. E.g. making:
lcg-cp -v --vo atlas gridcopy://srm://srm-atlas.cern.ch/castor/cern.ch/grid/atlas/tzero/prod2/perm/fdr08_run2/physics_Egamma/00522/fdr08_run2.0052283.physics_Egamma.merge.AOD.o3_f8_m10/fdr08_run2.0052283.physics_Egamma.merge.AOD.o3_f8_m10._0001.1 file:/tmp/tcf_fdr08_run2.0052283.physics_Egamma.merge.AOD.o3_f8_m10._0001.1
For example, by simply doing:
    // stager settings
    TStageManager& manager = TStageManager::instance();
    manager.setInfilePrefix("gridcopy://");
    manager.setCpCommand("rfcp");
one configures the filestager to use rfcp as the underlying copying command.

The verbosity can be changed with the options

    // turn on verbosity
    if (kFALSE) manager.verbose();     // lots of output
    if (kTRUE)  manager.verboseWait(); // useful to see if your next file has not yet finished staging

Just for fun, turn on

manager.verbose()
and rerun the example. You'll see all the lcg-cp commands being performed to copy over the grid collections. Since the copy commands and the ROOT program run independently, the output on the screen can look like a big mess.

The remaining code in stagerExampleRoot.C adds the input collections to TCopyChain and loops over the events. It uses TCopyChain (which uses TCopyFile) just like one would use the TChain and TFile classes in ROOT.

Running in Athena

The other example in the run directory, stagerExampleAthena.py, uses the FileStager to loop over grid collections in Athena. By default it's configured to run over the list of samples defined in "../samples/52283.def". Change it if you like. Run the script by doing:

cd FileStager/run/
athena stagerExampleAthena.py
and follow the output on the screen. Again, the script is simply looping over the collections, not doing anything with the events.

You'll see output like:

doStaging? True
FileStager.FileStagerTool() : Waiting till <gridcopy://srm://srm-atlas.cern.ch/castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid3/AOD/valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768/AOD.022768._00001.pool.root.1> is staged.
and
|=/***** Algorithm FileStagerAlg/FileStager **********************************************************
| |-AuditAlgorithms        = False
| |-AuditBeginRun          = False
| |-AuditEndRun            = False
| |-AuditExecute           = False
| |-AuditFinalize          = False
| |-AuditInitialize        = False
| |-AuditReinitialize      = False
| |-BaseTmpdir             = '/tmp/mbaak'  (default: '')
| |-CpArguments            = ['-v', '--vo', 'atlas', '-t', '1200']  (default: [])
| |-CpCommand              = 'lcg-cp'  (default: 'lcg-cp')
| |-Enable                 = True
| |-ErrorCount             = 0
| |-ErrorMax               = 1
| |-FirstFileAlreadyStaged = True  (default: False)
| |-InfilePrefix           = 'gridcopy://'  (default: 'gridcopy://')
| |-InputCollections       = ['gridcopy://srm://srm-atlas.cern.ch/castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid3/AOD/valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768/AOD.022768._00001.pool.root.1', 'gridcopy://srm://srm-atlas.cern.ch/castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid3/AOD/valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768/AOD.022768._00002.pool.root.1', 'gridcopy://srm://srm-atlas.cern.ch/castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid3/AOD/valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768/AOD.022768._00005.pool.root.1', 'gridcopy://srm://srm-atlas.cern.ch/castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid3/AOD/valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768/AOD.022768._00003.pool.root.1', 'gridcopy://srm://srm-atlas.cern.ch/castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid3/AOD/valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768/AOD.022768._00004.pool.root.1']
| |                        (default: [])
| |-MonitorService         = 'MonitorSvc'
| |-OutfilePrefix          = 'file:'  (default: 'file:')
| |-OutputCollections      = []  (default: [])
| |-OutputLevel            = 0
| |-PipeLength             = 1
| |-TreeName               = 'CollectionTree'
| |-VerboseStager          = False
| |-VerboseWaiting         = True
| \----- (End of Algorithm FileStagerAlg/FileStager) -------------------------------------------------
Note again the lcg-cp command, its settings, and the list of input collections to be staged by the FileStager algorithm.

Furthermore, you see the temporary collections being opened by the Athena event looper:

POOL> Unknown storage type requested:
ImplicitCollection  Warning Cannot find persistency storage type. Trying ROOT_StorageType
/tmp/mbaak/tcf_AOD.022768._00002.pool.root.1   Always Root file version:51800
/tmp/mbaak/tcf_AOD.022768._00002.pool.root.1   Always Root file version:51800
PoolSvc           WARNING File is not in Catalog or does not exist.
PoolSvc           WARNING Do not allow this ERROR to propagate to physics jobs.
and the stagemanager waiting for the next file to finish copying, before continuing Athena ...
TStageManager::getFile() : Waiting till <gridcopy://srm://srm-atlas.cern.ch/castor/cern.ch/grid/atlas/tzero/atlasmcdisk/valid3/AOD/valid3.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s429_b38_r461_tid022768/AOD.022768._00005.pool.root.1> is staged.

And at the end of the job:

TStageManager::print() : Successfully staged 1756 mb over 5 files.

Now take a look at the stagerExampleAthena.py.

The input files need to be staged before they get picked up by EventSelector. Since the (or any) FileStager algorithm is run after the EventSelector, the first file needs to be staged by a seperate tool, being 'stagetool'. All subsequent files are copied over by the FileStager algorithm. Again, the copying of one file starts when the previous file is being opened. If the FileStager has not yet finished copying a file before it is needed, the FileStager will wait till copying has finished. EventSelector then picks up the local file copy.

The relevant section that configures the file stager is right below, and is much like the settings in ROOT.

job = AlgSequence()
and reads
## File with input collections
sampleFile = "/afs/cern.ch/user/m/mbaak/muwork/workarea/14.1.0/Database/FileStager/samples/top.def"
# Alternative:
# myInputCollections = ["gridcopy://foo","gridcopy://bar","gridcopy://bal"]

## Import file stager classes
from FileStager.FileStagerConf import FileStagerAlg
from FileStager.FileStagerTool import FileStagerTool

## process sample definition file
stagetool = FileStagerTool(sampleFile=sampleFile)
#Alternative:
#stagetool = FileStagerTool(sampleList=myInputCollections)

## check if collection names begin with "gridcopy"
print "doStaging?", stagetool.DoStaging()

if stagetool.DoStaging():
  EventSelector.InputCollections = stagetool.GetStageCollections()
else:
  EventSelector.InputCollections = stagetool.GetSampleList()

## filestageralg needs to be the first algorithm added to the job.
if stagetool.DoStaging():
   job += FileStagerAlg('FileStager')
   job.FileStager.InputCollections = stagetool.GetSampleList()
   #job.FileStager.PipeLength = 2
   #job.FileStager.VerboseStager = True
   job.FileStager.BaseTmpdir    = stagetool.GetTmpdir()
   job.FileStager.InfilePrefix  = stagetool.InfilePrefix
   job.FileStager.OutfilePrefix = stagetool.OutfilePrefix
   job.FileStager.CpCommand     = stagetool.CpCommand
   job.FileStager.CpArguments   = stagetool.CpArguments
   job.FileStager.FirstFileAlreadyStaged = stagetool.StageFirstFile

The actual file staging happens in two steps, using FileStagerTool and FileStagerAlg. FileStagerTool does the staging of the very first grid collection, before FileStagerAlg can be run in the Athena eventlooper.

The FileStagerTool

stagetool = FileStagerTool(sampleFile=sampleFile)
processes the collections in the sample definition file. (It can also process lists of files that do not need to be staged.) It passes the list of (to be staged) temporary filenames to the EventSelector, and the list of grid collections to the vFileStager algorithm, which does the actual file copying.

The FileStagerTool also sets the location where temporary files need to be copied to.

Copy this snippet of code to any jobOptions file that you may have to incorporate the filestaging functionality.

Finally, the file:

stagerExampleAthenaRFCP.py
gives an example of how to use the stager in Athena with the copy command rfcp.
## configure the stager
stagetool.CpCommand = "rfcp"
stagetool.CpArguments = []
stagetool.OutfilePrefix = ""
stagetool.checkGridProxy = False
Use this script in combination with the output from the define_rfio_sample script.

Submitting a batch job

Go to

cd FileStager/run/
and make a script batch.sh that includes
#!/bin/sh

## script for restarting grid proxy certificate
source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
export X509_USER_PROXY=${HOME}/.globus/gridproxy.cert
voms-proxy-init -voms atlas -noregen

source ~/cmthome/setup.sh -tag=14.2.24,32
cd ~/workarea/14.2.24/Database/FileStager
cd cmt 
source setup.sh
cd ../run
athena stagerExampleAthena.py
Be sure to adjust the directories. Note that the first few lines restart your grid certificate on the batch node. Then do
chmod +x batch.sh
bsub -q 1nh batch.sh
This should give you a running batch job that uses the filestager. The output will appear in the LSFJOB directory.

FileStager and Ganga

Two things need to be done to get the FileStager working with Ganga.

First. We will use the script

run/stagerExampleAthenaRFCP.py
To use Ganga, in stagerExampleAthenaRFCP.py you will need to give the exact location of your sample definition file. Ala
sampleFile = "/afs/cern.ch/user/m/mbaak/muwork/workarea/14.2.24/Database/FileStager/samples/52283rfio.def"

Second. Currently there is a problem in Athena 14.x.y, which introduces some Python conflicts. Here is a workaround. You have to delete /afs/cern.ch/sw/lcg/external/Python/2.5/slc4_ia32_gcc34/lib/python2.5 out of the Python Path Variable $PYTHONPATH. In order to do so, just print it via

echo $PYTHONPATH

Then take the above entry away and set the remain variable again

export PYTHONPATH=...

We change to the run directory and create a file named gangaStager.py, containing

config["Athena"]["ATLAS_SOFTWARE"] = "/afs/cern.ch/atlas/software/releases"
j = Job()
j.name='AnalysisExample'
j.application=Athena()
j.application.exclude_from_user_area=["*.o","*.root*","*.exe"]
j.application.prepare(athena_compile=True)
j.application.option_file='$HOME/muwork/workarea/14.2.24/Database/FileStager/run/stagerExampleAthenaRFCP.py'
j.application.max_events='-1'
j.backend=LSF(queue='1nh')
j.submit()
Run Ganga with this example as you're used to. In Ganga
execfile("gangaStager.py")

Note on performance

Note that these examples are not realistic, since we're not actually processing any events. We just skip over them. In reality, when you're doing analysis, the average processing time per event is perhaps a few milliseconds. For several thousand events per file, it may take > ~20s to process one file, in which time the next file should have finished staging. So you will not or will hardly experience time loss from staging input files.

In fact, running the stager is often as fast as running the same job over files on a local harddisk, and is usually faster than running over files on afs.

Doing a simple test with FDR1 AODs on castor, doing a Z->ee analysis that reads out trigger info, there's a factor of >9x speed increase when using the filestager over running over the same files using the rfio protocol.

Setting the temporary directory & file deletion:

The tmp directory, which is used for storing the temporary local copies, needs to have a few Gb's of diskspace available. By default, tmp files are stored in $TMPDIR, or $WORKDIR on lxbatch machines. If this directory does not exist, the stagemanager falls back on /tmp

Under normal circumstances, temporary files are deleted right after they have been used. An additional monitoring jobs gets started by the TStageManager, being StageMonitor.exe. StageMonitor.exe makes sure all temporary files get properly deleted if your original application, using the FileStageManager, hangs.

I.e. start an xterminal

xterm &
and, in the first shell, rerun
athena stagerExampleAthena.py
Over the course of the next minute or so, take a look at $TMPDIR. Staged collections should start appearing.
 ls -ltr $TMPDIR
Then, after a while, in the xterm do:
ps -ef | grep StageMonitor
You will see something like this:
mbaak    26748 25909  1 23:59 ?        00:00:00 StageMonitor.exe 25909 /tmp/mbaak /tmp/mbaak
Then do
^C
in the window that runs athena, and after that keep on track of the files in $TMPDIR and of the StageMonitor process. You will find that StageMonitor stops too, and while doing so deletes the remaining temporary collections in $TMPDIR.

Common problems:

  • Make sure your grid scripts are initialized. Without it, the stager will crash.
  • The temp. directory does not have enough diskspace. Be sure to always have a few Gbs of temporary diskspace for copying over collections. Incomplete collections will make your application hang.
  • The filestager takes forever to copy over a file .. Kill your application and do:
lcg-cp -v --vo atlas srm://hangingfileyouwanttostage file:/tmp/foo.root
... which is simply the underlying staging command. Ask your local grid expert why this isn't working. Most likely it's a grid-site problem, or else a local network problem. Note that, in the stage examples, lcg-cp times out after 1200 seconds. Set this with lcg-cp -t 1200.

Further documentation:

See "FileStager tutorial" under:

FileStager rfio / xrootd comparison test:

-- MaxBaak - 29 Jul 2008

Topic attachments
I Attachment History Action Size Date Who Comment
PowerPointppt filestager.ppt r1 manage 543.5 K 2009-01-19 - 15:40 UnknownUser  
Edit | Attach | Watch | Print version | History: r18 < r17 < r16 < r15 < r14 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r18 - 2009-11-02 - MaxArjenBaak
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback