FAX for end-users

FAX (Federated ATLAS Xrootd) is a way to access most of the ATLAS data from any place, without needing to know where the files are. It relies on a gLFN(globalLogicalFileName) convention. So when user needs a file, it just needs to specify its gLFN and the FAX will find the place having this file and will deliver it to the user. More than 40 largest ATLAS sites have been federated. Soon more sites will join.

Why use FAX?

  • FAX simplifies your code: independently if your code will run interactively from your laptop, from your Tier3 even if it has no disk space, batch farm or grid, file naming stays the same.
  • FAX allows for recovery. If the file is missing from one site (site is down or there is another problem), there is a large chance it exists somewhere else in the federation and the FAX will deliver it.
  • Allows usage of more advance tools like frun.

Prerequisites

Data sets must be registered in DDM (Rucio). All the grid produced datasets are automatically registered independently if these are part of official production or simply result of a user's job. If your files are not registered it is trivial to do so. Very detailed description how to do this is given here.

Usage

Set up environment

Now all the sites have CVMFS mounted so setting up fax is trivially done using localSetupFAX:

     export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
     source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh
     localSetupFAX
    

As you may see from it output localSetupFAX will setup grid middleware, rucio client, xrootd, set up the best endpoint. If you want to set an endpoint to use by yourself you may find their addresses and current statuses here, than do:

export STORAGEPREFIX=root://endpointAddress:port/

You'll need a grid proxy (you need a valid ATLAS grid certificate):

voms-proxy-init -voms atlas

Check availability of your dataset/datacontainer in FAX

Until all of the sites join the federation, it is important to have a way to simply check if a dataset is available in FAX. The tool to do that is called fax-is-dataset-covered and is available upon setting up localSetupFAX. Usage is trivial: just give it a dataset or datacontainer name. For each input dataset it will print number of FAX endpoints containing full and incomplete replicas. Example:

fax-is-dataset-covered data12_8TeV.periodH2.physics_Muons.PhysCont.NTUP_SMWZ.grp13_v01_p1067/
data12_8TeV.00212809.physics_Muons.merge.NTUP_SMWZ.f481_m1233_p1067_p1141_tid01012924_00    complete replicas: 1    incomplete: 0
data12_8TeV.00212858.physics_Muons.merge.NTUP_SMWZ.f481_m1233_p1067_p1141_tid01014068_00    complete replicas: 1    incomplete: 0
data12_8TeV.00212815.physics_Muons.merge.NTUP_SMWZ.f481_m1233_p1067_p1141_tid01012923_00    complete replicas: 1    incomplete: 0

Find gLFN's of your files

Obtain the "gLFN's" (or global logical filenames) using RUCIO, which that scales better with load and is faster and more robust than the older approaches. To print the RUCIO gLFN use the following executable that is available to you once FAX is setup (see above):

fax-get-gLFNs data12_8TeV.periodH2.physics_Muons.PhysCont.NTUP_SMWZ.grp13_v01_p1067/
root://fax.mwt2.org:1094//atlas/rucio/data12_8TeV:NTUP_SMWZ.01014068._000103.root.1
root://fax.mwt2.org:1094//atlas/rucio/data12_8TeV:NTUP_SMWZ.01014068._000123.root.1
:

The results should use your local $STORAGEPREFIX which is in this example root://fax.mwt2.org:1094/.

In order to get a text file with the list of individual filenames, you can do:

  fax-get-gLFNs data12_8TeV.periodH2.physics_Muons.PhysCont.NTUP_SMWZ.grp13_v01_p1067/ > my_list_of_gLFNS.txt

To use these files

  • from ROOT:

 
  TFile *f = TFile::Open("root://myRedirector:port//atlas/rucio/user/ivukotic:group.test.hc.NTUP_SMWZ.root"); 

  • locally:

  xrdcp $STORAGEPREFIX/atlas/rucio/user/ivukotic:group.test.hc.NTUP_SMWZ.root /tmp/myLocalCopy.root 

  • from prun: Instead of giving

--inDS myDataset

option, provide it with

--pfnList my_list_of_gLFNS.txt

(where the filelist my_list_of_gLFNS.txt specified here is generated using the example above)

Using frun

If using prun on local resources please consider using instead the frun.py. Frun is a simple wrapper around prun and all that it does is to change inDS that you give with the gLFNs. All other options that you are using are simply forwarded to prun. Main advantage of frun is that in case file(s) are missing at the site where program runs, FAX layer will provide it as long as it exists anywhere in US or UK (hopefully soon it will be world wide).

Example:

frun -v --inDS=user.ilijav.HCtest.1 --exec "echo %IN | sed -e \"s/,/\n/g\" > input.txt; ./doRead.sh" \
--athenaTag=17.2.0 \
--noBuild \
--outputs=input.txt,ilija.txt \
--outDS=user.ivukotic.HCtestREWRITTEN.1 \
--site=ANALY_CERN_XROOTD

Example analysis jobs running against FAX

To test efficiency of FAX data access, have a way to stress a site, and have an examples that can be used in tutorials, we came up with analysis examples that should be easy to run.

More information and Troubleshooting

A tutorial found here Tutorial.tar contain a PowerPoint presentation of FAX, examples of it's usage and ready to test source code. If something does not work or you have any questions please feel free to contact atlas-adc-fax-operations@cernNOSPAMNOSPAMPLEASE.ch


Major updates:
-- IlijaVukotic - 05-Sep-2012


This topic: AtlasComputing > WebHome > AtlasComputing > AtlasXrootdSystems > UsingFAXforEndUsers
Topic revision: r30 - 2015-08-20 - unknown
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback