5.13 Using Xrootd Service (AAA) for Remote Data Access
Complete:
Detailed Review status
Contents
Goals of this workbook page
This page describes Any Data, Anytime, Anywhere (AAA), CMS's implementation of a generic
xrootd
service for analyzing disk-resident CMS data located at any grid site with bare ROOT or the CMSSW/FWLite environment, without downloading it to your local storage. Much effort has been invested in having ROOT and CMSSW read remote files efficiently, so that you will be able to analyze data without knowing whether the input file is on your computer or halfway around the world! AAA also allows for greater resilience against damaged or missing input files, and for greater use of opportunistic resources.
Introduction to the AAA Service
To access a particular file, no matter where in the world it is, you only need to know the Logical File Name (LFN) of the file. The LFN uniquely identifies any file that is somewhere with in the
/store
directory tree within all of CMS storage. The LFN's of files that are in defined CMS datasets can be found through the DAS service (see
WorkBook Chapter 5.4). For files that are not in official datasets, such as those in
/store/user
or
/store/group
, you will have to know the actual file names yourself. Examples below demonstrate how the files can be accessed through their LFN's in various contexts. What you don't need to know is the actual physical location of the file -- based on the LFN, the system will look up the physical location using a "redirector" that queries potential locations for you, and point your application to a valid location without any intervention from you. Your application will then proceed as it would if your file was local.
Note that your file must be disk-resident rather than archived on tape for this to work. If the file only lives at a site that includes "MSS" or "Buffer" in the site name, then it is only available on tape and would need to be staged to disk first for use via AAA.
Interested users can consult
CmsXrootdArchitecture for technical details on AAA implementation.
Warning for LXPLUS users
Currently LXPLUS at CERN seems to be using a strange IPv6 configuration, which confuses certain
XrootD releases (like the 4.0.4 in CMSSW 7-8). If you get a
No servers are available to read the file
when you would expect a reachable file, you can try setting an environmental variable which forces the use of IPv4.
If you are using
bash
, please do
export XRD_NETWORKSTACK=IPv4
If using
tcsh
, please do
setenv XRD_NETWORKSTACK IPv4
Quick steps to analyze remote data located in remote Tier-2 sites
Have a valid grid proxy
To use AAA, you MUST have a valid grid proxy with a valid VOMS extention for CMS.
This requires that you already have a grid certificate installed (see
Chapter 5 of the CMS workbook). The needed grid proxy is obtained via the usual command
voms-proxy-init --voms cms
Note that the nor
grid-proxy-init
nor a simple
voms-proxy-init
without the
-voms cms
option will work. Neither will it work to let
xroot
ask for the passphrase and create a proxy internally, since it uses
grid-proxy-init
.
The error
[FATAL] Redirect limit has been reached
is usually due to the proxy errors above, however, it can also be due to some problem in the grid somewhere, you can use the
DebugXrootd options listed below to try and understand where the failure is.
Know your redirector
As stated above, when you attempt to open a file, your application must query a redirector to find the file. You must specify the redirector to the application. Which redirector you use depends on your region, to minimize the distance over which the data must travel and thus minimize the reading latency. These "regional" redirectors will try file locations in your region first before trying to go overseas.
If you are working in the US, it is best to use
cmsxrootd.fnal.gov
, while in Europe and Asia, it is best to use
xrootd-cms.infn.it
. There is also a "global redirector" at
cms-xrd-global.cern.ch
which will query all locations.
In the examples below,
cmsxrootd.fnal.gov
is always used, but feel free to replace that with a choice more appropriate for your region.
Open a file using ROOT
If you are using bare ROOT, you can open files in the xrootd service just like you would any other file:
TFile *f =TFile::Open("root://cmsxrootd.fnal.gov///store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root");
Note the prefix of the root://cmsxrootd.fnal.gov/ (or any other redirector name name) in front of your LFN. This returns a TFile object, and you can proceed normally. The same is true for FWLite environment.
BEWARE: do
not use the apparently equivalent syntax, which is known not to work :
TFile("root://cmsxrootd.fnal.gov//store/foo/bar")
BEWARE: This syntax also will fail a large percentage of the time for files accessed through xrootd:
root root://cmsxrootd.fnal.gov//store/foo/bar
Open a file in CMSSW
You want to edit the PoolSource line in your python configuration file to point directly at the Xrootd service, instead of using a generic LFN.
For example, this might be the "before" picture:
process.source = cms.Source("PoolSource",
# # replace 'myfile.root' with the source file you want to use
fileNames = cms.untracked.vstring('/store/myfile.root')
)
Here's the same file, but accessed through the Xrootd Service by simply adding prefix root://cmsxrootd.fnal.gov/ :
process.source = cms.Source("PoolSource",
# # replace 'myfile.root' with the source file you want to use
fileNames = cms.untracked.vstring('root://cmsxrootd.fnal.gov//store/myfile.root')
)
Note that if your site has fallback configured, as described
here, you don't even need to make the above change -- CMSSW will automatically read the file from a remote site!
Let CRAB find your file
Since you are now able to read data from any location, your CRAB job doesn't necessarily have to run at the same site where the data actually resides. This potentially gives your grid jobs access to a much wider range of grid sites -- indeed, any in CMS.
To allow CRAB2 to ignore the location of your dataset and just run where there is an open batch slot, include the line
data_location_override = sites list
in the
[GRID]
block of the configuration file.
Here,
sites list
is a set of site names where you will allow your jobs to run (even if the target dataset is not physically at those sites). For example
data_location_override = T2_US
will allow your job to run at any T2 site in the US. This will only work with the
remoteGlidein
scheduler for CRAB.
In CRAB3, the equivalent line in the configuration file will be
config.Data.ignoreLocality = True
This option is false by default, but users can explore its behavior.
Open a file in Condor Batch or CERN Batch
Condor
If one wants to use local condor batch to analyze user/group skims located at remote sites. The only modification needed is adding:
use_x509userproxy = true
in your condor jdl file (the file which defines universe, Executable, etc..).
For OLDER versions of HTCondor (before 8.0.0), you need:
x509userproxy = /tmp/x509up_uXXXX
The string /tmp/x509up_uXXXX is the string in the "path:" statement from output of "voms-proxy-info -all", which contains your valid grid proxy. Condor will pass this information to the working node of the condor batch.
CERN Batch
Jobs submitted to the
CERN batch farm
will look for a valid grid proxy in the location pointed to by the environment variable $X509_USER_PROXY. (If $X509_USER_PROXY is not set, Xrootd looks for the proxy in the default location in /tmp.)
To make your proxy available to the lxbatch jobs, first copy your proxy to an area in afs, for example your home directory:
cp /tmp/x509up_uXXXX /afs/cern.ch/user/u/username/
If you are submitting a job with
bsub myscript.sh
, then in
myscript.sh
, set this environment variable:
export X509_USER_PROXY=/afs/cern.ch/user/u/username/x509up_uXXXX
File download with command-line tools
If for some reason (perhaps intensive debugging of a particular event) you wish to have the file to located locally, AAA also provides a command-line tool called
xrdcp
. This command line utility ships with stand-alone ROOT and CMSSW, which provides a much easier way to copy a grid file than lcg-cp, srm-cp, and
FileMover utility.
xrdcp root://cmsxrootd.fnal.gov//store/path/to/file /some/local/path
Turn On/Off xrootd Debug
One can turn xrootd debug mode environment variable On or Off. Use the commands inside the
'
one time at the command line for debugging on or off, or set them up in your login files to work on your next login:
If using
bash
, in .bash_profile please add:
alias xrddebugon='export XRD_LOGLEVEL=Debug'
alias xrddebugoff='unset XRD_LOGLEVEL'
If using
tcsh
, in .tcshrc please add:
alias xrdDebug 'setenv XRD_LOGLEVEL Debug'
alias xrdDebugOff 'unsetenv XRD_LOGLEVEL'
Access a file from one specific site
In some special cases, when there are multiple replicas around, you may want to bypass the redirectors and access a file at a specific site. Here's and example of how to do that:
- 1. locate file replicas
dasgoclient -query="site file=/store/data/Run2018A/EGamma/MINIAOD/UL2018_MiniAODv2-v1/50000/8D399CEC-A51E-004C-B4F5-D74B70706892.root"
returns
T1_FR_CCIN2P3_Tape
T1_US_FNAL_Disk
T2_IN_TIFR
T3_US_FNALLPC
- 2. Prepend
/store/test/xrootd/SITENAME
to the logical filename, e.g. /store/test/xrootd/T3_US_FNALLPC/store/data/Run2018A/EGamma/MINIAOD/UL2018_MiniAODv2-v1/50000/8D399CEC-A51E-004C-B4F5-D74B70706892.root
. This forces the xrootd redirector to look only at the specified site, because only that site will have a path /store/test/xrootd/SITENAME
(which is used for SAM tests). This may not work for some T3 sites. Sites listed with _TAPE in the name cannot be read from. For sites listed with _DISK in the name, remove _DISK.
- 3. Test with:
xrdfs root://cmsxrootd.fnal.gov/ ls -l /store/test/xrootd/T3_US_FNALLPC/store/data/Run2018A/EGamma/MINIAOD/UL2018_MiniAODv2-v1/50000/8D399CEC-A51E-004C-B4F5-D74B70706892.root
returns:
-r-- 2022-05-31 21:08:09 4159912199 /store/test/xrootd/T3_US_FNALLPC/store/data/Run2018A/EGamma/MINIAOD/UL2018_MiniAODv2-v1/50000/8D399CEC-A51E-004C-B4F5-D74B70706892.root
- 4. *Caveat*: While we advise/suggest to sites to make /store/test/xrootd/ a link back to /store, the requirement is only to have the SAM datasets accessible via this path. In the past some sites preferred to copy the SAM dataset there manually.
So, the above should work with most sites but is not guaranteed to work for all sites.
Where is "anywhere"?
The data at any CMS disk site are currently available through AAA.
If the file only lives at a site that includes "MSS" or "Buffer" in the site name, then it is only available on tape and would need to be staged to disk first for use via AAA.
Further, there is no guarantee of availability for files only on T3 sites. Only T1_*_Disk and T2_* sites are required to be available for AAA read at all times.
If you wish to check whether your desired file is actually accessible through AAA, execute the command
xrdfs cms-xrd-global.cern.ch locate /store/path/to/file
. As long as you do not get the message
No servers have the file
, it is safe for you to use the AAA service!
Note that if you are given an IP address for "anywhere", you can find out what site that is with the unix command
nslookup.
Support
If there's any problem, please post to the
Computing Tools hypernews
with the print out from below debugging command.
xrdcp -d 1 -f root://cmsxrootd.fnal.gov//store/<path-to-file> /dev/null
where
<path-to-file>
is your file path which usually starts with
/store/
.
Additionally, one can reach the xrootd experts by writing to
hn-cms-wanaccess@cern.ch
.
Review status
Responsible:
BrianBockelman JieChen
Last reviewed by: Most recent reviewer