Difference: WorkBookXrootdService (1 vs. 40)

Revision 402019-08-18 - NitishDhingra

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 11 to 11
 
Changed:
<
<
>
>
 
Line: 20 to 20
 
Added:
>
>
 
Line: 35 to 37
  Interested users can consult CmsXrootdArchitecture for technical details on AAA implementation.
Changed:
<
<
#lxplusWarning
>
>
 

Warning for LXPLUS users

Added:
>
>
 Currently LXPLUS at CERN seems to be using a strange IPv6 configuration, which confuses certain XrootD releases (like the 4.0.4 in CMSSW 7-8). If you get a No servers are available to read the file when you would expect a reachable file, you can try setting an environmental variable which forces the use of IPv4.

If you are using bash, please do

Line: 180 to 175
 xrdcp root://cmsxrootd.fnal.gov//store/path/to/file /some/local/path
Added:
>
>

Turn On/Off xrootd Debug

One can turn xrootd debug mode environment variable On or Off.

If using bash, in .bash_profile please add:

alias xrddebugon='export XRD_LOGLEVEL=Debug'
alias xrddebugoff='unset XRD_LOGLEVEL'
If using tcsh, in .tcshrc please add:
alias xrdDebug 'setenv XRD_LOGLEVEL Debug'
alias xrdDebugOff 'unsetenv XRD_LOGLEVEL'
 

Where is "anywhere"?

Line: 196 to 207
  where <path-to-file> is your file path which usually starts with /store/.
Added:
>
>
Additionally, one can reach the xrootd experts by writing to hn-cms-wanaccess@cern.ch.
 

Review status

Revision 392018-07-13 - LeonardoCristella

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 183 to 183
 

Where is "anywhere"?

Changed:
<
<
The data at three T1 sites and about 40 T2 sites are currently available through AAA. This is not in fact every site in CMS, but about 95% of datasets are available through AAA, and this amount is increasing as we add more sites into the system.
>
>
The data at any CMS site are currently available through AAA.
 
Changed:
<
<
These are the sites that do NOT currently make their data available:

  • T1 sites: Files that are only tape resident are not available.
  • T2 sites: T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_RU_PNPI, T2_RU_RRC_KI, T2_TR_METU, T2_TW_Taiwan

If you wish to check if your desired file is actually available through AAA, execute the command xrdfs cms-xrd-global.cern.ch locate /store/path/to/file. As long as you do not get the message No servers have the file, it is safe for you to use the AAA service!

>
>
If you wish to check whether your desired file is actually accessible through AAA, execute the command xrdfs cms-xrd-global.cern.ch locate /store/path/to/file. As long as you do not get the message No servers have the file, it is safe for you to use the AAA service!
 

Support

If there's any problem, please post to the Computing Tools hypernews with the print out from below debugging command.

Changed:
<
<
xrdcp -d 1 -f root://cmsxrootd.fnal.gov//store/path/to/file /dev/null
>
>
xrdcp -d 1 -f root://cmsxrootd.fnal.gov//store/<path-to-file> /dev/null
 
Added:
>
>
where <path-to-file> is your file path which usually starts with /store/.
 

Review status

Revision 382017-10-12 - StefanoBelforte

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 57 to 57
 

Have a valid grid proxy

Changed:
<
<
To use AAA, you MUST have a valid grid proxy. This requires that you already have a grid certificate installed (see Chapter 5 of the CMS workbook). The grid proxy is obtained via the usual command
>
>
To use AAA, you MUST have a valid grid proxy with a valid VOMS extention for CMS. This requires that you already have a grid certificate installed (see Chapter 5 of the CMS workbook). The needed grid proxy is obtained via the usual command
  voms-proxy-init --voms cms
Added:
>
>
Note that the nor grid-proxy-init nor a simple voms-proxy-init without the -voms cms option will work. Neither will it work to let xroot ask for the passphrase and create a proxy internally, since it uses grid-proxy-init
 

Know your redirector

Revision 372017-08-23 - DmitrySosnov

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 110 to 110
  )
Changed:
<
<
Note that if your site has fallback configured, as described [ConfiguringFallback][here], you don't even need to make the above change -- CMSSW will automatically read the file from a remote site!
>
>
Note that if your site has fallback configured, as described here, you don't even need to make the above change -- CMSSW will automatically read the file from a remote site!
 

Let CRAB find your file

Revision 362016-09-08 - TommasoBoccali

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 11 to 11
 
Added:
>
>
 
Line: 34 to 35
  Interested users can consult CmsXrootdArchitecture for technical details on AAA implementation.
Added:
>
>
#lxplusWarning

Warning for LXPLUS users

Currently LXPLUS at CERN seems to be using a strange IPv6 configuration, which confuses certain XrootD releases (like the 4.0.4 in CMSSW 7-8). If you get a No servers are available to read the file when you would expect a reachable file, you can try setting an environmental variable which forces the use of IPv4.

If you are using bash, please do

export XRD_NETWORKSTACK=IPv4
If using tcsh, please do
setenv XRD_NETWORKSTACK IPv4

 

Quick steps to analyze remote data located in remote Tier-2 sites

Revision 352015-06-22 - RomainRougny

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 168 to 168
 These are the sites that do NOT currently make their data available:

  • T1 sites: Files that are only tape resident are not available.
Changed:
<
<
  • T2 sites: T2_BE_IIHE, T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_RU_PNPI, T2_RU_RRC_KI, T2_TR_METU, T2_TW_Taiwan
>
>
  • T2 sites: T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_RU_PNPI, T2_RU_RRC_KI, T2_TR_METU, T2_TW_Taiwan
  If you wish to check if your desired file is actually available through AAA, execute the command xrdfs cms-xrd-global.cern.ch locate /store/path/to/file. As long as you do not get the message No servers have the file, it is safe for you to use the AAA service!

Revision 342015-05-19 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 93 to 93
  )
Added:
>
>
Note that if your site has fallback configured, as described [ConfiguringFallback][here], you don't even need to make the above change -- CMSSW will automatically read the file from a remote site!
 

Let CRAB find your file

Line: 165 to 167
  These are the sites that do NOT currently make their data available:
Changed:
<
<
  • T1 sites: T1_ES_PIC, T1_TW_ASGC
  • T2 sites: T2_BE_IIHE, T2_BR_UERJ, T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_RU_PNPI, T2_RU_RRC_KI, T2_TR_METU, T2_TW_Taiwan
>
>
  • T1 sites: Files that are only tape resident are not available.
  • T2 sites: T2_BE_IIHE, T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_RU_PNPI, T2_RU_RRC_KI, T2_TR_METU, T2_TW_Taiwan
  If you wish to check if your desired file is actually available through AAA, execute the command xrdfs cms-xrd-global.cern.ch locate /store/path/to/file. As long as you do not get the message No servers have the file, it is safe for you to use the AAA service!

Revision 332015-05-07 - WellsWulsin

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 173 to 173
 

Support

Changed:
<
<
If there's any problem, please post to the WAN access to CMS data hypernews with the print out from below debugging command.
>
>
If there's any problem, please post to the Computing Tools hypernews with the print out from below debugging command.
 
xrdcp -d 1 -f root://cmsxrootd.fnal.gov//store/path/to/file /dev/null

Revision 322015-05-06 - WellsWulsin

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 173 to 173
 

Support

Changed:
<
<
If there's any problem, please send email to hypernews WAN access to CMS data hn-cms-computing-tools@cernNOSPAMPLEASE.ch with the print out from below debugging command.
>
>
If there's any problem, please post to the WAN access to CMS data hypernews with the print out from below debugging command.
 
xrdcp -d 1 -f root://cmsxrootd.fnal.gov//store/path/to/file /dev/null

Revision 312015-01-26 - TommasoBoccali

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 59 to 59
 If you are using bare ROOT, you can open files in the xrootd service just like you would any other file:
Changed:
<
<
TFile *f =TFile::Open("root://cmsxrootd.fnal.gov//store/mc/JobRobot/RelValProdTTbar/GEN-SIM-DIGI-RECO/MC_3XY_V24_JobRobot-v1/0001/56E18353-982C-DF11-B217-00304879FA4A.root");
>
>
TFile *f =TFile::Open("root://cmsxrootd.fnal.gov///store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root");
 

Note the prefix of the root://cmsxrootd.fnal.gov/ (or any other redirector name name) in front of your LFN. This returns a TFile object, and you can proceed normally. The same is true for FWLite environment.

Revision 302015-01-23 - TommasoBoccali

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 49 to 49
  As stated above, when you attempt to open a file, your application must query a redirector to find the file. You must specify the redirector to the application. Which redirector you use depends on your region, to minimize the distance over which the data must travel and thus minimize the reading latency. These "regional" redirectors will try file locations in your region first before trying to go overseas.
Changed:
<
<
If you are working in the US, it is best to use cmsxrootd.fnal.gov, while in Europe and Asia, it is best to use xrootd.ba.infn.it. There is also a "global redirector" at cms-xrd-global.cern.ch which will query all locations.
>
>
If you are working in the US, it is best to use cmsxrootd.fnal.gov, while in Europe and Asia, it is best to use xrootd-cms.infn.it. There is also a "global redirector" at cms-xrd-global.cern.ch which will query all locations.
  In the examples below, cmsxrootd.fnal.gov is always used, but feel free to replace that with a choice more appropriate for your region.

Revision 292014-10-21 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 111 to 111
  config.Data.ignoreLocality = True
Changed:
<
<
but this feature is not currently operational in CRAB3.
>
>
This option is false by default, but users can explore its behavior.
 

Open a file in Condor Batch or CERN Batch

Revision 282014-09-17 - StefanoBelforte

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 64 to 64
  Note the prefix of the root://cmsxrootd.fnal.gov/ (or any other redirector name name) in front of your LFN. This returns a TFile object, and you can proceed normally. The same is true for FWLite environment.
Added:
>
>
BEWARE: do not use the apparently equivalent syntax, which is known not to work :
TFile("root://cmsxrootd.fnal.gov//store/foo/bar")

 

Open a file in CMSSW

Revision 272014-09-02 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 49 to 49
  As stated above, when you attempt to open a file, your application must query a redirector to find the file. You must specify the redirector to the application. Which redirector you use depends on your region, to minimize the distance over which the data must travel and thus minimize the reading latency. These "regional" redirectors will try file locations in your region first before trying to go overseas.
Changed:
<
<
If you are working in the US, it is best to use xrootd.unl.edu, while in Europe and Asia, it is best to use xrootd.ba.infn.it. There is also a "global redirector" at cms-xrd-global.cern.ch which will query all locations.
>
>
If you are working in the US, it is best to use cmsxrootd.fnal.gov, while in Europe and Asia, it is best to use xrootd.ba.infn.it. There is also a "global redirector" at cms-xrd-global.cern.ch which will query all locations.
 
Changed:
<
<
In the examples below, xrootd.unl.edu is always used, but feel free to replace that with a choice more appropriate for your region.
>
>
In the examples below, cmsxrootd.fnal.gov is always used, but feel free to replace that with a choice more appropriate for your region.
 

Open a file using ROOT

Line: 59 to 59
 If you are using bare ROOT, you can open files in the xrootd service just like you would any other file:
Changed:
<
<
TFile *f =TFile::Open("root://xrootd.unl.edu//store/mc/JobRobot/RelValProdTTbar/GEN-SIM-DIGI-RECO/MC_3XY_V24_JobRobot-v1/0001/56E18353-982C-DF11-B217-00304879FA4A.root");
>
>
TFile *f =TFile::Open("root://cmsxrootd.fnal.gov//store/mc/JobRobot/RelValProdTTbar/GEN-SIM-DIGI-RECO/MC_3XY_V24_JobRobot-v1/0001/56E18353-982C-DF11-B217-00304879FA4A.root");
 
Changed:
<
<
Note the prefix of the root://xrootd.unl.edu/ (or any other redirector name name) in front of your LFN. This returns a TFile object, and you can proceed normally. The same is true for FWLite environment.
>
>
Note the prefix of the root://cmsxrootd.fnal.gov/ (or any other redirector name name) in front of your LFN. This returns a TFile object, and you can proceed normally. The same is true for FWLite environment.
 

Open a file in CMSSW

Line: 77 to 77
  )
Changed:
<
<
Here's the same file, but accessed through the Xrootd Service by simply adding prefix root://xrootd.unl.edu/ :
>
>
Here's the same file, but accessed through the Xrootd Service by simply adding prefix root://cmsxrootd.fnal.gov/ :
 
process.source = cms.Source("PoolSource",
                            #                            # replace 'myfile.root' with the source file you want to use
Changed:
<
<
fileNames = cms.untracked.vstring('root://xrootd.unl.edu//store/myfile.root')
>
>
fileNames = cms.untracked.vstring('root://cmsxrootd.fnal.gov//store/myfile.root')
  )
Line: 147 to 147
 If for some reason (perhaps intensive debugging of a particular event) you wish to have the file to located locally, AAA also provides a command-line tool called xrdcp. This command line utility ships with stand-alone ROOT and CMSSW, which provides a much easier way to copy a grid file than lcg-cp, srm-cp, and FileMover utility.
Changed:
<
<
xrdcp root://xrootd.unl.edu//store/path/to/file /some/local/path
>
>
xrdcp root://cmsxrootd.fnal.gov//store/path/to/file /some/local/path
 

Line: 167 to 167
  If there's any problem, please send email to hypernews WAN access to CMS data hn-cms-computing-tools@cernNOSPAMPLEASE.ch with the print out from below debugging command.
Changed:
<
<
xrdcp -d 1 -f root://xrootd.unl.edu//store/path/to/file /dev/null
>
>
xrdcp -d 1 -f root://cmsxrootd.fnal.gov//store/path/to/file /dev/null
 

Revision 262014-08-19 - WellsWulsin

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 17 to 17
 
Changed:
<
<
>
>
 
Line: 106 to 106
 but this feature is not currently operational in CRAB3.

Changed:
<
<

Open a file in Condor Batch

>
>

Open a file in Condor Batch or CERN Batch

Condor

  If one wants to use local condor batch to analyze user/group skims located at remote sites. The only modification needed is adding:
Line: 124 to 126
  The string /tmp/x509up_uXXXX is the string in the "path:" statement from output of "voms-proxy-info -all", which contains your valid grid proxy. Condor will pass this information to the working node of the condor batch.
Added:
>
>

CERN Batch

Jobs submitted to the CERN batch farm will look for a valid grid proxy in the location pointed to by the environment variable $X509_USER_PROXY. (If $X509_USER_PROXY is not set, Xrootd looks for the proxy in the default location in /tmp.)

To make your proxy available to the lxbatch jobs, first copy your proxy to an area in afs, for example your home directory:

cp /tmp/x509up_uXXXX  /afs/cern.ch/user/u/username/

If you are submitting a job with bsub myscript.sh, then in myscript.sh, set this environment variable:

export X509_USER_PROXY=/afs/cern.ch/user/u/username/x509up_uXXXX

 

File download with command-line tools

Revision 252014-07-08 - StefanoBelforte

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 92 to 92
  To allow CRAB2 to ignore the location of your dataset and just run where there is an open batch slot, include the line
Changed:
<
<
data_location_override = se_white_list
>
>
data_location_override = sites list

in the [GRID] block of the configuration file.

Here, sites list is a set of site names where you will allow your jobs to run (even if the target dataset is not physically at those sites). For example data_location_override = T2_US will allow your job to run at any T2 site in the US. This will only work with the remoteGlidein scheduler for CRAB.

 
Deleted:
<
<
in the [GRID] block of the configuration file. Here, se_white_list is a set of site names where you will allow your jobs to run (even if the target dataset is not physically at those sites). For instance, an se_white_list value of T2_US will allow your job to run at any T2 site in the US. This will only work with the remoteGlidein scheduler for CRAB.
  In CRAB3, the equivalent line in the configuration file will be

Revision 242014-07-01 - StefanoBelforte

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 145 to 145
 

Support

Changed:
<
<
If there's any problem, please send email to hypernews WAN access to CMS data hn-cms-wanaccess@cernNOSPAMPLEASE.ch with the print out from below debugging command.
>
>
If there's any problem, please send email to hypernews WAN access to CMS data hn-cms-computing-tools@cernNOSPAMPLEASE.ch with the print out from below debugging command.
 
xrdcp -d 1 -f root://xrootd.unl.edu//store/path/to/file /dev/null

Revision 232014-06-12 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 138 to 138
 These are the sites that do NOT currently make their data available:

  • T1 sites: T1_ES_PIC, T1_TW_ASGC
Changed:
<
<
  • T2 sites: T2_BE_IIHE, T2_BR_UERJ, T2_EE_Estonia, T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_RU_PNPI, T2_RU_RRC_KI, T2_TR_METU, T2_TW_Taiwan
>
>
  • T2 sites: T2_BE_IIHE, T2_BR_UERJ, T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_RU_PNPI, T2_RU_RRC_KI, T2_TR_METU, T2_TW_Taiwan
  If you wish to check if your desired file is actually available through AAA, execute the command xrdfs cms-xrd-global.cern.ch locate /store/path/to/file. As long as you do not get the message No servers have the file, it is safe for you to use the AAA service!

Revision 222014-06-03 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 127 to 127
 If for some reason (perhaps intensive debugging of a particular event) you wish to have the file to located locally, AAA also provides a command-line tool called xrdcp. This command line utility ships with stand-alone ROOT and CMSSW, which provides a much easier way to copy a grid file than lcg-cp, srm-cp, and FileMover utility.
Changed:
<
<
xrdcp root://xrootd.unl.edu//store/foo /some/local/path
>
>
xrdcp root://xrootd.unl.edu//store/path/to/file /some/local/path
 

Line: 140 to 140
 
  • T1 sites: T1_ES_PIC, T1_TW_ASGC
  • T2 sites: T2_BE_IIHE, T2_BR_UERJ, T2_EE_Estonia, T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_RU_PNPI, T2_RU_RRC_KI, T2_TR_METU, T2_TW_Taiwan
Changed:
<
<
If you wish to check if your desired file is actually available through AAA, execute the command xrdfs cms-xrd-global.cern.ch locate /store/foo. As long as you do not get the message No servers have the file, it is safe for you to use the AAA service!
>
>
If you wish to check if your desired file is actually available through AAA, execute the command xrdfs cms-xrd-global.cern.ch locate /store/path/to/file. As long as you do not get the message No servers have the file, it is safe for you to use the AAA service!
 

Support

If there's any problem, please send email to hypernews WAN access to CMS data hn-cms-wanaccess@cernNOSPAMPLEASE.ch with the print out from below debugging command.

Changed:
<
<
xrdcp -d 1 -f root://xrootd.unl.edu//store/foo /dev/null
>
>
xrdcp -d 1 -f root://xrootd.unl.edu//store/path/to/file /dev/null
 

Revision 212014-05-30 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 105 to 105
 

Open a file in Condor Batch

Changed:
<
<
If one wants to use local condor batch to analyze user/group skims located at remote sites. The only modification needed is adding
>
>
If one wants to use local condor batch to analyze user/group skims located at remote sites. The only modification needed is adding:

use_x509userproxy = true

in your condor jdl file (the file which defines universe, Executable, etc..).

For OLDER versions of HTCondor (before 8.0.0), you need:

 
     x509userproxy = /tmp/x509up_uXXXX
Changed:
<
<
in your condor jdl file (the file which defines universe, Executable, etc..). The string /tmp/x509up_uXXXX is the string in the "path:" statement from output of "voms-proxy-info -all", which contains your valid grid proxy. Condor will pass this information to the working node of the condor batch.
>
>
The string /tmp/x509up_uXXXX is the string in the "path:" statement from output of "voms-proxy-info -all", which contains your valid grid proxy. Condor will pass this information to the working node of the condor batch.
 

File download with command-line tools

Revision 202014-05-19 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 88 to 88
 

Let CRAB find your file

Changed:
<
<
Since you are now able to read data from any location, your CRAB job doesn't necessarily have to run at the same site where the data actually resides. This potentially gives your grid jobs access to a much wider range of grid sites -- indeed, any in CMS. To allow CRAB to ignore the location of your dataset and just run where there is an open batch slot, include the line
>
>
Since you are now able to read data from any location, your CRAB job doesn't necessarily have to run at the same site where the data actually resides. This potentially gives your grid jobs access to a much wider range of grid sites -- indeed, any in CMS.

To allow CRAB2 to ignore the location of your dataset and just run where there is an open batch slot, include the line

  data_location_override = se_white_list

in the [GRID] block of the configuration file. Here, se_white_list is a set of site names where you will allow your jobs to run (even if the target dataset is not physically at those sites). For instance, an se_white_list value of T2_US will allow your job to run at any T2 site in the US. This will only work with the remoteGlidein scheduler for CRAB.

Added:
>
>
In CRAB3, the equivalent line in the configuration file will be

config.Data.ignoreLocality = True

but this feature is not currently operational in CRAB3.

 

Open a file in Condor Batch

Revision 192014-04-17 - TommasoBoccali

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 49 to 49
  As stated above, when you attempt to open a file, your application must query a redirector to find the file. You must specify the redirector to the application. Which redirector you use depends on your region, to minimize the distance over which the data must travel and thus minimize the reading latency. These "regional" redirectors will try file locations in your region first before trying to go overseas.
Changed:
<
<
If you are working in the US, it is best to use xrootd.unl.edu, while in Europe and Asia, it is best to use xrootd-cms.infn.it. There is also a "global redirector" at cms-xrd-global.cern.ch which will query all locations.
>
>
If you are working in the US, it is best to use xrootd.unl.edu, while in Europe and Asia, it is best to use xrootd.ba.infn.it. There is also a "global redirector" at cms-xrd-global.cern.ch which will query all locations.
  In the examples below, xrootd.unl.edu is always used, but feel free to replace that with a choice more appropriate for your region.

Revision 182014-04-14 - TommasoBoccali

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 49 to 49
  As stated above, when you attempt to open a file, your application must query a redirector to find the file. You must specify the redirector to the application. Which redirector you use depends on your region, to minimize the distance over which the data must travel and thus minimize the reading latency. These "regional" redirectors will try file locations in your region first before trying to go overseas.
Changed:
<
<
If you are working in the US, it is best to use xrootd.unl.edu, while in Europe and Asia, it is best to use xrootd.ba.infn.it. There is also a "global redirector" at cms-xrd-global.cern.ch which will query all locations.
>
>
If you are working in the US, it is best to use xrootd.unl.edu, while in Europe and Asia, it is best to use xrootd-cms.infn.it. There is also a "global redirector" at cms-xrd-global.cern.ch which will query all locations.
  In the examples below, xrootd.unl.edu is always used, but feel free to replace that with a choice more appropriate for your region.

Revision 172014-04-07 - ElliotHughes

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 25 to 25
 

Goals of this workbook page

Changed:
<
<
This page describes Any Data, Anytime, Anywhere (AAA), CMS's implementation of a generic xrootd service for analyzing CMS data located at any grid site with bare ROOT or the CMSSW/FWLite environment, without downloading it to your local storage. Much effort has been invested in having ROOT and CMSSW read remote files efficiently, so that you will be able to analyze data without know whether the input file is on your computer or halfway around the world! AAA is also allowing for greater resilience against damaged or missing input files, and for greater use of opportunistic resources.
>
>
This page describes Any Data, Anytime, Anywhere (AAA), CMS's implementation of a generic xrootd service for analyzing CMS data located at any grid site with bare ROOT or the CMSSW/FWLite environment, without downloading it to your local storage. Much effort has been invested in having ROOT and CMSSW read remote files efficiently, so that you will be able to analyze data without knowing whether the input file is on your computer or halfway around the world! AAA also allows for greater resilience against damaged or missing input files, and for greater use of opportunistic resources.
 

Introduction to the AAA Service

Revision 162014-03-28 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 13 to 13
 
Changed:
<
<
>
>
 
Line: 44 to 44
  voms-proxy-init --voms cms
Changed:
<
<
#Redirector
>
>
 

Know your redirector

As stated above, when you attempt to open a file, your application must query a redirector to find the file. You must specify the redirector to the application. Which redirector you use depends on your region, to minimize the distance over which the data must travel and thus minimize the reading latency. These "regional" redirectors will try file locations in your region first before trying to go overseas.

Revision 152014-03-26 - EmyrClement

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 92 to 92
  data_location_override = se_white_list
Changed:
<
<
in the [USER] block of the configuration file. Here, se_white_list is a set of site names where you will allow your jobs to run (even if the target dataset is not physically at those sites). For instance, an se_white_list value of T2_US will allow your job to run at any T2 site in the US. This will only work with the remoteGlidein scheduler for CRAB.
>
>
in the [GRID] block of the configuration file. Here, se_white_list is a set of site names where you will allow your jobs to run (even if the target dataset is not physically at those sites). For instance, an se_white_list value of T2_US will allow your job to run at any T2 site in the US. This will only work with the remoteGlidein scheduler for CRAB.
 

Open a file in Condor Batch

Revision 142014-03-14 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 120 to 120
  These are the sites that do NOT currently make their data available:
Changed:
<
<
  • T1 sites: T1_DE_KIT, T1_ES_PIC, T2_FR_CCIN2P3, T1_TW_ASGC
  • T2 sites: T2_BE_IIHE, T2_BR_UERJ, T2_EE_Estonia, T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_PT_NCG_Lisbon, T2_RU_PNP, T2_RU_RRC_KI, T2_TH_CUNSTDA, T2_TR_METU, T2_TW_Taiwan
>
>
  • T1 sites: T1_ES_PIC, T1_TW_ASGC
  • T2 sites: T2_BE_IIHE, T2_BR_UERJ, T2_EE_Estonia, T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_RU_PNPI, T2_RU_RRC_KI, T2_TR_METU, T2_TW_Taiwan
 
Changed:
<
<
If you wish to check if your desired file is actually available through AAA, execute the command xrd xrootd.unl.edu existfile /store/foo. If you are told, "The file exists." then it is safe for you to use the AAA service!
>
>
If you wish to check if your desired file is actually available through AAA, execute the command xrdfs cms-xrd-global.cern.ch locate /store/foo. As long as you do not get the message No servers have the file, it is safe for you to use the AAA service!
 

Support

Revision 132014-01-29 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service (AAA) for Remote Data Access

Line: 92 to 92
  data_location_override = se_white_list
Changed:
<
<
where se_white_list is a set of site names where you will allow your jobs to run (even if the target dataset is not physically at those sites). For instance, an se_white_list value of T2_US will allow your job to run at any T2 site in the US. This will only work with the remoteGlidein scheduler for CRAB.
>
>
in the [USER] block of the configuration file. Here, se_white_list is a set of site names where you will allow your jobs to run (even if the target dataset is not physically at those sites). For instance, an se_white_list value of T2_US will allow your job to run at any T2 site in the US. This will only work with the remoteGlidein scheduler for CRAB.
 

Open a file in Condor Batch

Revision 122014-01-29 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"
Changed:
<
<

5.13 Using Xrootd Service for Remote Data Access

>
>

5.13 Using Xrootd Service (AAA) for Remote Data Access

 
<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->
Line: 10 to 10
 

Contents

Changed:
<
<
>
>
 
Changed:
<
<
>
>
 
Added:
>
>
 
Changed:
<
<
>
>
 

Goals of this workbook page

Changed:
<
<
This page provides you a generic Xrootd service in analyzing CMS data (AOD/RECO data, Group skim ntuple, User skim ntuple) located at any grid site with bare root or CMSSW/FWLite environment, without downloading it to your institution cluster or your laptop.
>
>
This page describes Any Data, Anytime, Anywhere (AAA), CMS's implementation of a generic xrootd service for analyzing CMS data located at any grid site with bare ROOT or the CMSSW/FWLite environment, without downloading it to your local storage. Much effort has been invested in having ROOT and CMSSW read remote files efficiently, so that you will be able to analyze data without know whether the input file is on your computer or halfway around the world! AAA is also allowing for greater resilience against damaged or missing input files, and for greater use of opportunistic resources.
 
Changed:
<
<

Introduction to the Xrootd Service

>
>

Introduction to the AAA Service

 
Changed:
<
<
Xrootd (the eXtended ROOT file server daemon) is a new service provided under the Any Data, Any Time, Anywhere (AAA) Infrastructure of Open science grid and Worldwide LHC Computing Grid. AAA provides a new paradigm shift in data analysis from co-location of data and CPU to a dynamic model which analyzes remotely located data with local CPU. The Xrootd has been integrated with standalone ROOT. Interested users can consult CmsXrootdArchitecture for technical details on Xrootd implementation.
>
>
To access a particular file, no matter where in the world it is, you only need to know the Logical File Name (LFN) of the file. The LFN uniquely identifies any file that is somewhere with in the /store directory tree within all of CMS storage. The LFN's of files that are in defined CMS datasets can be found through the DAS service (see WorkBook Chapter 5.4). For files that are not in official datasets, such as those in /store/user or /store/group, you will have to know the actual file names yourself. Examples below demonstrate how the files can be accessed through their LFN's in various contexts. What you don't need to know is the actual physical location of the file -- based on the LFN, the system will look up the physical location using a "redirector" that queries potential locations for you, and point your application to a valid location without any intervention from you. Your application will then proceed as it would if your file was local.
 
Changed:
<
<
At CMS, Xrootd service allows user to process the data located anywhere in CMS grid without downloading it to local desktop/laptop/computer cluster. One only needs to know the Logical File Name (LFN) of a production dataset (/store/data/foo) or one of your group/user files at a Tier-2 (for example /store/group/exotica/foo or /store/user/bbockelm/foo and NOTE every CMS user has at least one /store/user space at some Tier-2). You can easily get those LFNs through DAS (see WorkBook Chapter 5.4) service, or you just specify your own user/group store path if the file is not part of a global or local dataset instance.

This will be useful in remotely analyzing skim Patuple or simple ntuples produced by group skim or user skim located at Tier-2 centers with even bare ROOT, doing test job on selected events with AOD/RECO data located at Tier-2 sites with CMSSW/FWLite. In future, this Xrootd service will be expanded to any grid cite(TIer-1, Tier-2 and Tier-3) and any file type.

>
>
Interested users can consult CmsXrootdArchitecture for technical details on AAA implementation.
 

Quick steps to analyze remote data located in remote Tier-2 sites

Changed:
<
<

Pre-requisite

>
>

Have a valid grid proxy

 
Changed:
<
<
To use Xrootd, you only need to have your grid certificate ready. i.e.,you must have a grid certificate installed into ~/.globus and registered in CMS, see Chapter 5 of the CMS workbook on how to get the certificate. To set up your grid environment, do in the normal way.
>
>
To use AAA, you MUST have a valid grid proxy. This requires that you already have a grid certificate installed (see Chapter 5 of the CMS workbook). The grid proxy is obtained via the usual command
 
Changed:
<
<
From CERN lxplus, source the following script:
source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.sh
>
>
voms-proxy-init --voms cms
 
Changed:
<
<
From FNAL lpc, source
source /uscmst1/prod/grid/gLite_SL5.csh
>
>
#Redirector

Know your redirector

As stated above, when you attempt to open a file, your application must query a redirector to find the file. You must specify the redirector to the application. Which redirector you use depends on your region, to minimize the distance over which the data must travel and thus minimize the reading latency. These "regional" redirectors will try file locations in your region first before trying to go overseas.

 
Changed:
<
<
Now you're ready to analyze remote data with bare ROOT or CMSSW (FWLite) environment.
>
>
If you are working in the US, it is best to use xrootd.unl.edu, while in Europe and Asia, it is best to use xrootd.ba.infn.it. There is also a "global redirector" at cms-xrd-global.cern.ch which will query all locations.

In the examples below, xrootd.unl.edu is always used, but feel free to replace that with a choice more appropriate for your region.

 

Open a file using ROOT

Line: 63 to 62
 TFile *f =TFile::Open("root://xrootd.unl.edu//store/mc/JobRobot/RelValProdTTbar/GEN-SIM-DIGI-RECO/MC_3XY_V24_JobRobot-v1/0001/56E18353-982C-DF11-B217-00304879FA4A.root");
Changed:
<
<
note the prefix of the root://xrootd.unl.edu/ (or anyother XrootdSrv name) in front of your LFN. This returns a TFile object, and you can proceed normally. Same is true for FWLite environment.
>
>
Note the prefix of the root://xrootd.unl.edu/ (or any other redirector name name) in front of your LFN. This returns a TFile object, and you can proceed normally. The same is true for FWLite environment.
 

Open a file in CMSSW

Line: 86 to 85
  )
Added:
>
>

Let CRAB find your file

Since you are now able to read data from any location, your CRAB job doesn't necessarily have to run at the same site where the data actually resides. This potentially gives your grid jobs access to a much wider range of grid sites -- indeed, any in CMS. To allow CRAB to ignore the location of your dataset and just run where there is an open batch slot, include the line

data_location_override = se_white_list

where se_white_list is a set of site names where you will allow your jobs to run (even if the target dataset is not physically at those sites). For instance, an se_white_list value of T2_US will allow your job to run at any T2 site in the US. This will only work with the remoteGlidein scheduler for CRAB.

 

Open a file in Condor Batch

Line: 96 to 104
  in your condor jdl file (the file which defines universe, Executable, etc..). The string /tmp/x509up_uXXXX is the string in the "path:" statement from output of "voms-proxy-info -all", which contains your valid grid proxy. Condor will pass this information to the working node of the condor batch.
Deleted:
<
<
This feature allows you to process large amount of edm or non-edm root files remotely using your local batch CPU power. For example, processing Tier-2 skim ntuple with your local Tier-3 condor.
 
Changed:
<
<

File Download with command-line tools

>
>

File download with command-line tools

 
Changed:
<
<
If you still like to have the file to located locally, Xrootd service also provides a command-line tool called xrdcp. This command line utility ships with stand-alone ROOT and CMSSW, which provides a much easier way to copy a grid file than lcg-cp, srm-cp, and FileMover utility.
>
>
If for some reason (perhaps intensive debugging of a particular event) you wish to have the file to located locally, AAA also provides a command-line tool called xrdcp. This command line utility ships with stand-alone ROOT and CMSSW, which provides a much easier way to copy a grid file than lcg-cp, srm-cp, and FileMover utility.
 
xrdcp root://xrootd.unl.edu//store/foo /some/local/path

Changed:
<
<

Current Xrootd Service Status

>
>

Where is "anywhere"?

The data at three T1 sites and about 40 T2 sites are currently available through AAA. This is not in fact every site in CMS, but about 95% of datasets are available through AAA, and this amount is increasing as we add more sites into the system.

These are the sites that do NOT currently make their data available:

  • T1 sites: T1_DE_KIT, T1_ES_PIC, T2_FR_CCIN2P3, T1_TW_ASGC
  • T2 sites: T2_BE_IIHE, T2_BR_UERJ, T2_EE_Estonia, T2_GR_Ioannina, T2_MY_UPM_BIRUNI, T2_PK_NCP, T2_PL_Warsaw, T2_PT_NCG_Lisbon, T2_RU_PNP, T2_RU_RRC_KI, T2_TH_CUNSTDA, T2_TR_METU, T2_TW_Taiwan
 
Changed:
<
<
The service is in production. Below are the list of sites currently providing Xrootd Service (Jan 2013):
>
>
If you wish to check if your desired file is actually available through AAA, execute the command xrd xrootd.unl.edu existfile /store/foo. If you are told, "The file exists." then it is safe for you to use the AAA service!
 
Deleted:
<
<
  • US Region
    1. T1_US_FNAL (disk-only; includes test EOS service)
    2. T2_US_Caltech
    3. T2_US_Florida
    4. T2_US_Nebraska
    5. T2_US_Purdue
    6. T2_US_UCSD
    7. T2_US_Wisconsin
    8. T2_US_MIT
    9. T2_US_Vanderbilt
  • EU Region
    1. T1_CH_CERN (EOSCMS only)
    2. T2_IT_Bari
    3. T2_IT_Pisa
    4. T2_IT_Legnaro
    5. T2_IT_Rome
    6. T2_DE_DESY
    7. T2_UK_*
    8. T2_EE_Estonia
Xrootd Service is also exploited by CRAB jobs when using glidein. In two ways:
  1. CMS sites are configuting their local file catalog so that whenever cmsRun fails to open a file will try to find it at a remote site via Xrootd Service before giving up and returning error. This will work also for interactive cmsRun.
  2. If your job has been queued for more than 12 hours at a US T2 site, it may be run at additional US sites (currently, T2_US_Purdue, T2_US_Nebraska, T2_US_Wisconsin, or T2_US_UCSD), even if the files are not present at those sites. When run, the jobs will automatically switch to reading the data from remote sites.There is nothing which needs to be done on the user side to use this feature, while it can be disabled if for some reason running elsewhere is not desired, see Crab documentation. This feature may be extended to other regions then US in the future.
 

Support

Line: 148 to 139
 
<!-- Add your review status in this table structure with 2 columns delineated by three vertical bars -->

Reviewer/Editor and Date (copy from screen) Comments
Added:
>
>
-- KenBloom - 29 Jan 2014 Substantial revisions to reflect current status
 
-- LucianoBarone - 27 Nov 2013 added Rome among sites with Xrootd enabled
-- JohnStupak - 14-September-2013 Review with minor changes
-- StefanoBelforte - 16 Jan 2013 more details on how Crab uses this

Revision 112013-11-27 - LucianoBarone

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service for Remote Data Access

Line: 127 to 127
 
    1. T2_IT_Bari
    2. T2_IT_Pisa
    3. T2_IT_Legnaro
Added:
>
>
    1. T2_IT_Rome
 
    1. T2_DE_DESY
    2. T2_UK_*
    3. T2_EE_Estonia
Line: 147 to 148
 
<!-- Add your review status in this table structure with 2 columns delineated by three vertical bars -->

Reviewer/Editor and Date (copy from screen) Comments
Added:
>
>
-- LucianoBarone - 27 Nov 2013 added Rome among sites with Xrootd enabled
 
-- JohnStupak - 14-September-2013 Review with minor changes
-- StefanoBelforte - 16 Jan 2013 more details on how Crab uses this
-- JieChen - 8 Jan 2013 modified the content slightly to be user friendly

Revision 102013-09-18 - DanielHuizenga

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service for Remote Data Access

Added:
>
>
 
<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->

Complete: 3

Line: 4 to 5
 
<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->

Complete: 3

Changed:
<
<

Detailed Review status
>
>

Detailed Review status
 

Contents

Line: 24 to 23
 

Goals of this workbook page

Changed:
<
<
This page provides you a generic Xrootd service in analyzing CMS data (AOD/RECO data, Group skim ntuple, User skim ntuple) located at any grid cite with bare root or CMSSW/FWLite environment, without downloading it to your institution cluster or your laptop.
>
>
This page provides you a generic Xrootd service in analyzing CMS data (AOD/RECO data, Group skim ntuple, User skim ntuple) located at any grid site with bare root or CMSSW/FWLite environment, without downloading it to your institution cluster or your laptop.
 

Introduction to the Xrootd Service

Line: 63 to 62
 
TFile *f =TFile::Open("root://xrootd.unl.edu//store/mc/JobRobot/RelValProdTTbar/GEN-SIM-DIGI-RECO/MC_3XY_V24_JobRobot-v1/0001/56E18353-982C-DF11-B217-00304879FA4A.root");
Changed:
<
<
note the prefix of the root://xrootd.unl.edu/ (or anyother XrootdSrv name) in front of your LFN. This returns a TFile object, and you can proceed normally. Same is true for FWLite environment.
>
>
note the prefix of the root://xrootd.unl.edu/ (or anyother XrootdSrv name) in front of your LFN. This returns a TFile object, and you can proceed normally. Same is true for FWLite environment.
 

Open a file in CMSSW

Line: 134 to 133
 Xrootd Service is also exploited by CRAB jobs when using glidein. In two ways:
  1. CMS sites are configuting their local file catalog so that whenever cmsRun fails to open a file will try to find it at a remote site via Xrootd Service before giving up and returning error. This will work also for interactive cmsRun.
  2. If your job has been queued for more than 12 hours at a US T2 site, it may be run at additional US sites (currently, T2_US_Purdue, T2_US_Nebraska, T2_US_Wisconsin, or T2_US_UCSD), even if the files are not present at those sites. When run, the jobs will automatically switch to reading the data from remote sites.There is nothing which needs to be done on the user side to use this feature, while it can be disabled if for some reason running elsewhere is not desired, see Crab documentation. This feature may be extended to other regions then US in the future.
Deleted:
<
<
 

Support

Changed:
<
<
If there's any problem, please send email to hypernews WAN access to CMS data hn-cms-wanaccess@cernNOSPAMPLEASE.ch with the print out from below debugging command.
>
>
If there's any problem, please send email to hypernews WAN access to CMS data hn-cms-wanaccess@cernNOSPAMPLEASE.ch with the print out from below debugging command.
 
xrdcp -d 1 -f root://xrootd.unl.edu//store/foo /dev/null
Line: 150 to 147
 
<!-- Add your review status in this table structure with 2 columns delineated by three vertical bars -->

Reviewer/Editor and Date (copy from screen) Comments
Added:
>
>
-- JohnStupak - 14-September-2013 Review with minor changes
 
-- StefanoBelforte - 16 Jan 2013 more details on how Crab uses this
-- JieChen - 8 Jan 2013 modified the content slightly to be user friendly
-- JieChen - 7 Jan 2013 move the page from Brian Bockelman's HdfsXrootdto page to SWGuide format

Revision 92013-01-17 - StefanoBelforte

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"
Changed:
<
<

5.13 Using Xrootd Service for Remote Data Accessing

>
>

5.13 Using Xrootd Service for Remote Data Access

 
<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->

Complete: 3

Revision 82013-01-16 - StefanoBelforte

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service for Remote Data Accessing

<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->
Line: 109 to 109
 

Current Xrootd Service Status

Changed:
<
<
The service is in production. Below are the list of sites currently providing Xrootd Service.
>
>
The service is in production. Below are the list of sites currently providing Xrootd Service (Jan 2013):
 
  • US Region
    1. T1_US_FNAL (disk-only; includes test EOS service)
Line: 130 to 130
 
    1. T2_UK_*
    2. T2_EE_Estonia
Changed:
<
<
Xrootd service is also integrated to CRAB now. If your job has been queued for more than 12 hours, it may be run at additional US sites (currently, T2_US_Purdue, T2_US_Nebraska, T2_US_Wisconsin, or T2_US_UCSD), even if the files are not present at those sites. When run, the jobs will automatically switch to reading the data from remote sites.There is nothing which needs to be done on the user side to use this feature.
>
>
Xrootd Service is also exploited by CRAB jobs when using glidein. In two ways:
  1. CMS sites are configuting their local file catalog so that whenever cmsRun fails to open a file will try to find it at a remote site via Xrootd Service before giving up and returning error. This will work also for interactive cmsRun.
  2. If your job has been queued for more than 12 hours at a US T2 site, it may be run at additional US sites (currently, T2_US_Purdue, T2_US_Nebraska, T2_US_Wisconsin, or T2_US_UCSD), even if the files are not present at those sites. When run, the jobs will automatically switch to reading the data from remote sites.There is nothing which needs to be done on the user side to use this feature, while it can be disabled if for some reason running elsewhere is not desired, see Crab documentation. This feature may be extended to other regions then US in the future.
 

Support

Line: 146 to 150
 
<!-- Add your review status in this table structure with 2 columns delineated by three vertical bars -->

Reviewer/Editor and Date (copy from screen) Comments
Added:
>
>
-- StefanoBelforte - 16 Jan 2013 more details on how Crab uses this
 
-- JieChen - 8 Jan 2013 modified the content slightly to be user friendly
-- JieChen - 7 Jan 2013 move the page from Brian Bockelman's HdfsXrootdto page to SWGuide format

Revision 72013-01-14 - JieChen

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service for Remote Data Accessing

<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->
Line: 15 to 15
 
Added:
>
>
 
Line: 30 to 31
  Xrootd (the eXtended ROOT file server daemon) is a new service provided under the Any Data, Any Time, Anywhere (AAA) Infrastructure of Open science grid and Worldwide LHC Computing Grid. AAA provides a new paradigm shift in data analysis from co-location of data and CPU to a dynamic model which analyzes remotely located data with local CPU. The Xrootd has been integrated with standalone ROOT. Interested users can consult CmsXrootdArchitecture for technical details on Xrootd implementation.
Changed:
<
<
At CMS, Xrootd service allows user to process the data located anywhere in CMS grid without downloading it to local desktop/laptop/computer cluster. One only needs to know the Logical File Name (LFN) of a production dataset (/store/data/foo) or one of your group/user files at a Tier-2 (/store/group/exotica/foo or /store/user/bbockelm/foo). You can easily get those LFNs through DAS (see WorkBook Chapter 5.4) service.
>
>
At CMS, Xrootd service allows user to process the data located anywhere in CMS grid without downloading it to local desktop/laptop/computer cluster. One only needs to know the Logical File Name (LFN) of a production dataset (/store/data/foo) or one of your group/user files at a Tier-2 (for example /store/group/exotica/foo or /store/user/bbockelm/foo and NOTE every CMS user has at least one /store/user space at some Tier-2). You can easily get those LFNs through DAS (see WorkBook Chapter 5.4) service, or you just specify your own user/group store path if the file is not part of a global or local dataset instance.
 
Changed:
<
<
This will be useful in remotely analyzing ntuples produced by group skim or user skim located at Tier-2 centers with bare ROOT, doing test job on selected events with AOD/RECO data located at Tier-2 sites with CMSSW/FWLite. In future, this Xrootd service will be expanded to any grid cite(TIer-1, Tier-2 and Tier-3) and any file type.
>
>
This will be useful in remotely analyzing skim Patuple or simple ntuples produced by group skim or user skim located at Tier-2 centers with even bare ROOT, doing test job on selected events with AOD/RECO data located at Tier-2 sites with CMSSW/FWLite. In future, this Xrootd service will be expanded to any grid cite(TIer-1, Tier-2 and Tier-3) and any file type.
 

Quick steps to analyze remote data located in remote Tier-2 sites

Line: 85 to 86
  )
Added:
>
>

Open a file in Condor Batch

If one wants to use local condor batch to analyze user/group skims located at remote sites. The only modification needed is adding
     x509userproxy = /tmp/x509up_uXXXX
in your condor jdl file (the file which defines universe, Executable, etc..). The string /tmp/x509up_uXXXX is the string in the "path:" statement from output of "voms-proxy-info -all", which contains your valid grid proxy. Condor will pass this information to the working node of the condor batch.

This feature allows you to process large amount of edm or non-edm root files remotely using your local batch CPU power. For example, processing Tier-2 skim ntuple with your local Tier-3 condor.

 

File Download with command-line tools

Revision 62013-01-14 - JieChen

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service for Remote Data Accessing

<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->
Line: 94 to 94
 xrdcp root://xrootd.unl.edu//store/foo /some/local/path
Deleted:
<
<
You may also want to think about using the -R option, which allows you to recursively download a directory.
 

Current Xrootd Service Status

Revision 52013-01-11 - JieChen

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service for Remote Data Accessing

<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->
Line: 62 to 62
 
TFile *f =TFile::Open("root://xrootd.unl.edu//store/mc/JobRobot/RelValProdTTbar/GEN-SIM-DIGI-RECO/MC_3XY_V24_JobRobot-v1/0001/56E18353-982C-DF11-B217-00304879FA4A.root");
Changed:
<
<
note the prefix of the root://xrootd.unl.edu/ (or anyother XrootdSrv name) in front of your LFN.
>
>
note the prefix of the root://xrootd.unl.edu/ (or anyother XrootdSrv name) in front of your LFN.
 This returns a TFile object, and you can proceed normally. Same is true for FWLite environment.

Open a file in CMSSW

Changed:
<
<
You want to edit the PoolSource line in your python configuration file to point directly at the Xrootd service, instead of using a generic LFN.
>
>
You want to edit the PoolSource line in your python configuration file to point directly at the Xrootd service, instead of using a generic LFN.
  For example, this might be the "before" picture:
Line: 102 to 102
 The service is in production. Below are the list of sites currently providing Xrootd Service.

  • US Region
Changed:
<
<
    1. T1_US_FNAL (disk-only; includes test EOS service)
>
>
    1. T1_US_FNAL (disk-only; includes test EOS service)
 
    1. T2_US_Caltech
    2. T2_US_Florida
    3. T2_US_Nebraska

Revision 42013-01-10 - JieChen

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service for Remote Data Accessing

<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->
Line: 7 to 7
 
Detailed Review status
Changed:
<
<

Goals of this page:

>
>

Contents

Goals of this workbook page

  This page provides you a generic Xrootd service in analyzing CMS data (AOD/RECO data, Group skim ntuple, User skim ntuple) located at any grid cite with bare root or CMSSW/FWLite environment, without downloading it to your institution cluster or your laptop.
Added:
>
>
 

Introduction to the Xrootd Service

Xrootd (the eXtended ROOT file server daemon) is a new service provided under the Any Data, Any Time, Anywhere (AAA) Infrastructure of Open science grid and Worldwide LHC Computing Grid. AAA provides a new paradigm shift in data analysis from co-location of data and CPU to a dynamic model which analyzes remotely located data with local CPU. The Xrootd has been integrated with standalone ROOT. Interested users can consult CmsXrootdArchitecture for technical details on Xrootd implementation.

Line: 19 to 34
  This will be useful in remotely analyzing ntuples produced by group skim or user skim located at Tier-2 centers with bare ROOT, doing test job on selected events with AOD/RECO data located at Tier-2 sites with CMSSW/FWLite. In future, this Xrootd service will be expanded to any grid cite(TIer-1, Tier-2 and Tier-3) and any file type.
Added:
>
>

Quick steps to analyze remote data located in remote Tier-2 sites

 
Changed:
<
<

Quick steps to analyze data located in remote Tier-2 sites.

>
>
 

Pre-requisite

To use Xrootd, you only need to have your grid certificate ready. i.e.,you must have a grid certificate installed into ~/.globus and registered in CMS, see Chapter 5 of the CMS workbook on how to get the certificate. To set up your grid environment, do in the normal way.

Line: 39 to 55
 Now you're ready to analyze remote data with bare ROOT or CMSSW (FWLite) environment.
Changed:
<
<
>
>
 

Open a file using ROOT

If you are using bare ROOT, you can open files in the xrootd service just like you would any other file:
Changed:
<
<
TFile *f =TFile::Open("root://xrootd.unl.edu//store/foo");
>
>
TFile *f =TFile::Open("root://xrootd.unl.edu//store/mc/JobRobot/RelValProdTTbar/GEN-SIM-DIGI-RECO/MC_3XY_V24_JobRobot-v1/0001/56E18353-982C-DF11-B217-00304879FA4A.root");
 
Changed:
<
<
note the prefix of the root://xrootd.unl.edu/ (or anyother xrootdSrv name) in front of your LFN.
>
>
note the prefix of the root://xrootd.unl.edu/ (or anyother XrootdSrv name) in front of your LFN.
 This returns a TFile object, and you can proceed normally. Same is true for FWLite environment.
Added:
>
>
 

Open a file in CMSSW

You want to edit the PoolSource line in your python configuration file to point directly at the Xrootd service, instead of using a generic LFN.
Line: 56 to 73
 
process.source = cms.Source("PoolSource",
                            #                            # replace 'myfile.root' with the source file you want to use
Changed:
<
<
fileNames = cms.untracked.vstring('/store/foo')
>
>
fileNames = cms.untracked.vstring('/store/myfile.root')
  )
Line: 64 to 81
 
process.source = cms.Source("PoolSource",
                            #                            # replace 'myfile.root' with the source file you want to use
Changed:
<
<
fileNames = cms.untracked.vstring('root://xrootd.unl.edu//store/foo')
>
>
fileNames = cms.untracked.vstring('root://xrootd.unl.edu//store/myfile.root')
  )
Changed:
<
<
>
>
 

File Download with command-line tools

If you still like to have the file to located locally, Xrootd service also provides a command-line tool called xrdcp. This command line utility ships with stand-alone ROOT and CMSSW, which provides a much easier way to copy a grid file than lcg-cp, srm-cp, and FileMover utility.
Line: 79 to 96
  You may also want to think about using the -R option, which allows you to recursively download a directory.
Changed:
<
<
>
>
 

Current Xrootd Service Status

The service is in production. Below are the list of sites currently providing Xrootd Service.

Line: 105 to 122
  Xrootd service is also integrated to CRAB now. If your job has been queued for more than 12 hours, it may be run at additional US sites (currently, T2_US_Purdue, T2_US_Nebraska, T2_US_Wisconsin, or T2_US_UCSD), even if the files are not present at those sites. When run, the jobs will automatically switch to reading the data from remote sites.There is nothing which needs to be done on the user side to use this feature.
Added:
>
>

Support

If there's any problem, please send email to hypernews WAN access to CMS data hn-cms-wanaccess@cernNOSPAMPLEASE.ch with the print out from below debugging command.
xrdcp -d 1 -f root://xrootd.unl.edu//store/foo /dev/null
 

Review status

Revision 32013-01-08 - JieChen

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service for Remote Data Accessing

<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->
Line: 9 to 9
 

Goals of this page:

Changed:
<
<
This page provides you a generic way in analyzing CMS data (AOD/RECO data, Group skim ntuple, User skim ntuple) located at Tier-2s with bare root or CMSSW/FWLite environment, without downloading it to your institution cluster or your laptop.
>
>
This page provides you a generic Xrootd service in analyzing CMS data (AOD/RECO data, Group skim ntuple, User skim ntuple) located at any grid cite with bare root or CMSSW/FWLite environment, without downloading it to your institution cluster or your laptop.
 

Introduction to the Xrootd Service

Changed:
<
<
Xrootd is a new service provided under the Any Data, Any Time, Anywhere (AAA) Infrastructure of Open science grid and Worldwide LHC Computing Grid. AAA provides a new paradigm shift in data analysis from co-location of data and CPU to a dynamic model which analyzes remotely located data with local CPU. The Xrootd has been integrated with standalone ROOT. Interested users can consult CmsXrootdArchitecture for technical details on Xrootd implementation.
>
>
Xrootd (the eXtended ROOT file server daemon) is a new service provided under the Any Data, Any Time, Anywhere (AAA) Infrastructure of Open science grid and Worldwide LHC Computing Grid. AAA provides a new paradigm shift in data analysis from co-location of data and CPU to a dynamic model which analyzes remotely located data with local CPU. The Xrootd has been integrated with standalone ROOT. Interested users can consult CmsXrootdArchitecture for technical details on Xrootd implementation.
 
Changed:
<
<
At CMS, Xrootd service allows user to process the data located anywhere in CMS grid without downloading it to local desktop/laptop/computer cluster. You only need know the Logical File Name (LFN) of a production dataset (/store/data/foo) or one of your group/user files at a T2 (/store/group/exotica/foo or /store/user/bbockelm/foo). You can easily get those LFNs through DAS(see WorkBook Chapter 5.4) service.
>
>
At CMS, Xrootd service allows user to process the data located anywhere in CMS grid without downloading it to local desktop/laptop/computer cluster. One only needs to know the Logical File Name (LFN) of a production dataset (/store/data/foo) or one of your group/user files at a Tier-2 (/store/group/exotica/foo or /store/user/bbockelm/foo). You can easily get those LFNs through DAS (see WorkBook Chapter 5.4) service.
 
Changed:
<
<
This will be useful in remotely analyzing ntuples produced by group skim or user skim located at Tier-2 centers with bare ROOT, doing test job on selected events with AOD/RECO data located at Tier-2 sites with CMSSW/FWLite.
>
>
This will be useful in remotely analyzing ntuples produced by group skim or user skim located at Tier-2 centers with bare ROOT, doing test job on selected events with AOD/RECO data located at Tier-2 sites with CMSSW/FWLite. In future, this Xrootd service will be expanded to any grid cite(TIer-1, Tier-2 and Tier-3) and any file type.
 

Quick steps to analyze data located in remote Tier-2 sites.

Pre-requisite

Changed:
<
<
To use Xrootd, you only need to have your grid certificate ready (see Chapter 5 of the CMS workbook on how to get the certificate) To set up your grid environment,
>
>
To use Xrootd, you only need to have your grid certificate ready. i.e.,you must have a grid certificate installed into ~/.globus and registered in CMS, see Chapter 5 of the CMS workbook on how to get the certificate. To set up your grid environment, do in the normal way.
  From CERN lxplus, source the following script:
Line: 44 to 44
 If you are using bare ROOT, you can open files in the xrootd service just like you would any other file:
Changed:
<
<
TFile::Open("root://xrootd.unl.edu//store/foo");
>
>
TFile *f =TFile::Open("root://xrootd.unl.edu//store/foo");
 
Changed:
<
<
note the prefix of the root://xrootd.unl.edu/ in front of your LFN.
>
>
note the prefix of the root://xrootd.unl.edu/ (or anyother xrootdSrv name) in front of your LFN.
 This returns a TFile object, and you can proceed normally. Same is true for FWLite environment.

Open a file in CMSSW

Line: 77 to 77
 xrdcp root://xrootd.unl.edu//store/foo /some/local/path
Changed:
<
<
You will get a progress bar as the file downloads. You may also want to think about using the -R option, which allows you to recursively download a directory.
>
>
You may also want to think about using the -R option, which allows you to recursively download a directory.
 

Current Xrootd Service Status

Revision 22013-01-08 - JieChen

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service for Remote Data Accessing

<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->
Line: 9 to 9
 

Goals of this page:

Changed:
<
<
This page provides you a generic way in analyzing CMS data located at Tier-2s with bare root or CMSSW/FWLite environment, without downloading it to your institution cluster or your laptop.
>
>
This page provides you a generic way in analyzing CMS data (AOD/RECO data, Group skim ntuple, User skim ntuple) located at Tier-2s with bare root or CMSSW/FWLite environment, without downloading it to your institution cluster or your laptop.
 

Introduction to the Xrootd Service

Xrootd is a new service provided under the Any Data, Any Time, Anywhere (AAA) Infrastructure of Open science grid and Worldwide LHC Computing Grid. AAA provides a new paradigm shift in data analysis from co-location of data and CPU to a dynamic model which analyzes remotely located data with local CPU. The Xrootd has been integrated with standalone ROOT. Interested users can consult CmsXrootdArchitecture for technical details on Xrootd implementation.

Changed:
<
<
At CMS, Xrootd service allows user to process the data located anywhere in CMS grid without downloading it to local desktop/laptop/computer cluster. You only need know the Logical File Name (LFN) of a production dataset (/store/data/foo) or one of your user files at a T2 (/store/user/bbockelm/foo). You can easily get those LFNs through DAS(see WorkBook Chapter 5.4) service.
>
>
At CMS, Xrootd service allows user to process the data located anywhere in CMS grid without downloading it to local desktop/laptop/computer cluster. You only need know the Logical File Name (LFN) of a production dataset (/store/data/foo) or one of your group/user files at a T2 (/store/group/exotica/foo or /store/user/bbockelm/foo). You can easily get those LFNs through DAS(see WorkBook Chapter 5.4) service.
 
Changed:
<
<
This will be useful in remotely analyzing ntuples produced by group skim located at Tier-2 centers with bare ROOT, doing test job on selected events with AOD/RECO data located at Tier-2 sites with CMSSW/FWLite.
>
>
This will be useful in remotely analyzing ntuples produced by group skim or user skim located at Tier-2 centers with bare ROOT, doing test job on selected events with AOD/RECO data located at Tier-2 sites with CMSSW/FWLite.
 

Quick steps to analyze data located in remote Tier-2 sites.

Line: 46 to 46
 
TFile::Open("root://xrootd.unl.edu//store/foo");
Changed:
<
<
>
>
note the prefix of the root://xrootd.unl.edu/ in front of your LFN.
 This returns a TFile object, and you can proceed normally. Same is true for FWLite environment.

Open a file in CMSSW

Revision 12013-01-08 - JieChen

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WorkBook"

5.13 Using Xrootd Service for Remote Data Accessing

<!-- Enter an integer between 0 (zero) and 5 after the word COMPLETE, below, to indicate how complete the page/topic is; 0 means empty, 1 is for very incomplete, ascending to 5 for complete.  -->

Complete: 3
Detailed Review status

Goals of this page:

This page provides you a generic way in analyzing CMS data located at Tier-2s with bare root or CMSSW/FWLite environment, without downloading it to your institution cluster or your laptop.

Introduction to the Xrootd Service

Xrootd is a new service provided under the Any Data, Any Time, Anywhere (AAA) Infrastructure of Open science grid and Worldwide LHC Computing Grid. AAA provides a new paradigm shift in data analysis from co-location of data and CPU to a dynamic model which analyzes remotely located data with local CPU. The Xrootd has been integrated with standalone ROOT. Interested users can consult CmsXrootdArchitecture for technical details on Xrootd implementation.

At CMS, Xrootd service allows user to process the data located anywhere in CMS grid without downloading it to local desktop/laptop/computer cluster. You only need know the Logical File Name (LFN) of a production dataset (/store/data/foo) or one of your user files at a T2 (/store/user/bbockelm/foo). You can easily get those LFNs through DAS(see WorkBook Chapter 5.4) service.

This will be useful in remotely analyzing ntuples produced by group skim located at Tier-2 centers with bare ROOT, doing test job on selected events with AOD/RECO data located at Tier-2 sites with CMSSW/FWLite.

Quick steps to analyze data located in remote Tier-2 sites.

Pre-requisite

To use Xrootd, you only need to have your grid certificate ready (see Chapter 5 of the CMS workbook on how to get the certificate) To set up your grid environment,

From CERN lxplus, source the following script:

source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.sh

From FNAL lpc, source

source /uscmst1/prod/grid/gLite_SL5.csh

Now you're ready to analyze remote data with bare ROOT or CMSSW (FWLite) environment.

Open a file using ROOT

If you are using bare ROOT, you can open files in the xrootd service just like you would any other file:

TFile::Open("root://xrootd.unl.edu//store/foo");

This returns a TFile object, and you can proceed normally. Same is true for FWLite environment.

Open a file in CMSSW

You want to edit the PoolSource line in your python configuration file to point directly at the Xrootd service, instead of using a generic LFN.

For example, this might be the "before" picture:

process.source = cms.Source("PoolSource",
                            #                            # replace 'myfile.root' with the source file you want to use
                            fileNames = cms.untracked.vstring('/store/foo')
                            )

Here's the same file, but accessed through the Xrootd Service by simply adding prefix root://xrootd.unl.edu/ :

process.source = cms.Source("PoolSource",
                            #                            # replace 'myfile.root' with the source file you want to use
                            fileNames = cms.untracked.vstring('root://xrootd.unl.edu//store/foo')
                            )

File Download with command-line tools

If you still like to have the file to located locally, Xrootd service also provides a command-line tool called xrdcp. This command line utility ships with stand-alone ROOT and CMSSW, which provides a much easier way to copy a grid file than lcg-cp, srm-cp, and FileMover utility.

xrdcp root://xrootd.unl.edu//store/foo /some/local/path

You will get a progress bar as the file downloads. You may also want to think about using the -R option, which allows you to recursively download a directory.

Current Xrootd Service Status

The service is in production. Below are the list of sites currently providing Xrootd Service.

  • US Region
    1. T1_US_FNAL (disk-only; includes test EOS service)
    2. T2_US_Caltech
    3. T2_US_Florida
    4. T2_US_Nebraska
    5. T2_US_Purdue
    6. T2_US_UCSD
    7. T2_US_Wisconsin
    8. T2_US_MIT
    9. T2_US_Vanderbilt
  • EU Region
    1. T1_CH_CERN (EOSCMS only)
    2. T2_IT_Bari
    3. T2_IT_Pisa
    4. T2_IT_Legnaro
    5. T2_DE_DESY
    6. T2_UK_*
    7. T2_EE_Estonia

Xrootd service is also integrated to CRAB now. If your job has been queued for more than 12 hours, it may be run at additional US sites (currently, T2_US_Purdue, T2_US_Nebraska, T2_US_Wisconsin, or T2_US_UCSD), even if the files are not present at those sites. When run, the jobs will automatically switch to reading the data from remote sites.There is nothing which needs to be done on the user side to use this feature.

Review status

<!-- Add your review status in this table structure with 2 columns delineated by three vertical bars -->

Reviewer/Editor and Date (copy from screen) Comments
-- JieChen - 8 Jan 2013 modified the content slightly to be user friendly
-- JieChen - 7 Jan 2013 move the page from Brian Bockelman's HdfsXrootdto page to SWGuide format

Responsible: BrianBockelman JieChen
Last reviewed by: Most recent reviewer

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback