Difference: HdfsXrootd (2 vs. 3)

Revision 32010-05-13 - BrianBockelman

Line: 1 to 1
Changed:
<
<
META TOPICPARENT name="BrianBockelmanSandbox"

CMS Xrootd Service

>
>
META TOPICPARENT name="Sandbox.BrianBockelmanSandbox"

USCMS Xrootd Service

 
Changed:
<
<
This document covers joining the CMS xrootd service based on the redirector xrootd.unl.edu. This was originally written for CMS HDFS sites, but any xrootd instance that: a Exports a compliant xrootd protocol a Is compatiable with a Scalla cmsd daemon for redirection. a Provides security acceptable to CMS (i.e., GSI-based authentication). The last two bullet points may limit the acceptable clients to only Scalla-based xrootd servers. We have integrated HDFS with Scalla xrootd by providing a custom integration module that implements a Scalla Xrootd OFS. Any storage element exporting a reasonably stable C API can write such a module.
>
>
This page covers the user and sysadmin aspects of the USCMS Xrootd Service.
 
Changed:
<
<
Any storage element which is POSIX-compliant (Lustre, GPFS, etc) needs no customization - it can run the Scalla Xrootd daemon directly!
>
>
The USCMS Xrootd Service is an exploration into new data access techniques targeted toward end users. It has the following goals:
 
Changed:
<
<
The rest of the document assumes you are going to run the Scalla Xrootd server on top of HDFS.
>
>
  1. Ease-of-use: One should be able to use the service directly from ROOT. The intricacies of picking a site, and mapping the CMS file name to the site's file name should be hidden.
  2. Reliability: If a site or SE fails, the application should not fail. Rather, it should gracefully fall-over
  3. Efficient: CMSSW should be able to efficiently run analysis, even over transatlantic data streams.
  4. Global: A user should be
 
Changed:
<
<

Installation

>
>
We believe that these goals will greatly reduce the difficulty of data access for physicists on the small or medium scale.
 
Changed:
<
<
You must already have HDFS working and configured on the node. A FUSE mount is not needed; however, you should be able to use hadoop -put and hadoop -get to move files in and out of HDFS on the node. If the node is already a functioning HDFS GridFTP server, then it probably meets these requirements.
>
>
Note that we specifically did not put scalability here - we already have an existing infrastructure that scales just fine. We have no intents on replacing current CMS data access methods.
 
Changed:
<
<
First, add the OSG-Hadoop repository to the node (if it is not there already):
>
>

Topics:

 
Changed:
<
<
if [ ! -e /etc/yum.repos.d/osg-hadoop.repo ]; then
  rpm -Uvh http://vdt.cs.wisc.edu/hadoop/osg-hadoop-1-2.el5.noarch.rpm
fi
>
>
  1. Joining the USCMS Xrootd Service (sysadmins only)
  2. Using the USCMS Xrootd Service
 
Deleted:
<
<
Next, install the xrootd RPM from the unstable repo. This will add the xrootd user if it does not already exist - ROCKS users might want to create this user beforehand.
yum --enablerepo=hadoop-unstable install xrootd
The version should be at least 1.1-4.

If your CMS namespace is not truly trivial (i.e., if the CMS top-level directory in Hadoop is not /store), copy your storage.xml to /etc/xrootd/storage.xml. Make sure your storage.xml exports a hadoop protocol (which should provide the PFN relative to Hadoop; see Nebraska's TFC as inspiration if necessary).

Copy the template config file, /etc/xrootd/xrootd_sample.cfg to /etc/xrootd/xrootd.cfg. If your site requires storage.xml, uncomment (and possibly update) the oss.namelib line.

Finally, create a copy of the host certs to be xrootd service certs:

mkdir -p /etc/grid-security/xrd
cp /etc/grid-security/hostcert.pem /etc/grid-security/xrd/xrdcert.pem
cp /etc/grid-security/hostkey.pem /etc/grid-security/xrd/xrdkey.pem
chown xrootd: -R /etc/grid-security/xrd
chmod 400 /etc/grid-security/xrd/xrdkey.pem # Yes, 400 is required

Operating xrootd

There are two init services, xrootd and cmsd, which must both be working for the site to participate in the xrootd service:

service xrootd start
service cmsd start

Everything is controlled by a proper init script.

Log files are kept in /var/log/xrootd/{cmsd,xrootd}.log, and are auto-rotated.

After startup, the xrootd and cmsd daemons drop privilege to the xrootd user.

Port usage:

The following information is probably needed for sites with strict firewalls:
  • The xrootd server listens on TCP port 1094.
  • The cmsd server needs outgoing TCP port 1213 to xrootd.unl.edu.
  • Usage statistics are sent to xrootd.unl.edu on UDP ports 3333 and 3334.

Testing the install.

The newly installed server can be tested directly using:
xrdcp xroot://local_hostname.example.com//store/foo/bar /tmp/bar
You will need a grid certificate installed in your user account for the above to work

You can then see if your server is participating properly in the xrootd service by checking:

xrdcp root://xrootd.unl.edu//store/foo/bar /tmp/bar2
where /store/foo/bar is unique to your site

Known Issues:

  • In 1.1-4, you need to make sure the line "HADOOP_CONF_DIR=/etc/hadoop" is in /etc/sysconfig/hadoop
  • In 1.1-4, most sites will want to change the log level for xrootd.log to be less verbose. Change the ofs.trace line to read ofs.trace none.
 \ No newline at end of file
Added:
>
>
META TOPICMOVED by="bbockelm" date="1273756668" from="Sandbox.HdfsXrootd" to="Main.HdfsXrootd"
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback