Xrootd Production and Integration

As opposed to the current production data management system, the xrootd data access system is oriented for the end-user. We are concerned with maintaining a satisfactory end-user experience and maintaining the stability of the system.

Toward this goal, we partition sites into production and integration. The integration testbed (ITB) allows us to send controlled tests to a participating site to help it stabilize operations without worrying about exposing them to chaotic user load. Once a site feels it is ready and passes our criteria, it will be allowed into the production infrastructure.

Xrootd Production Checklist

Before you verification of appropriate configurations (one time only), we suggest doing or check the following:

  • Allows remote access from public internet.
  • GSI-enabled authentication and authorization.
  • Works with xrdcp and ROOT.
  • Access is read-only.
  • Xrootd exports CMS namespace, not the site namespace.
  • In the CMS namespace: /store/test/xrootd/$SITENAME/(.*) must map to /store/$1. This allows us to query the redirector for a specific file and only get a response from hosts at $SITENAME; an important characteristic for our testing.
  • Participation in a regional redirector that is within 50ms RTT on the network.
  • Consider the impact of a malicious (or uneducated user). Do you feel you have comfortable controls for the following dimensions: site bandwidth utilized, IOPS, namespace queries.
  • Understand what monitoring is available to watch the service at your site.

Production Criteria

This checklist is meant to be evaluated periodically to determine which sites have achieved a sufficient level of stabilization to become a part of the production Xrootd architecture. We envision verifying the criteria once a month for all production sites; the transition from integration into production can happen whenever the site feels ready. Eventually, these items will be integrated into the normal site status board; currently, the list is primarily for the US regional redirector.

  1. At least three site xrootd servers at T2 sites and two servers for T3 sites. For T2s, the expected load should require two servers to handle; the third server is for redundancy. T3 sites should need one server to handle the load; the second is also for redundancy.
    • This requirement is so an end-user can expect reasonable performance when accessing official CMS data.
    • Each server ought to be a 4 core machine with at least 8GB of RAM.
    • Less servers can be used for sites serving only unique namespace. I.e., a T3 has a private namespace which doesn't conflict with the official CMS namespace.
  2. 95% availability in the redirector as measured by heartbeat tests.
    • The heartbeat tests frequently (approximately every 10 minutes) download a few bytes from a single, known file. This will be done directly against each known server (not through the redirector), and to the site via
  3. 95% availability in the random file tests.
    • The random file test will attempt to download a random file registered at the site in PhEDEx approximately once an hour. This will be done via the regional redirector.
  4. 95% success rate in xrootd JobRobot.
    • A CRAB task will be run approximately daily at one site in the region (T2_US_Nebraska for the US) on the JobRobot test utilizing files from the remote site via the redirector. Success rate is the percentage of successful jobs run that day.

If one of these criteria are not currently measured (for example, the JobRobot test is estimated to be available 2 months after all other monitoring), then the site is excused from the criteria.

Edit | Attach | Watch | Print version | History: r7 | r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2011-02-04 - BrianBockelman
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback