CMS Xrootd Architecture

This is the homepage for the Xrootd-based federations in CMS.


For Users

We have the following user documentation available also:

For Admins

The following documentation is aimed at the sysadmins of CMS sites:

For Operators


CMS is exploring a new architecture for data access, emphasizing the following three items:

  • Reliability: The end-user should never see an I/O error or failure propagated up to their application unless no USCMS site can serve the file. Failures should be caught as early as possible and I/O retried or rerouted to a different site (possibly degrading the service slightly).
  • Transparency: All actions of the underlying system should be automatic for the user catalog lookups, redirections, reconnections. There should not be a different workflow for accessing the data ``close by" versus halfway around the world. This implies the system serves user requests almost instantly; opening files should be a ``lightweight" operation.
  • Usability: All CMS application frameworks (CMSSW, FWLite, bare ROOT) must natively integrate with any proposed solution. The proposed solution must not degrade the event processing rate significantly.
  • Global: A CMS user should be able to get at any CMS file through the Xrootd service.

To achieve these goals, we will be pursuing a distributed architecture based upon the Xrootd protocol and software developed by SLAC. The proposed architecture is also similar to the current data management architecture of the ALICE experiment. Note that we specifically did not put scalability here - we already have an existing infrastructure that scales just fine. We have no intents on replacing current CMS data access methods for production.

We believe that these goals will greatly reduce the difficulty of data access for physicists on the small or medium scale. This new architecture has four deliverables for CMS:

  1. A production-quality, global xrootd infrastructure.
  2. Fallback data access for jobs running at the T2.
  3. Interactive access for CMS physicists.
  4. A disk-free data access system for T3 sites.


To explore the xrootd architecture, we put together a prototype for the WLCG, involving CMS sites worldwide and all the relevant storage technologies. This prototype wrapped up in January 2011, and we are moving to a regional redirector-based system. This injects another layer into the hierarchy which will make sure requests keep in a local network region if possible.

Local-region redirection

The image below shows the communication paths for a user application querying the regional redirector when the desired file is within the region. First (1), the user application attempts to open the file in the regional redirector. If the regional redirector does not know the file's location, it will then query all of the logged-in sites (2). In this diagram, Site A responds that it has the file, so the redirector redirects (3) the client to Site A's xrootd server. Finally, the client contacts Site A (4) and starts reading data (5). This is all implemented within the Xrootd client; no user interaction is necessary.

Regional Xrootd.png

Cross-region redirection

The image below shows the communication paths for a user application querying the regional redirector when the desired file is not within the region. This proceeds as in the previous case, except all local sites respond they do not have the file. Then, the regional redirector will contact the other regions (3); if the file location is not in cache , the other regional redirector will query its sites (4). In this example, the user is redirected to Site C (5) and successfully opens the file (6 and 7).

Regional Xrootd Regional Redirect.png

Fallback Access

In the prototype, most sites won't use Xrootd as their primary method; instead, they will use it primarily as a fallback. The image below shows how the file access would work for such a site:


Notes for Project Staff

Participating Sites


  1. T1_US_FNAL
  2. T2_US_Nebraska
  3. T2_US_Caltech
  4. T2_US_UCSD
  5. T2_US_Purdue
  6. T2_US_Wisconsin
  7. T2_US_MIT
  8. T2_US_Vanderbilt
  9. T2_US_Florida
  1. T2_UK_London_IC
  1. T2_IT_Legnaro
  2. T2_IT_Bari
  3. T2_IT_Pisa
  1. T2_DE_DESY

Improving CMSSW I/O

CMSSW has traditionally been very sensitive to latency. In order to make remote streaming feasible, we have been working closely with the CMSSW and ROOT team to provide guidance and code to remove this sensitivity.

The following is a list of changes:

  • ROOT TTreecCache functioning (some items landed in 3.3; true functionality was in 3.6).
    • Squashing accompanying memory leak
  • ROOT TTreeCache on by default; Delivered in 3.7
  • Fix broken caching on RAW files. Delivered in 3.8 and 3.9
  • Fallback protocols in CMSSW. Delivered 3.9
  • Xrootd stagein calls. Delivered 3.9
  • Removal of non-Event TTrees. Important for high-latency links. Delivered 3.9
  • Fix broken caching for Lumi and Run trees. Upcoming (4.2)
  • Addition of secondary cache for learning phase. Upcoming (4.2)
  • Validation of ROOT 5.26+ auto-clustering. Upcoming (4.2)
  • Validation of ROOT 5.32 TFile.Prefetching. Patches sent to ROOT - ROOT 5.34?
  • Allow limited backward seeks. Upcoming (5_2)
  • Combine read coalescing and vector reads. Upcoming (6_0)
  • Switch from TXNetFile to XrdAdaptor. Upcoming (6_0)
Several of these improvements were implemented by others, but benefit us and are listed here.

Tests and Issues

XRootD related

  • Tests for the Xrootd Demonstrator (back to 2010 initiative) we've performed are documented on this page.
  • We are also trying to document all the issues we observe with the xrootd-based system here: CmsXrootdIssues.
  • We record the CMSSW/ROOT I/O improvements needed here: CmsRootIoIssues.

XRootD-AAA related

Presentations and Workshops

Project Deliverables and Milestones

Project timeline for the US region.

Topic attachments
I Attachment History ActionSorted descending Size Date Who Comment
PDFpdf AAA_DPM-Federica.pdf r1 manage 799.6 K 2015-02-25 - 15:28 MericTaze  
PNGpng FallbackAccess.png r1 manage 27.5 K 2010-07-26 - 22:30 BrianBockelman  
PNGpng GlobalAccess.png r1 manage 80.5 K 2010-07-26 - 21:56 BrianBockelman  
PNGpng Regional_Xrootd.png r1 manage 51.7 K 2011-02-11 - 19:49 BrianBockelman Diagram of xrootd usage when file is in local region
PNGpng Regional_Xrootd_Regional_Redirect.png r1 manage 56.2 K 2011-02-11 - 19:49 BrianBockelman Diagram of xrootd usage when file is not in local region
PDFpdf ken-CHEP2013-paper.pdf r1 manage 1233.4 K 2015-02-25 - 15:28 MericTaze  
PDFpdf ken-aaa_xrootd_150127.pdf r1 manage 2666.6 K 2015-02-25 - 15:28 MericTaze  
PDFpdf ken-osg-ahm-2014-aaa_140410.pdf r1 manage 2616.7 K 2015-02-25 - 15:28 MericTaze  
PDFpdf matevz-osg-ahm-2014-BeyondIoPatterns-FS14.pdf r1 manage 3816.2 K 2015-02-25 - 15:28 MericTaze  

This topic: Main > TWikiUsers > BrianBockelman > CmsXrootdArchitecture
Topic revision: r50 - 2015-03-27 - JohnArtieda
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback