Difference: CmsXrootdArchitecture (1 vs. 54)

Revision 542017-07-31 - MarianZvada

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 25 to 25
 
    • POSIX - Joining a POSIX (Lustre, GPFS) site to the global Xrootd federation.
    • DPM - Instructions for a DPM site; link to the external DPM site. Main page still lack some of the generic info like contact etc, but should be a good start!
      • More tuning hints - in order to get better overall performance check out this page.
Added:
>
>
 

Revision 532016-09-27 - RokasMaciulaitis

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 24 to 24
 
      • dCache - Xrootd proxy - Joining dCache to the global federation using a site proxy. Also applies to dCache sites whose LFN->PFN mapping is more complex than adding a prefix.
    • POSIX - Joining a POSIX (Lustre, GPFS) site to the global Xrootd federation.
    • DPM - Instructions for a DPM site; link to the external DPM site. Main page still lack some of the generic info like contact etc, but should be a good start!
Deleted:
<
<
      • CMS-specific pieces for the federation configuration are here.
 

Revision 522016-06-23 - MarianZvada

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 132 to 132
 

Presentations and Workshops

  • Presentations:
Added:
>
>
 

Revision 512015-08-24 - MarianZvada

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 33 to 33
 
Changed:
<
<
>
>
 

For Operators

Revision 502015-03-27 - JohnArtieda

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 38 to 38
 

For Operators

Added:
>
>
 

Introduction

Revision 492015-02-25 - MericTaze

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 35 to 35
 
Added:
>
>

For Operators

 

Introduction

CMS is exploring a new architecture for data access, emphasizing the following three items:

Line: 125 to 129
 
Added:
>
>

Presentations and Workshops

 

Project Deliverables and Milestones

Project timeline for the US region.

Line: 147 to 163
 
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"
META FILEATTACHMENT attachment="Regional_Xrootd_Regional_Redirect.png" attr="" comment="Diagram of xrootd usage when file is not in local region" date="1297450156" name="Regional_Xrootd_Regional_Redirect.png" path="Regional_Xrootd_Regional_Redirect.png" size="57592" stream="Regional_Xrootd_Regional_Redirect.png" tmpFilename="/usr/tmp/CGItemp58577" user="bbockelm" version="1"
META FILEATTACHMENT attachment="Regional_Xrootd.png" attr="" comment="Diagram of xrootd usage when file is in local region" date="1297450179" name="Regional_Xrootd.png" path="Regional_Xrootd.png" size="52925" stream="Regional_Xrootd.png" tmpFilename="/usr/tmp/CGItemp58792" user="bbockelm" version="1"
Added:
>
>
META FILEATTACHMENT attachment="ken-aaa_xrootd_150127.pdf" attr="" comment="" date="1424874516" name="ken-aaa_xrootd_150127.pdf" path="ken-aaa_xrootd_150127.pdf" size="2730616" user="mtaze" version="1"
META FILEATTACHMENT attachment="AAA_DPM-Federica.pdf" attr="" comment="" date="1424874516" name="AAA_DPM-Federica.pdf" path="AAA_DPM-Federica.pdf" size="818833" user="mtaze" version="1"
META FILEATTACHMENT attachment="ken-CHEP2013-paper.pdf" attr="" comment="" date="1424874516" name="ken-CHEP2013-paper.pdf" path="ken-CHEP2013-paper.pdf" size="1263020" user="mtaze" version="1"
META FILEATTACHMENT attachment="ken-osg-ahm-2014-aaa_140410.pdf" attr="" comment="" date="1424874517" name="ken-osg-ahm-2014-aaa_140410.pdf" path="ken-osg-ahm-2014-aaa_140410.pdf" size="2679544" user="mtaze" version="1"
META FILEATTACHMENT attachment="matevz-osg-ahm-2014-BeyondIoPatterns-FS14.pdf" attr="" comment="" date="1424874518" name="matevz-osg-ahm-2014-BeyondIoPatterns-FS14.pdf" path="matevz-osg-ahm-2014-BeyondIoPatterns-FS14.pdf" size="3907803" user="mtaze" version="1"

Revision 482015-02-04 - MarianZvada

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 23 to 23
 
    • dCache - native Xrootd doors - Joining a dCache site to the global Xrootd federation.
      • dCache - Xrootd proxy - Joining dCache to the global federation using a site proxy. Also applies to dCache sites whose LFN->PFN mapping is more complex than adding a prefix.
    • POSIX - Joining a POSIX (Lustre, GPFS) site to the global Xrootd federation.
Changed:
<
<
    • DPM - Instructions for a DPM site; link to the external DPM site. The instructions still lack the CMS-specific pieces, but should be a good start!
>
>
    • DPM - Instructions for a DPM site; link to the external DPM site. Main page still lack some of the generic info like contact etc, but should be a good start!
      • CMS-specific pieces for the federation configuration are here.
      • More tuning hints - in order to get better overall performance check out this page.
 

Revision 472014-07-30 - MarianZvada

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 115 to 115
 Several of these improvements were implemented by others, but benefit us and are listed here.

Tests and Issues

Changed:
<
<
This page documents the tests we've performed.

We are also trying to document all the issues we observe with the xrootd-based system here: CmsXrootdIssues.

We record the CMSSW/ROOT I/O improvements needed here: CmsRootIoIssues.

This page documents the open scalability tests we've performed.

>
>

XRootD related

  • Tests for the Xrootd Demonstrator (back to 2010 initiative) we've performed are documented on this page.
  • We are also trying to document all the issues we observe with the xrootd-based system here: CmsXrootdIssues.
  • We record the CMSSW/ROOT I/O improvements needed here: CmsRootIoIssues.

XRootD-AAA related

 

Project Deliverables and Milestones

Revision 462014-06-23 - MarianZvada

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 31 to 31
 
Changed:
<
<
>
>
 

Introduction

Revision 452014-06-10 - MarianZvada

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 31 to 31
 
Added:
>
>
 

Introduction

Revision 442014-05-13 - MarianZvada

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 26 to 26
 
    • DPM - Instructions for a DPM site; link to the external DPM site. The instructions still lack the CMS-specific pieces, but should be a good start!
  • Checklist for production sites. Requirements for a site to reach (and maintain) production status.
Changed:
<
<
>
>
 

Revision 432014-04-08 - FedericaFanzago

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 121 to 121
  We record the CMSSW/ROOT I/O improvements needed here: CmsRootIoIssues.
Changed:
<
<
This page documents the scale tests we've performed.
>
>
This page documents the open scalability tests we've performed.
 

Project Deliverables and Milestones

Revision 422014-04-08 - FedericaFanzago

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 121 to 121
  We record the CMSSW/ROOT I/O improvements needed here: CmsRootIoIssues.
Added:
>
>
This page documents the scale tests we've performed.
 

Project Deliverables and Milestones

Project timeline for the US region.

Revision 412014-04-03 - MarianZvada

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"
Changed:
<
<

CMS Xrootd Architecture

>
>

CMS Xrootd Architecture

  This is the homepage for the Xrootd-based federations in CMS.
Added:
>
>
 

Documentation

For Users

Revision 402014-02-06 - KenBloom

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 10 to 10
  We have the following user documentation available also:
Changed:
<
<
>
>
 

For Admins

Revision 392013-09-17 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 18 to 18
 
Changed:
<
<
    • dCache - Joining a dCache site to the global Xrootd federation.
      • dCache < 1.9.12 - Joining an older version of dCache to the global federation. Also applies to dCache sites whose LFN->PFN mapping is more complex than adding a prefix.
>
>
    • dCache - native Xrootd doors - Joining a dCache site to the global Xrootd federation.
      • dCache - Xrootd proxy - Joining dCache to the global federation using a site proxy. Also applies to dCache sites whose LFN->PFN mapping is more complex than adding a prefix.
 
    • POSIX - Joining a POSIX (Lustre, GPFS) site to the global Xrootd federation.
    • DPM - Instructions for a DPM site; link to the external DPM site. The instructions still lack the CMS-specific pieces, but should be a good start!
  • Checklist for production sites. Requirements for a site to reach (and maintain) production status.

Revision 382013-02-27 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 28 to 28
 
Added:
>
>
 

Introduction

Revision 372013-01-29 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 24 to 24
 
    • DPM - Instructions for a DPM site; link to the external DPM site. The instructions still lack the CMS-specific pieces, but should be a good start!
  • Checklist for production sites. Requirements for a site to reach (and maintain) production status.
Added:
>
>
 

Revision 362012-09-14 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 15 to 15
 

For Admins

The following documentation is aimed at the sysadmins of CMS sites:

Changed:
<
<
>
>
 
  • How to integrate Xrootd into your site. Select the appropriate one for your SE technology.
    • HDFS - Joining a HDFS site to the global Xrootd federation.
    • dCache - Joining a dCache site to the global Xrootd federation.

Revision 352012-09-07 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 21 to 21
 
    • dCache - Joining a dCache site to the global Xrootd federation.
      • dCache < 1.9.12 - Joining an older version of dCache to the global federation. Also applies to dCache sites whose LFN->PFN mapping is more complex than adding a prefix.
    • POSIX - Joining a POSIX (Lustre, GPFS) site to the global Xrootd federation.
Added:
>
>
    • DPM - Instructions for a DPM site; link to the external DPM site. The instructions still lack the CMS-specific pieces, but should be a good start!
 

Revision 342012-09-06 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 63 to 63
 

Notes for Project Staff

Changed:
<
<

US Participants

Production

>
>

Participating Sites

 
Added:
>
>
US:
 
  1. T1_US_FNAL
  2. T2_US_Nebraska
  3. T2_US_Caltech
Line: 77 to 76
 
  1. T2_US_Vanderbilt
  2. T2_US_Florida
  3. T3_US_FNALLPC
Added:
>
>
UK:
  1. T2_UK_London_IC
Italy:
  1. T2_IT_Legnaro
  2. T2_IT_Bari
  3. T2_IT_Pisa
Germany:
  1. T2_DE_DESY
Switzerland:
  1. CERN EOS
 

Improving CMSSW I/O

Revision 332012-04-05 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 10 to 10
  We have the following user documentation available also:
Changed:
<
<
>
>
 

For Admins

Line: 21 to 21
 
    • dCache - Joining a dCache site to the global Xrootd federation.
      • dCache < 1.9.12 - Joining an older version of dCache to the global federation. Also applies to dCache sites whose LFN->PFN mapping is more complex than adding a prefix.
    • POSIX - Joining a POSIX (Lustre, GPFS) site to the global Xrootd federation.
Deleted:
<
<
<!-- 
    • T3Xrootd - Configuration of Xrootd for USCMS T3 sites.
-->
 
Line: 70 to 67
 

Production

Added:
>
>
  1. T1_US_FNAL
 
  1. T2_US_Nebraska
  2. T2_US_Caltech
  3. T2_US_UCSD
Deleted:
<
<
  1. T3_US_FNALLPC

Integration

 
  1. T2_US_Purdue
  2. T2_US_Wisconsin
Added:
>
>
  1. T2_US_MIT
  2. T2_US_Vanderbilt
  3. T2_US_Florida
  4. T3_US_FNALLPC
 

Improving CMSSW I/O

Line: 95 to 93
 
  • Fix broken caching for Lumi and Run trees. Upcoming (4.2)
  • Addition of secondary cache for learning phase. Upcoming (4.2)
  • Validation of ROOT 5.26+ auto-clustering. Upcoming (4.2)
Changed:
<
<
Several of these improvements were implemented by others, but benefit the prototype and are listed here.
>
>
  • Validation of ROOT 5.32 TFile.Prefetching. Patches sent to ROOT - ROOT 5.34?
  • Allow limited backward seeks. Upcoming (5_2)
  • Combine read coalescing and vector reads. Upcoming (6_0)
  • Switch from TXNetFile to XrdAdaptor. Upcoming (6_0)
Several of these improvements were implemented by others, but benefit us and are listed here.
 

Tests and Issues

Revision 322012-04-05 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 121 to 121
 
Added:
>
>
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 312012-02-07 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 118 to 118
 
Changed:
<
<
>
>
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 302012-01-11 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 118 to 118
 
Changed:
<
<
>
>
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 292011-11-29 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 117 to 117
 
Added:
>
>
 

META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"

Revision 282011-11-15 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 116 to 116
 
Added:
>
>
 

META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"

Revision 272011-11-11 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 28 to 28
 
Added:
>
>
 

Introduction

Revision 262011-11-02 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Changed:
<
<
CMS is exploring a new architecture for data access, emphasizing the following three items:
  • Reliability: The end-user should never see an I/O error or failure propagated up to their application unless no USCMS site can serve the file. Failures should be caught as early as possible and I/O retried or rerouted to a different site (possibly degrading the service slightly).
  • Transparency: All actions of the underlying system should be automatic for the user catalog lookups, redirections, reconnections. There should not be a different workflow for accessing the data ``close by" versus halfway around the world. This implies the system serves user requests almost instantly; opening files should be a ``lightweight" operation.
  • Usability: All CMS application frameworks (CMSSW, FWLite, bare ROOT) must natively integrate with any proposed solution. The proposed solution must not degrade the event processing rate significantly.
  • Global: A CMS user should be able to get at any CMS file through the Xrootd service.

To achieve these goals, we will be pursuing a distributed architecture based upon the Xrootd protocol and software developed by SLAC. The proposed architecture is also similar to the current data management architecture of the ALICE experiment. Note that we specifically did not put scalability here - we already have an existing infrastructure that scales just fine. We have no intents on replacing current CMS data access methods for production.

We believe that these goals will greatly reduce the difficulty of data access for physicists on the small or medium scale. This new architecture has four deliverables for CMS:

  1. A production-quality, global xrootd infrastructure.
  2. Fallback data access for jobs running at the T2.
  3. Interactive access for CMS physicists.
  4. A disk-free data access system for T3 sites.
>
>
This is the homepage for the Xrootd-based federations in CMS.
 

Documentation

Line: 30 to 17
 The following documentation is aimed at the sysadmins of CMS sites:

  • How to integrate Xrootd into your site. Select the appropriate one for your SE technology.
Changed:
<
<
    • HDFS - Joining a HDFS site to the global Xrootd cluster.
    • dCache - Joining a dCache site to the global Xrootd cluster.
    • POSIX - Joining a POSIX (Lustre, GPFS) site to the global Xrootd cluster.
>
>
    • HDFS - Joining a HDFS site to the global Xrootd federation.
    • dCache - Joining a dCache site to the global Xrootd federation.
      • dCache < 1.9.12 - Joining an older version of dCache to the global federation. Also applies to dCache sites whose LFN->PFN mapping is more complex than adding a prefix.
    • POSIX - Joining a POSIX (Lustre, GPFS) site to the global Xrootd federation.
 
<!-- 
    • T3Xrootd - Configuration of Xrootd for USCMS T3 sites.
-->
Line: 41 to 29
 
Added:
>
>

Introduction

CMS is exploring a new architecture for data access, emphasizing the following three items:

  • Reliability: The end-user should never see an I/O error or failure propagated up to their application unless no USCMS site can serve the file. Failures should be caught as early as possible and I/O retried or rerouted to a different site (possibly degrading the service slightly).
  • Transparency: All actions of the underlying system should be automatic for the user catalog lookups, redirections, reconnections. There should not be a different workflow for accessing the data ``close by" versus halfway around the world. This implies the system serves user requests almost instantly; opening files should be a ``lightweight" operation.
  • Usability: All CMS application frameworks (CMSSW, FWLite, bare ROOT) must natively integrate with any proposed solution. The proposed solution must not degrade the event processing rate significantly.
  • Global: A CMS user should be able to get at any CMS file through the Xrootd service.

To achieve these goals, we will be pursuing a distributed architecture based upon the Xrootd protocol and software developed by SLAC. The proposed architecture is also similar to the current data management architecture of the ALICE experiment. Note that we specifically did not put scalability here - we already have an existing infrastructure that scales just fine. We have no intents on replacing current CMS data access methods for production.

We believe that these goals will greatly reduce the difficulty of data access for physicists on the small or medium scale. This new architecture has four deliverables for CMS:

  1. A production-quality, global xrootd infrastructure.
  2. Fallback data access for jobs running at the T2.
  3. Interactive access for CMS physicists.
  4. A disk-free data access system for T3 sites.
 

Architecture

To explore the xrootd architecture, we put together a prototype for the WLCG, involving CMS sites worldwide and all the relevant storage technologies. This prototype wrapped up in January 2011, and we are moving to a regional redirector-based system. This injects another layer into the hierarchy which will make sure requests keep in a local network region if possible.

Revision 252011-11-01 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 109 to 109
 
Added:
>
>
 

META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"

Revision 242011-10-25 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 108 to 108
 
Changed:
<
<
>
>
 

META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"

Revision 232011-05-18 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 108 to 108
 
Added:
>
>
 

META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"

Revision 222011-04-27 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 107 to 107
 
Added:
>
>
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 212011-04-13 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 106 to 106
 
Added:
>
>
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 202011-03-16 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 105 to 105
 
Added:
>
>
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 192011-03-09 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 30 to 30
 The following documentation is aimed at the sysadmins of CMS sites:

  • How to integrate Xrootd into your site. Select the appropriate one for your SE technology.
Changed:
<
<
    • HdfsXrootdInstall - Joining a HDFS site to the global Xrootd cluster.
    • DcacheXrootd - Joining a dCache site to the global Xrootd cluster.
    • PosixXrootd - Joining a POSIX site to the global Xrootd cluster.
>
>
    • HDFS - Joining a HDFS site to the global Xrootd cluster.
    • dCache - Joining a dCache site to the global Xrootd cluster.
    • POSIX - Joining a POSIX (Lustre, GPFS) site to the global Xrootd cluster.
<!--
 
    • T3Xrootd - Configuration of Xrootd for USCMS T3 sites.
Added:
>
>
-->
 
Added:
>
>
 

Architecture

To explore the xrootd architecture, we put together a prototype for the WLCG, involving CMS sites worldwide and all the relevant storage technologies. This prototype wrapped up in January 2011, and we are moving to a regional redirector-based system. This injects another layer into the hierarchy which will make sure requests keep in a local network region if possible.

Revision 182011-03-06 - FrankWuerthwein

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 63 to 63
 
  1. T2_US_Nebraska
  2. T2_US_Caltech
Added:
>
>
  1. T2_US_UCSD
  2. T3_US_FNALLPC
 

Integration

Deleted:
<
<
  1. T2_US_UCSD
 
  1. T2_US_Purdue
  2. T2_US_Wisconsin
Deleted:
<
<
  1. T1_US_FNAL
 

Improving CMSSW I/O

Revision 172011-03-02 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 101 to 101
 Project timeline for the US region.
Added:
>
>
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 162011-02-11 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 17 to 17
 
  1. Interactive access for CMS physicists.
  2. A disk-free data access system for T3 sites.
Deleted:
<
<

Architecture

To explore the xrootd architecture, we put together a prototype for the WLCG, involving CMS sites worldwide and all the relevant storage technologies. This prototype wrapped up in January 2011, and we are moving to a regional redirector-based system. This injects another layer into the hierarchy which will make sure requests keep in a local network region if possible.

Envisioned Use Cases

Local-region redirection

The image below shows the communication paths for a user application querying the regional redirector when the desired file is within the region. First (1), the user application attempts to open the file in the regional redirector. If the regional redirector does not know the file's location, it will then query all of the logged-in sites (2). In this diagram, Site A responds that it has the file, so the redirector redirects (3) the client to Site A's xrootd server. Finally, the client contacts Site A (4) and starts reading data (5). This is all implemented within the Xrootd client; no user interaction is necessary.

Regional Xrootd.png

Cross-region redirection

The image below shows the communication paths for a user application querying the regional redirector when the desired file is not within the region. This proceeds as in the previous case, except all local sites respond they do not have the file. Then, the regional redirector will contact the other regions (3); if the file location is not in cache , the other regional redirector will query its sites (4). In this example, the user is redirected to Site C (5) and successfully opens the file (6 and 7).

Regional Xrootd Regional Redirect.png

Fallback Access

In the prototype, most sites won't use Xrootd as their primary method; instead, they will use it primarily as a fallback. The image below shows how the file access would work for such a site:

FallbackAccess.png

 

Documentation

For Users

Line: 56 to 38
 
Added:
>
>

Architecture

To explore the xrootd architecture, we put together a prototype for the WLCG, involving CMS sites worldwide and all the relevant storage technologies. This prototype wrapped up in January 2011, and we are moving to a regional redirector-based system. This injects another layer into the hierarchy which will make sure requests keep in a local network region if possible.

Local-region redirection

The image below shows the communication paths for a user application querying the regional redirector when the desired file is within the region. First (1), the user application attempts to open the file in the regional redirector. If the regional redirector does not know the file's location, it will then query all of the logged-in sites (2). In this diagram, Site A responds that it has the file, so the redirector redirects (3) the client to Site A's xrootd server. Finally, the client contacts Site A (4) and starts reading data (5). This is all implemented within the Xrootd client; no user interaction is necessary.

Regional Xrootd.png

Cross-region redirection

The image below shows the communication paths for a user application querying the regional redirector when the desired file is not within the region. This proceeds as in the previous case, except all local sites respond they do not have the file. Then, the regional redirector will contact the other regions (3); if the file location is not in cache , the other regional redirector will query its sites (4). In this example, the user is redirected to Site C (5) and successfully opens the file (6 and 7).

Regional Xrootd Regional Redirect.png

Fallback Access

In the prototype, most sites won't use Xrootd as their primary method; instead, they will use it primarily as a fallback. The image below shows how the file access would work for such a site:

FallbackAccess.png

 

Notes for Project Staff

US Participants

Revision 152011-02-11 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 20 to 20
 

Architecture

To explore the xrootd architecture, we put together a prototype for the WLCG, involving CMS sites worldwide and all the relevant storage technologies. This prototype wrapped up in January 2011, and we are moving to a regional redirector-based system. This injects another layer into the hierarchy which will make sure requests keep in a local network region if possible.
Changed:
<
<
The prototype will initially focus on two use cases: end-user access and T3s. We feel it is important to focus on these as they are the "least represented" in the traditional CMS model.
>
>

Envisioned Use Cases

Local-region redirection

The image below shows the communication paths for a user application querying the regional redirector when the desired file is within the region. First (1), the user application attempts to open the file in the regional redirector. If the regional redirector does not know the file's location, it will then query all of the logged-in sites (2). In this diagram, Site A responds that it has the file, so the redirector redirects (3) the client to Site A's xrootd server. Finally, the client contacts Site A (4) and starts reading data (5). This is all implemented within the Xrootd client; no user interaction is necessary.
 
Changed:
<
<
Additionally, the prototype will utilize a global redirector and invite as many T1/T2 sites to participate as possible. This way, we have a large amount of "interesting data" for users.
>
>
Regional Xrootd.png

Cross-region redirection

The image below shows the communication paths for a user application querying the regional redirector when the desired file is not within the region. This proceeds as in the previous case, except all local sites respond they do not have the file. Then, the regional redirector will contact the other regions (3); if the file location is not in cache , the other regional redirector will query its sites (4). In this example, the user is redirected to Site C (5) and successfully opens the file (6 and 7).
 
Changed:
<
<
The image below shows the high-level architecture of how file access would work in the global architecture:

GlobalAccess.png

>
>
Regional Xrootd Regional Redirect.png
 
Added:
>
>

Fallback Access

 In the prototype, most sites won't use Xrootd as their primary method; instead, they will use it primarily as a fallback. The image below shows how the file access would work for such a site:

FallbackAccess.png

Deleted:
<
<

Generating Usage

In order to generate a constant usage of the architecture, the demonstrator group has the T3_US_Omaha site advertised in the BDII as "close to" several of the participating sites. This allows CRAB analysis jobs to land at Omaha and access data at Nebraska or Caltech. These jobs are accounted in the CMS dashboard, allowing us to keep separate statistics just for these jobs (i.e., we can monitor error rates).

You can see the Xrootd cluster usage information from here.

Participants

Currently, the following sites are participating in the global redirector and allow CRAB jobs to run at Omaha using data from their sites:

  1. T2_US_Nebraska.
  2. T2_US_Caltech.

The following sites are participating in the global redirector:

  1. T2_US_UCSD.
  2. T2_US_Purdue.
  3. T2_US_Wisconsin
  4. T2_IT_Bari.

The following sites are participating, but not yet fully integrated into the global redirector.

  1. T1_US_FNAL
  2. T3_CH_PSI
  3. T3_US_UCR

The following have shown interest in participating, but don't have the local xrootd infrastructure running (yet!):

  1. UK DPM sites
  2. T3_US_Cornell
  3. T2_CH_PSI
We are working with these sites to help solve the local requirements.

From our xrootd monitoring, 38 distinct server hostnames have participated in the global redirector.

 

Documentation

For Users

Line: 86 to 58
 

Notes for Project Staff

Added:
>
>

US Participants

Production

  1. T2_US_Nebraska
  2. T2_US_Caltech

Integration

  1. T2_US_UCSD
  2. T2_US_Purdue
  3. T2_US_Wisconsin
  4. T1_US_FNAL
 

Improving CMSSW I/O

CMSSW has traditionally been very sensitive to latency. In order to make remote streaming feasible, we have been working closely with the CMSSW and ROOT team to provide guidance and code to remove this sensitivity.

Line: 119 to 105
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"
Added:
>
>
META FILEATTACHMENT attachment="Regional_Xrootd_Regional_Redirect.png" attr="" comment="Diagram of xrootd usage when file is not in local region" date="1297450156" name="Regional_Xrootd_Regional_Redirect.png" path="Regional_Xrootd_Regional_Redirect.png" size="57592" stream="Regional_Xrootd_Regional_Redirect.png" tmpFilename="/usr/tmp/CGItemp58577" user="bbockelm" version="1"
META FILEATTACHMENT attachment="Regional_Xrootd.png" attr="" comment="Diagram of xrootd usage when file is in local region" date="1297450179" name="Regional_Xrootd.png" path="Regional_Xrootd.png" size="52925" stream="Regional_Xrootd.png" tmpFilename="/usr/tmp/CGItemp58792" user="bbockelm" version="1"

Revision 142011-02-11 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 8 to 8
 
  • Usability: All CMS application frameworks (CMSSW, FWLite, bare ROOT) must natively integrate with any proposed solution. The proposed solution must not degrade the event processing rate significantly.
  • Global: A CMS user should be able to get at any CMS file through the Xrootd service.
Changed:
<
<
To achieve these goals, we will be pursuing a distributed architecture based upon the Xrootd protocol and software developed by SLAC. The proposed architecture is also similar to the current data management architecture of the ALICE experiment.
>
>
To achieve these goals, we will be pursuing a distributed architecture based upon the Xrootd protocol and software developed by SLAC. The proposed architecture is also similar to the current data management architecture of the ALICE experiment. Note that we specifically did not put scalability here - we already have an existing infrastructure that scales just fine. We have no intents on replacing current CMS data access methods for production.
 
Changed:
<
<
We believe that these goals will greatly reduce the difficulty of data access for physicists on the small or medium scale.
>
>
We believe that these goals will greatly reduce the difficulty of data access for physicists on the small or medium scale. This new architecture has four deliverables for CMS:
 
Changed:
<
<
Note that we specifically did not put scalability here - we already have an existing infrastructure that scales just fine. We have no intents on replacing current CMS data access methods.
>
>
  1. A production-quality, global xrootd infrastructure.
  2. Fallback data access for jobs running at the T2.
  3. Interactive access for CMS physicists.
  4. A disk-free data access system for T3 sites.
 
Changed:
<
<

Prototype

To explore the xrootd architecture, we are putting together a worldwide xrootd testbed. We hope this prototype will demonstrate the we are able to include all deployed SE technologies, learn potential usage patterns, and identify areas of work needed to bring the data access changes into production.
>
>

Architecture

To explore the xrootd architecture, we put together a prototype for the WLCG, involving CMS sites worldwide and all the relevant storage technologies. This prototype wrapped up in January 2011, and we are moving to a regional redirector-based system. This injects another layer into the hierarchy which will make sure requests keep in a local network region if possible.
  The prototype will initially focus on two use cases: end-user access and T3s. We feel it is important to focus on these as they are the "least represented" in the traditional CMS model.

Revision 132011-02-10 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 112 to 112
  Project timeline for the US region.
Added:
>
>
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 122011-02-09 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 111 to 111
 

Project Deliverables and Milestones

Project timeline for the US region.

Added:
>
>
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 112011-02-08 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 78 to 78
 
Added:
>
>
 

Notes for Project Staff

Revision 102011-02-08 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 98 to 98
 
  • Validation of ROOT 5.26+ auto-clustering. Upcoming (4.2)
Several of these improvements were implemented by others, but benefit the prototype and are listed here.
Changed:
<
<

Xrootd Tests and Issues

>
>

Tests and Issues

  This page documents the tests we've performed.

We are also trying to document all the issues we observe with the xrootd-based system here: CmsXrootdIssues.

Added:
>
>
We record the CMSSW/ROOT I/O improvements needed here: CmsRootIoIssues.
 

Project Deliverables and Milestones

Project timeline for the US region.

Revision 92011-02-04 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 60 to 60
  From our xrootd monitoring, 38 distinct server hostnames have participated in the global redirector.
Changed:
<
<

Improving CMSSW I/O

>
>

Documentation

 
Changed:
<
<
CMSSW has traditionally been very sensitive to latency. In order to make remote streaming feasible, we have been working closely with the CMSSW and ROOT team to provide guidance and code to remove this sensitivity.
>
>

For Users

 
Changed:
<
<
The following is a list of changes:
  • ROOT TTreecCache functioning (some items landed in 3.3; true functionality was in 3.6).
    • Squashing accompanying memory leak
  • ROOT TTreeCache on by default; 3.7
  • Fix broken caching on RAW files. 3.8 and 3.9
  • Fallback protocols in CMSSW. Upcoming (3.9)
  • Xrootd stagein calls. Upcoming (3.9)
  • Removal of non-Event TTrees. Important for high-latency links. Upcoming (3.9)
Several of these improvements were implemented by others, but benefit the prototype and are listed here.
>
>
We have the following user documentation available also:
 
Changed:
<
<
This page documents the tests we've performed.
>
>
 
Changed:
<
<

Documentation

>
>

For Admins

  The following documentation is aimed at the sysadmins of CMS sites:
Added:
>
>
  • How to integrate Xrootd into your site. Select the appropriate one for your SE technology.
 
  • HdfsXrootdInstall - Joining a HDFS site to the global Xrootd cluster.
  • DcacheXrootd - Joining a dCache site to the global Xrootd cluster.
  • PosixXrootd - Joining a POSIX site to the global Xrootd cluster.
  • T3Xrootd - Configuration of Xrootd for USCMS T3 sites.
Added:
>
>
 
Changed:
<
<
We have the following user documentation available also:
>
>

Notes for Project Staff

 
Changed:
<
<
>
>

Improving CMSSW I/O

CMSSW has traditionally been very sensitive to latency. In order to make remote streaming feasible, we have been working closely with the CMSSW and ROOT team to provide guidance and code to remove this sensitivity.

The following is a list of changes:

  • ROOT TTreecCache functioning (some items landed in 3.3; true functionality was in 3.6).
    • Squashing accompanying memory leak
  • ROOT TTreeCache on by default; Delivered in 3.7
  • Fix broken caching on RAW files. Delivered in 3.8 and 3.9
  • Fallback protocols in CMSSW. Delivered 3.9
  • Xrootd stagein calls. Delivered 3.9
  • Removal of non-Event TTrees. Important for high-latency links. Delivered 3.9
  • Fix broken caching for Lumi and Run trees. Upcoming (4.2)
  • Addition of secondary cache for learning phase. Upcoming (4.2)
  • Validation of ROOT 5.26+ auto-clustering. Upcoming (4.2)
Several of these improvements were implemented by others, but benefit the prototype and are listed here.

Xrootd Tests and Issues

This page documents the tests we've performed.

  We are also trying to document all the issues we observe with the xrootd-based system here: CmsXrootdIssues.
Added:
>
>

Project Deliverables and Milestones

Project timeline for the US region.

 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 82010-10-17 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 80 to 80
  The following documentation is aimed at the sysadmins of CMS sites:
Changed:
<
<
  • HdfsXrootd - Joining a HDFS site to the global Xrootd cluster.
>
>
 
  • DcacheXrootd - Joining a dCache site to the global Xrootd cluster.
  • PosixXrootd - Joining a POSIX site to the global Xrootd cluster.
  • T3Xrootd - Configuration of Xrootd for USCMS T3 sites.

We have the following user documentation available also:

Changed:
<
<
  • HdfsXrootdUsage - How to utilize the current demonstrator infrastructure.
>
>
  We are also trying to document all the issues we observe with the xrootd-based system here: CmsXrootdIssues.

Revision 72010-09-06 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 58 to 58
 
  1. T2_CH_PSI
We are working with these sites to help solve the local requirements.
Added:
>
>
From our xrootd monitoring, 38 distinct server hostnames have participated in the global redirector.
 

Improving CMSSW I/O

CMSSW has traditionally been very sensitive to latency. In order to make remote streaming feasible, we have been working closely with the CMSSW and ROOT team to provide guidance and code to remove this sensitivity.

Line: 72 to 74
 
  • Removal of non-Event TTrees. Important for high-latency links. Upcoming (3.9)
Several of these improvements were implemented by others, but benefit the prototype and are listed here.
Added:
>
>
This page documents the tests we've performed.
 

Documentation

The following documentation is aimed at the sysadmins of CMS sites:

Revision 62010-08-31 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

CMS is exploring a new architecture for data access, emphasizing the following three items:

  • Reliability: The end-user should never see an I/O error or failure propagated up to their application unless no USCMS site can serve the file. Failures should be caught as early as possible and I/O retried or rerouted to a different site (possibly degrading the service slightly).
  • Transparency: All actions of the underlying system should be automatic for the user catalog lookups, redirections, reconnections. There should not be a different workflow for accessing the data ``close by" versus halfway around the world. This implies the system serves user requests almost instantly; opening files should be a ``lightweight" operation.
Changed:
<
<
  • Usability: All CMS application frameworks (CMSSW, FWLite, bare ROOT) must natively integrate with any proposed solution. The proposed solution must not degrade the event processing rate significantly.
>
>
  • Usability: All CMS application frameworks (CMSSW, FWLite, bare ROOT) must natively integrate with any proposed solution. The proposed solution must not degrade the event processing rate significantly.
 
  • Global: A CMS user should be able to get at any CMS file through the Xrootd service.

To achieve these goals, we will be pursuing a distributed architecture based upon the Xrootd protocol and software developed by SLAC. The proposed architecture is also similar to the current data management architecture of the ALICE experiment.

Line: 63 to 63
 CMSSW has traditionally been very sensitive to latency. In order to make remote streaming feasible, we have been working closely with the CMSSW and ROOT team to provide guidance and code to remove this sensitivity.

The following is a list of changes:

Changed:
<
<
  • ROOT TTreecCache functioning (some items landed in 3.3; true functionality was in 3.6).
>
>
  • ROOT TTreecCache functioning (some items landed in 3.3; true functionality was in 3.6).
 
    • Squashing accompanying memory leak
Changed:
<
<
  • ROOT TTreeCache on by default; 3.7
>
>
  • ROOT TTreeCache on by default; 3.7
 
  • Fix broken caching on RAW files. 3.8 and 3.9
  • Fallback protocols in CMSSW. Upcoming (3.9)
  • Xrootd stagein calls. Upcoming (3.9)

Revision 52010-08-24 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Revision 42010-08-24 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 29 to 29
  FallbackAccess.png
Added:
>
>

Generating Usage

In order to generate a constant usage of the architecture, the demonstrator group has the T3_US_Omaha site advertised in the BDII as "close to" several of the participating sites. This allows CRAB analysis jobs to land at Omaha and access data at Nebraska or Caltech. These jobs are accounted in the CMS dashboard, allowing us to keep separate statistics just for these jobs (i.e., we can monitor error rates).

You can see the Xrootd cluster usage information from here.

 

Participants

Changed:
<
<
Currently, the following sites are participating:
>
>
Currently, the following sites are participating in the global redirector and allow CRAB jobs to run at Omaha using data from their sites:
 
  1. T2_US_Nebraska.
  2. T2_US_Caltech.
Added:
>
>
The following sites are participating in the global redirector:
 
  1. T2_US_UCSD.
  2. T2_US_Purdue.
Added:
>
>
  1. T2_US_Wisconsin
 
  1. T2_IT_Bari.
Changed:
<
<
  1. T1_US_FNAL - note: FNAL is not yet fully integrated until they are done with their studies.
>
>
The following sites are participating, but not yet fully integrated into the global redirector.
  1. T1_US_FNAL
  2. T3_CH_PSI
  3. T3_US_UCR

The following have shown interest in participating, but don't have the local xrootd infrastructure running (yet!):

  1. UK DPM sites
  2. T3_US_Cornell
  3. T2_CH_PSI
We are working with these sites to help solve the local requirements.

Improving CMSSW I/O

CMSSW has traditionally been very sensitive to latency. In order to make remote streaming feasible, we have been working closely with the CMSSW and ROOT team to provide guidance and code to remove this sensitivity.

The following is a list of changes:

  • ROOT TTreecCache functioning (some items landed in 3.3; true functionality was in 3.6).
    • Squashing accompanying memory leak
  • ROOT TTreeCache on by default; 3.7
  • Fix broken caching on RAW files. 3.8 and 3.9
  • Fallback protocols in CMSSW. Upcoming (3.9)
  • Xrootd stagein calls. Upcoming (3.9)
  • Removal of non-Event TTrees. Important for high-latency links. Upcoming (3.9)
Several of these improvements were implemented by others, but benefit the prototype and are listed here.
 

Documentation

Revision 32010-07-27 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

Line: 52 to 52
 
  • HdfsXrootdUsage - How to utilize the current demonstrator infrastructure.
Added:
>
>
We are also trying to document all the issues we observe with the xrootd-based system here: CmsXrootdIssues.
 
META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 22010-07-26 - BrianBockelman

Line: 1 to 1
 
META TOPICPARENT name="BrianBockelman"
Deleted:
<
<
 

CMS Xrootd Architecture

CMS is exploring a new architecture for data access, emphasizing the following three items:

Line: 8 to 6
 
  • Reliability: The end-user should never see an I/O error or failure propagated up to their application unless no USCMS site can serve the file. Failures should be caught as early as possible and I/O retried or rerouted to a different site (possibly degrading the service slightly).
  • Transparency: All actions of the underlying system should be automatic for the user catalog lookups, redirections, reconnections. There should not be a different workflow for accessing the data ``close by" versus halfway around the world. This implies the system serves user requests almost instantly; opening files should be a ``lightweight" operation.
  • Usability: All CMS application frameworks (CMSSW, FWLite, bare ROOT) must natively integrate with any proposed solution. The proposed solution must not degrade the event processing rate significantly.
Added:
>
>
  • Global: A CMS user should be able to get at any CMS file through the Xrootd service.
 To achieve these goals, we will be pursuing a distributed architecture based upon the Xrootd protocol and software developed by SLAC. The proposed architecture is also similar to the current data management architecture of the ALICE experiment.
Added:
>
>
We believe that these goals will greatly reduce the difficulty of data access for physicists on the small or medium scale.

Note that we specifically did not put scalability here - we already have an existing infrastructure that scales just fine. We have no intents on replacing current CMS data access methods.

 

Prototype

To explore the xrootd architecture, we are putting together a worldwide xrootd testbed. We hope this prototype will demonstrate the we are able to include all deployed SE technologies, learn potential usage patterns, and identify areas of work needed to bring the data access changes into production.

The prototype will initially focus on two use cases: end-user access and T3s. We feel it is important to focus on these as they are the "least represented" in the traditional CMS model.

Added:
>
>
Additionally, the prototype will utilize a global redirector and invite as many T1/T2 sites to participate as possible. This way, we have a large amount of "interesting data" for users.

The image below shows the high-level architecture of how file access would work in the global architecture:

GlobalAccess.png

In the prototype, most sites won't use Xrootd as their primary method; instead, they will use it primarily as a fallback. The image below shows how the file access would work for such a site:

FallbackAccess.png

Participants

Currently, the following sites are participating:

  1. T2_US_Nebraska.
  2. T2_US_Caltech.
  3. T2_US_UCSD.
  4. T2_US_Purdue.
  5. T2_IT_Bari.
  6. T1_US_FNAL - note: FNAL is not yet fully integrated until they are done with their studies.
 

Documentation

Changed:
<
<
  • HdfsXrootd - Fledgling Xrootd service for CMS data.
>
>
The following documentation is aimed at the sysadmins of CMS sites:

  • HdfsXrootd - Joining a HDFS site to the global Xrootd cluster.
  • DcacheXrootd - Joining a dCache site to the global Xrootd cluster.
  • PosixXrootd - Joining a POSIX site to the global Xrootd cluster.
 
  • T3Xrootd - Configuration of Xrootd for USCMS T3 sites.
\ No newline at end of file
Added:
>
>
We have the following user documentation available also:

  • HdfsXrootdUsage - How to utilize the current demonstrator infrastructure.

META FILEATTACHMENT attachment="GlobalAccess.png" attr="" comment="" date="1280174175" name="GlobalAccess.png" path="GlobalAccess.png" size="82462" stream="GlobalAccess.png" tmpFilename="/usr/tmp/CGItemp38990" user="bbockelm" version="1"
META FILEATTACHMENT attachment="FallbackAccess.png" attr="" comment="" date="1280176254" name="FallbackAccess.png" path="FallbackAccess.png" size="28117" stream="FallbackAccess.png" tmpFilename="/usr/tmp/CGItemp39011" user="bbockelm" version="1"

Revision 12010-07-12 - BrianBockelman

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="BrianBockelman"

CMS Xrootd Architecture

CMS is exploring a new architecture for data access, emphasizing the following three items:

  • Reliability: The end-user should never see an I/O error or failure propagated up to their application unless no USCMS site can serve the file. Failures should be caught as early as possible and I/O retried or rerouted to a different site (possibly degrading the service slightly).
  • Transparency: All actions of the underlying system should be automatic for the user catalog lookups, redirections, reconnections. There should not be a different workflow for accessing the data ``close by" versus halfway around the world. This implies the system serves user requests almost instantly; opening files should be a ``lightweight" operation.
  • Usability: All CMS application frameworks (CMSSW, FWLite, bare ROOT) must natively integrate with any proposed solution. The proposed solution must not degrade the event processing rate significantly.
To achieve these goals, we will be pursuing a distributed architecture based upon the Xrootd protocol and software developed by SLAC. The proposed architecture is also similar to the current data management architecture of the ALICE experiment.

Prototype

To explore the xrootd architecture, we are putting together a worldwide xrootd testbed. We hope this prototype will demonstrate the we are able to include all deployed SE technologies, learn potential usage patterns, and identify areas of work needed to bring the data access changes into production.

The prototype will initially focus on two use cases: end-user access and T3s. We feel it is important to focus on these as they are the "least represented" in the traditional CMS model.

Documentation

  • HdfsXrootd - Fledgling Xrootd service for CMS data.
  • T3Xrootd - Configuration of Xrootd for USCMS T3 sites.
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback