Performance studies and improvements of CMS Distributed Data Transfers

CMS computing needs reliable, stable and fast connections among multi-tiered computing infrastructures. CMS experiment relies on File Transfer Services (FTS) for data distribution, a low level data movement service responsible for moving sets of files from one site to another, while allowing participating sites to control the network resource usage. FTS servers are provided by Tier-0 and Tier-1 centers and used by all the computing sites in CMS, subject to established CMS and sites setup policies, including all the virtual organizations making use of the Grid resources at the site, and properly dimensioned to satisfy all the requirements for them. Managing the service efficiently needs good knowledge of the CMS needs for all kind of transfer routes, and the sharing and interference with other Virtual Organizations using the same FTS transfer managers. This contribution deals with a complete revision of all FTS servers used by CMS, customizing the topologies and improving their setup in order to keep CMS transferring data to the desired levels in a reliable and robust way, as well as complete performance studies for all kind of transfer routes, including overheads measurements introduced by SRM servers and storage systems, FTS server misconfigurations and identification of congested channels, historical transfer throughputs per stream for site-to-site data transfer comparisons, file-latency studies, among others... This information is retrieved directly from the FTS servers through the FTS Monitor webpages and conveniently archived for further analysis. The project provides a monitoring interface for all these values. Measurements, problems and improvements in CMS sites connected to LHCOPN are shown, where differences up to x100 are visible, constant performance measurements of data flowing from Tier-0 to Tier-1s, comparison to other existing monitoring tools (PerfSonar, LHCOPN dashboard), as well as the usage of the graphical interface to understand, among others, the effects for sites when connecting to LHCONE network. Given the multi-VO added value of this tool, this work is serving as a reference for building up the WLCG FTS monitoring tool, which will be based on the FTS messaging system.

-- NicoloMagini - 27-Oct-2011

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2011-10-27 - NicoloMagini
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback