Design and Implementation of the next generation FTS, a distributed file transfer service.




Initially only gLite is involved. At some point in the time plan, storage elements within EMI, and possibly outside, will be required to provide feedback on their load, in order to allow FTS to efficiently schedule transfers.

Motivation and technical description

(This introduction has been taken from "The gLite Transfer Service (FTS)") FTS is the gLite file transfer services. It performs file transfers on channels. An FTS instance serves a configurable set of channels. Every channel is unidirectional, i.e. it is intentional that different FTS instances may serve two directions of a network link between two sites. Usually the receiving site configures the corresponding channel, as it is shown in Figure 4 4. A channel is a specific, possibly dedicated network pipe for transferring files between two sites. Production Channels. These are typically dedicated Tier0-Tier1, Tier1-Tier1 or Tier1-Tier2 network pipes. Production transfer jobs run on these channels to ensure efficient bulk distribution. All files in a job must be assignable to the same channel; otherwise the job will not be assigned to a production channel. Non-production Channels. Typically open networks shared with other applications. Jobs otherwise not assignable to a production channel may be serviced on the open network. A ‘Star’ usually denotes such channels. During the past years, FTS has been permanently improved and adjusted to the needs of the WLCG production file delivery service. However, in the opinion of the FTS experts, further improvements are no longer possible without significant changes of the basic FTS design.

Work plan

Design Phase [M16]
Within the next months, the high level design of the new services will be introduced, including a detailed plan of the new functionalities. This will cover an improved FTS configuration system allowing remote configuration and the persistency of the configured parameters. Currently, FTS is restricted to the Oracle database as persistency backend. This interface will be generalized and an abstract database interface will be provided. The first implementation of the interface however, will be for Oracle.
First implementation and proof of concept [M20]
Instead of Java, the new FTS system will be implemented in C++. The channel mechanism, described above, will be completely removed as it turned out to be not flexible enough for the fast changing load of the network and the storage elements. Instead, performance information on the network and the connected storage elements will be taken into account. Initially, those parameters have to be manually configured using the FTS administration interface. The interface to the different transfer logics (transfer protocols) will be abstracted and will become easily pluggable. Initially the SRM and gridFTP protocols will be implemented, as they are the current workhorses of the entire WLCG data exchange. It’s very likely that the http(s) protocol will be provided as an additional plug-in.
Prototype [M23]
The initial functional prototype will be deployed to evaluation facilities and will be massively tested. The code is not meant to be used in any production environment yet.
Pilot Service [M30]
While the prototype is hardened, some more features are added. The transfer job queue will be based on the common EMI messaging infrastructure. The information from the different storage element endpoints, as well as from the network, can be obtained automatically through a defined interface to those entities. This information is used to schedule and optimize transfers. At this point, the FTS developers feel confident that the software can be used in a limited pilot services infrastructure. The pilot service is planned to run in parallel to the current production service, which will replaced gradually.
Risks and influence on Key Performance Indicators
During a WLCG workshop on the evolution of data management in WLCG [R60] alternative data management approaches were suggested, which could replace coordinated data transfers at all. If one those approaches are approved and implemented, this objective will become superfluous as only WLCG is using FTS. If, however, the design of the current data management system is kept, it is inevitable to redesign the file transfer service, as the channel concept, described above, has limitations which will become painful with the increasing load of the overall system and the opportunistic use of network and storage element resources, FTS doesn’t take into account by design. This objective has one of the highest risks not being finished with the funded period of EMI mostly because of two reasons. The objective is demanding but introduced late in the project, only allowing two years for design, implementation and deployment. However the total amount of efforts doesn’t increase. The objective is to replace an existing system, which beside the deficiencies described about, is operational for years. As with the introduction of the new EMI data client library in 4.16 a minimum acceptance is required to replace and system already in production. However, the work plan is split into small functional pieces, which significantly reduces that risk.

-- PatrickFuhrmann - 03-Aug-2011

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2011-08-03 - PatrickFuhrmann
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EMI All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback