-- Main.atsareg - 21 Nov 2005
LHCb/LCG Task Force page
List of the LHCb Baseline service requirements
Workload Management services
Item |
Priority |
Stable and redundant RB service |
High |
A list of RB's available for the VO should be defined and an easy or transparent switching mechanism from one RB to another should be provided |
High |
A single RB end-point should be provided with an automatic load balancing between the RB services behind |
Medium |
No loss of jobs or loss of the job results due to temporary unavailability of a RB service should happen. The current estimate of the number of jobs handled simultaneously by a combined RB service is ~10^5 jobs per day or ~1Hz submission rate |
Medium |
Computing Element
Item |
Priority |
Computing Element service should be provided which allows direct access in order to: get the status of the computing resource and, in particular, the number of waiting/running tasks for the given VO; submit, monitor and manipulate jobs through the CE service interface. |
High |
Allow running special jobs ( Agents ) on a worker node which can steer execution of the jobs belonging to other users on the same worker node |
High |
A mechanism should be provided to allow to change identity of a job running on the worker node: interrogate the site policy service for permission to run a job of a particular user; in case of the positive answer, the new user proxy will be acquired from the VO service for subsequent job operations; the Agent job continues even after the user job execution finished |
High |
Data Management services
Storage management
Item |
Priority |
SRM service should be provided on all the sites with exactly the same semantics for each interface method for each backend storage |
High |
The functionality advertised for the SRM v2.1 is needed, in particular the possibility of the file pinning, bulk file removal |
High |
Storage Element functionality should include checking of the file validity after the new replica creation |
Medium |
SRM should provide the possibility to specify a particular storage pool |
Medium |
SRM client tools should be based on a highly optimized C/C++ library (gfal). In particular, command line tools based on the C/C++ API ( and not java based ) should be available. Python binding is required |
High |
The C/C++ API ( gfal library ) should be able to provide POSIX file access based on the file LFN |
Medium |
POSIX file access should include an efficient strategy for the "best replica" choice in the context of a running job. The strategy should take into account site location, prioritization of the different storage classes, the current state of the networking, etc. |
Medium |
Possibility to define group and user level disk quotas |
Medium |
File Transfer service
Item |
Priority |
FTS is needed to provide a single central entry point to all the required LHCb transfer channels including T0-T1, T1-T1 and T1-T2/T2-T1 transfers for the T2 sites running the LHCb analysis tasks |
High |
The central FTS service should handle also the automatic proxy renewal |
High |
File Placement Service
Item |
Priority |
FPS should provide easy plug-in of the VO specific agents to implement retry policies in case of any kind of failure |
High |
FPS should handle higher level operations such as data routing if necessary; replication operations ( without specification for the file source ); File Transfer Requests with multiple destination sites |
High |
Grid File Catalog
Item |
Priority |
An LFC instance should be availabel at CERN together with unauthenticated read-only entry point |
High |
Read-only mirrors of the central LFC service should be available at a subset or all the T1 sites. The mirror update frequency is of the order of 30-60 minutes |
Medium |
The LFC should be highly optimized with respect to different kinds of queries, bulk operations for file and replica registration should be supported |
High |
The basic file access API ( gfal library ) should be able to talk to several instances of the LFC catalog to ensure redundancy for high availability as well as load balancing for effiency |
Medium |
Information Services
Item |
Priority |
VOMS should provide a stable service to define user roles and groups |
High |
Storing arbitrary user metadata should be possible in VOMS with an easy interface to access the user parameters, e.g. passing them in the VOMS proxy |
Medium |
Grid Information System ( BDII or equivalent ) should provide a stable access to the static information ( services end-points and characteristics ) |
High |
Grid Information System must provide a precise, timely and consistent information |
High |
VO Boxes
Item |
Priority |
VO Boxes are needed in all the LHCb T1 sites as well as selected T2 centers which are running the LHCb user analysis tasks. See the more detailed specification of the LHCb VO box and the services that will run on it in the attached document |
High |
Services deployment on sites
Item |
Priority |
Each site should provide a Storage Element with an SRM interface |
Medium |
Tier1 sites as well as LHCb analysis Tier2 sites should provide different classes of storages with distinct SRM end-points: MSS storage ( if available ) for non-frequently accessed data ( archives ); disk storage with write access for production managers; disk storage with write access for all the VO users. A mechanism for choosing the SE at a given site with the above mentioned characteristics should be provided |
High |
Each site should provide a Computing Element service accessible directly |
High |