Status of the SRM 2.2 WLCG usage agreement and its addendum
Addendum requirements
Feature |
BeStMan Gateway |
CASTOR |
dCache |
DPM |
StoRM |
GFAL/lcg-util |
Priority |
ATLAS |
CMS |
LHCb |
Protection of spaces from (mis-)usage by generic users via space ACLs |
No |
Yes, but not by DN/FQAN |
Yes, but only Stage-to-Space |
Yes, but only Write-To-Space |
Yes |
N/A |
2 |
1: to protect spaces dedicated to special activities (e.g. T1D0 at T1) |
1: needed for better data protection |
Full VOMS-awareness |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
1.5 |
1: easier mgmt of access privileges |
1: needed for better data protection |
Selecting spaces for read operations |
No |
Yes |
No |
Yes |
No, but no tape backend |
Yes |
1 |
1: all T1s could use space tokens |
1: needed to understand data movement and used disk space |
Correct implementation of srmGetSpaceMetadata |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
|
|
|
Providing information to efficiently store data to tapes |
N/A |
No |
No |
N/A |
N/A |
No |
0 |
0 |
0 |
srmLs returns all space tokens with a copy of the file |
No |
Yes |
No, but files can be in one space only |
Yes |
No, but files can be in one space only |
Yes |
0.5 |
1: to understand where a file is |
0.5 |
Release files without request token |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
|
|
2 |
N/A = Not applicable
Items in
red changed recently.
Priority:
- 0 = useless
- 1 = if available, it could allow for more functionality in data management, or better performance, or easier operations, but it is not critical
- 2 = critical: it should be implemented as soon as possible and its lack causes a significant degradation of functionality / performance / operations.
Other requirements
Feature |
BestMAN |
CASTOR |
dCache |
DPM |
StoRM |
GFAL/lcg-util |
Priority |
ATLAS |
CMS |
LHCb |
File pinning |
N/A |
No |
Yes |
Yes |
Yes |
Yes |
2: "soft" pinning is acceptable |
2 |
2 |
CMS:
The two main issues at tier2 sites are the "scalability and stability" at the requested scale needed to fulfill experiments requirements.
I.e. it is required that the SRM front-end should guarantee that the activity of a single user could not disrupt the service
Implementation details
Space protection
According to the Addendum, a space allocation can have an Access Control List (ACL) defining who can do what. The possible operations which are mandatory for WLCG are
Read-from-Space,
Write-to-space,
Stage-to-Space,
Replicate-from-Space and
Purge-from-Space. The optional operations are
Release-Space,
Update-Space,
Modify-Space-ACL and
Query-Space. A user should be identified by his DN and/or VOMS primary FQAN. The Addendum proposes to follow the syntax used by NFSv4. Given that
Modify-Space-ACL is optional, it is acceptable that only the administrator can modify the ACLs.
BeStMan Gateway
As BeStMan is deployed on top of other file systems, none of SRM space features are supported, nor really required, from the experience of the T2 sites using BeStMan. Usually, the management features (such as spaces in XrootD or quotas in HDFS, GPFS, and Lustre) are provided by the underlying file system.
Being fully VOMS-aware has been a crucial feature for management; this has been delivered for about a year or so, and is well-tested.
CASTOR
CASTOR allows to grant or deny certain privileges to users or groups via a native interface consisting of the commands
stager_addprivilege
,
stager_removeprivilege
and
stager_listprivileges
. There are several possible operations, among which
Get,
Put,
PrepareToGet,
PrepareToPut,
PutDone,
Rm,
DiskCopyReplica,
ChangePrivilege and
ListPrivileges. The operations in the Addendum are covered, but the ACLs identify users via username and groupname rather than DN or VOMS FQAN.
dCache
Starting from release 1.9.4, dCache provides support for
Tape Protection, that is, restriction of who may read from tape (Tape Read Access). This mechanism can be activated only by the administrator, who has to decide which users can trigger file staging. Users are identified by the DN of their certificate and optionally a VOMS FQAN. Wildcards are allowed. The only operation supported is
Stage-to-Space.
DPM
For DPM the most important requirement for space protection (to forbid file staging by generic users) does not apply by definition. The only form of space protection is the possibility to restrict writing to a space to a list of VOMS FQANs. In general, a space belongs to a group and only members of the group can write to the space.
Full VOMS awareness
CASTOR
CASTOR has a very crude form of VOMS awareness: authorisation via SRM works using the grid-mapfile mechanism, which is built using a tool equivalent to
edg-mkgridmap
. Its configuration file maps VOMS FQANs to local accounts (possibly pool accounts) and the DNs having a certain FQAN are retrieved from the VOMS server. Therefore, the VOMS extensions of the user proxy are totally irrelevant. For the moment there are no definite plans to implement real VOMS support.
Selecting spaces for read operations
If space selection for read operations is supported,
srmChangeSpaceForFiles does
not need to be supported, as the same functionality can be obtained invoking
srmBringOnline or
srmPrepareToGet followed by a
srmPurgeFromSpace.
dCache
It is not possible to select on which space token a file should be staged, because in dCache all files have a single "SRM copy" (which does not prevent from having other copies on pools which cannot contain space tokens). At the dCache level it is possible to select the pool where to stage a file according to rules which look at the client IP or the protocol, but this cannot be controlled via SRM.
DPM
DPM allows to specify a target space token for
srmBringOnline requests, but this is irrelevant in the WLCG context, as DPM is not currently used with a tape backend.
StoRM
In StoRM, a file can exist on only one space, hence space selection for read operations does not make sense. Moreover, currently
StoRM is not used with a tape backend. Anyway, also when StoRM will support T1D0 (using GPFS-TSM), it will not be possible to specify a target space and the file will always be staged to the same disk cache.
Currently StoRM at CNAF implements T1D1, but using the tape backend (GPFS-TSM) as a backup system; if a file disappears from disk, it will be automatically staged back from tape.
Correct implementation of srmGetSpaceMetadata
srmGetSpaceMetadata should return information about the space used and available for a given space token.
Providing information to efficiently store data to tapes
This request is now unanimously considered irrelevant.
File Pinning
ATLAS
Ideally file pinning should be "hard" everywhere. However, "soft" pinning is acceptable, but only because a dedicated prestaging service is being developed for ATLAS to handle the space management.
LHCb
File pinning is extremely important to make data reprocessing manageable, given that the disk cache is limited. A "soft" pinning can be acceptable, but it should have a real effect on the order in which files are garbage collected, which is not the case today.
WLCG SRM Usage agreement summary
In this section all requirements expressed in the WLCG SRM usage agreement are listed "as they are", just for convenience. To critically review them is for the time being outside of the scope of this document.
- Storage classes. Only T1D0, T1D1 and T0D1 are used.
- File lifetime. All files are permanent and can be completely removed only using srmRm.
- Protocol negotiation. Protocols are specified in order of preference in Get, Put and BringOnline requests.
- Space size. It only refers to the disk space, while tape space is considered infinite.
- Space reservation. Only static space reservations are mandatory.
- Space reservation. The input parameter sizeOfTotalSpaceDesired shall be ignored.
- Space reservation. It can be restricted to certain DNs or VOMS roles. If not allowed, the server returns SRM_AUTHORIZATION_FAILURE.
- Space reservation. The input parameter expectedFileSize shall be ignored.
- Space reservation. The input parameter preferredRetentionPolicyInfo should be renamed retentionPolicyInfo and the corresponding output parameter removed.
- Space reservation. The input parameter storageSystemInfo shall be not be specified by the client and shall be ignored by the server.
- Space reservation. The lifetime of the reserved space may be infinite, the client must take care of removing the space.
- Space reservation. The input parameter transferParameter is optional.
- Space reservation. The method srmChangeSpaceForFiles does not need to be supported.
- Ls. The file locality and the space token returned by srmLs do not have to be returned for directory listings.
- Ls. The file locality must be LOST for permanent hardware failures, UNAVAILABLE for temporary hardware failures.
- Ls. For directory listings the number of entries returned can be limited.
- Ls. If the file does not exist, SRM_INVALID_PATH is returned.
- Ls. If numberOfLevels is zero, the returned information refers to the directory itself.
- Get/BringOnline. The target space token can be specified.
- Get/Put. The transfer protocol is mandatory.
- Get/Put/BringOnline. The method is asynchronous.
- Get. If the connection type is not provided, the server can choose it.
- Get/Put/BringOnline. The output estimated times cannot be relied upon.
- Get/BringOnline. If a file is temporarily unavailable, SRM_FILE_UNAVAILABLE is returned.
- Get/BringOnline. If a file is permanently lost, SRM_FILE_LOST is returned.
- Get/Put/Copy/BringOnline. If a file has still an open Put (= no srmPutDone issued), SRM_FILE_BUSY is returned.
- Get/Put/Copy/BringOnline. If the request does not complete within the totalRequestTime (negotiated with the server), it times out and returns SRM_TIMEDOUT. Not required for the start of data taking.
- Put. The client must provide SURLs and the implementation does not have to generate names.
- Put/BringOnline. The server shall allow the request to continue provided that at least one file is successful or in progress.
- Get/Put. If both target space token and RetentionPolicyInfo are specified, they must match, otherwise SRM_INVALID_REQUEST is returned.
- Get/Put/Copy. The WLCG clients will specify the target space token and not the RetentionPolicyInfo. A default space token will be chosen by the server if not specified.
- Put/Copy. WLCG shall not use any SURL lifetime.
- Put. The TURL lifetime is the time available to write the file, after which the TURL may become invalid.
- Put. A TURL returned by a srmPrepareToPut cannot be used for read access.
- Copy. Files are immutable and therefore WLCG clients shall not specify overwriteOption.
--
AndreaSciaba - 2009-08-28