LCG Management Board

Date/Time:

Tuesday 9 May 2006 at 16:00

Agenda:

http://agenda.cern.ch/fullAgenda.php?ida=a061500

Members:

http://lcg.web.cern.ch/LCG/Boards/MB/mb-members.html

 

(Version 1 - 13.5.2006)

Participants:

A.Aimar (notes), D.Barberis, S.Belforte, I.Bird, D.Boutigny, N.Brook, F.Carminati, T.Cass, Ph.Charpentier, J.Gordon, B.Gibbard, C.Grandi, J. Knobloch, M.Lamanna, M.Litmaath, H.Marten, P.Mato, M.Mazzucato, G.Merino, B.Panzer, Di Qing, L.Robertson (chair), Y.Schutz, J.Shiers, J.Templon

Action List

https://twiki.cern.ch/twiki/bin/view/LCG/MbActionList 

Next Meeting:

Tuesday 16 May 2006 at 16:00

1.      Minutes and Matters arising (minutes)

 

1.1         Minutes of the Previous Meeting

Minutes approved.

1.2         Clarification of the discussion on accounting (document)

The comments received do not change the substance of the proposal:

-          K.Bos (email): accounting data should be sent more regularly than just once a month

-          H.Marten (email): user accounting is mentioned in the proposal but J.Gordon will present to the MB a more detailed proposal on user accounting (action already in the MB Action List).

 

 

2.      Action List Review (list of actions)

 

 

09 May 06 - L.Robertson will discuss with E.Laure and C.Grandi the status of the development of the features needed by the LHC (in Flavia’s list).

L.Robertson discussed with C.Grandi about the features being developed in JRA1. C.Grandi will present priorities and next developments to the MB.

 

Action:

31 May 06 – C.Grandi presents to the MB the EGEE middleware priorities and development of the features needed by the LHC (in Flavia's list).

 

3.      GLite 3.0 Update - I.Bird

 

 

Glite 3.0 released on Thursday 4 May 2006. The agreement is that within two weeks gLite 3.0 should be installed on all sites (in two groups, one per week).

 

The current status of gLite 3.0 deployment is:

-          CERN: started deployment

-          PIC: deployment completed

-          RAL: starts deployment in the next couple of days

-          TRIUMF-Vancouver: started deployment

-          USATLAS BNL: starts deployment in the next couple of days

-          INFN CNAF: not able to start before 22 May

 

Status of deployment on the sites of the second group:

-          ASGC, FZK, IN2P3, NDGF, SARA: will start in a week

-          USCMS FNAL: started deployment

 

 

4.      SRM 2.1 working group (transparencies) - M.Litmaath

-          Progress, features and plans of the SRM implementations

 

4.1         SRM working group update

The working group meets weekly and will conclude with a workshop on the 22-23 May at FNAL.

 

Introduction

The strategy is to start, as far as possible, from the features already planned for SRM 3 and select those that should be implemented earlier in order to include them in the next release of SRM 2 (Sep-Oct 2006)

 

The summary of all discussions (ontology, classes, attributes, etc) is in a document maintained by J.Casey and J.-Ph. Baud that will be distributed publicly (the temporary link here).

 

Lifetime of files

The lifetime of a file can be: Volatile/Durable/Permanent (slide 2).  The durable as defined already for SRM 2 and 3 is not useful for the purpose and is being redefined. The main issue is that when a durable file reaches its “end of life” there is no efficient way to inform the system to delete it and recover the space. Instead the experiments prefer to have permanent files that they manage directly via their bookkeeping procedures.

 

One reason for introducing “durable” files was also not to use the tape quotas. This will be addressed by using cache attributes of the file that avoid the storage on tape.

 

The SRM 3 custodial responsibility ensures the safe storage of the data but does not define where it is stored. Instead the experiments want to know where the data is stored. This issue is still being discussed and storage classes are being defined accordingly.

 

Storage Classes

The Storage classes are being identified by the number of copies (0, 1, >1) on Tape and Disk (see slide 3)

 

    +-----------------------------------------------------------
    |               | min. required copies | Mumbai term
    |---------------+----------------------+--------------------
    |               | Tape     | Disk      |
    |---------------+----------+-----------+--------------------
    | Storage Class |          |           |
    |---------------+----------+-----------+--------------------
    |            A  |   1      |   0       | "permanent"
    |---------------+----------+-----------+--------------------
    |            B  |   1      |   1       | "permanent-durable"
    |---------------+----------+-----------+--------------------
    |            C  |   0      |   1       | "durable"
    |---------------+----------+-----------+--------------------
    |            D  |   0      |  >1       |
    |---------------+----------+-----------|
    |            E  |  >1      |   0       |
    |---------------+----------+-----------|
    |            F  |  >1      |   1       |
    |---------------+----------+-----------|
    |            G  |  >1      |  >1       |

 

Put and Get functions

Slide 4: For the PUT function a new “storage class” attribute was added. The PUT command could also have extra cache parameters to indicate typical usage of the file. Also a new method to change the storage class will be needed (but not implemented for the Autumn release).

 

The GET function is not symmetric to PUT. Class A is a system managed class therefore a volatile class is sufficient. Classes B and C are “user-managed” caches, but because copies could be needed on other pools the volatile class may still be needed even in this case.

 

The MB discussed whether the SRM interface should be separated in high-level user features and in interchange functions for more advanced users. But this was not agreed because depending on the users and experiments applications the classification of these users vs. advanced functions varies by teams and VOs.

 

Further discussions

Experiments usage and models will require the implementation and optimization of 4-5 typical access patterns (slide 5).

New methods for pre-staging and asynchronous space reservations will also be implemented (slide 6). Transfer speeds and pool rates should be carefully studied in order to obtain the best performance.

 

The GLUE interface needs adaptations in order to advertise what an SRM service makes available. This will require changes or extensions to the schema in order to adequately describe storage classes, cache attributes, etc.

 

Discussions are still needed to clarify and extend important concepts. The example given was the interpretations of “free space” that can refer to the tape system as a whole or to the cache of such system.

 

4.2         SRM 2.1 Timescale concerns

The discussions on the timescale are ongoing. The work needed to have uniform SRM implementation will be more than expected.

 

The existing SRM implementations are not compatible, even for the simplest SRM operations (e.g. put, get, etc). Therefore new functionalities should be avoided/limited as much as possible in a new SRM 2.2 version.

 

For instance the storage classes needed could be temporarily implemented with the three types (permanent, durable, and volatile) already available without new implementations.

 

Action:

31 May 2006 – M.Litmaath presents the results of the SRM workshop in FNAL.

 

5.      Accounting CPU, Disk and Tape Availability and Usage (document ) - L.Robertson

 

The proposal  now takes into account the decisions to report:

-          cpu/wall clock times

-          grid/local job execution

-          disk allocation/usage

-          tape usage

 

The collection of the data will be temporarily manual. In the future the data should all be collected in the GOC database as is at present the grid cpu accounting data. Data collection should start monthly from April 2006, and report data starting back from January 2006.

 

Action:

20 May 2006 – All SC4 sites send accounting data using the report form that will be sent to them by Fabienne Baud-Lavigne (for Jan, Feb, Mar, Apr 2006).

 

The RRB will receive a summary of this information (e.g. resources pledged, delivered and used). The format of the report to the RRB will be discussed at the in September MB.

 

Note on availability metrics

Uptime and reliability of the sites will be in a separate (automated) report that will be using the SFT/SAME testing framework.

 

Note on MoU resources

From 2007 the yearly capacities in the MoU should be available from April. For 2006 the formal deadline is moved to July.

 

6.      AOB

 

 

The SC4 schedule and requests will be reviewed in the next two weeks and the wiki pages will be updated accordingly.

 

7.      Summary of New Actions

 

 

Action:

20 May 2006 – All SC4 sites send accounting data using the report form that will be sent to them by Fabienne Baud-Lavigne (for Jan, Feb, Mar, Apr 2006).

 

Action:

31 May 2006 – M.Litmaath presents the results of the SRM workshop in FNAL.

 

Action:

31 May 06 – C.Grandi presents to the MB the EGEE middleware priorities and development of the features needed by the LHC (in Flavia's list).

 

The full Action List, current and past items, will be in this wiki page before next MB meeting.