LCG Management Board

Date/Time:

Tuesday 19 June 2007 16:00-17:00 - Phone Meeting

Agenda:

http://indico.cern.ch/conferenceDisplay.py?confId=13800

Members:

http://lcg.web.cern.ch/LCG/Boards/MB/mb-members.html

 

(Draft Version 2 26.6.2007)

Participants:

D. Barberis, I. Bird, N. Brook, T. Cass, Ph. Charpentier, L. Dell’Agnello, C. Eck, M. Ernst, I. Fisk, S. Foffano (notes), J. Gordon, F. Hernandez, M. Kasemann,  J. Knobloch, M. Lamanna, H. Marten, P. Mato, P. McBride, G. Merino, B. Panzer, R. Pordes, L. Robertson (Chair), Y. Schutz,  J. Shiers, O. Smirnova, R. Tafirout, J. Templon.

Action List

https://twiki.cern.ch/twiki/bin/view/LCG/MbActionList

Mailing List Archive:

https://mmm.cern.ch/public/archive-list/w/worldwide-lcg-management-board/

Next Meeting:

Tuesday 26 June 2007 16:00-17:00 - Phone Meeting

1.      Minutes and Matters arising (Minutes

 

L. Robertson noted the frequent absence of RAL at the MB. J. Gordon to follow this up with T. Doyle.

 

Minutes:

Nick Brook requested the following 2 additions to the last minutes:

Item 5: "LHCb raised the issue of milestones associated with the gLite RB, as LHCb had been informed a usuable version of the RB cannot be deployed at CERN until the proper certification procedures had been completed. The MB believed that the gLite RB was in production. ACTION: Les would seek clarification."

 

Item 6: "LHCb still had concerns on issues with regards how best to estimate disk cache in front of the MSS."

 

D. Barberis commented on the statement at the end of point 3 that “ATLAS expect the MoU pledge to be honoured no matter what is specified in Harry’s tables” pointing out that there should be no contradiction, if the numbers are different between the MoU and Harry’s table the MoU is the reference.

 

The MB accepted the changes and comment and the minutes were approved.

 

J. Templon asked for the link to the latest version of Harry’s table to be confirmed. It is available from the LCG project planning page Resources section, Resource Planning Tables heading (an explanatory email was sent by L. Robertson on 20/06/07 for clarification).

 

Matters arising:

I. Bird gave a brief status of the resource broker – a packaged version of the new workload management system was given to Imperial College for testing. There were issues with overloaded BDII and machine configuration. A service has been set up at CERN to duplicate and validate what is being tested with a plan to deploy as soon as possible.

 

M. Lamanna commented on the document made available to the MB in the agenda showing trends and efficiency of Crab jobs (CMS), Pilot jobs (ALICE), Job Agents (LHCb)  and Ganga jobs (ATLAS). Only jobs submitted through a Resource Broker are monitored. Comments and feedback should be sent to M. Lamanna.

 

2.      Conclusions on accounting non-grid work – J. Gordon

 

A proposal was made at the F2F meeting on 05/06/07 however the discussion had been cut short.

 

Conclusion from minutes of F2F meeting:

The MB would like this issue investigated further. It is necessary to distinguish between grid and non-grid and non-grid work should be reported with as much information as possible (including VO and the user identity) as needed by the experiments.

 

A technical study has started and a new proposal will be made at the next Face to Face meeting on 03/07/07.

 

3.      GDB Main points – J. Gordon

 

J. Gordon invited members to read the meeting summary document which covers all subjects treated at the meeting, as only certain points would be raised in this MB agenda item.

 

  • SL4- The gLite Worker Node release is now ready and statements are needed from the experiments about whether they are ready now for full deployment of SL4. J. Shiers had requested these statements in the context of the operations meeting. Current testing is being done at different degrees by different experiments. The positions of ALICE, ATLAS and LHCb were detailed in a mail thread to the MB list. CMS said in the meeting that it was ready for SL4 deployment. Deployment details should be agreed in the Operations Meeting.

 

  • BDII-Database indexing has significantly improved things and has stopped the SAM timeouts.

 

 

  • gLiteCE-  The lcgCE is the only current interface available, but there is no plan at present to port this to SL4.  The gLite CE is still being tested and it should be ready for certification during the summer.

 

  • Job Priorities- the conclusion of this point at the GDB was for a fresh start involving different people. R. Pordes emphasized the need to correctly identify the requirements.  L. Robertson remarked that somebody, or a Working Group with a minimum set of members, should drive this. D. Barberis referred to a recent ATLAS discussion on this where the recommendation was to stop deployment on the short-term and develop a longer-term proposal through a re-formatted Working Group. L. Robertson suggested that the reformed group should be led by S. Campana with a few other people. The MB agreed with the proposal.
  • Conclusion: S. Campana will be asked to lead the reformed Working Group. Plans will be rolled back while waiting for a new proposal. The Operations Meeting will define the rollback instructions to sites. To be confirmed at the next MB meeting (26 June).

 

  • VOMS – the plan is to replicate the VOMS databases. However the use cases are unclear and are being investigated by Maria Dimou. I. Bird noted that CNAF and Brookhaven had planned only to have a read-only copy of the VOMS database for reference and Oracle streams is not necessary for this. L.Robertson noted that there was still little experience with Oracle streams and that we should be careful about using complex technologies where simpler solutions were available. If the sites involved have time to explore better solutions that is of course welcome, but this should not delay satisfying the basic requirement for a VOMS db copy outside CERN.

 

On the Generic Attributes issue several things are unclear: it is not clear that all the use cases are known, it is not clear what is expected to happen once they all exist and what middleware will be used. J. Gordon stressed the importance of getting a common idea of roles, groups, attributes and the need for a middleware group to discuss this.

 

Comment received from C.Grandi.

Just one clarification on the VOMS Generic Attributes: the API call that retrieves the
GA is different from the one that retrieves the FQAN. So the existing middleware is not
affected by the introduction of GAs. This ensures full backward compatibility.
How to exploit GAs in the middleware is another story. An example is in the work done
for gLite-Shibboleth interoperability. I agree that the MWSG could be a proper forum for
further discussions.

 

R.Pordes mentioned the Middleware Security Working Group and a paper that describes the current use of certificates in the Open Science Grid, confirming that OSG is using the Middleware Security Working Group for such discussions. J. Templon expressed interest in the paper and J. Gordon added that the Middleware Security Working Group seemed to be the right forum for the discussion. However I. Bird commented that a wider and more technical discussion forum was needed. Ph. Charpentier added that generic attributes do not have a well defined usage. J. Gordon stated that the definition of how middleware can use generic attributes still had to be defined and a focused group needed to look into this issue. J. Templon suggested the reformed Job Priorities group led by S. Campana. L. Robertson suggested that J. Gordon makes a proposal to the GDB or MB.

 

  1. Glexec/Pilot jobs - The MB was reminded of the 2006 summary from Jeff Templon concluding that if the pilot job downloaded subsequent payload from the same user then everyone was happy. If the pilot job downloaded payload for a different user and didn’t change identity then the VO was breaking the AUP. If it were to change identity then glexec is required. This point had been referred back to the MB at the GDB meeting. N. Brook commented that LHCb had stated clearly that they wanted pilot jobs and I. Bird added that sites are running pilot jobs - they just don’t know it. Jobs run as a generic user taking workload from a different user – these users are not known to the system. R. Pordes commented that OSG are waiting for an AUP for the VO and would develop it if necessary. L. Robertson asked if the OSG solution would be acceptable to others and stated that priority should be put on this by Joint Security Group (Dave Kelsey).

 

1.      Megatable issues (slides) - C. Eck

 

C. Eck informed the MB of a meeting of the Megatable team which had taken place on 14/06/07 and at which three main issues had been revealed:

 

  • The need to recall certain MB decisions

Resources pledged have to be installed and available on 1st April of the year of the pledge (with two exceptions: 1. for 2007 resources pledged have to be installed and available on 1st July, 2. Russian pledges are installed and available at the end of the year therefore the Megatable for 2008 will use the 2007 pledges from Russia). Each site has to plan acquisition and installation schedule accordingly with these dates. Requirements published by experiments are gross capacities including resources required for usage inefficiency and disk capacity required for efficient tape access i.e. no hidden resources in addition to the pledge. C. Eck recalled that this had already been discussed at length by the MB.

Following the meeting in an email sent to the MB by H. Marten, an additional exception was communicated “Out of the total Alice resource increments planned at GridKa for 2008, about 25% are planned to be provided in April and 75% in October.” This should be discussed at the meeting on 26 June.

  • The need to clarify the aim of the Megatable 

C.Eck said that the Megatable is intended as a translation of the TDR requirements of the experiments into requests for storage at the sites. D. Barberis corrected this saying that it was rather a translation of the pledges of the sites into the bandwidth available at the sites. ATLAS have converged between requirements and pledges but it is nonetheless the pledges that have been used as input to the Megatable. C. Eck remarked that the aim was to give a clearer view of what is requested and what is expected to be available. It also provided a learning exercise allowing T1<=>T2 relationships to be established and optimization of the T2<=>T1 dependency network. Some discussions had resulted in the move of T2’s to different T1’s. C. Eck reminded the MB that the Megatable is not intended as a detailed recipe for implementing storage and networking at the sites. By splitting storage and requesting the tape cache size some sites got the impression that the Megatable was becoming a blueprint of their system architecture. Creating now a last version of the Megatable for 2008 should prepare sites and experiments for adaptation to future years and to the changing LHC schedule.

  • The need to improve site-experiment communication in certain cases - some experiments claim big difficulties in finding discussion partners at certain sites.

Proposal to the MB: a last version of the Megatable for 2008 should be published based on the requirements from October 2006 and the pledges resulting from this from April 2007. This version of the Megatable will only contain summary information on tape and disk and should be ready by beginning of July.

 

L. Robertson asked for clarification on the proposal as it seems to remove all detail of the allocation of storage to different classes and leave the experiments to discuss this detail with the sites.

 

F. Hernandez commented that the estimated network bandwidth requirements were needed beyond 2008. H. Marten expressed surprise that the Megatable was being discussed again following the agreement of the last version where it was understood that any new versions would be minor corrections only in case of additional T2’s. C. Eck clarified the difference between the last version (January 2007) and the proposed new final version is the updated pledge level of the funding agencies for the April Computing Resource Review Board allowing experiments and sites to update the table based on more realistic resource planning.

 

D. Barberis commented that the MB had asked for a new set of requirements based on no run in 2007. ATLAS are currently working on this and an updated Megatable if it was really a necessity could only follow as they are linked. He questioned the usefulness as new pledges would be available in October so another new version should be produced. L. Robertson stated that probably only one version should be produced per year and asked the MB to comment on the proposal to remove storage classes. J. Templon and J. Gordon asked for clarification on where to get the information if not from the Megatable. L. Robertson noted that such detail comes from discussions between the experiments and the sites – the Megatable is only a central repository for this information. M. Kasemann confirmed that CMS prefers gross numbers in the Megatable with the details discussed between the experiments and the sites. L. Robertson commented that as the Megatable only contains information provided by the experiments, if the experiments are in doubt about its usefulness it should not be maintained.

 

D. Barberis commented that the main added value is the calculation of network bandwidth. J. Templon added that the T2-T1 matching process and the storage class information was useful. Ph. Charpentier commented that the information in the Megatable is useful, but not complete. G. Merino was of the same opinion. D. Barberis suggested adding a T1-T1 matrix. N. Brook commented that disk cache could not be included as he was not in a position to provide this information. L. Robertson remarked that it is necessary to allocate some of the disk storage for use as a disk cache for tape, even if this is a minor fraction of the resources, and so there must be bilateral discussions between the sites and experiments to decide on the size of the cache.

 

L. Robertson commented that we have two proposals –

  • the first for a simpler solution providing only T1-T2 relationships and bandwidth targets;
  • the second to maintain the current set of data, possibly even enlarging it with detailed T1-T1 data rates

In both cases the data is provided by the experiments after discussion with the sites. The usefulness is to have a central repository for this data to allow people to cross-check commitments.

 

L. Robertson summarised the discussion stating that this favoured continuing with the second solution, but underlined that the Megatable is just a centrally located repository for information provided by the experiments from their model and discussion with the sites. If storage classes are required then the experiment must agree them with the sites.

 

The experiment computing coordinators should provide input towards a new version of the Megatable in July, based on their new requirements, via the appointed Megatable team members of each experiment.

 

2.      AOB 

 

 

L. Robertson informed the MB that the meeting with the referees may be moved. L. Robertson to send current proposed agenda for the meeting which may have to change depending on who is available.

 

3.      Summary of Actions

 

J. Gordon to follow up RAL participation at MB with T. Doyle.

MB members to make comments on trends and efficiency document to M. Lamanna

J. Gordon to make a new proposal for non Grid accounting at the F2F meeting on 03/07/07.

I.Bird to follow the suggestion that S. Campana be asked to lead the reformed Job Priorities Working Group and to define instructions to sites via the operations meeting.

J. Gordon to propose which group (Middleware Security Working Group, Job Priorities Working Group or other) should look into the Generic Attributes issue.

J. Gordon to contact Dave Kesley regarding follow-up of the AUP for the VO issue.

Experiment Computing Coordinators to work on update of the Megatable for July in collaboration with the appointed Megatable team members of each experiment.

L. Robertson to send the proposed LHCC referees meeting agenda.