LCG Management Board |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Date/Time: |
Tuesday
16 October 2007 16:00-17:00 – Phone Meeting |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Agenda: |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Members: |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
(Version 1 - 18.10.2007) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Participants:
|
A.Aimar
(notes), D.Barberis, I.Bird, T.Cass, Ph.Charpentier, L.Dell’Agnello, T.Doyle,
M.Ernst, S.Foffano, J.Gordon, F.Hernandez, M.Kasemann, J.Knobloch, M.Lamanna,
U.Marconi, P.Mato, G.Merino, R.Pordes, Di Qing, L.Robertson, Y.Schutz,
J.Shiers, R.Tafirout, J.Templon |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Action
List |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Mailing
List Archive: |
https://mmm.cern.ch/public/archive-list/w/worldwide-lcg-management-board/ |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Next Meeting: |
Tuesday
23 October 2007 16:00-17:00 – Phone Meeting |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1. Minutes and Matters arising (Minutes) |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1.1 Minutes of Previous Meeting
The
minutes of the previous meeting were distributed only on Tuesday morning.
Unless the MB members have any comment or feedback in the next few days the
minutes will be considered approved. Update: No comments received, minutes approved. 1.2 Sites Names (Document)A.Aimar proposed that the name of each site becomes unique in all reports, document, tables, etc. Now several names are used for some of the sites. Below is the table with a couple of proposals, but is each site that should decide.
He proposed that: - each site chooses the name that will be used to identify the site itself - as for the Tier-s sites, the 2-letters ISO country code is prefixed to the name - the only character used as separator is the hyphen (avoiding slashes, underscores, etc). L.Robertson
added that the Tier-2 sites were asked their name and the name they have
registered in GOCDB. The proposal is that the Tier-1 sites do the same and
this will not require any change in GOCDB. L.Dell’Agnello
noted that there is “CNAF-INFN” as the Tier-1 and also a smaller Tier-2 also
called “CNAF”. They will have to discuss and agree which name to choose. Decision: The MB agreed to have the country code prefixed to all
Tier-1 sites name (except NDGF, being across several countries). Action: The Tier-1 sites should define the name that will be used
in all reports of the LCG. 1.3 LCG MB Private Web Area (Slides)The MB had agreed that the Sites would share
their 24x7 and VO Boxes SLA documents but it should be done only among the
members of the MB. A.Aimar has created a private Web Area
accessible only by the members of the MB and will distribute the details
after the meeting Information
sent by email after the meeting: ------------ https://cern.ch/lcg-mb-private/Documents
------------ R.Tafirout asked whether confidential information can be put safely
there. L.Robertson replied that is a site’s decision what to share and whether
remove some confidential information in the documents. But it would be useful
that the 24x7 and the VO Boxes documents are shared among sites, as requested
by several of them. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2. Action List Review (List of actions)Actions that are late are highlighted in RED. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I.Bird
agreed to report to the Management Board about the progress of the Job
Priority working group. He will
report next week after a new meeting. ·
16 October 2007 - Sites should send the
pointers to their documents about 24x7 and VO Boxes to A.Aimar. A.Aimar will
prepare a protected web area for confidential documents of the LCG Management
Board. Done. A.Aimar created
the private Web Area and sites can upload themselves the documents. ·
D.Barberis agreed to clarify with the Reviewers the kind of
presentations and demos that they are expecting from the Experiments at the
Comprehensive Review. Ongoing.
D.Barberis started the discussions with the Reviewers and with the other
Computing Coordinators. He will send a summary via email in the next days. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3. SRM 2.2 Weekly Update (Agenda Edinburgh workshop; dCache 1.8 deployment schedule; dCache site) – J.Shiers
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J.Shiers presented the weekly update on the SRM Roll-out at the Tier-1 Sites. Deployment: The dCache 1.8 deployment schedule is now available (Link). Preparations for the production deployment at named sites, at defined dates, continues and is on track. Issues: The outstanding issues that were blocking high-level tools were fixed with latest dCache releases. A new issue was found and is being solved. A similar situation also for CASTOR2 SRM. The 1.1.5 release is going to be available for deployment in the coming week(s) and fixes several outstanding bugs Sites: CNAF (CASTOR2 + STORM) is in production for ATLAS and is now available for testing by the Experiments A recent mail from Frank Wuerthwein proposes that dCache 1.8 is ready for production deployment at (CMS) Tier2s now (forwarded by GSSD & GDB mail lists). Experiments: It is important that Experiments test the SRM V2.2 features not only by running SRM 1.1 applications by with applications using the SRM 2.2 features. This will come back later, under the CCRC'08 topic. LHCb foresee restarting testing on Thursday this week, along the lines of the original plan: - transfer of data - access data from applications on the WN - deletion of data (not originally included in our list of tests) |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4. Update on CCRC-08 Planning (CCRC'08 Meetings, Slides) – J.Shiers
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J.Shiers summarized the weekly CCRC’08 phone meeting held on the day before. The goals of the first
meeting were to start working on: 1. First draft of a combined scheduled 2. First draft of combined goals (as started in the CSA08 description) 3.
Initial identification of key (existing)
services for February run Slide 2 shows the overall proposed schedule. Phase 1 - February 2008: Possible scenario:
blocks of functional tests, Try to reach 2008 scale
for tests for: 1. CERN: data recording, processing, CAF, data export 2. Tier-1’s: data handling (import, mass-storage, export), processing, analysis 3.
Tier-2’s: Data Analysis, Monte Carlo, data import and export 4.
Phase 2: May 2008 Duration of challenge:
1 week setup, 4 weeks challenge Of course the Phase 1 results will be used to define the Phase 2: - Use February (pre-)GDB to review metric, tools to drive tests and monitoring tools - Use March GDB to analyze CCRC phase 1 -
Launch the May challenge
at the WLCG workshop (April 21-25, 2008) The next F2F CCRC meetings will cover: - Nov 6: agreement on key services & goals – including with sites; draft schedule for component testing; check-point on Explicit Requirements (ERs) - Dec 4: progress with component testing; plans for integration testing; remaining ERs; status of site services - Jan 8: review metric, tools to drive tests and monitoring tools; progress with integration - Feb 12: mid-challenge assessment. Slide 4 shows the tasks, activities, holidays, etc from now to May 2008.
Slide 5 shows the current explicit requirements from the Experiments.
J.Gordon
asked how many sites are going to run SL3 for WMS in CCRC. J.Shiers
replied that CERN will do so and maybe one other site will have to do it for
LHCb, if needed. But
Ph.Charpentier added that LHCb does not require any WMS outside of CERN. J.Templon
asked how many RB is ATLAS expecting to have on the Tier-1 sites. J.Shiers
replied that CCRC will soon prepare a clear list of which service should be
running at which site (“what service and where” table). J.Templon
noted that “pilot jobs” limitation will also imply a change in the way ALICE
is operating and therefore ALICE should be added to LHCb in the corresponding
cell, in the table above. J.Shiers then highlighted (slide 10) the fact that there cannot be “implicit requirements”. For each service VOs must specify the installations required, the level of service and the target performance to be reached in CCRC-Feb08 and CCRC-May08. Slides 12 and 13 show two examples (CMS and LHCb) of the kind of information about targets that is needed for the preparation. All Experiments agreed to produce this kind of information at the CCRC meeting on Monday. At the CCRC meeting was also agreed that a usual milestone plan will be initiated (by A.Aimar) and then used to plan and monitor CCRC’08. The focus of next 1-2 CCRC meetings is to obtain the detailed target described above from all 4 Experiments: - Week 1: equivalent of CMS targets - Week 2: resource requirements at sites Agendas for the next CCRC meetings (up to but not including F2F) are in Indico: http://indico.cern.ch/categoryDisplay.py?categId=1613 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5. Report from the EGI Workshop (Slides) - J.Knobloch |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J.Knobloch presented a summary of the EGI Workshop in Budapest and other information on EGI. 5.1 Introduction to EGIAs described in slide 2, Starting nearly two years ago, CERN has prepared the ground (nickname: EGO) to create a sustainable infrastructure in Europe with the vision of transferring the know-how and the responsibility for a global e-infrastructure into a new organization independent from CERN. CERN has also offered to host this new organization at least initially, to facilitate a smooth transition from present CERN-led operations. CERN’s proposal was well received and following a large information campaign initiated via EGEE, which included visits to many countries, the idea is now generally accepted. Supported by the position of e-IRG, the EU has now opened as part of FP7 a call for design studies with EGI in mind. Although CERN has proposed to lead the design project, the recent choice by the majority of the EGI preparation team was different. CERN accepts this and it does not change CERN’s position as far as the need for a sustainable e-infrastructure is concerned. CERN must ensure that the needs of the LHC community are fully taken into account. CERN expects that the future EGI will gradually take over, together with the NGIs, the responsibility for the operation presently provided by EGEE, the software integration, certification and distribution, as well as the required support and training. While global coordination is important, it is not sufficient and EGI will have to provide reliable long-term services until such services can be obtained from industry. CERN also sees a role for EGI in the coordination of future middleware developments and in standardization as described in the vision paper prepared for the February 2007 workshop in Munich. The EGI project must help to ensure that National Funding will take over a large fraction of the EU funding for operations, which is expected to run down with time. J.Templon asked how this
mandate compares to the OMII mandate (http://omii-europe.org/). I.Bird replied that the
OMII goal is mainly the standardization of the existing middleware and this
will not be part of the EGI mandate. 5.2 EGI Design StudyThe current work is regarding the setup and operation of a new organizational model of a sustainable pan-European grid infrastructure. The main dates are: -
February 26-27: EGI Workshop in Munich -
May 2: Proposal submitted to the EC for funding within
FP7-INFRA-2007-1, 1.2.1 Design Studies -
September 1: Project start (if approved) -
September 27: End of negotiations with EC -
October 2: EGI workshop in Budapest And it will involve about 300 person months. The institutes that will participate to the preparation of the design documents are: -
Johannes Kepler Universität
Linz (GUP) -
Greek Research and Technology
Network S.A. (GRNET) -
Istituto Nazionale di Fisica
Nucleare (INFN) -
CSC - Scientific Computing
Ltd. (CSC) -
CESNET, z.s.p.o. (CESNET) -
European Organization for
Nuclear Research (CERN) -
Verein zur Förderung eines
Deutschen Forschungsnetzes - DFN-Verein (DFN) -
Science & Technology
Facilities Council (STFC) -
Centre National de la
Recherche Scientifique(CNRS) A slide 6 shows the Management Structure that has been agreed, with an Advisory Board and a Management Board. The overall Project Director is Dieter Kranzlmüller (Linz) and 6 work packages have been defined (WP1
to WP6). Below are the details of each work package,
each with a leading partner (slide 7).
J.Knobloch also added that J.Shiers represents CERN in WP3: Functions Definitions. I.Bird
added that “Functions Definitions” will be about the functions of the EGI
infrastructure from the NGIs, etc. Not the functionalities and requirements
of the middleware software. J.Shiers said that he
will distribute the links to the EGI web and to the Use Cases Letter that is
being prepared. Information
received: http://www.eu-egi.org/ http://www.eu-egi.org/public/EGI_Use_Case_Letter.pdf
5.3 Scope of WP5: Establishment
of EGI
The main objectives of WP5 are: -
Generate
with WP3 and WP4 the “blueprint” which will serve to establish EGI -
Get the Organization
and its Conventions ratified by a significant majority of European States -
Prepare
and start the transition from EGEE to EGI WP5 will be based on the results from WP2, WP3 and WP4 and, if
needed, direct investigations. It is vital that the process for establishing
EGI completes successfully at least 3 months before the end of EGEE-III,
anticipated to be March 2010. WP5 will therefore span 23 months starting in
January 2008, not counting preparatory work done outside the project. The main tasks of WP5 (with partner working
on it) will be -
Establish
the convention of the organisation (CERN) -
Get the
convention agreed by a majority of European NGIs (all) -
Maintain
the relationship with the EC in view of supporting EGI (CERN and GUP) -
Initiate
and complete the ratification process with the NGIs willing to join EGI (all) -
Incorporate
the organisation (CERN) -
Initiate
and complete the hand-over from major RI-project (e.g. EGEE) operations (all) The preparation work will be done mostly by
the lead partner – CERN - but all partners will contribute to obtain the
agreement from the NGIs and during the ratification process. 5.4 Project StatusNow they have submitted “Description of Work”
and “Grant agreement Preparation Forms” to the EU. The Project started 1 September 2007 even if is still waiting for
official approval: -
Development
of an NGI knowledge base -
Collected
use uses (NGIs, EGEE, …) -
Elected
chair of advisory board: Gaspar Barreira (Portugal) 5.5 Next StepsDec 2007: Feb 2008: Mar 2008: Apr 2008: June 2008: J.Templon asked when the
proposal will be negotiated with all the other NGIs that do not participate
to the design project. J.Knobloch replied that
all the NGIs are represented in the Advisory Board which is a more operation
role than its name tells. That board is now more an oversight body on the
whole project than just advisory. In addition the workshop in March 2008 is
open to all people that want to contribute. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
6. VO-specific SAM Tests (VO-specific SAM tests) - Experiments
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The Experiment had agreed to comment the results of the VO-specific SAM tests when they are above the targets (in red in the table below).
6.1 ALICEY.Schutz will ask information about the SAM tests. 6.2 ATLASD.Barberis explained that the ATLAS tests are still under development therefore the values reported are not to be considered very realistic, both the positive and the negative ones. The issues were mostly on the test configuration of the connection between SE and CE at some of the sites. A major bug was fixed only in mid-October; therefore also for this month part the values will not be reliable. From this week on the ATLAS SAM values should be correct. 6.3 CMSS.Belforte and A.Sciabá are looking into the issue. - FNAL: there is a name clash in the SRM endpoint that is being fixed. - IN2P3: executing a test that is misconfigured at IN2P3. CMS will fix it. 6.4 LHCbThe issue at INFN are about accessing files. Could be a mismatch between the file catalog and the SE and not a site problem. There is no real follow-up daily of the tests and is difficult to catch up at the end of the month. The numbers are over positive in the table because the SAM test do not test all services needed by LHCb. L.Dell’Agnello asked
where could INFN find more information about the LHCb tests. Ph.Charpentier replied
that all information in on the LHCB SAM tests page. L.Robertson
concluded that the tests are still under development and will be prepared
during the next few months. Until then the results will be discussed in the
MB but not presented in other reports. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
7. Sites Reliability Reports for September 2007 (Sites Reports; Slides) - A.Aimar |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A.Aimar briefly commented the Site Reliability Reports for September 2007. Here is the summary of the reliability since January 2007.
We have again 7 sites above 91% (current target) and 2 above 82% (90% of the target). The two sites below target are very close (FNAL and RAL) so we could easily have had 9 sites this time above target. Below is the progression of the global averages in the last 6 months: Average 8 best
sites: Sept 93% Aug 94% Jul
93% Jun:87% May 94% Apr
92% Average all
sites: Sept 89% Aug
88% Jul 89% Jun:80% May 89% Apr
89% The site reports are available here and are summarized by the table below. Most of the issues are on: - SRM and SE components that will anyway be upgraded; therefore not much progress is expected until these upgrades. - Operational issues regarding certificated, network or power problems or maintenance J.Gordon explained that a
certificate badly renewed on Friday was discovered on Monday. And was enough
to go to 90% and therefore RAL is below the target. F.Hernandez noted that
the scheduled downtime for IN2P3 is not taken into account and otherwise the
availability will be better. G.Merino reported that on
11 September the unavailability is not clear. SAM restarted working at PIC
without intervention. As is only in PIC down that day it is not clear why the
SAM tests failed and restarted at PIC. J.Templon added that in
October there will be a week with a major problem at SARA and he submitted a
ticket about it. L.Robertson noted that
from next quarter the target should raise to 93%.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
8. AOB |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I.Fisk and R.Tafirout were confirmed as speakers at the Comprehensive Review. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
9. Summary of New Actions |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The full Action List, current and past items, will be in this wiki page before next MB meeting. |