LCG
Management Board
|
|
Date/Time: |
Tuesday 7 February 2006 at 16:00 |
Agenda: |
|
Members: |
|
|
(Version 1 - 17.2.2006) |
Participants: |
A.Aimar (notes), S.Belforte, I.Bird, K.Bos, N.Brook, T.Cass, Ph.Charpentier, D.Foster, J.Gordon, M.Lamanna, E.Laure, H.Marten, P.Mato, G.Merino, B.Panzer, R.Popescu, R.Pordes, Di Qing, L.Robertson (chair), Y.Schutz |
Action List |
|
Next Meeting: |
Tuesday 21 February 2006 from 16:00 to 1800 |
1. Minutes and Matters Arising (minutes) |
|
1.1 Minutes No feedback received. Minutes of the previous approved. 1.2 Matters Arising FZK SC4 throughput tests in April In April FZK will be in a “hardware set-up” phase and have difficulty to perform the SC4 throughput tests in that period. This will be discussed with J.Shiers and the SC4 team. SC4 Sites Planning The initial sites plans for SC4 (by F. Donno) are available on LCG/Planning page in the section: SC4 Planning. Sites and experiments should review the
plans and send their feedback to F.Donno. |
|
2. Action List Review (list of actions ) |
|
·
31 Jan 06 - B.Panzer will discuss with the experiments and present to the MB a plan
with possible dates and resources for the experiments and for IT activities
(performance tests, etc). The plan will include a proposal for the ATLAS TDAQ
large tests. ·
31 Jan 06 MB - J.Gordon prepares a presentation on the situation of grid accounting.
|
|
3. EGO status - E.Laure (presentation) |
|
The EGEE Project Office has organized a workshop in order to discuss a strategy toward a permanent European Grid Organization (working name “EGO”), with the participation of the EGEE MB and of Kyriakos Baxevanides from the EU. The goal of EGO is to provide a sustainable grid infrastructure, as follow-up to the EGEE2 project. A summary, by W.Von Rueden, will be distributed. The presentation at the MB described the goals and the open issues. The model followed is the one of GEANT for the network organization. EGO would be a consortium of the partners involved in EGEE 1 and 2 and provide an infrastructure in collaboration with the National Grid Initiatives (NGI). Slide 4 shows list of NG partners working with EGEE already. The resources would be owned by the NGIs, EGO will enable them to work together and share common channels of communication and operations. Slides 5 and 6 show that EGO would provide: - Inter-operation among the NGI infrastructures, coordination of common services, interaction with users’ communities (VOs) and other organizations (EU, GEANT, etc) - Testing and certification of middleware, as well as coordination of the NGI support groups. - Dissemination, including media relations, event organization and public surveys. As well as public outreach and training of the NGI management and operations. Industry could benefit by making commercial use of this infrastructure or as sub-contractor for support, operations and infrastructure. Industry could contribute also via manpower (joint-projects) and sponsorship (equipment). The funding could come from several channels (NGI, EU FP7, open lab, etc) but this is still all under discussion. Governance of the organization would be by a separate entity, initially hosted at CERN but then moving to separate premises. Slide 9 shows the calendar of key events, with the target date of Q1 2008 for the constitution of EGO. The open issues that still need to be discussed are(slide 10): - What type of organization should EGO be? (a consortium or an organisation?) - Should EGO be involved in grid development? Have R&D projects? - What is the right model for funding involving the EU and the NGI organizations? - How will it interact with users communities and bring benefit to industries? A short discussion followed: EGO’s focus will not be limited to Europe because interoperability, standardization and collaboration with other organizations are fundamental. And the LCG should not have to interact with many grid bodies and separate regional organizations. Currently some Tier-2 sites are dependent on the EGEE funds for operations, but in the future the funding will have to come from other sources. |
|
4. OSG status - R.Pordes (presentation) |
|
Status of the middleware support: -
Condor
and Globus have a funding of 5 more years from NSF. -
GriPhyN
has ended, maybe one year to bridge the gap for VDT -
PPDG
and iVDGL will complete by end of FY2006. Slide 3 describes the organization chart and the next steps. The OSG consortium is transformed into a project-like structure, with the goal of delivering and operating a Distributed Facility to meet LHC needs and timescales. OSG includes three thrusts (slide 4): - Distributed Facility: including operation, security, middleware etc in order to define a long-term infrastructure - Education, Training and Outreach: with grid schools, minority and international outreach, etc - Science Driven Extension: adding capacity to the Facility via joint projects with external teams and organizations The call for funds (SciDAC 2 call) have been sent, see slide 5 and 6 for the funding levels and the scope. A same single proposal has been sent to multiple program offices in DOE and NSF. It is expected that by June 2006 the call will be answered and, if positive, work should start from FY07 (July or September 2006) for 3 or 5 years. The cooperation between OSG and EGEE to deliver to the WLCG is continuing. Collaboration focuses mostly on: - OSG 0.4.0 includes the EGEE information providers, the publishing through OSG-BDII to LCG- BDII, and support (to-date) for RB submissions for ATLAS, CMS and Geant 4. - Common testing and monitoring tools (SFT and ACDC). - OSG 0.4.1 working in testing interoperation and collaboration with gLite-CEMON. - Synchronizing the schedules for the SRM deployment. - There are common concerns for the VOMS ADMIN situation, with issue of compatibility of role-based authorization between EGEE and OSG. OSG is also working on interoperability with TeraGrid and contributing to the Multi-Grid Interoperability work between EGEE and OSG. |
|
5. Accounting issues and policy - J.Gordon (document, transparencies) |
|
There is not a single solution for accounting: there are multiple grids and therefore multiple accounting systems have been developed. The accounting system in production since December 2004 is APEL, with about 180 sites publishing data and 100K records per week. Slides 6 to 10 show several examples of views and aggregations of the accounting data. The issues currently open are: -
Not full deployment of APEL
by the sites. -
Legal and privacy concerns. -
Validation of the data. Are
all record collected? Is normalisation correct? -
Level of detail. -
Other resources to account in
addition to CPU. -
Maintenance problems for the
system. APEL. The system depends on data that is coming from many log files
and other sources. They are all different and even change format almost at
every release (Globus, batch systems, LSF, gLite, etc). -
Standards and
interoperability. The MB discussed about the mandatory deployment of accounting at all sites, whether accounting should also include the local job submissions and the granularity of the data (VO, or groups, or users). Decision: As also stated in the MoU,
accounting from the sites should be mandatory at the VO group level, and the
data be submitted to a central repository. All LHC usage (grid and locally
submitted work) must be accounted for. Reports to the CB and to the RRB will
only take into account the data in the central accounting. Action (to report to GDB): 1 Apr 2006 - Sites should provide information to the GDB about
any legal issue in their country. Action: 28 Feb 2006 –
J.Gordon will ask the sites
to install the accounting software. If they don’t do it, they
should explain the reasons for not doing so. J.Gordon will report to
the MB early in March. |
|
6. OPN update - D.Foster (transparencies) Following the meeting on 31 January |
|
The Optical Private Network for the LHC includes all the infrastructure needed for connecting Tier-0, Tier-1, Tier-2 sites and relies on many different networks (e.g. USLHCNet, NRENS, GEANT, Internet, etc.). A first clear objective is to take data quickly and reliably from Tier-0 to Tier-1 sites. But the complexity increases when looking also at the Tier-2 sites and all the possible interconnections. Currently the status of the different parts of the LHCOPN to the Tier-1 sites is: - Complete: SARA, IN2P3, CNAF, FNAL - Spring: FZK, BNL - Mid-Year: RAL, NDGF, TRIUMF - End-Year: PIC - Unclear: ASGC (plans to go to 2x2.5 Gb/s) Probably 6 sites will be ready for April, but not 3 of them on GEANT. Therefore the “OPN milestones” in the WLCG High Level Plan should be modified accordingly. The open issues and problems to solve are: -
Cost
sharing with the GEANT partners is not agreed yet and negotiations are still
on-going. -
Some
links are not dedicated but used also by other traffic and will become fully
available when needed. -
Monitoring
and Operations are not very clear yet. Several options exist but the exact
processes have not been defined yet. -
Many
network operations are not supported at a 24x7 level. Tier-1/Tier-1 connections are not organized
as part of the OPN and are discussed directly between the sites, presumably
using the standard Internet connectivity. A connection (via CERN) could be
provided but it is not planned unless this turns out to be really needed.
Backup links are usually available for all connections and there are several
options possible. The focus now is on the main links to the Tier-1 sites. More information on the OPN status is
available at the site: http://lhcopn.cern.ch |
|
7. Scheduling of the CERN resources in 2006 - B.Panzer (document) |
|
The document describes the planning for the IT resources at CERN for 2006. It includes allocations of CPU, disk storage and tapes. On pages 3 and 4 there is the list of the activities planned. Experiments should check that their plans concord with this general scheduling. Clashes will have to be discussed. An example is on page 5 for the “ATLAS DAQ Tests”, during 4 weeks in October, the other experiments will have a reduced amount of resources available (i.e. in week 45 only 55% of the CPU will be available to the other experiments). B.Panzer will produce a week-based detailed table describing the resources needed by each experiment every week of 2006. Once this initial schedule is done all clashes will have to be discussed and negotiated. Action: 28 Feb 2006 - Experiments should clarify with B.Panzer their exact
schedule (week no) and resources needed (KSI2000) in 2006 so that the
whole plan can be consolidated. |
|
8. AOB |
|
Because of the CHEP conference the next meeting is in 2 weeks. |
|
9. Summary of New Actions |
|
Action (to report to GDB): 1 Apr 2006 - Sites should provide information to the GDB about
any legal issue in their country concerming
reporting of accounting data. Action: 28 Feb 2006 –
J.Gordon will ask the sites
to install the accounting software. If they don’t do it, they
should explain the reasons for not doing so. J.Gordon will report to
the MB early in March. Action: 28 Feb 2006 - Experiments should clarify with B.Panzer the exact
schedule (week no) and resources needed (KSI2000) in 2006. The full Action List, current and past items, will be in this wiki page before next MB meeting. |