LCG Management Board |
|
Date/Time:
|
Tuesday 11 July 2006 at 16:00 |
Agenda: |
|
Members: |
|
|
(Version
3 - 19.7.2006) |
Participants: |
A.Aimar (notes), L.Bauerdick, M.Barroso,
S.Belforte, I.Bird, K.Bos, N.Brook, T.Cass, Ph.Charpentier,
L.Dell’Agnello, I.Fisk, B.Gibbard, J.Gordon (chair),
M.Lamanna, H.Marten, G.Merino, Di Quing, H.Renshall, Y.Schutz , J.Shiers,
O.Smirnova, R.Tafirout, J.Templon |
Action List |
|
Next Meeting: |
Tuesday 18 July from 16:00 to 1700 |
1. Minutes and Matters arising (minutes) |
|
1.1
Minutes
of Previous Meeting
The minutes of the 4
July 2006 were only distributed on Tuesday; therefore there will be a few
more days for comments. Apologies from A.Aimar. 1.2
QR
report 2006Q2
The reports are
expected since the 10 July 2006. Received before the
meeting (chronological order): CERN, FZK, ALICE, Deployment Area,
SARA-NIKHEF, INFN, ATLAS, and PIC. Update: Received
since then: ASGC, TRIUMF, IN2P3, LHCb Missing QRs at the time of the MB: Applications Area, ARDA, CMS,
DB-Services, GDB, NDGF, RAL, SC4, |
|
2. Action List Review (list of actions)Actions that are late
are highlighted in RED. |
|
Not done:
BNL, FNAL, NDGF and TRIUMF. -
BNL had
some difficulties that are being solved. -
FNAL: Their
FTS server is running. The connections to Tier-1s are running, the ones to
the Tier-2s are still being defined and will be done this week. -
NDGF does
not have any hardware for now. -
TRIUMF
tested the Tier-1/Tier-2s but not yet the connection to other Tier-1 sites.
Not done.
Not done
yet.
Not done. Presented to the GDB in
June and asked for feedback. Did not receive any. Reminded it to the July
GDB. If J.Gordon does not received any use cases he will propose some use
cases himself.
|
|
3. Evolving the Operations and Service Coordination meetings (transparencies) - M.Barroso |
|
The presentation, by M.Barroso, proposed some changes to the WLCG Operations and to the Service Coordination meetings. First she presented an overview of the current meetings (slide 2). 3.1 Current MeetingsWLCG-OSG-EGEE Operations Meeting (OPS) - Link to Agendas -
Mondays
at 16:00 (Wednesdays from mid-September) in 28-R-15. -
Attendance
by representatives of the ROCs, sites, VOs, operations coordination, GGUS
team -
Discuss
and solves operational issues from the previous week, as raised by ROCs and
VOs. -
Distributes
and discusses information about operational tools and procedures, future
releases. L.Bauerdick asked whether the move to
Wednesdays in September was definitive. Maite replied that until now there
was not any objection. Wednesday is considered better because (1) in this way
the OPS is not just after the SCM meeting and (2) is not at the beginning of
the week when typically there are
fewer attendees. No objections from
the MB to moving the OPS meeting to Wednesday at 16:00. LCG Service Coordination Meeting (LCG SCM) - Link to the
Agendas -
Wednesdays
at 10:00 in the OpenLab space at CERN (513 ground floor) -
Attended
by the service responsibles from FIO, PSS and GD groups at CERN -
Defines
and discusses the deployment and delivery of the CERN services LCG Resource
Scheduling Meeting (LCG RSM) - Link to the
Agendas -
Mondays
at 15:00 in 40 R-D-10, chaired by J.Shiers -
Attended
by the LHC experiments, Tier-0, SC and PPS representatives. -
The
experiments bring there their resource requirements and their schedule and
plans. J.Templon and J.Gordon asked for details
about the RSM meeting. J.Shiers explained that it is a meeting
started after Mumbai in order to receive the experiments requirements and
plans as soon as they change. For any change H.Renshall updates the wiki page
where the Experiments Plans are collected. If the changes are important he
also reports them to the following Operations Meeting (OPS) in order to
inform all the Tier-1 sites. H.Renshall also reports to the MB, periodically
or if there are important issues. Daily CERN Operations Meeting - Link to the Agendas -
Every
morning at 09:00 in the OpenLab space. -
Attended
by the SMOD, GMOD, service teams in FIO, PSS, DES, CS and GD -
It
focuses only on short term operational issues. 3.2 People involvedSlide 3 shows the people and team that coordinate the meetings: - The Operations Coordination team: Maite Barroso and Nicholas Thackray - The Service Coordination team: Jamie Shiers, Harry Renshall, James Casey, Maarten Litmaath, Flavia Donno - The Experiment Integration Support (EIS) team: Andrea Sciaba, Simone Campana, Patricia Mendez Lorenzo, Roberto Santinelli 3.3 Proposed Changes to the OPS and SCM Meeting (Slide 4 and 5)Improve
the Information Flow - The EIS team should report, for each experiment, to the SCM meeting with an overview of the achievements of the week, possible suggestions and the list of outstanding problems. - The Coordination team will collect and take this information (in writing) to the Operations Meeting, in a summarized way and focusing on the outstanding issues to solve. Problems should be discussed and followed-up at the Operations Meeting until they are solved. Improve
the Escalation of Problems The experiments should report all problems and they should be escalated until they are solved: - 1st step: experiments report to GGUS - 2nd step: experiments escalate, via EIS, the GGUS ticket not solved to the LCG SCM meeting - 3rd step: the Service Coordination team bring the issue to Operations Meeting and a new action is created in the OPS Action List, referring to the original GGUS ticket(s) that is put on “hold” status. J.Shiers
noted that this process was used with CMS recently and some old outstanding
problems were solved efficiently. L.Bauerdick agreed that this process worked
very well for their case. J.Gordon
asked how the GGUS tickets were escalated. It was explained that they were
put on hold and managed through the Operations Meeting where the actions were
referenced by GGUS ticket number Changes
to the Operations Meeting The OPS meeting should follow more rigorously the action list and do a better prioritization of the actions: - Extract the “Top 10 actions” with highest priority, and discuss them at every meeting. -
Define the escalation step to
undertake when an action is not fulfilled on the due date (e.g. escalate to As already mentioned above, when discussing the “information flow”, the OPS meeting will have a section “weekly report on experiments activities and issues”. Ph.Charpentier
noted that the SCM meeting is involving mostly people from CERN services and
that the participation should be re-discussed. J.Shiers agreed that the
participants are mostly from CERN, but replied that many experts are at the
meeting (on Castor, FTS, LFC, etc) and they do not cover only CERN issues but
all the GGUS ticket that they received, from all the LCG community.
All services are *covered*
at SCM but some (dCache, WMS, etc) are not *represented* by the
responsible people. J.Templon
said that is important that the EIS team reports as much information as
possible to the SCM meeting. I.Bird and M.Barroso noted that is what should
already be happening: the experiment should already “use” the EIS
team as the channel for escalating if the GGUS tickets are not solved. Decision: The MB endorsed the
changes to the OPS and SCM meeting proposed in the presentation. Note: N.Brook restated that
LHCb, as announced at the GDB, will continue to define parallel contact
channels with the Tier-1 sites. |
|
4. LCG Bulletin Proposal (initial proposal of bulletin, email text ) - A.Aimar |
|
A.Aimar described the purpose of the bulletin along with the text of the email distributed to the MB before the meeting. The goal of the bulletin is try to streamline the distribution of information in the LCG. And to concentrate in a short summary "what should be known" by the people in the LCG, providing all links to such information. The Bulletin will be distributed every two weeks. Therefore future issues will be shorter than the proposal below. Everybody wishing to have some (relevant) information published in an issue should simply send it to A.Aimar. The suggestions received by email and at the meeting were discussed. D.Boutigny
suggested reversing the chronological order (most recent first). In the future the bulletin will be much shorter. Note: Done already in Bulletin Issue No. 1. J.Gordon
suggested that the link to accounting and monitoring (SAM) data should be
provided in a consistent way and be easily reachable from every bulletin
issue. Note: Already done in Bulletin Issue No. 1. The top
heading on the bulletin “Sites Availability - Accounting Summary - LCG Planning”
contains links that will be in every bulletin issue. O.Smirnova
suggested using some weblog tool were many can contribute. J.Gordon
answered that the goal of the bulletin is to concentrate all the information
that is already in several wikis, webs, agendas, etc. and try to provide a
short summary from where to reach that information. One more weblog were many
contribute would not help. - Alberto's editorial input is what distinguishes
the bulletin from just another wiki or blog. J.Shiers
suggested to have one section collecting information
from sites to/from the experiments. But at the MB was unclear how this would
work and if there would be any contributions. This possibility is postponed
for the moment. The MB agreed on the sections in the initial proposal of bulletin except the “Top Concerns” section. That section should be discussed more in detail at the MB. What should be its contents? How entries would be added and removed from the list? Which body would decide on what are the top concerns? Action: The MB should
reconsider whether to have somewhere a “top concerns” list and,
if so, how to manage it. Decision: The MB supports the
initial proposal and the distribution of Issue No. 1. And suggested that the
MB meetings should have an entry like “Items for the Bulletin”. |
|
1. AOB |
|
No AOB. |
|
2. Summary of New Actions |
|
Action: 31 July 2006 - The MB should reconsider whether
to have somewhere a “top concerns” list and, if so, how to manage
it. The full Action List, current and past items, will be in this wiki page before next MB meeting. |