Maite pointed out that after a period of full completion, some ROCs are now missing it again. She stressed the fact that the report is something that the ROCs should be happy to do in their own interest.
Define with gstat (roc-dev@listsNOSPAMPLEASE.grid.sinica.edu.tw) the new value to be set in the list of allowed OS describing the Scientific Linux 5 run at the site Update 17th April Min looking at it but the site should really submit a ticket. As per the instructions on http://goc.grid.sinica.edu.tw/gocwiki/How_to_publish_the_OS_name
An issue in the portal (confirmed after the meeting) did not allow the reports to be read. After the fix no issues were reported .
News about upcoming releases in the Agenda. Notably the certification and imminent release to PPS of the WMS and LB for SL4 was announced.
Lev Shamardin (ROC Russia) asked for news about the outcome of certification.
Antonio (PPS): Release notes not received yet. Apparently there is still an open issue on bulk submission affecting DAG jobs, but collections are working fine.
Maite (OCC): is the WMS now supposed to go through the full PPS cycle?
Antonio: for sure a pre-deployment test will be done. A lot will depend on whether the VOs will be available to test the WMS in the PPS environment. If they won't, a "pilot" service will have to be set up in some friendly production sites. This is a survey/decision, to be done as soon as the release documentation will be available.
Roberto Santinelli (LHCb): the two-month grace period foreseen by the gLite Middleware team for the support of the lcg-RB after the deployment of the new WMS is not enough for LHCb. In fact the developments on Dirac2 (submission engine currently used for LHCb production) are frozen and the framework can use only the RB. The VO is working intensively to the development of Dirac3, supporting the new system, but this will not be ready in two months.
Maite (OCC): there are no problems from the operations point of view. We will move forward the point to the gLite team.
Oliver Keeble (gLite): commented later (off-line via e-mail):
On Mon, 7 Apr 2008, Oliver Keeble wrote:
>
> Well, this is a largely hypothetical discussion as we haven't released
> an RB update for months and there are in fact no RB developers left...
> So, I don't think we can meaningfully 'extend support', but this does
> not prevent LHCb from continuing to use their existing RBs.
>
> Oliver Keeble Information Technology Department
> oliver.keeble@cern.ch CERN
> +41 22 76 72360 CH-1211 Geneva 23
>
>
> Maite Barroso Lopez wrote:
> > Hi Oliver,
> >
> > At today's ops meeting, Roberto said that the 2 months time after
> > the SL4 WMS release to support the RBs might not be enough for LHCb:
> >
> > Lhcb is moving to dirac3, but normal production uses dirac2, which
> > still uses the RB.
> >
> > Could you please discuss it internally and with the RB developers
> > so we extend the RB support till LHCB migrates to the new Dirac version?
> >
> > We can check where they are in 2 months from now (action at the ops
> > meeting).
> >
> > Thanks,
> >
> > Maite
Assigned to
Due date
Description
State
Closed
Notify
Main.OCC
2008-06-10
Check with LHCb he status of the development of Dirac3 (version of the submission engine interfaced to WMS) Update 17th April Will be released in at least 2 months, close action item for now.
(ROC France): Some site administrators complained because their e-mail address was added to a VO mailing-list without their agreement. The VO has been contacted and the problem is being solved, but that incident raises the more general problem of SPAM generated by the project itself. Could we agree in a good administration rule of mailing-list ? At least, except for some obvious and mandatory mailing-lists, an actor should have the possibility to unregister from any mailing-list by him/herself. The way to unregister should be made clear by the mailing list.
Rolf (ROC France): it is annoying for sysadmins to be solicited also for application and usage problems. This issue was reported also to the Coordinator of VO Management (Pierre Girard) so we hope that this fact remains sporadic.
(ROC DECH): Please reopen action item 150. The problem is still present. see GGUS:33850.
Maite (OCC): the action was discussed and closed last week. After an intervention on SAM's side, it turned out that also gstat needed a fix, so the ticket was re-opened and assigned to gstat. The action is re-opened to track the issue.
Ioannis: as the test affect the availability results of a site which is currently on the spot for suspension, it is very important forour ROC to rely on correct SAM data for this case
Maite (OCC): The ticket was unduly assigned to the RB software support, where nobody was listening and it stood idle for a long time. Now it has been re-assigned to the SAM support. There are already some replies from the supporters.
(ROC SEE): LCG-TAU still has some problems, thus it is now in downtime for the next 7 days in order to upgrade to the latest gLite release.
(ROC SWE): We would like to have an update Top-BDII failover awareness on gLite client tools. Is it possible to configure several BDIIs in form of a list with yaim?
Lev Shamardin (ROC Russia): at least GFAL and lcg-utils should support a multi-bdii configuration, although YAIM does not support this option. A list of new client can be used in the LCG_GFAL_INFOSYS variable since gfal version 1.10.6 . NO corresponding Savannah bug was oopened.
The feature will be tried by Atlas and CMS and Atlas (mostly interested in the option) and eventually documented in the release notes.
Verify and document in the User Guide the option to configure the GFAL client to use multiple BDIIs Update 17th April, Maite will check. Update 19th May, Andrea changed this on the same day the action was raised. This action can be closed.
(ROC UKI): GGUS should respond whether the UKI-SOUTHGRID-CAM-HEP problem of 100 mails for the same ticket is a bug.
A reply was sent by GGUS via email:
From: Grein, Guenter [mailto:Guenter.Grein@iwr.fzk.de]
Sent: Monday, April 07, 2008 3:58 PM
To: Maite Barroso Lopez; Torsten Antoni
Cc: Maria Dimou; Guenter Grein
Subject: RE: GGUS issue for today's ops meeting
[...]
Dear Maria and All,
This huge traffic was caused by a guy from UK. He has entered the GGUS mail address helpdesk@ggus.org to the "Assign ticket to one person" field. As every update triggers an email to the mail addresses in this field, the system was looping until I got aware of it and stopped it.
Meanwhile we have updated our mail parsing tool to avoid such things in future.
Best regards
Guenter
(ROC UKI): There have been many complaints in UKI about the move to the need to complete the site reports every day. Site admins often fill out the report for the week in one go and this seems a sensible approach - at least they should be able to choose. Several sites have indicated that they will stop filling out the reports in this new format. On the positive side the new interface seems better with the graphical representation of downtime etc. However, it would be very welcome if the colours used between tools were consistent. Previously grey represented downtime and red a failure... now we have black. Sites would also like to see the past history for the report so they can cross reference previous failures which is a feature lost in this upgrade.
Gilles Mathieu (CIC Portal): The developers have already been contacted and they will restore the functionality as it was, providing both the options to the site admins
Maite encourages the ROCs to provide written feedback for the new interface.
(ROC UKI): The move to validating every use of a certificate on a site is becoming tedious. Is this a feature of the browser settings or does everyone get greeted with constant requests to use their certificate? Is it possible to have a compact view and a detailed view of site problems? I can not see correlations between sites anymore.
Maite: This needs clarification: Is it a general comment or related in particular to the CIC portal?
Assigned to
Due date
Description
State
Closed
Notify
Main.UKRoc
2008-04-14
Clarify the scope of the issue reported in WlcgOsgEgeeOpsMinutes2008x04x07 about continuous certificate requests. Is it a general comment or related in particular to the CIC portal? Update 17th April, Gilles has done something.
Test of the Tier0 to Tier1 Optical Private Networks backup links from 15.00 to 19.00 CEST (13.00 to 17.00 UTC) on Wednesday 9 April.
More details in agenda.
ATLAS Service (Simone Campana, Alessandro Di Girolamo)
deployment of new version of DPM(1.6.7-4): request for update (detailed request on the agenda)
_Antonio (PPS, SA1) as explained in the release section, a technical issue in the creation of the repository prevented the deployment in production to happen last Wednesday as announced. It will done definitely today
P.S.: gLite 3.1 Update 18 was actually released few hours after the meeting
ATLAS sites with lcg-utils for SRM2:Request to the ROCs for follow-up (detailed request on the agenda)
Maite: from the SAM link, half the sites seem to have fixed the problem. The ROCs are kindly invited to finish the work. Are there issues at any sites which Atlas would like to address in particular?
Alessandro Di Girolamo (Atlas): No,vthe T1s are all working and this is the important thing for Atlas
Are there any updates regarding the SAm test being developed by Atlas to test the size of the Atlas SW area?
Alessandro sent details in an e-mail
From: Alessandro Di Girolamo
Sent: Monday, April 07, 2008 4:46 PM
To: Maite Barroso Lopez
Subject: ATLAS issue: 100GB space in the sw area
Ciao Maite,
we are running the test CE-sft-vo-swspace that is trying to understand how much
space has been allocated for the ATLAS sw area.
https://lcg-sam.cern.ch:8443/sam/sam.py?CE_atlas_disp_tests=CE-sft-lcg-version&CE_atlas_disp_tests=CE-sft-vo-swspace&order=RegionName&funct=ShowSensorTests&disp_status=na&disp_status=ok&disp_status=info&disp_status=note&disp_status=warn&disp_status=error&disp_status=crit&disp_status=maint
I said "trying" since not for all the sites is possible to retrieve correctly this information,
but it is already a good starting point.
Could be very useful if ROCs could have a look to the sites belonging to their clouds and
try to see the output one by one to see if sites match the 100GB ATLAS request (or if the
sites has problem in giving this number and in this case would be useful if the ROC would
directly ask to the site admin)
Thanks
CiaoCiao
Ale
ALICE Service
No report received
CMS Service (Andrea Sciaba')
Nothing to report
LHCb Service (Roberto Santinelli)
(LHCb): LHCb is planning week by week. The effort is currently focused in the development of dirac3 to accelerate the commissioning of the framework. This is a working in progress for some new relevant features of Dirac3 which were not implemented during the first February phase but will be in place for May
Last week a lot of sites were found in SAM db which are not in the production BDII. This is being analised with the SAM experts (Judit). Apparently the sites are relics of other ages and probably a clean up is needed.
Antonio Retico: These sites are currently not visualised in the SAM portal. Cannot the same flag be applied also for LHCb applications?
Roberto: in theory, however there are actually issues recognised by Judit and she is working on them.
CCRC08; Sites
No comment
Roberto (LHCb) asks an update (ping) to PIC and SARA (or the relevant ROCs) about the status of the installation of the LFC, needed by the 18th of April
Goncalo Borges (PIC): The schedule is unchanged: we still foresee to deliver next week
Jules Wolfrat (NE ROC) will ask Ron Trompert to update the status.
GGUS:31037 was closed; agreed it was a mistake
GGUS:33220 a long discussion between Steve Traylen and Rob Quick . Still not clear what is the conclusion. An interesting point is that the UFRJ, already part of EELA, is managed by OSG . Can this rule be somehow generalised and a FAQ be generated accordingly for the OSG support unit in GGUS?
Rob Quick: Definitely not. We will send a list of resourced outside US supported by OSG
An e-mail was received from Rob Quick following the discussion at the OPS meeting:
----------
Maria,
Here are the OSG resources not within the US borders.
Rob
Taiwan: osgc01.grid.sinica.edu.tw
Europe: rhilxs.ph.bham.ac.uk
South America: osgce.hepgrid.uerj.br, osg-ce.sprace.org.br, osg-se.sprace.org.br
----------
The GGUS:33220 will be closed and split internally in two GOC tickets to be handled separately at the UFL and UFRJ sites
Maria: Announced the phone conference for the User Support Advisory Group on Apr. 10th at 11am CEST, which VOs, ROCs and T1 sites
This is the link of the USAG agenda :
http://indico.cern.ch/conferenceDisplay.py?confId=30349
Next Meeting
The next meeting will be Monday, 14 Apr 2008 15:00 UTC (16:00 Swiss local time).
Attendees can join from 14:45 UTC (15:45 Swiss local time) onwards.
The meeting will start promptly at 15:00 UTC (16:00 Swiss local time).
The WLCG section will start at the fixed time of 15:30 UTC (16:30 Swiss local time).