WLCG-OSG-EGEE Ops' Minutes Mon 09 Mar 2009

Summary

gLite3.2 is announced as the first version of glite supporting signed rpms

LHCb is doing extensive testing of GGUS alarm tickets against T1 sites

Attendance

EGEE

  • Asia Pacific ROC: ShuTing Liao
  • Central Europe ROC: Malgorzata Krakowian
  • OCC / CERN ROC: John Shade, Antonio Retico, Steve Traylen
  • French ROC: Rolf Rumler, Helene Cordier
  • German/Swiss ROC: Angela Poschlad
  • Italian ROC: -
  • Northern Europe ROC: Ron Trompert
  • Russian ROC: Victor Edneral
  • South East Europe ROC: Kostas Koumantaros
  • South West Europe ROC: Kai Neuffer
  • UK/Ireland ROC: Jeremy Coles
  • GGUS: Guenter Grein
  • GOCDB: Gilles Mathieu
  • CIC Portal : Helene Cordier
  • C-COD : Vera Hansper
  • OSG : Rob Quick

WLCG

  • WLCG Service Coordination: Harry Renshall, Jamie Shiers

WLCG Tier 1 Sites

  • ASGC: ShuTing Liao
  • BNL: Absent
  • CERN site: Ignacio Reguero
  • FNAL: Catalin Dumitrescu
  • FZK: Angela Poschlad
  • IN2P3: Pierre Girard
  • INFN: -
  • NDGF: Vera Hansper, Tore Mauset
  • PIC: Kai NEuffer
  • RAL: Derek Ross, Gareth Smith
  • SARA/NIKHEF: Absent
  • TRIUMF: Absent

LHC Experiments

  • ATLAS: absent
  • LHCb: Roberto Santinelli
  • CMS: absent
  • ALICE: absent

Feedback on Last Week's Minutes

None was given.

EGEE Items

Grid Operator Hand Over on Duty

  Primary Team Secondary Team
From ROC Italy ROC UK/I
To ROC Russia ROC SEE

  • List of unresponsive sites:
    • No unresponsive site
  • Problems Encountered during shift:
    • On Friday the dashboard was unavailable until approximately 14.00 UTC

Sites Considered For Suspension

PPS Reports and Issues

  • None

gLite Release News

Harry: info about upcoming FTS fixes
Antonio(off-line): This update fixes the FTS web service's delegation code that the delegated proxy certificates should not be corrupted BUG:33641 . The fix is to be applied server-side and requires only a restart of the service after upgrading the glite-data-transfer-fts package.

EGEE Items From ROC Reports

  • ROC France:
    • GRID2-FR certificate were rejected by VOMS servers behind the alias voms.cern.ch. Steve Traylen fixed the problem this morning.
  • ROC SEE:
    • Sites in SEE requested to give top priority to bdii version 5 certification. There has been no progress in https://savannah.cern.ch/patch/?2623 for a month. This prevents more top bdiis to be deployed to better load balance as the current glite production bdii has numerous issues (GGUS ticket 43578)

Laurence(off-line):Patch 2623 was rejected as a newer version of the BDII was created due to issues raised by some early adopters. The new patch is 2783 which is currently being tested.
PATCH:2784 I am aware of one packaging issue which will result in this patch being rejected but I am waiting for the package to be rejected to see if there are any other issues which will need fixing. As soon as I have this information I will create a new patch. I notice that PATCH:2784 is current assigned priority normal. This should be increased to high if this patch is seen as important.

Antonio will follow-up with the EMT

Assigned to Due date Description State Closed Notify  
AntonioRetico 2009-03-16 Follow-up with EMT the re-prioritisation of PATCH:2784 and possibly increase it to high

UPDATE 20-Mar: I discussed with the EMT Coordinator. There is a long list of services which now have priority both in certification and release preparation. They welcomed the idea of a pilot service to be run at some sites, which by providing real-usage records would help them making the certification faster. I am preparing a request for sites to join this activity to be presented at the next OPS meeting.

2009-03-18 edit


  • ROC UK/I:
    • UKI-LT2-UCL-HEP - New CE now online and passing SAM tests. Decided to shorten downtime in GOCDB. Downtime lifted on 05/03 at 16.05. It took 10 hours for this information to be propagated through the system (e.g. SAM).

John: This is not normal gridview should get in synch with GOCDB every half hour.

Antonio: this is possibly related to the GODB connection problems experienced last Friday and reported via e-mail to the GOCDB discuss list

WLCG Items

WLCG issues coming from ROC reports

  • NE ROC :
    • Recently there has been a request from WLCG to reinstall the WNs with 64-bit SL5. In principle this is OK. We have had similar requests also from non-HEP users in the past. However, a substantial fraction of our user community depends on the UIs to compile their codes. So we would very much like the UI to be on the same platform as the WNs. For this reason we would like to have a 64-bit SL5 glite-UI or a 32-bit glite-UI which is certified on SL5 64-bit. Will such glite-UI release be made available and if so, when will it be available?

Oliver (off-line): A gLite3.2/SL5/64 UI will be made available. Indeed, all services will eventually be made available on that platform.
The situation with the UI is that we have 99% of build, but zero runtime experience with a number of components on there (eg LB 2.0). Andreas Unterkircher will be covering the subject (gLite 3.2 roll-out plan) during Wednesday's GDB (http://indico.cern.ch/conferenceDisplay.py?confId=45473 )

Upcoming WLCG Service Interventions

  • Consult links on the agenda page.

WLCG Service Coordination

Harry:A request was moved to re-schedule to the 1st of April the intervention on cern networks foreseen for the 18th of March. Apparently this is not possible. Updates to be communicated at this table and at the daily WLCG ops meeting

ATLAS Service

-

ALICE Service

-

CMS Service

-

LHCb Service

Harry: LHCb performed a "GGUS alarm ticket test". This went well and they will run another one this afternoon more reflecting a real LHCb production use case

Roberto: LHCb phoned directly castor service admins at CERN-PROD for an issue observed last Thursday. Basically there is an issue with Globus xio where there are many zombie processes running on the gridftp server which need to be cleaned-up. A GGUS:46946 was opened as well with top priority, the issue was initially fixed by the administrator contacted by phone and now it is happening again preventing LHCb users from working. In all this the GGUS ticket worklog is empty so the ticket seems stuck at the ROC level.

Antonio: the ticket is not stuck because it is being followed-up by the Cern ROC as a standard team ticket (same workflows as for the CODs). The fact that the work is not being logged is disappointing though and will be followed up

OSG Items

nothing to report

Newly Created Action Items

Assigned to Due date Description State Closed Notify  
Main.OCC 2007-03-05 Example Action Item 2007-03-06 SteveTraylen   edit

Review of Open Action Items

Open Action Items

IdSubmitterDescriptionCreationDueAssigned To 

Actions Closed in Last 20 Days

IdSubmitterDescriptionCreationDueAssigned ToClosed 

AOB

Helene: A combined release of CIC Portal and GOCDB with brand new downtime mechanism is in preparation. The new mechanism allows the users to subscribe to notification thus avoiding avoiding "spamming" of broadcast messages. A further reminder will be issued on Monday and unless serious issues are reported the roll-out is scheduled for Wednesday 18th of March. In order to benefit of the new features the users will need to apply configuration changes, otherwise the behaviour of the interface stays unchanged.

Maria (via e-mail): Please record in today's minutes the announcement of the dedicated meeting we shall hold on 12/3 on GGUS-OSG ticket routing for Tier1s. I shall include agenda and notes on the next Ops meet. agenda.

Next Meeting

The next meeting will be Monday, 16 Mar 2009 15:00 UTC (16:00 Swiss local time).

  • Attendees can join from 14:45 UTC (15:45 Swiss local time) onwards.
  • The meeting will start promptly at 15:00 UTC (16:00 Swiss local time).
  • The WLCG section will start at the fixed time of 15:30 UTC (16:30 Swiss local time).
  • To dial in to the conference:
    • Dial +41227676000
    • Enter access code 0148141


These minutes can only be changed by members of:

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2009-03-18 - JohnShade
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback