WLCG-OSG-EGEE Ops' Minutes Mon 12 Oct 2009

Summary

  • Only half of the sites running a CREAM instance are passing the SAM tests for CREAM in validation. Present version of CREAM is working, sites should check what the issues are.
  • New version of simplified EGEE intervention procedures ratified by ROC managers: https://edms.cern.ch/document/1032984 Sites that have a reliability or availability of less than 50% during three consecutive months will be suspended, and will have to go through the certification process again.
  • Downtimes longer than one month should be exceptional and be approved beforehand by the corresponding ROC and this body notified.

Attendance

EGEE

  • Asia Pacific ROC:
  • Central Europe ROC:
  • OCC / CERN ROC: John Shade, Antonio Retico, Maite Barroso
  • French ROC: Helene Cordier
  • German/Swiss ROC: Sven, Wen Mei
  • Italian ROC: Paolo Veronessi
  • Northern Europe ROC: Thomas Bellman, Gert Svensson
  • Russian ROC: Lev Shamardin
  • South East Europe ROC: Marios Chatziangelou
  • South West Europe ROC:
  • UK/Ireland ROC: Jeremy Coles
  • GGUS:
  • GOCDB: Gilles Mathieu

Feedback on Last Week's Minutes

None was given.

EGEE Items

Grid Operator Hand Over on Duty

  c-COD Team
From ROC France
To ROC CE

  • Report from cCOD:

1 ticket (GGUS #51458) for ROC_SE is opened for more than 1 month. I have sent a reminder to ROC_SE about it to check if the problem can be solved.
ROC_AP has 2 APEL tickets opened for more than 1 month. Work is in progress for MY_MIMOS-GC-01. For TW-FTT, ROC_AP sent a reminder and will escalate to last step if no answer.
Gilles will check if there is any ticket related to these sites.
Question about APEL test: we cannot put a site out of production with APEL problem not solved for more than 1 month. How should we handle such tickets?
Good practise: The ROC should try to isolate/debug the problem. If still this not help, involve the APEL team in the ticket.

PPS Reports and Issues

Only half of the sites running a CREAM instance are passing cream tests in validation. Present version of CREAM is working, sites should check what the issues are.
MPI tests? Part of the CE/CREAM CE tests, included there, in validation.

gLite Release News

  • from Monday last week; security update 56, including CREAM CE version 1.5 (PATCH:3259). It is being applied here and there, as observed in rollout.
  • gLite 3.2 07 in preparation, vobox, and new version of wms ui configuration. Scheduled since 1 months ago.
  • Additionally, a security fix for a vulnerability is being prepared, it will be moved to production quickly, by this week. The patch will not require reconfiguration but a restart, and it affects ~8 node types.

EGEE Items From ROC Reports

  • FZL [INFO]: On the 14th of October DE-KIT will run a test of the LHCOPN backuplink infrastructure. We expect this intervention to be completely transparent. The execution of the link test will start at 9:00am (CEST).
  • FZK [AT RISK]: Planned intervention AT RISK: 20-10-2009 8:00 - 10:00 UTC Due to the application of an Oracle patch, GridKa/DE-KIT s LHCb 3D/LFC database is at risk.

Grid Service Interventions

  • Consult links on the agenda page.

Miscellaneous

  • Recent vulnerability: 7 sites still not patched, OSCT following up
  • Egee roc managers ratified simplified intervention procedures. We are asking sites to declare downtimes correctly and on time. Gilles: this will be enforced in GOCDB as of next Wednesday 21st of October, 2 pm UTC, and also at the CIC portal. https://edms.cern.ch/document/1032984
  • Sites with reliability of less than 50% over 3 consecutive months will be suspended. Who should suspend them? their ROC
  • Downtimes longer than 1 month should be exceptional and closely followed up by the ROC

Newly Created Action Items

Assigned to Due date Description State Closed Notify  
Main.OCC 2007-03-05 Example Action Item 2007-03-06 SteveTraylen   edit

Review of Open Action Items

Open Action Items

IdSubmitterDescriptionCreationDueAssigned To 

Actions Closed in Last 20 Days

IdSubmitterDescriptionCreationDueAssigned ToClosed 

Next Meeting

The next meeting will be Monday, 19 October 2009 14:00 UTC (16:00 Swiss local time).

  • Attendees can join from 13:45 UTC (15:45 Swiss local time) onwards.
  • The meeting will start promptly at 14:00 UTC (16:00 Swiss local time).
  • To dial in to the conference:
    • Dial +41227676000
    • Enter access code 0148141


These minutes can only be changed by members of:

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2009-10-19 - SteveTraylen
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback