WLCG-OSG-EGEE Ops' Minutes Mon 05 Oct 2009

Summary

  • There are still sites with unpatched WN. EGEE PMB will enforce sites to update their kernel. OSCT is following the issue. Each ROC security contact has provided a list of sites that are unpatched. Deadline for upgrading is Friday October 9th. Sites that are not patched by Friday will be suspended.
  • 2 set of SAM tests are now entering the validation period: MPI and CREAM CE tests.

Attendance

EGEE

  • Asia Pacific ROC: Jason Shih
  • Central Europe ROC: Malgorzata Krakowian
  • OCC / CERN ROC: Antonio Retico, Nick Thackray, Steve Traylen, Diana Bosio.
  • French ROC: Pierre Girard, Helene Cordier
  • German/Swiss ROC: Wen Mei, Sven Hermann
  • Italian ROC: Alessandro Paolini, Paolo Veronesi, Giuseppe Misurelli
  • Latina American ROC: Renato Santana
  • Northern Europe ROC: Ron Trompert
  • Russian ROC: Lev Shamardin
  • South East Europe ROC: Marios Chatzangelou
  • South West Europe ROC: Christian Neissner
  • UK/Ireland ROC: Jeremy Coles
  • GGUS: Torsten Antoni
  • GOCDB: absent

WLCG Tier 1 Sites

  • FZK: Sven Hermann

Feedback on Last Week's Minutes

None was given.

EGEE Items

Grid Operator Hand Over on Duty

  c-COD Team
From ROC IT
To ROC France

Report from cCOD:

  • As last week, there are two tickets regarding APEL problems on two AP sites, opened since more than a month: only in one of them there are some progress, however I will rise both at the WLCG meeting

  1. ROC_SW today has an expired ticket (UPV-GRyCAP), as ROC_DECH (SWITCH)
  2. ROC_SE has a ticket opened a month ago (TR-04-ERCIYES) but non yet solved, expiring today (2nd level of escalation)

  • WLCG items:

  1. Site MY-MIMOS-GC-01 (AP) has still a ticket GGUS:51229 opened a month ago: no feedback by site-admins, only by the AP roc-managers three days ago, after a reminder sent Sep 30th: "sorry for the late, we have some problem dealing with bouncycastle bcprov.jar, we're working on it now as being stuck due to the series of event recently. before the ticket open, we don't read the same error with Taiwan-LCG2 accounting publishing, as well as rgma client check. keep you posted then."

  • Answer from AP ROC: We are dealing with the issue. We are waiting for the site admin to negotiate to open ports for the MON box.

Sites Considered For Suspension

PPS Reports and Issues

  • Update of the Pilot service of Cream CE:
    • CMS is testing the WMS-based submission to CREAM CES at CERN, CNAF, FZK and RAL
    • The tests so far haven't shown any performance limitation in ICE, the upgrade to the latest version, released with gLite 3.1 Update 55 is however recommended
    • A known issue in CREAM, reported with BUG:52651 "CREAM file descriptor overuse" was hit at CNAF

gLite Release News

  • Soon in production: CREAM 1.5
    • it is the version removed because of the security issue. Expect the patch 3259 will go to production soon, within days.

Pierre (ROC_FR): What is the problem of file descriptors with CREAM? Is it the usual problem with inode? Antonio: Probably. It has not been seen by Alice, but it can be seen when using ICE. As soon as CMS started using CREAM it showed up.

EGEE Items From ROC Reports

  • Pierre (ROC_FR): there was a problem with the report, as I had put a problem, which did not appear. Problem with SAM SFT job. Jobs are submitted to the short queue, but jobs are killed as they consume too much CPU. OPS was removed from our short queue.SAM tests should put requirements.
Steve: submit a GGUS ticket, probably the requirement needs to be changed.

  • ROC_FR: dcache was move to chimera. Move successful.

  • Helene: they were problems this morning with the ROC reports. Now the summary of the ROC reports is available again.

Grid Service Interventions

  • Consult links on the agenda page.

Miscellaneous

  • REMINDER: Security vulnerabilities. Last month several alerts were sent to all sites about critical security vulnerabilities in the Linux kernel (CVE-2009-2692 and CVE-2009-2698). o a number of sites still have not upgraded o the EGEE PMB agreed to suspend the vulnerable sites (but they will be warned in advance) o all sites having questions/issues should contact their ROC security contact

  • ROC_DECH: we will have our last site patched by October 12th.

  • REMINDER: Reminder for sites to move to WMS 3.2 (available in gLite repository). This must be done by the end of October!

Newly Created Action Items

Assigned to Due date Description State Closed Notify  
Main.OCC 2007-03-05 Example Action Item 2007-03-06 SteveTraylen   edit

Review of Open Action Items

Open Action Items

IdSubmitterDescriptionCreationDueAssigned To 

Actions Closed in Last 20 Days

IdSubmitterDescriptionCreationDueAssigned ToClosed 

AOB

  • 2 set of SAM tests are now entering the validation period. ROCs are encouraged to look at them. Report back if there are problems. The tests are
    • MPI tests.
    • CREAM CE tests. results are available from the validation pages

https://sam-val.cern.ch:8443/sam/sam.py?CREAMCE_ops_disp_tests=CREAMCE-sft-brokerinfo&CREAMCE_ops_disp_tests=CREAMCE-sft-caver&CREAMCE_ops_disp_tests=CREAMCE-sft-csh&CREAMCE_ops_disp_tests=CREAMCE-sft-djs&CREAMCE_ops_disp_tests=CREAMCE-sft-job&CREAMCE_ops_disp_tests=CREAMCE-sft-lcg-rm&CREAMCE_ops_disp_tests=CREAMCE-sft-softver&setdefs=on&order=NodeName&funct=ShowSensorTests&disp_status=na&disp_status=ok&disp_status=info&disp_status=note&disp_status=warn&disp_status=error&disp_status=crit&disp_status=maint

The tests results look identical to the LCG-CE ones.

Pierre: will they become critical? Steve: yes, in time they will become critical, after ROC managers approval.

Next Meeting

The next meeting will be Monday, 12 Oct 2009 14:00 UTC (16:00 Swiss local time).

  • Attendees can join from 13:45 UTC (15:45 Swiss local time) onwards.
  • The meeting will start promptly at 14:00 UTC (16:00 Swiss local time).
  • To dial in to the conference:
    • Dial +41227676000
    • Enter access code 0148141


These minutes can only be changed by members of:

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2009-10-19 - SteveTraylen
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback