WLCG-OSG-EGEE Op's Minutes Mon 21 Jan 2008

Attendance

EGEE

  • Asia Pacific ROC: Min
  • Central Europe ROC: Marcin
  • OCC / CERN ROC: John Shade, Antonio Retico, Steve Traylen, Farida Naz
  • French ROC: Absent
  • German/Swiss ROC: Clemens Koerdt
  • Italian ROC: Paolo
  • Northern Europe ROC: RON
  • Russian ROC: Lev Shmanrdin
  • South East Europe ROC: Kostas
  • South West Europe ROC: Gonalo,
  • UK/Ireland ROC: Absent
  • GGUS: Guenter
  • OSCT: Absent

Most of the ROC Managers were in SOFIA

WLCG

  • WLCG Service Cordination: Harry, Jamie

WLCG Tier 1 Sites

  • ASGC: Min Tsai
  • BNL: Absent
  • CERN PROD: GavinMcCance
  • FNAL: Absent
  • FZK: Clemens Koerdt
  • !IN2P3: Absent
  • INFN:
  • NDGF: Vera Hansper
  • PIC: Goncalo
  • RAL: Derek
  • SARA/NIKHEF: Ron
  • TRIUMF: Rod Walker

Reports Not Received

  • WLCG Tier 1s:
  • VOs: CMS, LHCb and ALICE
  • EGEE ROCs (Prod Sites): Italy, North Europe, Russia, South Western, South Eastern
  • EGEE ROCs (PPS Sites): AP, CE, IT, NE, RU, SEE, SWE

Feedback on Last Week's Minutes

None were given.

EGEE Items (29 ')

Grid Operator Hand Over on Duty

  Primary Team Secondary Team
From Central Europe France
To Taiwan CERN

Handover from CE to Asia commented on by Marcin: The CE first line support is now intercepting and handling alarms raised for the CE region. The alarms will became visible to the CODs only 24 hours after they are raised, possibly enriched with annotations. Only after that, if needed the CODs will eventually open tickets. Further details in the hand-over log.

PPS Reports & Issues

PPS reports were not received from these ROCs: AP, CE, IT, NE, RU, SEE, SWE

Issues from EGEE ROCs:

Nothing to report.

Release News:

1. gLite3.1.0-PPS-UPDATE13, has been released to PPS, has passed the pre-deployment tests and it is currently being deployed at the PPS sites. Details, release notes and test reports available as usual in www.cern.ch/pps/index.php?dir=./release/

EGEE Items From ROC Reports

1. Central Europe

  • From previous week: When could we expect MON BOX on SL(C)4? Is in certifcation PATHC:1537

OCC(Steve): The final packages are not available for certification: Therefore it is hard to produce a timeline. Work is being done on that but with lower priority.

  • SL3->SL4 DPM migration. Is there a documented way on how to migrate DPM from SL3 for SL4? Specifically an UPGRADE option would be good i.e. to upgrade OS on existing DPM SE machine and then DPM packages and database safely.

OCC(Steve): There is a wiki page writen by Lana Abadie on the subject: DpmMigratingFromSL3TOSL4

  • CE(Marcin): Actually we would rather need a procedure describing the upgrade path from SL3 to SL4. Small sites may not have all the nodes needed to perform the upgrade as described in the wiki page.
Steve: The upgrade is not specifically covered by gLite, but in fact the use case described is valid. We will ask Lana Abadie to comment and we will post a question to the e-mail list to see if someone has already worked on that.

2. Germany-Switzerland.

  • According to GSSDCCRCBaseVersions sites are expected to deploy certain service releases by early February for CCRC08. Some of them are still in certification. What are the expected timelines? Will some of these services (e.g. LFC) be fast tracked to production? Any reason put forward by the experiments why they need those particular updates?

OCC(Steve): Almost all the software required for CCRT08 was released today, including GFAL and lcg_utils for WNs. Actually the oustanding exception is the LFC. The priority in the certification is given to the 64b version. The T0 is going to use the release currently in certification

SC(Jaimie): T1s will have to ugrade their LFC for Atlas as well.

  • FZK(Klemens) : The schedule is rather tight

Steve: The other update is an FTS update 3.0, (to be released this week in PPS In addition a GFAL patch was submitted today and will go through the normal certification/PPS path

  • PIC(Goncalo): Is it important for the experiments that LFC is brought to the SLC4 version?

Steve: For sure this is the first one to be released and so available to the sites.

Jaimie: The important functionality for the VO is the bulk deletion. Likely the deployment will be planned case-by-case defining the priority as needed . For instance we know that atlas will perform a bulk delete every end of week which gives us a bitmore time to enable the functionality after the release. However, realistically, we will have to support different versions as it is not fesible for the T1s to update all at the same time.

  • Goncalo: Is the mysql version of LFC going to be released at the same time?

All: Unsure

gLite Release News:

1. gLite 3.1 update 10 released today:

  • See gLite 3.1 updates for exact details.
  • 3 new gLite 3.1 services - glite-AMGA , glite-PX and glite-VOBOX.
  • New glite-WN with gfal and lcg-utils in particular.
  • Updated glite-FTM
  • Yaim
  • SL4 VO Box: all implementations of this VO Box for Alice must be coordinated with Alice.

Steve; Stresses on the need to coordinate with Alice(Patricia) for the deployment of the new boxes at the sites.

WLCG Items (30 ')

Tier1 Reports

WLCG issues coming from ROC reports

WLCG Service Interventions (with dates / times where known)

Link to CIC Portal (broadcasts/news), scheduled downtimes (GOCDB) and CERN IT Status Board

Time at WLCG T0 and T1 sites.

FZK(klemens): Downtime (or at least service at risk) because of installation of latest dcache pathces CERN(Gavin): Major Castor update at CERN-PROD this Wednesday. Service downtime. Impact on SAM test foreseen(fot the CODs to take into account): The tests will have an "hiccup" failing the replica tests for one cycle. Then SAm shoudl fall back on other working SRMs

FTS Service Review (05 ')

None again this week, relavent staff still on leave.

Gavin McCance (CERN) , Steve Traylen

ATLAS Service

See also TierZero20071 and ComputingOperations for more information.

1. Nothing special to report except the status of items raised previous week, especially about BNL failing SAM SE/SRM critical tests. jaimie there was a request from michael ernst for the availability to be reviewed. Since

Atlas(Alessandro): An issue was observed with GGUS: a ticket submitted via e-mail cannot be accessed for update by the user from the web interface Cern ROC(Antonio): I observed the issue as well, but I could not reproduce it by giving a try together with a different unregistered user GGUS(Guenter): This was observed from time to time and it depends on incorrect mapping between DN and e-mail account LHCb(Roberto) for Guenter: I received a e-mail to verify but I can't see the button Guenter: likely to be the same issue To be solved off-line UPDATE (after meeting): A possible explanation is that the e-mail was actually sent by an e-mail account different from the one registered in the DN in use

ALICE Service

Patricia Mendez Lorenzo (CERN IT/GD)

CMS Service

Daniele Bonacorsi (CNAF-INFN BOLOGNA, ITALY)

LHCb Service

  • Roberto asked for an update on CNAF problem (reference to last week's minutes)
Paolo Veronesi (IT) said "what problem?" Action: Paolo to read previous meeting minutes, and follow-up. Paolo said, the responsible will give an update by e-mail.

WLCG Service Coordination

Harry Renshall / Jamie Shiers

OSG Items

  • Discussion of open tickets for OSG

https://gus.fzk.de/pages/download_escalation_reports_roc.php

Nobody connected for OSG

User Support(Maria) to GGUS(guenter): The GGUS:29474 (very urgent) got stuck because someone from OSG made a question to the submitter (Nicola De Filippis) and instead of putting the ticket in 'waiting for reply' he put it 'in progress'. Is that due to wrong manipulation or is it a problem in the GGUS/OSG-GOC interface?

The GGUS developer Guenter Grein explained that the OSG ticketing system doesn't have all states relevant to the GGUS ones. Maria said, the user can't guess he has to take action, so we 'll discuss at the dedicated GGUS developement meeting on 22/1/2008 how to handle such cases.

Review of Action Items (05 ')

AOB

NDGF(Vera): How do you switch off site monitoring prior to putting it into production? A: Need to "put into downtime". Change in procedure wasn't well advertised; CSC people upset. GGUS:31311 was already opened.

Steve: An item will be created in next week's agenda from this ticket

Next Meeting

The next meeting will be Monday, 28 Jan 2008 15:00 UTC (16:00 Swiss local time).

  • Attendees can join from 14:45 UTC (15:45 Swiss local time) onwards.
  • The meeting will start promptly at 15:00 UTC (16:00 Swiss local time).
  • The WLCG section will start at the fixed time of 15:30 UTC (16:30 Swiss local time).
  • To dial in to the conference:
    • Dial +41227676000
    • Enter access code 0157610


These minutes can only be changed by members of:

Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2008-03-03 - SteveTraylen
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback