Week of 140804

WLCG Operations Call details

  • At CERN the meeting room is 513 R-068.

  • For remote participation we use the Alcatel system. At 15.00 CE(S)T on Monday and Thursday (by default) do one of the following:
    1. Dial +41227676000 (just 76000 from a CERN office) and enter access code 0119168, or
    2. To have the system call you, click here

  • In case of problems with Alcatel, we will use Vidyo as backup. Instructions can be found here. The SCOD will email the WLCG operations list in case the Vidyo backup should be used.

General Information

  • The SCOD rota for the next few weeks is at ScodRota
  • General information about the WLCG Service can be accessed from the Operations Web

Monday

Attendance:

  • local: Maria Alandes (chair, minutes), Zbigniew Baranowski (Databases), Belinda Chan Kwok Cheong (Storage), Ben Jones (Grid&Batch), Andrew McNab (LHCb)
  • remote: Sang-Un Ahn (KISTI), Stefano Belforte (CMS), Thomas Belleman (NDGF), Jeremy Coles (GridPP), Michael Ernst (BNL), Kyle Gross (OSG), Tiju Idiculla (RAL), Dmitry Nilsen (KIT), Emmanouil Vamvakopoulos (IN2P3), Alexander Verkooijen (NL-T1)

Experiments round table:

  • ALICE - (Not present)
    • NTR

  • LHCb reports (raw view) -
    • MC and User jobs mostly
    • Due to allowing more than one job manager, had 55000 jobs running concurrently over the weekend.
    • T1:
      • PIC CRLs problem fixed.
      • More RRCKI DNS problems but again fixed.
      • Unable to use RAL currently because of ARC CE client library fixes needed on our side.

Sites / Services round table:

  • ASGC: Not present
  • BNL: NTR
  • CNAF: Not present
  • FNAL: Not present
  • GridPP: NTR
  • IN2P3: NTR
  • JINR: Not present
  • KISTI: There was 1h outage due to a network interventation that resulted in a fiber cut with Chicago. This is now fixed.
  • KIT: NTR
  • NDGF: A series of issues over the weekend mainly affecting ATLAS data (maybe also some ALICE data):
    • The two broken RAID controllers reported last Thursday are still waiting for replacement and the associated disk pools are only accesible in Read-only mode for the time being.
    • Two tape pools had problems during the weekend. One of them was related to a SELinux problem that has been understood and fixed. The other one is not yet understood.
    • dCache nodes were overloaded due to a problem not yet fully understood but for which experts have found a workaround and it's now fixed.
  • NL-T1: NTR
  • OSG: NTR
  • PIC: Not present
  • RAL: NTR
  • RRC-KI: Not present
  • TRIUMF: Not present

  • CERN batch and grid services:
    • FTS2 service now fully decommissioned as announced last week. See IT SSB entry for more details.
    • SLC5 ce20[1-7] removed from the BDII due to the reduction of SLC5 capacity at CERN. Note that it is still possible to submit jobs to SLC5 from SLC6 CEs.
    • CVMFS stratum 0 services upgrade for AMS and SFT. See IT SSB entry for more details.
  • CERN storage services: NTR
  • Databases: CMS Offline DB has now restricted connections from outside CERN as requested by CMS. There is a white list of nodes that can connect from outside CERN. Stefano adds that this has been announced at the PhEDEx operational list and in case someone has problems should get in touch with them or oracle support at CERN. In principle all PhEDEx agents have been included.
  • GGUS: Not present
  • Grid Monitoring: Not present
  • MW Officer: Not present

AOB:

Thursday

Attendance:

  • local:
  • remote:

Experiments round table:

  • ALICE -
    • NTR

Sites / Services round table:

  • ASGC:
  • BNL:
  • CNAF:
  • FNAL:
  • GridPP:
  • IN2P3:
  • JINR:
  • KISTI:
  • KIT:
  • NDGF:
  • NL-T1:
  • OSG:
  • PIC:
  • RAL:
  • RRC-KI:
  • TRIUMF:

  • CERN batch and grid services:
  • CERN storage services:
  • Databases:
  • GGUS:
  • Grid Monitoring:
  • MW Officer:
    • New EMI Update released today affecting APEL, caNl library, CREAM GE, dCache and DPM ARGUS. All details can be found in the release notes.
      • It is worth mentioning the dCache fix whereby NFS protocol is no longer being published in the BDII. This is something that was giving problems in the past to LHCb, as reported in previous meetings, and was tracked as a known MW issue.
    • A new minor release of several Information System packages is now available for Readinness Verification. More details in the release notes. The release is expected to be included in the next EMI Update planned in September.

AOB:

Edit | Attach | Watch | Print version | History: r13 | r11 < r10 < r9 < r8 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r9 - 2014-08-07 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback