Week of 130812

Daily WLCG Operations Call details

To join the call, at 15.00 CE(S)T Monday to Friday inclusive (in CERN 513 R-068) do one of the following:

  1. Dial +41227676000 (Main) and enter access code 0119168, or
  2. To have the system call you, click here

The scod rota for the next few weeks is at ScodRota

WLCG Availability, Service Incidents, Broadcasts, Operations Web

VO Summaries of Site Usability SIRs Broadcasts Operations Web
ALICE ATLAS CMS LHCb WLCG Service Incident Reports Broadcast archive Operations Web

General Information

General Information GGUS Information LHC Machine Information
CERN IT status board WLCG Baseline Versions WLCG Blogs GgusInformation Sharepoint site - LHC Page 1


Monday

Attendance:

  • local: Simone (SCOD), Ivan (WLCG Monitoring), Luca (CERN Databases), Luca (CERN Storage)
  • remote: Dimitri (KIT), Sang-Un (KISTI), David (CMS), Michael (BNL), Tiju (RAL), Matteo (CNAF), Wei-Jen (ASGC), Onno (NL-T1), Pavol (ATLAS), Rob (OSG), Ulf (NDGF)

Experiments round table:

  • ATLAS reports (raw view) -
    • T0
      • CERN-PROD GGUS:96524 FTS2 channel CERN->ASGC again stuck at 04:00 in the morning on Friday, after some investigation some spurious agent was removed in the afternoon, but it took around 48 hours, until backlog was cleared.
      • CERN EOS GGUS:96519 Atlas EOS instance had problems (memory issues), restarted.
    • T1
      • FZK (GGUS:96245) : Transfer problems from/to FZK with different sites affecting a fraction of transfers. Under investigation.

  • CMS reports (raw view) -
    • MC production and rereconstruction continue
    • GGUS:96546/INC:356501 CMSEOS files not manifesting in global xrootd redirector, but are visible directly in eoscms.cern.ch -- possibly similar issue on May 30.
      • Luca: cmsd daemon had a problem in reading a config file after a restart. Fixed.
    • GGUS:96504 User with possibly expired certificate
    • GGUS:96482 Transfers from Caltech to T1_UK_RAL -- investigation continues.
    • GGUS:96559 Hammercloud failures reading files at ASGC -- in progress
      • Wei-Jen: ASGC failed HC due to an expired host certificate. A new one has been requested.

  • ALICE -
    • NTR

Sites / Services round table:

  • NL-T1: SARA had one pool node in HW maintenance this morning. Some files were unavailable.
  • ASGC: scheduled downtime tonight for 1 day for network hardware upgrade.
  • NDGF: problem with SRM during the weekend. 1/2 hour downtime between saturday and sunday

AOB:

Thursday

Attendance:

  • local: Simone (SCOD), Ivan (WLCG Monitoring), Luca (CERN Databases), Luca (CERN Storage), Alex (CERN Grid Services), Vitor (CERN Grid Services)
  • remote: Michael (BNL), WooJin (KIT), David (CMS), John (RAL), Jeremy (GridPP), Rob (OSG), Wei-Jen (ASGC), Ulf (NDGF)

Experiments round table:

  • CMS reports (raw view) -
    • Relatively light activity -- primarily upgrade MC production
    • No new GGUS tickets -- GGUS:96482 (Caltech/RAL transfers) waiting for more info, CMS transfer team will follow up.

  • ALICE -
    • NTR

  • LHCb reports (raw view) -
    • Mostly MC productions ongoing, tail of reprocessing and restripping campaign
    • T0:
      • CERN: NTR
    • T1:
      • Recovered from network interruptions at RAL earlier in the week
      • Local transfer failures at IN2P3 and SARA resolved (SRM overloads?)

Sites / Services round table:

  • WLCG Monitoring:
    • WLCG Transfers Dashboard: a new dashboard prototyping the future evolution of the WLCG Transfers Dashboard has been deployed. http://dashb-wlcg-transfers-new.cern.ch/
      • follows a hierarchical architecture designed to provide a common feature set independent of transfer protocol, while delegating to FTS and XRootD Dashboards for protocol-specific features
      • includes monitoring of ALICE XRootD data traffic.
      • The current production WLCG Transfers Dashboard remains available. http://dashb-wlcg-transfers.cern.ch/
    • ATLAS DDM Dashboard: monitoring of on-demand transfers for ATLAS (dq2-get / dq2-put) has been deployed to the integration version of ATLAS DDM Dashboard (http://dashb-atlas-data-soup-tbed.cern.ch/ddm2/#activity=%288%29). This feature is currently undergoing validation by ATLAS before release to production.
  • CERN Storage:
    • Tue 20 there will be the upgrade of CASTOR oracle backend. Transparent.
  • Grid services:
    • the batch batch farm nodes will be reinstalled in a rolling fashion in the next days (transparent)

AOB:

Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r9 - 2013-08-15 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback