Week of 150615

WLCG Operations Call details

  • At CERN the meeting room is 513 R-068.

  • For remote participation we use the Vidyo system. Instructions can be found here.

General Information

  • The purpose of the meeting is:
    • to report significant operational issues (i.e. issues which can or did degrade experiment or site operations) which are ongoing or were resolved after the previous meeting;
    • to announce or schedule interventions at Tier-1 sites;
    • to inform about recent or upcoming changes in the experiment activities or systems having a visible impact on sites;
    • to provide important news about the middleware;
    • to communicate any other information considered interesting for WLCG operations.
  • The meeting should run from 15:00 until 15:20, exceptionally to 15:30.
  • The SCOD rota for the next few weeks is at ScodRota
  • General information about the WLCG Service can be accessed from the Operations Web
  • Whenever a particular topic needs to be discussed at the daily meeting requiring information from site or experiments, it is highly recommended to announce it by email to wlcg-operations@cernSPAMNOTNOSPAMPLEASE.ch to make sure that the relevant parties have the time to collect the required information or invite the right people at the meeting.

Monday

Attendance:

  • local: Ilija (ATLAS), Jerome (batch and grid services), Maarten (SCOD + ALICE), Massimo (storage), Prasanth (databases), Stefan (LHCb), Xavi (storage)
  • remote: Christian (NDGF), Christoph (CMS), Felix (ASGC), Kyle (OSG), Lisa (FNAL), Michael (BNL), Onno (NLT1), Preslav (CMS), Rolf (IN2P3), Sang-Un (KISTI), Sonia (CNAF), Tiju (RAL), Vladimir (LHCb)

Experiments round table:

  • ATLAS reports (raw view) -
    • FTS upgrade to fix the issue with storm: all the FTS servers did it, thanks a lot!
    • CERN-PROD: high failure rate was understood and fixed thanks to the quick interactions with CERN-IT DSS. GGUS:114293 reopened today, possibly fix was not complete.

  • CMS reports (raw view) -
    • File access problems at CCIN2P3: GGUS:114343
      • Rolf: that issue concerns our T2
    • Possible file corruption issues at CERN EOS: GGUS:114304
      • Massimo: for different experiments we have seen a strong correlation with the network incident of June 11
    • Some files seem not to migrate to CASTOR tape at CERN: GGUS:114282
    • File transfer issues from FNAL to RAL: GGUS:114275
    • Tape staging test started at CERN: GGUS:114283 (for information logging)
    • Any news regarding P5-Wigner network link?
      • Xavi: this is still being followed up further by network experts

  • ALICE -
    • high activity

  • LHCb reports (raw view) -
    • Validating data processing workflow
    • T0
      • NTR
    • T1
      • NTR

Sites / Services round table:

  • ASGC: ntr
  • BNL: ntr
  • CNAF: ntr
  • FNAL: ntr
  • GridPP:
  • IN2P3:
    • reminder: downtime tomorrow
  • JINR:
  • KISTI: ntr
  • KIT:
  • NDGF:
    • downtime tomorrow 10:00-16:00 CEST for dCache upgrades
  • NL-T1:
    • this morning the SARA squid service crashed and was restarted
    • there was a DNS issue affecting the computing cluster at SARA; it has been fixed, but the issue might not fully be gone until all previously cached information has expired
    • the NIKHEF farm seemed rather quiet, maybe due to the squid problem?
      • after the meeting: ALICE had 2k jobs running throughout the day...
  • NRC-KI:
  • OSG: ntr
  • PIC: ntr
  • RAL:
    • reminder: network maintenance Wed afternoon
  • TRIUMF:

  • CERN batch and grid services: ntr
  • CERN storage services:
    • tomorrow new HW will be added, which should be transparent, but might affect any experiment
  • Databases:
    • tomorrow CMS DB firewall rules will be moved from DB triggers to iptables
  • GGUS:
  • Grid Monitoring:
  • MW Officer:

AOB:

Thursday

Attendance:

  • local:
  • remote:

Experiments round table:

  • ALICE -

Sites / Services round table:

  • ASGC:
  • BNL:
  • CNAF:
  • FNAL:
  • GridPP:
  • IN2P3:
  • JINR:
  • KISTI:
  • KIT:
  • NDGF:
  • NL-T1:
  • NRC-KI:
  • OSG:
  • PIC:
  • RAL:
  • TRIUMF:

  • CERN batch and grid services:
  • CERN storage services:
  • Databases:
  • GGUS:
    • Release scheduled for the 24th of June. Downtime announced on GOCDB. The service might not be available during the intervention. More info about the release
  • Grid Monitoring:
  • MW Officer:

AOB:

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatpptx MB-Jun-15.pptx r1 manage 2871.0 K 2015-06-15 - 09:41 PabloSaiz  
Edit | Attach | Watch | Print version | History: r13 | r11 < r10 < r9 < r8 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r9 - 2015-06-17 - PabloSaiz
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback