Week of 200622

WLCG Operations Call details

  • For remote participation we use the Vidyo system. Instructions can be found here.

General Information

  • The purpose of the meeting is:
    • to report significant operational issues (i.e. issues which can or did degrade experiment or site operations) which are ongoing or were resolved after the previous meeting;
    • to announce or schedule interventions at Tier-1 sites;
    • to inform about recent or upcoming changes in the experiment activities or systems having a visible impact on sites;
    • to provide important news about the middleware;
    • to communicate any other information considered interesting for WLCG operations.
  • The meeting should run from 15:00 Geneva time until 15:20, exceptionally to 15:30.
  • The SCOD rota for the next few weeks is at ScodRota
  • Whenever a particular topic needs to be discussed at the operations meeting requiring information from sites or experiments, it is highly recommended to announce it by email to wlcg-scod@cernSPAMNOTNOSPAMPLEASE.ch to allow the SCOD to make sure that the relevant parties have the time to collect the required information, or invite the right people at the meeting.

Best practices for scheduled downtimes



  • local:
  • remote:

Experiments round table:

  • CMS reports ( raw view) -
    • It's (virtual) CMS Computing Workshop and likely nobody from CMS can call in
    • Bad CMS workflow caused storage overload at CC-IN2P3
      • Clarified within CMS and bad WF got aborted - no GGUS ticket
    • Otherwise no major items to report

  • ALICE -
    • NTR

  • LHCb reports ( raw view) -
    • completing stripping of heavy ions collision data
    • MC production as usual
    • planning for DB outage of Saturday
    • GGUS down -- some impact on operations

Sites / Services round table:

  • ASGC:
  • BNL: FTS was upgraded last Monday, to version 3.9.4.
  • EGI:
  • FNAL:
  • IN2P3: during maintenance on last Tuesday :
    • CREAM-CEs have been decommissioned.
    • upgrade of HTCondorCE to 4.1.0: incident on June 19th with HTCondor job router -> impact on ALICE during the afternoon.
    • dCache: upgrade to 5.2.22 and rollback to version 5.2.21 (last pools for ATLAS/CMS analysis done today midday)
    • new endpoints for LHCb on dCache to distinguish disk and tape. Registered in GOC DB.
  • KISTI:
  • KIT:
  • NDGF:
  • NL-T1:
    • The slow directory listings at the Sara dCache (reported last week) are understood. A user has 3 million files in one dir, that is a bit of a challenge. We fixed this by increasing pnfsmanager.limits.list-threads from 2 (default) to 10. This slightly increases the database load but nothing it can't handle.
    • The Sara tape backend was down from Saturday until this afternoon. It is operational now.
  • NRC-KI:
  • OSG:
  • PIC: PIC Tier 1 will be in scheduled downtime on Tuesday June 30th, from 08:00 to 14:00 (CERN and local time), in order to perform upgrades on the compute (HTCondor) and storage (dCache and Enstore) services. As usual, access to the CPU farm will be closed right before the start of the SD.
  • RAL: NTR

  • CERN computing services:
  • CERN storage services:
    • On 23rd of June, Final closure of ATLAS area in CASTOR : access to the ATLAS tree of CASTOR will be definitely blocked, before final migration of data to CTA. OTG0057293
    • On 25th of June, the EOSCTA ATLAS instance will be opened for reads and writes OTG0057317
    • On 27th of June, all CASTOR and CTA instances will be down due to a database intervention OTG0057292
    • On 27th of June, many FTS instances will be unavailable due to DB storage intervention OTG0057307

  • CERN databases:
  • GGUS:
    • Access via CERN Grid CA certificates was refused from Sun afternoon till Mon morning.
      • Due to an expired CRL.
      • The problem was announced on the wlcg-operations list Sun afternoon.
      • The GOCDB entry for GGUS provides the contact e-mail to be used for such cases.
  • Monitoring: We will be sharing the draft reports also with site managers during the week after greenlight from experiments representatives
  • MW Officer:
  • Networks:
  • Security: NTR


Edit | Attach | Watch | Print version | History: r25 | r23 < r22 < r21 < r20 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r21 - 2020-06-22 - DavidBouvet
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback