WLCG Operations Coordination Minutes, July 2, 2020





  • local:
  • remote:
  • apologies:

Operations News

  • The next meeting is planned for Sep 3

Special topics

WLCG Critical Services proposal followup

CERN Grid CA OCSP incident

  • After a scheduled intervention on June 24, the CERN Grid CA OCSP service
    became inaccessible from outside CERN (OTG:0057432)
  • Requests to the service were dropped by the CERN perimeter firewall
  • A CREAM CE will try to check a client certificate's status via OCSP,
    if the existence of such an endpoint is indicated in the certificate details
    • It appears other CE flavors rely on CRLs only and just ignore OCSP services
  • Checks of CERN Grid CA certificates were then hanging until a timeout was reached
  • The CREAM client code would time out first, thus failing job submissions that
    used CERN Grid CA certificates
  • This affected the 4 experiments and, through the SAM tests, sites running CREAM
    • Some A/R recomputations may be needed
  • The service was restored about 24h later on June 25
  • Some improvements are foreseen to make reoccurrence a lot less probable

  • Further details here

SAM migration progress

Middleware News

Tier 0 News

Tier 1 Feedback

Tier 2 Feedback

Experiments Reports


  • Mostly business as usual, no major issues


  • Stable Grid production with up to ~380k concurrently running grid job slots with the usual mix of MC generation, simulation, reconstruction, derivation production and user analysis, including ~45k slots from the HLT/Sim@CERN-P1 farm and ~15k slots from Boinc. Occasional additional peaks of 200k job slots from HPCs.
  • Continuing with about 60k job slots used for Folding@Home jobs since 4 April. 50% from ~55 different grid sites via opt-in and 50% at CERN-P1
  • No other major other issues apart from the usual storage or transfer related problems at sites
  • Finishing grand unification of production+analysis queues in PanDA in the next days.
  • All systems recovered quickly from Oracle/DBonDemand downtime last Saturday - would appreciate to avoid such downtimes over the weekend next time
  • CTA in production for ATLAS since Monday - still fixing some issues in Rucio/middleware


  • Covid-19 compute contributions being returned to experiment use
  • main proccessing activities:
    • Run 2 ultra-legacy Monte Carlo
    • Run 2 pre-UL Monte Carlo
  • migration to Rucio ongoing
    • production of nanoAOD samples configured for PhEDEx being bumped up to complete more quickl


  • still running F@H on part of HLT farm
  • large MC requests coming up so we are going to reduce this Covid-19-related activity
  • processing (small) samples of lead-lead collisions and lead-neon fixed target collisions
  • grid drained in preparation for the CERN Oracle/DBOD outage of last Saturday, DIRAC services and agents switched off, then on again after the outage, everything went extremely smoothly

Task Forces and Working Groups

GDPR and WLCG services

Accounting TF

Archival Storage WG

Containers WG

CREAM migration TF

Details here


  • 90 tickets
  • 14 done: 7 ARC, 7 HTCondor
  • 16 sites plan for ARC, 15 are considering it
  • 20 sites plan for HTCondor, 14 are considering it, 8 consider using SIMPLE
  • 14 tickets on hold, to be continued in the coming weeks / months
  • 7 tickets without reply
    • response times possibly affected by COVID-19 measures

dCache upgrade TF

DPM upgrade TF

StoRM upgrade TF

Information System Evolution TF

IPv6 Validation and Deployment TF

Detailed status here.

Machine/Job Features TF


MW Readiness WG

Network Throughput WG

Traceability WG

Action list

Creation date Description Responsible Status Comments

Specific actions for experiments

Creation date Description Affected VO Affected TF/WG Deadline Completion Comments

Specific actions for sites

Creation date Description Affected VO Affected TF/WG Deadline Completion Comments


This topic: LCG > WebHome > WLCGCommonComputingReadinessChallenges > WLCGOperationsWeb > WLCGOpsCoordination > WLCGOpsMinutes200702
Topic revision: r9 - 2020-07-02 - ConcezioBozzi
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback