local: Andrea M (CERN data mgmt), Boris (WLCG), Concezio (LHCb), Julia (WLCG), Maarten (ALICE + WLCG), Renato (CBPF + LHCb)
remote: ? (KIT), Alessandra (Manchester + ATLAS + WLCG), Andrea S (GRIF), Christoph (CMS), Dave (FNAL), Di (TRIUMF), Eric (IN2P3-CC), Felix (ASGC), Igor (NRC-KI), Johannes (ATLAS), Luca (CNAF), Marcelo (CNAF), Mike (ASGC), Miro (databases + WLCG), Panos (WLCG), Pepe (PIC), Prasun (Kolkata), Ron (NL-T1), Simon (TRIUMF), Thomas (DESY), Tigran (dCache), Vikas (Kolkata)
apologies:
Operations News
The next meeting is planned for Oct 3
Please let us know if that date would pose a major inconvenience
storage baselines have been updated following DOMA TPC WG requirements
Tier 0 News
Tier 1 Feedback
Tier 2 Feedback
Experiments Reports
ALICE
NTR
ATLAS
Smooth Grid production over the past weeks with ~320k concurrently running grid job slots with the usual mix of MC generation, simulation, reconstruction, derivation production, user analysis and a dedicated reprocessing campaign (see below). In addition ~90k job slots from the HLT/Sim@CERN-P1 farm when it was not used for TDAQ purposes. Some periods of additional HPC contributions with peaks of ~50 k concurrently running job slots running simulation using EventService.
Since August 8th running a special reprocessing campaign of 2018 data (~7 PB, 3mio files) using the data carousel setup. This requires a stage-in of all RAW inputs from tape at the Tier1s. A few notes and possible future improvements:
Expected throughput: Stage 7 PB in 2 weeks: 5.8 GB/s, for a 10% Tier1: 580 MB/s
INFN-T1 could not be used due to its ~2 weeks downtime - CERN CTA was used very successfully instead.
Observed contention in the data export along the tape -> disk buffer -> data disk path at Triumf and IN2P3-CC since the tape staging was faster than the copying away of the data from the disk buffer - room for improvements on the FTS and dCache side .... more news from FTS experts.
Suboptimal tape staging performance observed at FZK and PIC
Tier1s: are the the file pins are respected on the disk tape buffer ?
Improve the ATLAS WFMS task release threshold w.r.t the optimal fraction of available inputs files on disk after staging from tape
Can dCache please have a look and implement the following space token writing feature request which is critical for TPC and non-gridftp stores: https://github.com/dCache/dcache/issues/3920
Now in the tails of switching to the new PanDA worker node pilot version2 + singularity. Only moving CentOS7 queues. Excluding CERN-P1, there are the following jobs slots converted/not-converted: ~300k pilot2, ~260k pilot2+singularity, ~20k pilot1(and still to be migrated to pilot2+singularity+CentOS7)