Week of 130401
Daily WLCG Operations Call details
To join the call, at 15.00 CE(S)T Monday to Friday inclusive (in CERN 513 R-068) do one of the following:
- Dial +41227676000 (Main) and enter access code 0119168, or
- To have the system call you, click here
The scod rota for the next few weeks is at
ScodRota
WLCG Availability, Service Incidents, Broadcasts, Operations Web
General Information
Monday: Easter
Monday holiday
- The meeting will be held on Tuesday instead.
Tuesday
Attendance:
- local: Raja, Maarten, Jan, Jerome, Stefan
- remote: Peter, Xavier, Rolf, Wei-Jen, Oliver, Onno, Lisa, Lucia, Rob, Pepe, Gareth, Jeremy, Roger
Experiments round table:
- CMS reports (raw view) -
- LHC / CMS
- Rereconstruction of 2012 data in the tails, load at the T1 sites small
- CERN / central services and T0
- Frontier system under high load over the weekend, FastSim workflow was mis-configured using the FullSim job splitting causing very short jobs with a lot of access to the SQUID caches to access alignment and calibration constants. If you see failures in SAM tests and/or Hammercloud tests because of failed access to Frontier, please open a savannah ticket to get the SiteReadiness calculation corrected.
- Tier-1:
- Tier-2:
- ALICE -
- NTR
- Xavier: There was a question from ALICE why jobs were lost last week, the reason was a reboot of the VOBOX
- Maarten: Yes, was also reported in the WLCG Ops meeting last week, but also after the reboot there were some instabilities seen
- LHCb reports (raw view) -
- Mainly user jobs with some MC ongoing.
- T0:
- No SAM tests displayed on the SUM dashboard - solved now (GGUS:92924
). Solution not very clear though.
- T1:
- RAL : Continuing to have occasional problems with setting up job environment.
Sites / Services round table:
- FNAL: NTR
- KIT: today 8.30 am one fileserver showed issues, hardware is currently being replaced, for the moment 6 x 30 TB are not available for ATLAS
- CNAF: Kernel upgrade is finished. Pledges are available as of today
- ASGC: NTR
- RAL: NTR
- NL-T1: NTR
- NDGF: NTR
- PIC: Scheduled DT of last week went well, but after coming back online the chimera system was unstable, therefore rolled back to the previous version. Pledges installed as of today.
- IN2P3: NTR
- OSG: NTR
- GridPP: NTR
- Batch Services: NTR
- Storage: Announcement: next Monday the "file update functionality" for CASTOR will be removed for ATLAS, CMS and LHCb. ALICE was already running without.
AOB:
Thursday
Attendance:
Experiments round table:
Sites / Services round table:
AOB: