Week of 150302
WLCG Operations Call details
- At CERN the meeting room is 513
R-068.
- For remote participation we use the Vidyo system. Instructions can be found here
.
General Information
- The purpose of the meeting is:
- to report significant operational issues (i.e. issues which can or did degrade experiment or site operations) which are ongoing or were resolved after the previous meeting;
- to announce or schedule interventions at Tier-1 sites;
- to inform about recent or upcoming changes in the experiment activities or systems having a visible impact on sites;
- to provide important news about the middleware;
- to communicate any other information considered interesting for WLCG operations.
- The meeting should run from 15:00 until 15:20, exceptionally to 15:30.
- The SCOD rota for the next few weeks is at ScodRota
- General information about the WLCG Service can be accessed from the Operations Web
- Whenever a particular topic needs to be discussed at the daily meeting requiring information from site or experiments, it is highly recommended to announce it by email to wlcg-operations@cernSPAMNOTNOSPAMPLEASE.ch to make sure that the relevant parties have the time to collect the required information or invite the right people at the meeting.
Monday
Attendance:
- local: Alessandro (ATLAS), Jan (storage), Maarten (SCOD + ALICE), Steve (grid services), Zbyszek (databases)
- remote: Christian (NDGF), Christoph (CMS), Di (TRIUMF), Felix (ASGC), Lisa (FNAL), Onno (NLT1), Pepe (PIC), Rob (OSG), Rolf (IN2P3), Sang-Un (KISTI), Tiju (RAL), Vladimir (LHCb)
Experiments round table:
- ATLAS reports ( raw view) -
- Central Services/T0/T1
- FTS3 servers upgrade: we would like that BNL, RAL and CERN to upgrade to the latest FTS server release to fix the activity share issue.
- Steve: the downtime will be less than 5 min; do the upgrade tomorrow (Tue)?
- Christoph: OK for CMS
- Vladimir: OK for LHCb
- Alessandro: we will only re-enable the shares once all 3 FTS for ATLAS have been upgraded; we will coordinate this through the FTS-3 mailing list
- ALICE -
- high activity
- RAL now allow up to 6k concurrent ALICE jobs when other VOs have little activity - thanks!
- LHCb reports ( raw view) -
- Distributed computing dominated by Monte Carlo and user activities.
- T0: NTR
- T1: NTR
- Vladimir: there are fewer jobs than usual due to a minor issue that appeared after the migration to the DIRAC File Catalog and is being worked on
Sites / Services round table:
- ASGC: ntr
- BNL:
- CNAF:
- FNAL:
- the issue of GGUS alarm tickets paging the wrong number has been resolved
- GridPP:
- IN2P3:
- reminder of the downtime tomorrow; the ATLAS tape buffer space will be more than sufficient during the intervention
- JINR:
- KISTI: ntr
- KIT:
- NDGF:
- downtime tomorrow morning for dCache head nodes upgrade
- there was a power supply problem at the Copenhagen site; the problem has disappeared, but the root cause has not been found yet
- NL-T1:
- on Mon March 9 the dCache upgrade from 2.6 to 2.10 will start; the downtime has been declared for 2 days because the DB upgrade might not finish in 1 day
- NRC-KI:
- OSG: ntr
- PIC:
- on Tue March 10 there will be a 5h downtime for network upgrades and maintenance as well as a dCache 2.10 patch
- RAL: ntr
- RRC-KI:
- TRIUMF: ntr
- CERN batch and grid services:
- the migration from VOMRS to VOMS-Admin is proceeding as planned; the new service is expected to be available by 17:00 CET (and it is)
- CERN storage services:
- CASTOR-LHCb will be down for an upgrade this Wed
- CASTOR-ALICE will be down for an upgrade next Mon
- a bunch of rolling DB updates will soon be announced; they should be fairly transparent from the client side; some daemons may need to be restarted on the server side
- Databases:
- tomorrow there will be rolling updates of WLCG integration DBs (INT6R and INT11R)
- the corresponding production DB updates will be done in 1 or 2 weeks
- tomorrow there will be a full-day transparent intervention on back-end storage serving half of all DBs
- CASTOR, WLCG and experiments are affected in principle
- GGUS:
- Grid Monitoring:
- MW Officer:
AOB:
Thursday
Attendance:
Experiments round table:
Sites / Services round table:
- ASGC:
- BNL:
- CNAF:
- FNAL:
- GridPP:
- IN2P3:
- JINR:
- KISTI:
- KIT:
- NDGF:
- NL-T1:
- OSG:
- PIC:
- RAL:
- TRIUMF:
- CERN batch and grid services:
- CERN storage services:
- Databases:
- GGUS:
- Grid Monitoring:
- MW Officer:
AOB: