WLCG Tier1 Service Coordination Minutes - 11/11/2010


Oliver, Dirk, Kors, Gavin, Alexei, Flavia, Roberto, Massimo, Flavia, Andrea V., Maarten, Miguel, Manuel, Alessandro, Nicolo, Simone, Jamie, Maria A., Maria D, Maria G, Juan (Atlas). Connected Michael, Carlos, Jon B. Jon NDGF, John DS, John K, Felix, Elena, Xavier, Gonzalo, Elena, Andrew, Ron, Rolf, Carmine, Andrew S., Andreas, Alexander.

CONFLICT version new:
Local: Maria Allandes, Dirk, Nicolo, Maarten, Manuel, Ricardo, Oliver, Flavia, MariaDZ, Gavin, Jamie, Maria, Huang, Andrea V, Massimo, Simone, Alessandro, Alexei, Roberto

Remote: Jon, Jeremy, Foud, Joel, John Kelly, Ian Fisk, Alexander Verkooijen, Elena Planas, Carmine, Carlos, DESY, Andreas@GridKA, FelixLee(ASGC), Jhen-Wei, 00886922923533, Dave Dykstra

Release Update

WLCG Baseline Versions

Data Management & Other Tier1 Service Issues

Site Status Recent changes Planned changes
CASTOR 2.1.9-9 (ALICE, CMS and LHcb)
SRM 2.9-4 (all)
xrootd 2.1.9-7
LHCb upgraded on 8-NOV-2010  
ASGC CASTOR 2.1.7-19 (stager, nameserver)
CASTOR 2.1.8-14 (tapeserver)
SRM 2.8-2
11/11 00:00-06:00 UTC: scheduled downtime for network maintenance none
BNL dCache 1.9.4-3 (PNFS)    
CNAF StoRM 1.5.4-5 (ATLAS, CMS, LHCb,ALICE)    
FNAL dCache 1.9.5-23 (PNFS)
Scalla xrootd 2.9.1/1.4.2-4
Upgraded dCache on Nov 8
xrootd read accessed opened to CMS VO
IN2P3 dCache 1.9.5-22 (Chimera)    
KIT dCache 1.9.5-15 (admin nodes) (Chimera)
dCache 1.9.5-5 - 1.9.5-15 (pool nodes)
NDGF dCache 1.9.7 (head nodes) (Chimera)
dCache 1.9.5, 1.9.6 (pool nodes)
NL-T1 dCache 1.9.5-19 (Chimera) (SARA), DPM 1.7.3 (NIKHEF)    
PIC dCache 1.9.5-21 (PNFS)    
RAL CASTOR 2.1.7-27 and 2.1.9-6 (stagers)
2.1.9-1 (tape servers)
SRM 2.8-2 and SRM 2.8-6
  CMS upgrade to 2.1.9-6 on 16-18/11/10 and ATLAS to 2.1.9-6 on 6-8/12/10
TRIUMF dCache 1.9.5-21 with Chimera namespace none none


CERN operations

LHCb has been upgraded. We are following on a alarm ticket (not necessarily linked to the upgrade). A post-mortem (upgrade + ticket) is planned for the end of the week.

We are preparing for the DB upgrades on the stagers (after the HI run but before Christmas). The DB upgrade related to the NS is definitely for January 2011 (it affects all instances (hence VO) and will require some downtime)


Release 2.1.9-10 has been produced, which mainly targets issues found at RAL following their upgrade to 2.1.9. Full release notes and upgrade instructions are available.

xrootd news

dCache news

  • Installed an experimental RSS feed for the different dCache downloads. It is linked from the dCache download pages and the feeds are organized per target (e.g. clients, server versions, etc.)
  • New Golden Release, 1.9.5-23 (see the release notes). Highlights:
    • Space Manager Database access was optimized to make better use of indexes.
    • Fixed read from pool that was disabled with the -rdonly flag of the pool disable command.
    • Fixed xrootd mover TCP port allocation to avoid reusing a port until previous transfers on the port have finished.
    • Fixed implementation of rep ls -l=c command.
  • Recommended feature release: 1.9.10-2 (release notes). Highlights:
    • Mostly speedup of the SRM, xrootd and NFS4.1 protocol.

StoRM news

FTS news

DPM news

  • DPM 1.8.0-1 has been certified
  • dpm-xrootd 2.2.0-1 has been certified
  • Working on DPM 1.8.1:
    • faster dpm-drain and replication
    • refactoring of Name Server code for better performance of SRM, xroot and NFS 4.1

LFC news

  • LFC 1.8.0-1 has been certified

LFC deployment

Site Version OS, n-bit Backend Upgrade plans
ASGC 1.7.2-4 SLC4 64-bit Oracle Testing on a database dump, upgrade will be scheduled after tests, no date for now
BNL 1.7.2-4 SL4 Oracle 1.7.4 on SL5 in November
CERN 1.7.3 64-bit SLC4 Oracle Will upgrade to SLC5 64-bit by the end of the year
CNAF 1.7.2-4 SLC4 32-bit Oracle 1.7.4 on SL5 64-bit in November
FNAL N/A     Not deployed at Fermilab
IN2P3 1.7.4-7 SL5 - 64 bits Oracle  
KIT 1.7.4 SL5 64-bit Oracle  
PIC 1.7.4-7 SL5 64-bit Oracle  
RAL 1.7.4-7 SL5 64-bit Oracle  
TRIUMF 1.7.2-5 SL5 64 bit MySQL  

Experiment issues

GGUS Issues

Outstanding SIRs

Conditions data access and related services



Database services

* Topics of general discussion

    • Distributed Database Operations Workshop - please register if attending social dinner:

  • Experiment reports:
    • ALICE:
      • Nothing to report
    • ATLAS:
      • Conditions replication to SARA fixed.
      • Atlas PANDA applications have suffered from transaction locking issues on Wednesday (3rd Nov) afternoon. DBAs had to intervene to kill user sessions to unblock the application. We are following up with Atlas on the issue.
    • CMS:
      • On Tuesday morning (9th Nov) CMS PVSS replication from online to offline database was unexpectedly disabled for 2 hours due failure of one of weekly automatic maintenance procedures (shrinking of LogMiner table). The procedure has been modified on to inform DBAs via email whenever its execution is unsuccessful.
      • On Tuesday (9th Nov) CMS PVSS replication was affected once again for 30 minutes because of user error (using table without primary key) which caused abort of the apply process.
    • LHCb:
      • Conditions replication to SARA fixed.

  • Site reports:
Site Status, recent changes, incidents, ... Planned interventions
ASGC Nothing to report - Install and set up TSM
- Data Guard studies, testbed creation and implementation plans.
BNL Nothing to report Deployment of PSU OCT 2010 in TAGS test cluster.
Performance tests of data replication using Transportable Tablespaces between Triumf and BNL for TAGS database.
CNAF Nothing to report None
KIT Nothing to report None
IN2P3 Nothing to report None
NDGF Nothing to report None
PIC Nothing to report None
RAL Nothing to report Planning to apply October PSU on 3D DBs at the next CERN technical stop.
SARA Nothing to report November 16th starting at 7:00 UTC until 17:00 UTC- intervention is caused by maintenance on network infrastructure.
TRIUMF Nothing to report None


Action List

Action number Description Announced Due Last Update Status
20101028_01 RAL out of production due to Atlas upgrade 20101028 20101124-27 20101028 Open
20101028_02 Configure new ASGC T2 channels
Done at: ASGC, CERN, IN2P3. KIT: tomorrow
20101028 20101104 20101111 Open
20101028_03 CMS to decide on redirector fix of GGUS:62696 20101028 a.s.a.p. 20101028 Open
20101028_04 IN2P3 (P.Girard)-CERN(H.Renshall) WG to address Afs issues of LHCb shared area GGUS:59880,GGUS:62800
The WG is active (thanks to Harry) and an intermediate report is attached
20101028 a.s.a.p. 20101111 Closed
20101028_05 Invite Dave Dijkstra to discuss FroNTier/squid sharing by Atlas and CMS sites 20101028 20101111 20101028 Open

-- AndreaSciaba - 10-Nov-2010

Topic attachments
I Attachment History Action Size Date Who Comment
Microsoft Word filedoc CCIN2P3-WLCGT1SCM-LHCB-SW-Problem-Report-20101111-0.doc r1 manage 38.5 K 2010-11-11 - 12:21 AndreaSciaba Intermediate report for the LHCb software area AFS problem
Edit | Attach | Watch | Print version | History: r15 | r13 < r12 < r11 < r10 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r11 - 2010-11-11 - MatthewViljoen1
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback