WLCG Tier1 Service Coordination Minutes - 9 February 2012

Attendance

Local:

Remote:

Action list review

Release update

Data Management & Other Tier1 Service Issues

Site Status Recent changes Planned changes
CERN CASTOR 2.1.11-9 (tapegateway active) + SRM-2.11 for all main instances ; c2xrootd: 2.1.11-1
FTS: all nodes in SLC5 3.7.0-3
EOSATLAS: 0.1.1-11/xrootd-3.1; EOSCMS: 0.1.0/xrootd-3.0.4
 

CASTOR 2.1.12-1 is being released. Deployment will start with test/validation instances and the general rolll-out is foreseen for the first part of March.

Campaign of FileClasses changes (10 days for now). Preparatory work for 2.1.12 to organise user data on tape (Monday 20th)

ASGC CASTOR 2.1.11-6
SRM 2.11-0
DPM 1.8.2-3
None None
BNL dCache 1.9.12.10 (Chimera, Postgres 9 w/ hot backup)
http (aria2c) and xrootd/Scalla on each pool
None None
CNAF StoRM 1.8.0 (Atlas, CMS, LHCb) None None
FNAL dCache 1.9.5-23 (PNFS, postgres 8 with backup, distributed SRM) httpd=2.2.3
Scalla xrootd 2.9.7/3.1.0.osg
Oracle Lustre 1.8.6
EOS 0.1.1-12/xrootd 3.1.0.osg with Bestman 2.0.10
   
IN2P3 dCache 1.9.5-29 (Chimera) on core servers and pool nodes ? ?
KIT dCache
atlassrm-fzk.gridka.de: 1.9.12-11 (Chimera)
cmssrm-fzk.gridka.de: head nodes 1.9.5-26 (Chimera), pool nodes 1.9.5-6 through -25
gridka-dcache.fzk.de: head nodes 1.9.5-26 (PNFS), pool nodes 1.9.5-24,-25
xrootd (version 20100510-1509_dbg)
? ?
NDGF dCache 1.9.14 (Chimera) on core servers. Mix of 1.9.13 and 2.0.0 on pool nodes. ? ?
NL-T1 dCache 1.9.12-10 (Chimera) (SARA), DPM 1.7.3 (NIKHEF) ? ?
PIC dCache 1.9.12-14 (last upgrade to patch release on 14-Dec); PNFS on Postgres 9.0 ? ?
RAL CASTOR 2.1.10-1
2.1.10-0 (tape servers)
SRM 2.11

Upgraded SRM to 2.11

Upgrade NS to 2.1.11 on 14 Feb

Upgrade CASTOR to 2.1.11 between 20-29 Feb
TRIUMF dCache 1.9.5-28 with Chimera namespace None None

Other site news

CASTOR news

CERN operations and development

EOS news

xrootd news

dCache news

StoRM news

FTS news

  • FTS 2.2.8 now installed on the CERN pilot service. Stress tests from Altas, Phedex transfers, and Oracle 11 db all working. Action: Oliver will provide a table of FTS version / Oracle version / OS version compatibility. Now gLite FTS 2.2.8 is installed by CERN/IT/PES in production at the Tier0, wherelse EMI FTS 2.2.8 is installed on the pilot service (and in production at RAL). Official EMI release date is 2012/02/16. Sites should report every Thursday at the end of the daily meeting when they plan to upgrade so that all is done before LHC data taking (mid-March).

DPM news

LFC news

LFC deployment

Site Version OS, n-bit Backend Upgrade plans
ASGC NA NA NA NA
BNL 1.8.0-1 SL5, 64-bit Oracle None
CERN 1.8.2-0 64-bit SLC5 Oracle Upgrade to SLC5 64-bit only pending for lfcshared1/2
CNAF 1.8.0-1 SL5 64-bit Oracle None
FNAL N/A     Not deployed at Fermilab
IN2P3 1.8.2-2 SL5 64-bit Oracle 11g  
KIT 1.7.4-7 SL5 64-bit Oracle Oracle backend migration pending
NDGF 1.7.4.7-1 Ubuntu 10.04 64-bit MySQL None
NL-T1 1.7.4-7 CentOS5 64-bit Oracle  
PIC 1.7.4-7 SL5 64-bit Oracle  
RAL 1.7.4-7 SL5 64-bit Oracle  
TRIUMF 1.7.3-1 SL5 64-bit MySQL None

Experiment issues

WLCG Baseline Versions

Status of open GGUS tickets

Review of recent / open SIRs and other open service issues

Conditions data access and related services

Database services

  • Experiment reports:
    • All experiment databases have been successfully upgraded to 11gR2.
    • January security patches have also been deployed all all experiment databases.
    • Change of the compatible parameter from 10.2.0.5 to 11.2.0.3 ongoing.
    • Several issues affected Streams replication after the upgrades:
      • Bug when using segment advisor (metalink Streams Capture Aborting With ORA-26767 Due To Temp Tables Created By DBMS_COMPRESSION [ID 1082323.1]). Workaround: add additional capture rules to ignore offending LCRs.
      • 11.2.0.3 cannot capture changes on compressed tables when 'compatible' parameter is not raised to 11.2.0.3.
      • Compatible parameter has to be modified before or after the upgrade when all instances are up. Otherwise logminer will abort with ora-600. Making dictionary dump after change the parameter is a good practice.
      • Capture process tends to crash during reading of logs written during upgrade. Workaround: recreation of the process to start just after the upgrade.

  • Site reports:
Site Status, recent changes, incidents, ... Planned interventions
BNL None Migration of the Conditions Database to 11g and a new hardware, Feb 22 2012. Oracle data guard (physical standby database) will be used during this intervention.
CNAF LHCb updated to 11g on 31Jan-01Feb We still have this problem of huge amount of FTS old data to delete, which is mainly slowing down FtsMonitor, that is too big to be purged by the official scripts. This issue will be treated starting with a clean DB after the scheduled down for FTS HW consolidation and upgrade to SL5+EMI. We are coming closer to this, but there isn't a fixed date yet.
KIT Jan 24: ATLAS 3D DB migration to 11g, Feb 1: LHCb 3D/LFC DB upgrade to 11g and migration to new hardware.  
IN2P3 On 7 february, ASM gets stuck what had for consequence to abort the apply process.ASM had rised many access errors to some old raw devices which had been deleted at the OS level.The problem has been fixed by bouncing machines
Oracle has provided a solution for the problem observed during last upgrade attempt. Unfortunately the solution stop working just before redoing upgrade of LHCB database (09.02)
 
PIC LHCb database upgraded to 11g (30.01) Upgrade of FTS and migration of the ATLAS LFC is intended to do between 27th to 29th of february, but the exact day is not fixed yet.
RAL LHCb 3D upgraded to 11g (31.01) January security patches to be applied, no date yet
SARA LHCb 3D upgraded to 11g (01.02) - took until 02.02  
TRIUMF Oracle's JAN 2011 CPU Patch applied to our ATLAS 3D, FTS & TAG database servers. Short outage to change the database_compatible parameter to 11.2.0.3 to be scheduled

AOB

-- AndreaSciaba - 08-Feb-2012

Edit | Attach | Watch | Print version | History: r15 < r14 < r13 < r12 < r11 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r15 - 2012-02-09 - GiuseppeLoPresti
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback