TWiki
>
LCG Web
>
WLCGTier1ServiceCoordinationMinutes110120
(revision 22) (raw view)
Edit
Attach
PDF
---+ WLCG Tier1 Service Coordination Minutes - 20th January 2011 %TOC% ---++ Attendance local(); remote (); ---++ Release update ---++ Data Management & Other Tier1 Service Issues | *Site* | *Status* | *Recent changes* | *Planned changes* | | !CERN | CASTOR 2.1.10 (CMS, ATLAS and ALICE)<br />CASTOR 2.1.9-9 (LHCb)<br />SRM 2.9-4 (all)<br />xrootd 2.1.9-7 | Oracle changes (all instances): upgrade on Jan the 6th to 10.2.0.5 | LHCb will be upgraded to 2.1.10 | | ASGC | CASTOR 2.1.7-19 (stager, nameserver)<br />CASTOR 2.1.8-14 (tapeserver)<br />SRM 2.8-2 | 14/1: 4h "at risk" [[https://next.gocdb.eu/portal/index.php?Page_Type=View_Object&object_id=24425&grid_id=0][intervention]] on tape system due to construction of electrical power system in data center | None | | BNL | dCache 1.9.5-23 (PNFS, Postgres 9) | 1.9.4 upgraded to 1.9.5-23; PG 8.3 to PG 9 | None | | CNAF | !StoRM 1.5.6-3 (ATLAS, CMS, LHCb,ALICE) | | upgrade OS to SL5 within February | | FNAL | dCache 1.9.5-23 (PNFS)<br />Scalla xrootd 2.9.1/1.4.2-4 | None | Putting Lustre into Production Service for Merging Pools | | !IN2P3 | dCache 1.9.5-22 (Chimera) | | | | KIT | dCache 1.9.5-15 (admin nodes) (Chimera)<br />dCache 1.9.5-5 - 1.9.5-15 (pool nodes) | | | | NDGF | dCache 1.9.7 (head nodes) (Chimera)<br />dCache 1.9.5, 1.9.6 (pool nodes) | | | | NL-T1 | dCache 1.9.5-23 (Chimera) (SARA), DPM 1.7.3 (NIKHEF) | | | | PIC | dCache 1.9.5-23 (PNFS) | | | | !RAL | CASTOR 2.1.9-6 (stagers)<br />2.1.9-1 (tape servers)<br />SRM 2.8-6 |Upgraded ATLAS disk servers to SL5 64bit and enabled checksum support|Will next upgrade CMS disk servers and enable checksum support. Will upgrade Oracle to 10.2.0.5 on 1 Feb 2011| | TRIUMF | dCache 1.9.5-21 with Chimera namespace | None | None | ---+++ CASTOR news ---++++ CERN operations ---++++ Development No news. ---+++ xrootd news ---+++ dCache news ---+++ !StoRM news * !StoRM 1.6.0 released for Early Adopters for SL5 X86_64 on Jan 13. Changelog is [[http://storm.forge.cnaf.infn.it/documentation/changelog][here]]. ---+++ FTS news No news. ---+++ DPM news * DPM 1.8.0 released for gLite 3.2 / SL5 on Jan 18, including these highlights: * new VOMS library fixing memory leaks in SRM daemons * facility to ban users and groups (VOMS attributes) * DPM 1.8.0 for gLite 3.1 / SL4 still in Staged Rollout: * http://glite.cern.ch/staged_rollout * Edinburgh already upgraded * should be released to production soon ---+++ LFC news * LFC 1.8.0 for gLite 3.1 / SL4 and 3.2 / SL5 still in Staged Rollout * new VOMS library fixing memory leaks in LFC daemon * facility to ban users and groups (VOMS attributes) * http://glite.cern.ch/staged_rollout * BNL already upgraded * should be released to production soon ---++++ LFC deployment | *Site* | *Version* | *OS, n-bit* | *Backend* | *Upgrade plans* | | ASGC | 1.7.4-7 | SLC5 64-bit | Oracle | None (upgrade done on 4/1) | | BNL | 1.8.0-1 | SL5, 64-bit | Oracle | | | [[https://twiki.cern.ch/twiki/bin/view/PESgroup/PesGridServicesSfwLevels][CERN]] | 1.7.3 64-bit | SLC4 | Oracle | Will upgrade to SLC5 64-bit by the end of Jan or begin of Feb. | | CNAF | 1.7.4-7 | SL5 64-bit | Oracle | | | FNAL | N/A | | | Not deployed at Fermilab | | !IN2P3 | 1.8.0-1 | SL5 64-bit | Oracle | Upgraded to LFC 1.8.0 on January 4th | | KIT | 1.7.4 | SL5 64-bit | Oracle | | | NDGF | | | | | | NL-T1 |1.7.4-7 | !CentOS5 64-bit | Oracle | | | PIC | 1.7.4-7 | SL5 64-bit | Oracle | | | !RAL | 1.7.4-7 | SL5 64-bit | Oracle | | | TRIUMF | 1.7.3-1 | SL5 64-bit | !MySQL | | ---+++ Experiment issues ---++ Status of open GGUS tickets ---++ GGUS - Service Now interface ---++ Review of open SIRs ---++ Conditions Data Access and related services ---++ Database services * Experiment reports: * Generic: * LHCb and ATLAS downstream capture databases have been patched to 10.2.0.5 on Wednesday 5th of Jan. * ALICE: * Online DB upgraded to 10.2.0.5 on 12 of Jan * Online DB not available between 19th of Jan 16h and 20th of Jan 19h due to power tests in the pit. * ATLAS: * Atlas offline database for ADC applications (PANDA, DQ2 and prodsys accounts) have been moved to a dedicated database on Monday 17th of Jan 2011. The operation has involved a scheduled downtime for the affected accounts from 9am till 6pm and has included a full backup to tape of the new DB accounts. The rest of Atlas offline DB, in particular conditions and PVSS have been untouched. * Atlas online DB has been upgraded to 10.2.0.5 on Wednesday afternoon (19th of Jan). * After the CC power cut on 18th of December, the ATLARC database did not want to reboot. The investigation showed, that one of online log files was corrupted. This file has been already archived by the database but it turned out that it was archived corrupted. That looks like a database bug, since a corrupted online log file should not be archived successfully. In the consequence, the restore of the database from the backup was needed. The recovery operation was possible only to the point in time few hours before the power cut (7.30 a.m.), because TSM service lost some backup data after power cut and the existing archivelog was corrupted. The users agreed to restore the DB to the point in time several hours before, without waiting for TSM to be able to recover more data. * CMS: * CMS online DB was not available between 8:25 and 19:10 on the 4th of January, due to power cut in P5. Manual interventions were require to start the DB after power was back (~17h). * Maintenance activities over the weekend on schemas replicated by CMS PVSS streaming introduced aborts and high latency of the replication. To avoid further problems other changes has been done manually on the target database - 17th of Jan * CMS online production database was stopped twice during last week on Monday and Tuesday morning (10-11 Jan). First downtime was necessary due to power tests at P5. At the same time several disks critical for database operation were replaced with new ones. Second downtime was related to upgrade to 10.2.0.5. The upgrade was completed successfully. * CMS offline database has been upgraded to version 10.2.0.5 on Wednesday afternoon (19th of Jan). * LHCb: * LHCb online DB was not available between 0:10 and 19h on the 20th of Dec due to scheduled power tests in LHCb pit. * Site reports: | *Site* | *Status, recent changes, incidents, ...* | *Planned interventions* | | ASGC | LFC DB upgraded to 10.2.0.5 | None | | BNL | Conditions DB - underlying storage firmware patches applied%BR%Updates of OS RHEL 5 and Oracle to 10.2.0.5:%BR%- LFC BNL, LFC Tier 3 and FTS database service successfully migrated to a new hardware (head nodes/storage)%BR%- VOMS / Dcache Priority Stager database successfully migrated to a new hardware service (head nodes/storage) | Upgrade 10.2.0.5 Conditions DB | | CNAF | | Still no exact plan for the 10.2.0.5 upgrade, but will define it soon. | | KIT | A new DBA: Stefan Waldecker | Upgrade of 3D RACs (ATLAS, LHCb) to Oracle 10.2.0.5 on Jan 26 (during full GridKa/DE-KIT downtime 7:00-18:00 UTC). | | IN2P3 | Nothing to report | - DBLHCB,DBATL and DBAMI - On 8th feb [9:00 - 18:00 CET], we are going to upgrade our storage system and network switch. All 3D databases will be shutdown during this intervention. %BR%- DBAMI - On 20th Jan, we will add a new schema into the AMI stream. | | NDGF | Nothing to report | None | | PIC | | We are planning a downtime for early of February to upgrade our Oracle databases and other tasks, but the date is not fixed yet. | | RAL | We have installed the new CASTOR and ATLAS LFC hardware which is now been tested. Next step is to install Data Guard in High Availability mode and test it before going in production. | - Planning to upgrade Castor DBs on the 31st if we get the final approval from the experiment.%BR%- Waiting for CERN to finish their upgrade before we proceed to upgrade our 3D and LFC/FTS DBs | | SARA | Successfully moved the database back to the original cluster hardware on 18th of Jan | No date for 10.2.0.5 upgrade yet | | TRIUMF | Nothing to report | Planning to upgrade to Oracle 10.2.5 sometime in February. | ---++ Action List |*Action number*|*Description*|*Announced*|*Due*|*Last Update*|*Status*| |20101028_01|!RAL out of production due to Atlas upgrade<br>%RED%Being an announcement and not an action it will not be further followed up%ENDCOLOR%|20101028|20101206-08|20101119|Open| |20101028_02|Configure new ASGC T2 channels<br>%RED%Done at: ASGC, CERN, !IN2P3. KIT: tomorrow%ENDCOLOR%|20101028|20101104|20101111|Open| |20101028_03|CMS to decide on redirector fix of GGUS:62696|20101028|a.s.a.p.|20101028|Open| |20101028_05|Invite Dave Dijkstra to discuss FroNTier/squid sharing by Atlas and CMS sites|20101028|20101111|20101028|Open| |20101216_01|Write SIR on Atlas db server reboot|20101216|a.s.a.p.|20110120|Done| |20101216_01|Write SIR on Atlas db server reboot|20101216|a.s.a.p.|20110120|Open| ---++ AOB
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r26
|
r24
<
r23
<
r22
<
r21
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r22 - 2011-01-20
-
DawidWojcik
Log In
LCG
LCG Wiki Home
LCG Web Home
Changes
Index
Search
LCG Wikis
LCG Service
Coordination
LCG Grid
Deployment
LCG
Apps Area
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
Altair
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
LCG
All webs
Copyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback