TWiki
>
LCG Web
>
WLCGCommonComputingReadinessChallenges
>
WLCGOperationsWeb
>
Tier1ServiceCoordination
>
WLCGTier1ServiceCoordinationMinutes110915
(2011-10-12,
AndrewWong
)
(raw view)
E
dit
A
ttach
P
DF
---+ WLCG Tier1 Service Coordination Minutes - 15 September 2011 %TOC% ---++ Attendance ---++ Action list review ---++ Release update ---+++ Data Management & Other Tier1 Service Issues | *Site* | *Status* | *Recent changes* | *Planned changes* | | !CERN | CASTOR 2.1.11-6 (SL5); CASTOR-SRM 2.10-x (SL5); xrootd: 2.1.11-1<br /> FTS: 5 nodes in SLC5 3.7.0-3; 7 nodes in SLC4 3.2.1<br /> EOS -0.1.0/xrootd-3.0.4 | CASTOR-SRM on SL5 | EOS-SRM (bestman) update to 2.1.3 | | ASGC | CASTOR 2.1.11-2<br />SRM 2.11-0<br />DPM 1.8.0-1 | None | None | | BNL | dCache 1.9.5-23 (PNFS, Postgres 9) | None | Transition to Chimera during next TS (Nov) | | CNAF | !StoRM 1.7.0 | | | | FNAL | dCache 1.9.5-23 (PNFS) httpd=1.9.5.-25<br />Scalla xrootd 2.9.1/1.4.2-4<br />Oracle Lustre 1.8.3 | | | | !IN2P3 | dCache 1.9.5-26 (Chimera) on core servers. Mix of 1.9.5-24 and 1.9.5-26 on pool nodes | | | | KIT | dCache (admin nodes): 1.9.5-27 (ATLAS, Chimera), 1.9.5-26 (CMS, Chimera) 1.9.5-26 (LHCb, PNFS)<br />dCache (pool nodes): 1.9.5-6 through 1.9.5-27 | | | | NDGF | dCache 1.9.12 | | | | NL-T1 | dCache 1.9.5-23 (Chimera) (SARA), DPM 1.7.3 (NIKHEF) | | | | PIC | dCache 1.9.12-10 (last upgrade to patch release on 13-Sep); PNFS on Postgres 9.0 | | Planning intervention in the MSS: upgrade to Enstore2. Possible date 28-Sep. In contact with experiments to check if this is ok. | | !RAL | CASTOR 2.1.10-2 <br />2.1.10-0 (tape servers)<br />SRM 2.10-0 | | 7/9: will apply DB patches to 3D ATLAS and LHCb, FTS and LFC. Services will be "at risk" | | TRIUMF | dCache 1.9.5-28 with Chimera namespace | None | None | ---++++ Other site news ---++++ CASTOR news ---+++++ CERN operations and development ---++++ EOS news ---++++ xrootd news ---++++ dCache news ---++++ !StoRM news ---++++ FTS news * FTS 2.2.5 in gLite Staged Rollout: http://glite.cern.ch/staged_rollout * FTS 2.2.6 released in EMI-1 Update 6 on Sep 1 * FTS 2.2.7 being prepared for certification: [[https://savannah.cern.ch/patch/?4862][FTS 2.2.7 patch]] (see list of bugs at the end) ---++++ DPM news * DPM 1.8.2-2 ready for final certification (code already validated extensively and in use at some sites) * fast dpm-drain * filesystem selection algorithm configurable by admin * support for central banning (Argus) * https://savannah.cern.ch/patch/?5005 * https://savannah.cern.ch/patch/?5006 ---++++ LFC news * LFC 1.8.2-2 ready for final certification (code already validated extensively and in use at some sites) * fix for read-only replica operation (LHCb) * support for central banning (Argus) * https://savannah.cern.ch/patch/?5003 * https://savannah.cern.ch/patch/?5004 ---++++ LFC deployment | *Site* | *Version* | *OS, n-bit* | *Backend* | *Upgrade plans* | | ASGC | 1.8.0-1 | SLC5 64-bit | Oracle | None | | BNL | 1.8.0-1 | SL5, 64-bit | Oracle | None | | [[https://twiki.cern.ch/twiki/bin/view/PESgroup/PesGridServicesSfwLevels][CERN]] | 1.8.2-0 64-bit | SLC5 | Oracle | Upgrade to SLC5 64-bit only pending for lfcshared1/2 | | CNAF | 1.7.4-7 (ATLAS, to be dismissed><br />1.8.0-1 (LHCb, recently updated) | SL5 64-bit | Oracle | | | FNAL | N/A | | | Not deployed at Fermilab | | !IN2P3 | 1.8.0-1 | SL5 64-bit | Oracle 11g | Oracle DB migrated to 11g on Feb. 8th | | KIT | 1.7.4-7 | SL5 64-bit | Oracle | Oracle backend migration pending | | NDGF | 1.7.4.7-1 | Ubuntu 9.10 64-bit | !MySQL | None | | NL-T1 | 1.7.4-7 | !CentOS5 64-bit | Oracle | | | PIC | 1.7.4-7 | SL5 64-bit | Oracle | | | !RAL | 1.7.4-7 | SL5 64-bit | Oracle | | | TRIUMF | 1.7.3-1 | SL5 64-bit | !MySQL | | ---++++ Experiment issues ---+++ WLCG Baseline Versions * Release report: [[https://twiki.cern.ch/twiki/bin/view/LCG/LcgScmStatus#Deployment_Status][deployment status wiki page]] * WLCG Baseline versions: [[https://twiki.cern.ch/twiki/bin/view/LCG/WLCGBaselineVersions][table]] ---++ Status of open GGUS tickets * All 4 experiments confirmed they had no issues. Atlas shifters are reminded to use TEAM tickets in order to share the ticket ownership across shifts. * The introduction of a "Type of Problem" field for TEAM and ALARM tickets will take place with the 2011/09/28 GGUS Release as planned according to Savannah:117206. The field values will be, as agreed: * Infrastructure (File transfer/access, Batch, Monitoring) * Storage Systems * Databases * Network problem * Middleware ---++ Review of recent / open SIRs and other open service issues ---++ Conditions data access and related services ---++ Database services * Experiment reports: * ATLAS: * Due to a high load coming from COOL application third node of Atlas offline database (ATLR) rebooted on Thursday (8th of Sept) around 2AM. After 3rd node has restarted first instance of ATLR crashed due to clusterware error (according to Oracle documentation it's a known problem in 10g per unpublished bug). All services were relocated to the second node which managed to survive. All ATLR instances were back in operation in around 20 minutes. * As it was requested by ATLAS experiment the new schema ATLAS_CONF_TRIGGER_REPR has been added to ATLAS T0-T1s replication on Wednesday (14th of Sept). The intervention was not transparent and required 1 hour downtime (from 10.30 until 11.30) of whole ATLAS replication service between T0 and T1s. * CMS: * There were several failures of CMS PVSS stream replication on: Monday, Tuesday and Wednesday (12-14th of Spet) due to user mistakes who created view with wrong syntax, and tried to recompile non-existing views. * * WLCG: * LCGR database services were unreachable for 1 hour on Sunday (11th Sept ~19:30). The problem was caused by reaching maximum number of opened session by the database. This problem was triggered by the fact that some VOMS application stopped disconnecting from the database but kept opening new sessions. In order to make database available again instance no 4 was restarted. * Site reports: | *Site* | *Status, recent changes, incidents, ...* | *Planned interventions* | | BNL | - Resynchronization of a table in the Conditions DB due to a streams instantiation problem.%BR%- Renew DOE certificates in the BNL Oracle Enterprise Manager Grid Control. | - apply CPU patches initially to Conditions Database and proposed patch from Oracle (P6011045). | | CNAF | | Applying CPU July patches on 20th and 21st of September | | KIT | ATLAS 3D DB: due to the migration to new hardware, disk group names changed and we had to re-create a few dba_directories. | None | | IN2P3 | | | | NDGF | | | | PIC | | We're going to apply July CPU patch on all database next week, between Tuesday and Wednesday. | | RAL | Nothing to report | None | | SARA | Nothing to report (Not attending) | None | | TRIUMF | Nothing to report | None | ---++ AOB -- Main.JamieShiers - 14-Sep-2011
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r13
<
r12
<
r11
<
r10
<
r9
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r13 - 2011-10-12
-
AndrewWong
Log In
LCG
LCG Wiki Home
LCG Web Home
Changes
Index
Search
LCG Wikis
LCG Service
Coordination
LCG Grid
Deployment
LCG
Apps Area
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
LCG
All webs
Copyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback