TWiki
>
LCG Web
>
WLCGCommonComputingReadinessChallenges
>
WLCGOperationsMeetings
>
WLCGDailyMeetingsWeek131209
(revision 9) (raw view)
Edit
Attach
PDF
---+!! Week of 131209 %TOC% ---++ WLCG Operations Call details To join the call, at 15.00 CE(S)T, by default on Monday and Thursday (at CERN in 513 R-068), do one of the following: 1. Dial +41227676000 (Main) and enter access code 0119168, or 2. To have the system call you, click [[https://audioconf.cern.ch/call/0119168][here]] The scod rota for the next few weeks is at ScodRota ---++ WLCG Availability, Service Incidents, Broadcasts, Operations Web | *VO Summaries of Site Usability* ||||*SIRs* |*Broadcasts* |*Operations Web* | | [[http://dashb-alice-sum.cern.ch/dashboard/request.py/historicalsmryview-sum#view=siteavl&time%5B%5D=lastWeek&profile=ALICE_CRITICAL&group=all%2Bsites&site%5B%5D=CCIN2P3&site%5B%5D=CERN&site%5B%5D=CNAF&site%5B%5D=FZK&site%5B%5D=NIKHEF&site%5B%5D=RAL&site%5B%5D=SARA&type=quality][ALICE]] | [[http://dashb-atlas-sum.cern.ch/dashboard/request.py/historicalsmryview-sum#view=siteavl&time%5B%5D=lastWeek&profile=ATLAS_CRITICAL&group=All%2Bsites&site%5B%5D=BNL-ATLAS&site%5B%5D=CERN-PROD&site%5B%5D=FZK-LCG2&site%5B%5D=IN2P3-CC&site%5B%5D=INFN-T1&site%5B%5D=NDGF-T1&site%5B%5D=NIKHEF-ELPROD&site%5B%5D=pic&site%5B%5D=RAL-LCG2&site%5B%5D=SARA-MATRIX&site%5B%5D=Taiwan-LCG2&site%5B%5D=TRIUMF-LCG2&type=quality][ATLAS]] | [[http://dashb-cms-sum.cern.ch/dashboard/request.py/historicalsmryview-sum#view=siteavl&time%5B%5D=lastWeek&profile=CMS_CRITICAL_FULL&group=Tier1s%2B%252B%2BTier0&site%5B%5D=T0_CH_CERN&site%5B%5D=T1_CH_CERN&site%5B%5D=T1_DE_KIT&site%5B%5D=T1_ES_PIC&site%5B%5D=T1_FR_CCIN2P3&site%5B%5D=T1_IT_CNAF&site%5B%5D=T1_TW_ASGC&site%5B%5D=T1_UK_RAL&site%5B%5D=T1_US_FNAL&type=quality][CMS]] | [[http://dashb-lhcb-sum.cern.ch/dashboard/request.py/historicalsmryview-sum#view=siteavl&time%5B%5D=lastWeek&profile=LHCb_CRITICAL&group=Tier%2B0/1&site%5B%5D=LCG.CERN.ch&site%5B%5D=LCG.CNAF.it&site%5B%5D=LCG.GRIDKA.de&site%5B%5D=LCG.IN2P3.fr&site%5B%5D=LCG.NIKHEF.nl&site%5B%5D=LCG.PIC.es&site%5B%5D=LCG.RAL.uk&site%5B%5D=LCG.SARA.nl&type=quality][LHCb]] | [[https://twiki.cern.ch/twiki/bin/view/LCG/WLCGServiceIncidents][WLCG Service Incident Reports]] | [[https://operations-portal.egi.eu/broadcast/archive][Broadcast archive]] | [[WLCGOperationsWeb][Operations Web]] | ---++ General Information | *General Information* ||| *GGUS Information* | *LHC Machine Information* | | [[http://itssb.web.cern.ch/][CERN IT status board]] | [[https://twiki.cern.ch/twiki/bin/view/LCG/WLCGBaselineVersions][WLCG Baseline Versions]] | [[http://cern.ch/planet-wlcg][WLCG Blogs]] | GgusInformation | [[https://espace.cern.ch/be-dep-op-lhc-machine-operation/default.aspx][Sharepoint site]] - [[http://op-webtools.web.cern.ch/op-webtools/vistar/vistars.php?usr=LHC1][LHC Page 1]] | <HR> ---++ Monday Attendance: * local: Simone (SCOD), Alessandro (ATLAS), Raja (LHCb), Przemek (CERN-DB), Vitor (CERN-PES), Maarten (ALICE) * remote: Xavier (KIT), Pepe (PIC), Sang-Un (KISTI), Rolf (IN2P3), Michael (BNL), Tiju (RAL), Jeremy (GridPP), Roger (NDGF), Onno (NL-T1), Rob (OSG) Experiments round table: * ATLAS [[https://twiki.cern.ch/twiki/bin/view/Atlas/ADCOperationsDailyReports2013][reports]] ([[https://twiki.cern.ch/twiki/bin/view/Atlas/ADCOperationsDailyReports2013?raw=on][raw view]]) - * Central services * ATLAS_DDM_VOBOXes were unstable on Dec. 5th. Back stable at 2:00UTC on Dec. 6th. * PilotFactories also. Degraded during 12:00UTC - 24:00UTC on Dec. 5th. * T0/T1 * FZK-LCG2: Network trouble caused DNS lookup errors on Dec. 6th. GGUS:99571. Fixed. * FZK-LCG2: Transfer failures due to 'RQueued' (reported last Thursday ) still happening. Around 10% of failure rate since Dec. 8th. * TAIWAN-LCG2: Recovered from disk server crash on Oct. 30th. GGUS:98482 closed. * CMS [[https://twiki.cern.ch/twiki/bin/view/CMS/FacOps_WLCGdailyreports][reports]] ([[https://twiki.cern.ch/twiki/bin/view/CMS/FacOps_WLCGdailyreports?raw=on][raw view]]) - * It has been a very quiet few days, largely just some scattered issues at scattered T2 sites. * The exception to this is CNAF, for which the storage was down for several days. It's back now. * I have just (13:40) learned that there is trouble with the CERN BDII that are making sites appear unavailable in SAM tests. GGUS:99521, perhaps there will be an update by 15:00? * ALICE - * NTR * LHCb [[https://twiki.cern.ch/twiki/bin/view/LHCb/ProductionOperationsWLCGdailyReports][reports]] ([[https://twiki.cern.ch/twiki/bin/view/LHCb/ProductionOperationsWLCGdailyReports?raw=on][raw view]]) - * Main activities is Simulation at all Sites. * T0: * Movement of LHCb VO-boxes is completed. DIRAC services started and jobs going out again to the grid. * srm not responding according to dashboard (http://dashb-lhcb-sum.cern.ch/dashboard/request.py/historicalsmryview-sum#view=serviceavl&time[]=last12&granularity[]=default&profile=LHCb_CRITICAL&group=Tier+0/1&site[]=LCG.CERN.ch&type=quality) * T1: * CNAF storage is back in operation * IN2P3 : Continuing problem with nagios probe (GGUS:99420). Waiting for support from SAM/dashboard administrators. Sites / Services round table: * KIT: there will be 3 downtimes tomorrow: CMS dCache, firewall, tape management software. Thursday dCache for ATLAS will be upgraded as well. * PIC: finishing the SIR on the network incident occurred last week. Will be provided ASAP. * BNL: there will be a 2h network intervention one week from now (next monday). It will also affect access to LFC and FTS (therefore T2s activity) beside T1 services . On tuesday next dCache will be upgraded to the SHA-2 compliant version. * IN2P3: downtime tomorrow. Operations portal down from 8:30 to 10:30. * NL-T1: downtime on december 17th: 24 hours maintenance of the MSS. It will not be possible to stage files during that time. Maarten: what is the situation with the disk servers (which gave lots of troubles in the past weeks)? Onno: seem to be stable now after a lot of hardware replacement. New hardware should also arrive before the end of the year. Maarten: at the end of the process, a SIR should be provided (there was also some minimal data loss). Onno: will do. * NDGF: this morning in downtime for upgrade of central storage services. On wednesday there will be a network intervention which will affect some pools; therefore some data might be unavailable. * CERN DB: intervention on wednesday (10 AM CET) to the WLCGR test and integration database. * ASGC: During the weekend, our data center was suffering high temperature issue due to there were some problems with our air conditions, so, it caused some CASTOR disks to be unstable, it should be improved in Monday morning. * Maarten for PES: it is very urgent to upgrade the CERN and SAM BDII to the latest version to make sure the FCR mechanism does not affect SAM tests. Also T1s are invited to upgrade. AOB: ---++ Thursday Attendance: * local: * remote: Experiments round table: * ATLAS [[https://twiki.cern.ch/twiki/bin/view/Atlas/ADCOperationsDailyReports2013][reports]] ([[https://twiki.cern.ch/twiki/bin/view/Atlas/ADCOperationsDailyReports2013?raw=on][raw view]]) - * CMS [[https://twiki.cern.ch/twiki/bin/view/CMS/FacOps_WLCGdailyreports][reports]] ([[https://twiki.cern.ch/twiki/bin/view/CMS/FacOps_WLCGdailyreports?raw=on][raw view]]) - * ALICE - * LHCb [[https://twiki.cern.ch/twiki/bin/view/LHCb/ProductionOperationsWLCGdailyReports][reports]] ([[https://twiki.cern.ch/twiki/bin/view/LHCb/ProductionOperationsWLCGdailyReports?raw=on][raw view]]) - Sites / Services round table: * *CERN CvmFS* The stratum 0 (cvmfs-stratum-zero.cern.ch) and stratum 1 (cvmfs-stratum-one.cern.ch) will migrate to new hardware, OS and in the stratum1 case also from from 2.0.? to 2.1.15. The migration will be transparent for all stratum 1s that are replicating from the stratum 0. It will also be transparent for all CvmFS clients (both 2.0.* and 2.1.*) that are using the stratum one [[https://cern.service-now.com/service-portal/view-outage.do?&n=OTG5767][ITSSB]]. AOB: Middleware Readiness WG meeting *TODAY* at 4pm CET. Agenda and connection details in https://indico.cern.ch/conferenceDisplay.py?confId=285681
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r12
<
r11
<
r10
<
r9
<
r8
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r9 - 2013-12-11
-
MariaDimou
Log In
LCG
LCG Wiki Home
LCG Web Home
Changes
Index
Search
LCG Wikis
LCG Service
Coordination
LCG Grid
Deployment
LCG
Apps Area
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
Altair
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
LCG
All webs
Copyright &© 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback