TWiki
>
CMSPublic Web
>
CompOps
>
CompOpsWorkflowTeam
>
WorkflowTeamMeeting
>
WorkflowTeamMeeting20140424
(2014-04-24,
JenniferAdelmanMcCarthy
)
(raw view)
E
dit
A
ttach
P
DF
---+!! Workflow Team Meeting - April 24 %TOC{depth="3" title="Contents:"}% ---++ [[https://indico.cern.ch/event/254703/][Vidyo Link]] ---++ Attending * FNAL - Jen Seangchan * CERN - Dave, Jullian Andrew, Adli ---++ Personel | April 17 -> April 24 | Jasper | | April 24 -> May1 | Adli | * Note: the schedule only goes up to next Thurs. We need a shift schedule for May/June! Please fill [[https://doodle.com/uvg5654vib78qa32#admin][doodle poll]] * Luis is going to Colombia for a seminar April 21-25 then on vacation April 28-May 2 * Dave at CERN April 21-29 * Julian will be on vacation from 16-May to 18-May * Jen will be working from home Fri April 25th ---++ News * RAL and KIT will have downtimes next week we think * RAL's downtime is for Tues & Wed for network upgrades we will put them in drain in Monday and when it is over on Wed we will bring them back up. * High priority WF's will come this week * MinBias, "other stuff" plan is to run on high-prio agents * first doing GEN-SIM assign them with a limited favorite T2's, and T1's that are not RAL, KIT, FNAL or CERN * Expect e-logs from Vincenzo and Dave over these ---++ Jasper's Notes * glidein problems at several sites * https://ggus.eu/index.php?mode=ticket_info&ticket_id=104780 * Recent tickets RALPP - status of this? site in drain. undrain it! https://ggus.eu/index.php?mode=ticket_info&ticket_id=102917 https://savannah.cern.ch/support/index.php?142870 * T1_TW_ASGC https://ggus.eu/?mode=ticket_info&ticket_id=103047 https://ggus.eu/?mode=ticket_info&ticket_id=104855 * only failing jobs for the last week because they have been problems with links to the T2's * Jasper and Julian put it in drain * ticket has been opened * IIFCA https://ggus.eu/?mode=ticket_info&ticket_id=104851 * PNPI - still in waiting room because SAM tests are failing it has been in down for a long time. This morning it was in skip in ssb * . It received some jobs that failed. * John will move it and T2_GR_IOANNA into down from skip * Who? https://ggus.eu/index.php?mode=ticket_info&ticket_id=104831 ---++ Agent Issues * 235 - stability issues. It was down Sun-Wed and still not entirely stable! * crashed again! - Ivan is ticketing the IT guys. * Ticket: https://cern.service-now.com/service-portal/view-incident.do?n=INC0539969 * It should be running now. There were problems with the host certs. Priorities were set wrong * Alison was working on autorestart from Condor. If it doesn't start we need to start it manually. Alison is working on it. * 216 - upgraded * 234 - now up and running * We have nothing in drain, everything should be green * All major agents are re-deployed except the high-prio machines ---++ Site Issues * RAL downtime tues * KIT Monday downtime ---++ Workflow Issues ---++ MonteCarlo * high priority stuff, formerly HighPrio Spring13 is getting pushed out of the way we need to keep an eye on the high prio stuff and keep it moving * http://spinoso.web.cern.ch/spinoso/mc/issues.html ---++ Redigi/ReReco * The issues we had last week with WF's being "stuck" in acquired eventually worked their way through the system. The ACDCs weren't really "stuck" but were sitting behind higher priority WF's in the queue so weren't moving along. * we managed to get ~150 WF's through, most of which needed some sort of special attention ---++ Andrew's issues * all log collect jobs failing at Fermilab, there is a GIT HUB issue already, with the new patch * Seangchan is trying to talk to Catilyn about it but Catlyn has been out sick all week * For RelVal we just want to write to _DISK endpoint, when you make deletions delete both MSS and DISK * For Pileup datasets we make duplicates. Put it in the comments ---++ SeangChan's questions * Alan and Seangchan have new tag. We need to do upgrades once we get through the high-priority stuff * Premixing, any progress on premixing test workflow? Ask Oli -- Main.JenniferAdelmanMcCarthy - 23 Apr 2014
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r5
<
r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r5 - 2014-04-24
-
JenniferAdelmanMcCarthy
Log In
CMSPublic
CMSPublic Web
CMSPrivate Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Create
a LeftBar
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Cern Search
TWiki Search
Google Search
CMSPublic
All webs
Copyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback