TWiki
>
CMSPublic Web
>
CompOps
>
CompOpsWorkflowTeam
>
WorkflowTeamMeeting
>
WorkflowTeamMeeting20150625
(2015-06-25,
JenniferAdelmanMcCarthy
)
(raw view)
E
dit
A
ttach
P
DF
---+!! Workflow Team Meeting - June 25 4PM CERN, 9 FNAL time %TOC{depth="3" title="Contents:"}% ---++ Vidyo Link * https://indico.cern.ch/event/402959/ ---++ Attending * US: Jen, Ajit, Jorge, Matteo, SeangChan, Luis * EU: Julian, Andrew ---++ Personel * Jen off June 26-July5 - will have e-mail access but painfully slow internet * Jen off Aug 10-26 (tenitive ) * SeangChan July 27-31 * Julian Sept 14-30 ---++ News * A few upgrades in the system but nothing major ---++ 3 top issues effecting production * File Read issues at FNAL, xrootd problems - Both Julian and Jen submitted ACDC's on Wed.. check in morning to see if they worked * RunIISpring15DR74: (Exit Code: 8003) = Step3 miniaod problem, across sites, across workflows for * getting somewhere with ACDC's but not to full recovery * Julian will let run over the weekend, and then report back to PPD on Monday * EXO-RunIIWinter15GS 's with high failure rates (31 so far): * <font size="2"><a href="https://mmm.cern.ch/owa/redir.aspx?C=iL3suTZQi0Gl5lnIVhFtXsmy-aCsgtIIUiOf4N7yA8kobCbBa0Tpn1GuPV8otVVsn-OmSuC8OGs.&URL=https%3a%2f%2fcms-logbook.cern.ch%2felog%2fWorkflow%2bprocessing%2f20900" target="_blank">https://cms-logbook.cern.ch/elog/Workflow+processing/20900</a></font> * maxRSS exceeded, I'm testing finer splitting, Andrew says he didn't see any problem with these on RelVal * using too much RSS memory, probably will need to be reset, finer splitting isn't fixing the problem we should send it back?? elog and hypernews discussions already in the works * ACDC's stuck in acquired, stuck in Global Queue - SeangChan needs to look at them - Global Queue got wrong from PhEDEx * https://cms-logbook.cern.ch/elog/Workflow+processing/20907 ---++ Site support - John ---+++ Waiting Room ---++ Workflows ---+++ ReDigi * Problems discussed up in Problems section ---+++ TaskChains * there are 2 task chains in complete, Julian have you looked at them * Waiting for reset ---+++ Rereco * Need to check and handle separately * https://cms-logbook.cern.ch/elog/Workflow+processing/20903 * what do we need to do? * Julian will ask at Monday meeting ---+++ Store Results * NTR ---+++ MonteCarlo * MC WF's with >95% failure rate Exceeding memory, sending them back ---++ Agent Issues ---+++ Redeployment Plan * Deploying 1.0.8.pre6 version. * Also we want another opportunistic WMAgent for FNAL+Amazon * probably will use one of the submit machines * vocms0304 - backfill team - needs a few tests before using it as a backfill and scale testing, try to increase maxjobs running for CERN agents * deploy new cern agent 308, then we have all new agents and start draining old ---++ RelVal Andrew * Discussing injecting log files int PhEDEx ---++ L3 discussion - Ajit, Jean-Roch, Matteo * nothing special, no more sending redigi/rereco to SDSC * GEN_SIM not moving, seems to be a site issue ---+++ Opportunistic Resources - Stefan * Old backfills sitting in running-closed but not closing out, what should we do ---++++ HLT * HLT Testing - stupid question, T2_CH_CERN_HLT is not in the menu in WMStats are you just assigning via script? or what? needs to be assigned via script * perhaps we should do a more systematic campaign like what we are doing with SDSC * let's make sure that we submit reasonable workflows ---++++ SDSC ---+++ Automatic Assignment And Unified Software ---+++ AOB -- Main.JenniferAdelmanMcCarthy - 2015-06-24
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r6
<
r5
<
r4
<
r3
<
r2
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r6 - 2015-06-25
-
JenniferAdelmanMcCarthy
Log In
CMSPublic
CMSPublic Web
CMSPrivate Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Create
a LeftBar
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Cern Search
TWiki Search
Google Search
CMSPublic
All webs
Copyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback