TWiki
>
CMSPublic Web
>
CompOps
>
CompOpsWorkflowTeam
>
WorkflowTeamMeeting
>
WorkflowTeamMeeting20140218
(2014-02-18,
JenniferAdelmanMcCarthy
)
E
dit
A
ttach
P
DF
Vidyo Link
Attending
Shift
News
Agent Issues
Workflow Issues
Site Issues:
Global-prioritization
Release Validation
Vidyo Link
Attending
John, Seangchan, Dave, Jen Luis
Julian, Andrew
Shift
Feb 11 -> Feb 18
Sunil
Feb 18 -> Feb 25
???
News
Turn back on to production with DBS3 went amazingly well! Thank you again for everyone's hard work! Hope everyone managed to grab a little breather.
Lessons learned?
Agent Issues
Phedex/DBS3 error?
Fix made and applied changes to WMAgents,
All the agents reinstalled with few changes:
No mc-highprio team.
vocms237 with no step0 patch (zero/events file will be reported as a request error)
vocms85 in the mc team, not started yet, will be started as a backup
DBS3Upload
crashing on thursday 12804:
Incomplete SE information reported by jobs.
Seangchan and Luis run a recovery for missing blocks
GitHub
issue: 4964
DBS3 Upload time-lag, blocks may be uploaded after the workflow is complete:
Wait 18h to resubmit workflows?
How to check the completion-time?
add this wait in to the closeout script?
Seangchan will open github issue. we will be having a new state, for now work around is time WF moves to closeout + 18 hrs
Workflow Issues
Closeout script problems - Julian how far did you get on this?
Already fixed and running, it was an error on the way I used the dbs3 api.
The monitoring script that reports to dashboard:
Julian is working on that
MonteCarlo
NTR
ReDigi
/ReReco
cloned WF's assigned
need to go through them and make sure everything is OK now. We have 1 with duplicates and several that
PhEDEx
is not updating on
Site Issues:
Disk/tape separation on PIC, switching plan
:
Julian will submit a plan for the
CompOps
meeting
Global-prioritization
When are we doing this? April sounds good? Validate with PPD
Plan (Scratch)
:
Drain the reproc_highprio agents
Create a "production" team:
vocms85 + ex_reproc_highprio
Drain one of the reproc_lowprio and one of the mc
Switch assignment to the new team.
Add the agents to the new team as they are drained
Drain the rest of the agents
At the end:
One team with 5 agents + 2 backup (will they be enough?)
Release Validation
wmagent parameters issue
high memory relval workflows
how can JINR set a limit of 8 GB per core if each worker node has 12 cores and 48 GB RSS total?
slow transfers from FNAL:
http://savannah.cern.ch/support/?141982
--
JenniferAdelmanMcCarthy
- 18 Feb 2014
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r4
<
r3
<
r2
<
r1
|
B
acklinks
|
R
aw View
|
WYSIWYG
|
M
ore topic actions
Topic revision: r4 - 2014-02-18
-
JenniferAdelmanMcCarthy
Log In
CMSPublic
CMSPublic Web
CMSPrivate Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Create
a LeftBar
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Cern Search
TWiki Search
Google Search
CMSPublic
All webs
Copyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback