Workflow Team Meeting - July 10 4PM CERN time

Vidyo Link


  • Jen will be late - Jullian will be running the meeting
  • On FNAL: Luis, Dave and Seangchan
  • On CERN: Andrew, Alan, Julian


June 26-July3 Jasper
July 3 - July 10 Sara
July 10 --> July 17 Jasper
(Or tell Oli | CW | CP)
  • Jen will be taking off July 28 - Aug 8 - may have limited access evenings
    • Juan will be working as a US Shifter while I am on vacation so we will have eyes on the system
  • Julian Badillo wiill be on vacation July 24 - July 25
  • Dave Mid July - possibly not happening
  • Dave will be at Jury duty next Tuesdays up until October.
  • Note : Krista will be going on Maternity leave sometime in July


  • As Part of Juan's rampup as a developer, we are going to have him do operations for a couple weeks, covering US shift while Jen is on vacation.
    • this has been cleared through SeangChan and Eric, haven't talked to OSG guys yet wink he won't be 100% operations but at least we will ave somebody looking at the system
    • Luis will back Juan up and be helping train him in.
    • Please Juan send your IM (gtalk/skype/whatever) info to the workflow team (luis).

Sara's notes

Site support

  • BE_UCL? not running jobs.

Agent Issues

  • Couch-erlang upgrade: All agents up-to-date
    • Do we need a new redeployment procedure/rpm's?
    • The new rmp's will be included in the 0.9.95b (or next) tag
    • Alan and Justas are still validating it, hopefully on Monday or Tuesday we will get the new tag that can be used to redeploy the agents.
  • Redeployment plan:
    • Next in line: vocms216 --> set to drain --> will be redeployed next week with the new tag.
    • vocms237 (step0) next in line.
    • cmssrv112 and cmssrv98 --> set to drain and will be splitted into reproc and mc teams.
      • cmssrv98 first --> mc team - Alan will do the agent's checks and switch the teams.
      • on Friday cmssrv112 --> reproc_lowprio
      • we need to check that no wf gets assigned to the reproc_highprio team.
      • we need to switch priorities in the collector.
      • They won't be redeployed.
  • About SL6 agents in FNAL:
    • They will be assigned to a different team (production) and finish the tests.


  • workflows in acquired, we've had a LOT of issues reported about this this past week.
  • Jen spent some time looking at Old WF's in the system, I need input from Andrew/Dave what to do with these:
    • April Workflows still in assigned?? 15733
    • March WF's still in the system 15704
    • Feb WF's still in the system 15679


  • Both Jen and Dave have "hacked" the closeout script so we can close these out automatically. These are not subscribed anywhere, so we do not need to do subscription tests on them.
    • Jen's hack just comments out the PhEDEx check line and runs the regular script with a list of aodsim files. If we are going to be running a lot of these in the future (after CSA14) we should probably code this up more formally.
    • Julian made a change based on the request name having the "miniaod" substring, see pull request
      • better to do a check based on the output dataset name (tier)


  • Workflows in "closed-out" in WMStats, but "complete" in reqMgr to be announced. Once they are moved to announced WMStats should get the state right.
  • Did we retried this "pre-mixing" in Oli's email? pdmvserv_EXO-Spring14premixdr-00002_T1_US_FNAL_MSS_00001_v0_premixinPilotGF_140527_171514_2197
    • Dave will look at it.


Store Results

  • 1 ticket completed with 99%: 50979
  • 3 tickets "On hold", what to do about them?
  • 4 requests failed that are going to be manually retried.


  • Workflows were waiting for ReqMgr patch to be extended Elog - both extended.
    • pdmvserv_SMP-Summer12-00013_00105_v0__140328_041711_9773 (60M events) 80%
    • pdmvserv_BPH-Summer12-00166_00114_v0__140410_161619_101 (20M events) 91% * Egamma samples aborted, but extended.

RelVal Andrew

Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r6 - 2014-07-10 - JulianBadillo
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback