Workflow Team Meeting - Aug 28 4PM CERN time

Vidyo Link

Attending

  • FNAL: Jen, Dave, SeangChan
  • CERN : Julian, Andrew and Alan
  • Joe

Personel

Aug 21 --> Aug 28 Joseph
Aug 28 --> Sept 4 Sara

  • this is as far as the schedule goes.
    • Who can setup the new fall schedule?
    • Once peoples class/meeting scheudles change in the fall do we need to change our meeting time so the operators can make the meeting?
  • Julian will take vacations September 1st to 5th
    • Please remember to fill the CompOps meeting report, I'll leave a cronjob that generates the plots Monday 13:30 CERN time, in a public location.
  • Dave on Jury duty Tuesdays until the End of Sept

News

  • still quiet on the operations front
    • Dave is about to assign 15 real workflows as soon as he makes sure the datasets are in place.
    • we are expecting the next round of upgrade workflows to show up. ~1wk of work
  • We need to come up with a list of things to do while things are quiet. Might as well take advantage of the slow time to make our busy time easier:
    • How can we use backfill to streamline operations? What tests should we be running
      • Dave is running redigi backfill across sites to test network.
        • How agressive do you want us to be in tracking down and reporting issues? This is mainly a networking test that can only be done when we have very little else going on
      • Test/backfill workflows
        • what do you want us to do with them when they move to complete? Are you looking at outputs or can they be cleaned up and moved out of the way?
      • RAL is showing issues we need to keep on them about the issues so that we can get them working properly.
    • do we need to review our documentation?
    • We keep coming up with stuck workflows. What can we do to help the developers debug this?
    • What upgrades are on the horizon for say the next 6 months. Can any of them be pushed up and tested while we are slow? What can we do to help facilitate this?
      • SL6
      • moving to one team
  • Pages for issues and "stuckness", they are in test, generated twice a day, please check them out to see workflows progress:

Site support

Joseph's notes

Agent Issues

  • workflows keep getting stuck in various states. What has changed? When we were busy things just moved... now that we are slow noting is moving on it's own
  • We need to collect WMAgent issues to push to the next Release Planning Meeting.

Redeployment plan

  • All agents are up to date
    • Next in line: vocms216 - 38% disk --> it can wait.

Workflows

miniaod's

  • dmason_TSG-Spring14miniaod-00022_00023_v0__140816_202411_6545 - duplicates in the input dataset [[https://cmslogbook.cern.ch/elog/Workflow+processing/16558][16558]
    • I know we were going to return these to the requestors, but I don't want to move it off the list until it is decided what to do about the inputs. Otherwise they will be sitting there and valid until we forget about them and re-run against them and have this issue again 6 months from now.

ReDigi

  • workflows being stuck in various states
    • Last Friday it was stuck in running-open due to a problem in the request manager
    • this week it's things sitting in running-closed, I have 4 "real" workflows going and Dave's backfill are all not moving
  • FSQ-ppSpring2014DRX53-00004_00004_v0_castor
    • I have been waiting for this input dataset to be complete on _DISK for almost a week now so I can try cloining it again, hoping Jorge found the issue.
    • ggus ticket
    • elog
    • in the end this may very well end up going back to the requestors, but we have very little else to look at so we might as well pound on it and find out why it is having so many issues.

T1's summary

Rereco

  • Nothing has run

Store Results

  • moved to main request manager
    • seem to be running OK, what else is there to do here?
    • Documentation updated. No tickets open.

MonteCarlo

  • crickets

RelVal Andrew

-- JenniferAdelmanMcCarthy - 27 Aug 2014
Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r6 - 2014-08-28 - JenniferAdelmanMcCarthy
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback