Vidyo Link


  • Jen, Julian, Andrew, Luis, SeangChan, Dave, Adli, Alan


Feb 25 -> March 4 Xavier
March 4 -> March 11 Jasper + Xavier


  • Fermilab is going disk/tape separated this week. Unlike other sites it is not requiring a full drain since we were already reading everything via xrootd anyway
  • Fermilab is doing storage upgrades this week as well.
  • Pools have been moved as of yesterday.
  • Input data is all on disk endpoint so we shouldn't have errors but we need to look
  • merges should succeed with xrootd but we need to keep an extra close eye
  • complaints from above about latency in workflows that have been running. What can we internally do do make things run more smoothly?
    • Dave is working on numbers really looking at the latency and will get them to us by Friday
    • we need big picture items that need to be taken care of on Friday

Operations notes:

  • Do we need to move the meeting, or find a separate time to meet with the operators?
    • Julian will set up doodle poll to see what we can work out.
  • What can we do to more effectively use our operators?
    • have operators do more wf & site debugging
    • maybe have operators do acdc's and force complete wfs
    • stuck workflows??

Agent Issues

  • Oracle Upgrade with no major issues.
  • vocms235: Too much pending jobs (it sucked all workload), causing a bottleneck 13105
    • Almost everything recovered.
  • ReDigi's with duplicate events, status?
  • Top WMAgent (& friends) issues for the Release planning meeting.
    • API to get number of jobs running/pending per category: This will help us to spot stuck workflows and to accurately measure workflow progress.
    • DBS3 late events register (closed block field?)
    • Job location missing - crash of DBS3 Uploader, is this fixed with the patch?
    • Duplicate lumis?
  • next redeploy cycle will begin at the end of March

Site issues related with Workflow Team


  • NTR


  • Where are we in the recovery from the mass clone of redigi for DBS?
  • Where are my ACDC's
  • are there ReDigis stuck in acquired?

Andrew's questions/Luis's answers

  • e-mail about datasets that need to be on disk at FNAL - Dave saw several
  • nearly everything from last weeks subscription request is done
  • Andrew, make subscription but do not approve, Dave will touch base with Burt and Catalin make sure we are OK and then Dave will approve
    • requests should be made to usergroup- dataops
  • can we start subscribing output datasets to disk since they would be input for another workflow, if you write something out it will be registered to buffer, if it's going to be an input for future WF
put it to disk, once we do our TFC change it will automatically go to disk and not buffer
  • couch replication will automatically restart
  • there should be no change in the time it takes for datasets to be registered in DBS not sure why we are seeing
  • do we need to make tape family requests for HI WF's at IN2P3?
    • once you make the subscription by request there does need to be a tape family, the site will tell you what they need

Site support issues

-- JenniferAdelmanMcCarthy - 03 Mar 2014
Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2014-03-04 - JenniferAdelmanMcCarthy
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback