Workflow Team Meeting - April 23 4PM CERN, 9 FNAL time

Vidyo Link

Attending

  • FNAL: Jen, Luis, Matteo, SeanChan
  • US: Ajit
  • CERN : Alan, Dima, Julian , JeanRoc, Andrew
  • EU:

Personel

  • Luis to Colombia around May 1-15th, working remotely (very little)
    • John will be in charge of the T0 while Luis is gone
  • Jorge is in Colombia
  • Matteo will be at CERN May 4-14

News - DIMA

  • TP workflows are getting late
    • can close and announce lumi-based at 80% and event-based at 50% or more
      • no need to force complete yet
  • Major Run2 DigiReco is delayed by ~2 weeks
    • they found another SW problem to fix and check so 2 wk delay, 25 million events,
  • What is the latest on the TP workflows? (Event based splitting)
  • A few changes on WmAgentScripts, please pull latest version
    • resubmit.py

3 top issues effecting production

  • Problem with dbs calls in new agent caused us to lose part of the weekend. Thank you Alan for getting this fixed
  • Changes in walltime definition causing work to be put into condor limbo: https://cms-logbook.cern.ch/elog/Workflow+processing/19818
    • many of the workflows are running afternoon fnal time but I still had a few with all jobs pending. Check in AM
    • There are still some jobs with large maxwalltime, Julian is on top of it and is chaning MaxWalltimes, but is only a good for things we have to shove through quickly.
    • Alan is looking at the Event based ACDC, and then will work on the MaxWalltime bug
  • MaxWallTimeMins calculation for acdc's (and some other workflows): check this https://cms-logbook.cern.ch/elog/Workflow+processing/19836
  • Wfs stopped in acquired
    • if it's just resource starved we need to move them out of FNAL
    • supercomputer sites do not have data access, whre can we move them to.
    • the high priority stuff is running at FNAL, RAL, JINR so we should move other stuff to other sites.
    • JeanRoc will take care of re-assigning them

Site support - John

  • why are transfers to IN2P3 failing - need to track down John to discuss this

Workflows

ReDigi

  • Ongoing upgrade, non-upgrade workflows continue to get older

TaskChains

  • Dave was complaing that they are doing something funny that can't be run at SanDiego, they are 2 step wf's that we need the 1st step to run 2nd so we need to be able to stage out to disk which is the issue at SanDiego, so if we have a TaskChain that only has one step, they could run there.
  • we can use the SD storage element so we could be use it like a T1, we just need to work it now.
  • Can we send MC clones and resubmit there? yes
  • We should send as much pure gen MC at them as we can

miniaod's

  • nothing

Rereco

* nothing to report

Store Results

  • there are open tickets, Yuyi said input was invalid so it's back in the hands of Crab3

MonteCarlo

Agent Issues

Redeployment Plan

  • We brought up new agents last week, vocms308 is already drained, rest are getting there, if we see components down on draining agents we should restart them again

RelVal Andrew

  • What is the status of the github issues

L3 discussion - Ajit, Jean-Roch, Matteo

Opportunistic Resources - Stefan

HLT

  • Status of the current test - it is tested, how is it going?
    • Andrew L joined this list, there was an e-mail, we need to move it to the global pool, Andrew is helping get this working, nobody is injecting work until this is done. Lots of people out the last week so not much progress has been made.
  • Future plan

SDSC

  • Status of the special LHE request
  • Running DigiReco, what do we need?
    • we need input storage space, Discuss with Dave for more information, mostly about testing, things should work out of the box. If we don't use this resource now, we will never get it again. it has potential to be a relyable T1/T2 but we need to really work at it!
  • TaskChains?

Automatic Assignment And Unified Software

AOB

Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2015-04-23 - JenniferAdelmanMcCarthy
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback