Workflow Team Meeting - Jan 29 4PM CERN time

Vidyo Link

Attending

  • FNAL: Dave, Jen, Matteo, Luis, SeangChan, Jorge
  • US: Ian, Stefan, Ajit
  • EU: Julian, Andrew Levin,

Personel

  • New Post Doc Matteo Cremosnesi
    • will be 50% Physics 50% computing, just graduated
    • has accounts everywhere
    • Together with Oli, the Christoph's will figure out what exactly he is doing, for now ~taking over Redigi
  • New US Operator coming to FNAL next week for training
    • Stefan Piperov - knows transfers, did workflow stuff in the Prod Agent Era
      • is working on opertunistic sites
      • Physics, started in computing in 2006 doing HCal
      • at CERN doing computing and transfer 2011-2013
      • wants to learn agent and get ramped up at NERSK to get opertunisitc

    • Bo - coming out of CDF
      • opertunistic resources will be at FNAL next week Mon-Wed
  • John will be back next week
  • Julian will be off Feb 10-13th - in Istanbul Already updated calendar
  • Alan is on vacation until Friday in Russia

News

  • Amazon - CF - we are no longer doing this, It is dead.
  • Upgrade workflows - going fairly smoothly except for attempting to read invalidated files issues
    • closing out WF's at 95%
  • What to expect (David Lange)
    • Dave or Dima should answer back to this e-mail on timescale and priority of these workflows. This looks to be ~1mo of keeping the T1's busy
    • Upgrades: - Expect approximately 4 (+/-1) additional sets of 16million additional high pileup digi-reco in addition to the one being finished now - Expect approximately 2 (+/-1) sets of GEN-SIM (also ~16 million plus some minbias)
    • GEN-SIM for startup: - The ~200 million done so far will be scrapped and resubmitted (software bug that has been deemed important enough) - Most other samples are held up by generator readiness
      • Dave is going to suggest that they re-name this era to something sane
    • DIGI-RECO for startup: - Still planned to start at the end of March
    • Other (10 to 100 millions) - (Possibly) Repeat of "GEN validation" campaign in 53x - Finalizing E/gamma and JetMet calibrations - Samples for electron/muon id development - Samples for PF paper (would be partly 74x)
  • the 1B campaign is delayed - any fresh news on this?
    • part of this is the ~200 million event that is being scrapped above
  • it's likely that we will need to redo 300 requests for RunIIFall14GS campaign.
    • any word on this?

EU Operators Meeting Notes

* more info on S. Korean operators - not yet still working through the Politics one senior guy 1-2 students we don't know how much of their time we will have yet.

Site support

Agent Issues

  • Agents were fairly unstable over the weekend but seem to be doing better now
  • Friday night/over the weekend when the upgrade workflows where hitting hard he cleared out backfill, that took out all of the agents. If you kill a lot of jobs in condor the agents start to fail. The agent should bring the high priority stuff in, it doesn't kill the jobs already running.
  • We will be full all the time, and it will be real work we can't abort but there will always be crazy high priority stuff coming in and it needs to be able to take over even if we have other work running.
  • We need to get priorities settled so when new High Priority work comes in it can really take over!
  • one agent is sucking down all work, Early next week, put in fresh backfill and see if we can re-produce what we saw and get it working right when SeangChan and Alan can watch... not on a weekend

Redeployment plan

  • Submit2 redeployed on Wed
    • Global Pool
      production SL6
      submit1 (up)
      submit2 (up)
      cmssrv217 (up)
      218 (up)
      219 (up)
      vocms0308 (down)
      vocms0309 (down)
      vocms0310 (down)
    • Production Pool:
      mc/reproc SL5
      vocms216 (up)
      vocms234 (up)
  • vocms201 - retired
  • CERN machines installed and tested:
    • We are waking them up when one of the FNAL agents reach 75% disk
    • The idea -> drain submit1 and submit2 and use them as backup.
    • Please check that you have access to the machines!

Workflows

ReDigi

  • All old Redigi's have been cleared out Happy Dance!
  • What are we checking for closing out Redigi's? Do we want events or lumi's to match... Dave and Julian - discuss

miniaod's

  • All old miniaod's have been cleared out!

Rereco

Store Results

MonteCarlo

RelVal Andrew

AOB

-- JenniferAdelmanMcCarthy - 2015-01-28
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2015-02-04 - JenniferAdelmanMcCarthy
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback