Luis (FNAL), Diego (FNAL), John, Andrew, Edgar, Dorian


  • Coming off Shift- Sunil
  • Coming on Shift - Sunil
  • Jen will be on vacation Aug 9-21. I will have very little Internet access during this time.
  • Edgar will not be taking any time off during this time so we are covered
  • Dorian will be watching things US time while Jen is on Vacation.
  • Dorian will start watching this so you can ramp back up

Issues last week

Site Issues

Sites for Production

Site in MC Slots Status Notes Issues
T2_AT_Vienna 212 skip comissioned 32% failure rate - cvmfs black hole on Saturday
T2_GR_Ioannina 94 skip under commissioning 100% failure rate - Aug08: WF re assigned
T2_UA_KIPT 200 skip comissioned 0% failure rate - running 116 jobs
T2_UK_London_Brunel 1000 drain Aug 09: request from the site It seems that they had at least a blackhole node due to CVMFS issues.


  • 237 was upgraded to new version and now uses Oracle


  • EXO and TSG waiting for Ajit and Vincenzo to give back to requestors. Very bad time/event set

IEEE Paper

Draft Outline #1

* Introduction (Why we need to run so much simulations, why we need to do a rereconstruction of the data) (Edgar/Jen)
  • a brief discussion of what the different types of workflows are, and how they are processed differently (Diego/Jen/Edgar)
  • monitoring for T1 & T2 sites(Diego/Jen/Edgar)
  • How we ran prior to 2011
    • ProdAgent vs WMAgent ( Diego/Alan) (Focus on differences and improvements)
    • Reprocessing and Production (Jen/Xavier) (How this was handled with ProdAgent and why the need to move to another framework
  • How we ran with WMAgent (after 2011)
    • WMAgent /ReqMgr/Workqueue (Diego/Edgar/Alan) General comment on how it works
    • PREP/ReqmG Interaction (Vincenzo?)
    • Organization of the workflow team and operations around it (Edgar)
  • Achievements
    • Events reconstructed (L3s)
    • Usage of the grid (Edgar/Jen/L3s)
  • Conclusions / Outlook (Edgar/Jen)

Action Items

  • Write twiki disk/tape separation T1_IT_CNAF. Edgar
  • SVN - IEEE paper - Edgar. Done.
  • Recovery workflows - Jen - suspend
    • first 2 workflows are completely through and now we are waiting for people to really look and make sure that there are no show stoppers before we do the other 50.
    • Guillemo is bothering JeanRoc about if people have actually looked at the data
  • we need to add a daily report on Workflow stats - needs work on debugging
    • A new state for completed and already dealt with ACDC.
    • How many workflows running, pending, waiting, stuck
    • Is it documented yet?
      • need to pull documentaion out of e-log and put it on the twiki - Jen - Done


Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2013-08-14 - EdgarFajardo
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback