INDICO LInk: https://indico.cern.ch/conferenceDisplay.py?confId=254662

Attending

Personel

  • Coming off shift - Sara
  • going on shift Sunil
  • Jen will be taking some vacation time in Mid August exact dates TBD
  • Edgar will be in Paris from friday to monday.

Site Issues

  • T2_US_Vanderbilt - not giving enough slots over weekend
  • ASGC back as opportunistic processing

  • T2_FI_HIP & T2_IT_Bari
    • moved out of drain - keeping an eye on these sites during this week

  • T2_GR_Ioananniana & T2_AT_Vienna - need to be commissioned
    • Edgar doesn't have a lot of time to help with the commissioning
    • do we have instructions listed so that John can do this? John has a basic list and will try and talk to Edgar when stuck
    • let's get this on a twiki linked off the Workflow Team Main twiki

Site in MC WN Status Notes
T0_CH_CERN 2000 down leave down
T1_RU_JINR 0 skip waiting to be commissioned
T1_TW_ASGC 1500 drain working in opportunistic mode
T1_UK_RAL_Disk 2000 down leave down, it exists only for PhEDEX
T2_BR_UERJ 200 drain network problems
T2_GR_Ioannina 0 skip need to be commissioned
T2_IN_TIFR 200 drain keep in drain as long as possible - everything is an issue at this site
T2_RU_IHEP 1000 drain in wating room
T2_RU_INR 100 drain network problem
T2_RU_JINR 1500 drain network problem
T2_RU_RRC_KI 0 drain network problem
T2_RU_SINP 100 drain network problem
T2_TR_METU 200 drain in waiting room
T2_AT_Vienna 400 skip need to start commissioning - John
T2_FR_GRIF_IRFU 0 skip shares WNs with LLR - as long as we are using GRIF_LLR we don't need IRFU
T2_KR_KNU 300 skip needs re-commissioning
T2_RU_PNPI 10 skip in waiting room
T2_UA_KIPT 500 skip too small to make it work

Agents

  • Agents upgraded in previous week
  • Agents down/drain for upgrade
    • 201 - in drain for upgrade
  • Agents with other issues
    • do we undertand the dip in jobs on 235 over the weekend? It seemed to "fix itself" which always makes me nervous

Workflows

IEEE Paper

Draft Outline #1

  • Introduction (Why we need to run so much simulations, why we need to do a rereconstruction of the data) (Edgar/Jen)
  • How we ran prior to 2011
    • ProdAgent vs WMAgent ( Diego/Alan) (Focus on differences and improvements)
    • Reprocessing and Production (Jen/Xavier) (How this was handled with ProdAgent and why the need to move to another framework
  • How we ran with WMAgent (after 2011)
    • WMAgent /ReqMgr/Workqueue (Diego/Edgar/Alan) General comment on how it works
    • PREP/ReqmG Interaction (Vincenzo?)
    • Organization of the workflow team and operations around it (Edgar)
  • Achievements
    • Events reconstructed (L3s)
    • Usage of the grid (Edgar/Jen/L3s)
  • Conclusions / Outlook (Edgar/Jen)

Action Items

  • Recovery workflows - Jen - ongoing
    • Diego got us an updated recovery workflow script!
    • discovered that the recovery workflows were creating some duplicates
  • we need to add a daily report on Workflow stats
    • How many workflows running, pending, waiting, stuck -
      • Jen -come up with template report
      • Edgar - please comment on workflow statuses I feel like we are not always communicating what workflows are in a waiting status for various issues
      • Diego - how hard would it be to have a "manual switch" that we can set on workflows for "waiting" so if there is a group of workflows that we are waiting back from a site/requesters to close out we can put the workflows in waiting so that things that are in "complete" really are ready to be closed or need to be looked at.
  • Diego - Can we have the script you wrote for finding stuck workflows?
    • Diego will put it in a public place so we can add it to svn
    • Is it documented yet?
      • need to pull documentaion out of e-log and put it on the twiki
  • Problems with dbsTest.py https://cmslogbook.cern.ch/elog/Workflow+processing/8656
    • Edgar have you looked at this yet?
    • not solved yet, Edgar will look at it. It is made to look at DAS run by run and is slow. Maybe we need to think about splitting the script

AOB

  • None!
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r5 - 2013-07-09 - EdgarFajardo
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback