Reprocessing and Production Team Meeting - May 5, 2016 - VIRTUAL!

Vidyo Link

  • Virtual

Attending

  • FNAL: Jen, Scarlet, Jesus, Matteo
  • US: Gaston, Allie
  • CERN : JR
  • Korea :

Personnel

  • Youn on shift
  • CERN holidays in the next week, May 5 and 6.
  • Gaston to Colombia May 12-29, Talk on the 13
  • Allie will be gone April 29 through May 8.
  • Alan on holidays from May 5 to 11 (included).
  • Paola May 4, May 9 and May 13
  • Jen Vacation May 20, 1/2 day May 13

News - Dima

3 top issues affecting production

  • lots of wmLHEGS low priority taking over the system. known issue, on-going discussion. any action required ?
  • filemistach : delay in uploading is bad for book keeping. pdmvserv_SUS-RunIISummer15GS-00131_00284_v0__160323_032949_5198 in the list since a couple of weeks
    • running* => completed transition only when all files are in ? is there an api to check this is all done ? any ways to have this automated out ? please provide the script that check in the agents for files pending injection.
  • T0 merge : how comes permission issues come and go ?
  • /reqmgr2/data/request?status= failing repeatedly bringing unified to errors.
  • EventBased splitting and ACDC. what is the situation ? what can we do further to this ?

Site support - Gaston

Date Site Into the Waiting Room Out of the Waiting Room Into the morgue Out of the morgue
2016-04-25 00:00:01 T2_UK_London_Brunel x      
2016-04-25 00:00:01 T2_US_UCSD x      
2016-04-25 00:00:01 T2_BR_SPRACE x      
2016-04-28 00:00:01 T2_BR_UERJ   x    
2016-04-28 00:00:01 T2_BE_UCL   x    
2016-04-28 00:00:01 T2_EE_Estonia x      
2016-04-28 00:00:01 T2_BR_SPRACE   x    
2016-04-29 00:00:01 T2_US_UCSD   x    
2016-04-29 00:00:01 T2_PL_Swierk   x    
2016-04-30 00:00:01 T2_UK_London_Brunel   x    
2016-04-30 00:00:01 T2_RU_IHEP   x    
2016-05-02 00:00:01 T2_PL_Swierk x      
2016-05-03 00:00:01 T2_DE_RWTH x      

Transfers - Jorge

Workflows

  • lots of workflow are completing, lots of recoveries pending
  • why do we have a two-step process for ACDC (creation, then assigning). Require a synching of wmagent script ? let's please do this

ReDigi

MiniAOD

TaskChains

StepChain

  • NA

Rereco

  • recent rereco taken back because of the wrong GT : we'll find a way for them to provide the rejection/invalidation to Unified (a la McM invalidation)
  • HIRun2015 reprocessing outstanding

Store Results

MonteCarlo

Agent Issues

Agent redeployment

RequestMgr2 Migration

Merging Scripts

  • Need testing, then make a PR and have the new stuff available to be used by the group. Allie/Matteo/Paola will be testing next week. Plan to have this done by the next group meeting.
  • The merging changes in this pull request https://github.com/CMSCompOps/WmAgentScripts/pull/139.
  • reject.py/resubmit.py tested OK.
  • Do we wanna merge resubmitUnprocessedBlocks.py and resubmitWithBlockBlacklist.py? Has Unified picked up the below functionality?
  • assignProdTaskChain.py was changed according to https://github.com/CMSCompOps/WmAgentScripts/pull/140, script tested ok.
  • What is left?
  • Automatic ACDC, step 1: start with a script that drill down through the various wmstats calls for a given workflow and expose what are the main issues, error code, sites, ecc. Paola will be on top of that.

RelVal Andrew

AOB

Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r9 - 2016-05-12 - MatteoCremonesi
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback