Reprocessing and Production Team Meeting - May 12 4PM CERN, 9 FNAL time

Vidyo Link

Attending

  • FNAL: Jen, Jorge, Jesus, Matteo, SeangChan
  • US: Allie
  • CERN : Alan, Paola, Andrew, JeanRoch
  • Korea :

Personnel

  • Gaston to Colombia Early May 12-28, talk on the 13
  • Paola May 13
  • Jen 1/2 day May 13 and 20

News - Dima

  • New Shifters! Mykola and Svenja from DESY University.
    • Paola & Alan will do some training with them next week

3 top issues affecting production

  • Lots of Submit failures, FNAL, KIT, RWTH, PISA needing manual cleanup
  • Where are we in testing merge issues at T0_CH_CERN
  • Script has been integrated, I downloaded it Wed and it is working perfectly! Thanks Paola!
  • filemistach : delay in uploading is bad for book keeping. pdmvserv_SUS-RunIISummer15GS-00131_00284_v0__160323_032949_5198 in the list since a couple of weeks
    • running* => completed transition only when all files are in ? is there an api to check this is all done ? any ways to have this automated out ? please provide the script that check in the agents for files pending injection.
  • T0 merge : how comes permission issues come and go ?
  • /reqmgr2/data/request?status= failing repeatedly bringing unified to errors.

Site support - Gaston is on his way to Colombia

  • T2_CH_CSCS - where are we in the testing?
    • Last week's test failed but this week is ok so Paola will answer her findings in a ggus tiket

Date Site Into the Waiting Room Out of the Waiting Room Into the morgue Out of the morgue
2016-04-21 00:00:01 T2_US_Caltech x      
2016-04-21 00:00:01 T2_PL_Swierk x      
2016-04-22 00:00:01 T2_TW_NCHC x      
2016-04-22 00:00:01 T2_IT_Bari   x    
2016-04-24 00:00:01 T2_IN_TIFR   x    
2016-04-25 00:00:01 T2_US_Caltech x      
2016-04-25 00:00:01 T2_UK_London_Brunel x      
2016-04-25 00:00:01 T2_US_UCSD x      
2016-04-25 00:00:01 T2_BR_SPRACE x      

Transfers - Jorge

  • there is a GenSim dataset stuck in transfers that Jorge is working on

Workflows

  • lots of workflow are completing, lots of recoveries pending
    • lots of smaller workflows that just finished at rate of 25% into recovey, it would be interesting to have and keep track of this number over time.

ReDigi

MiniAOD

TaskChains

StepChain

  • NA

Rereco

Store Results

MonteCarlo

Agent Issues

Agent redeployment

RequestMgr2 Migration

Merging Scripts

  • The merging changes in this pull request https://github.com/CMSCompOps/WmAgentScripts/pull/139.
  • reject.py/resubmit.py tested OK.
  • Do we wanna merge resubmitUnprocessedBlocks.py and resubmitWithBlockBlacklist.py? Has Unified picked up the below functionality?
  • assignProdTaskChain.py was changed according to https://github.com/CMSCompOps/WmAgentScripts/pull/140, script tested ok. Are this changes propagated to the new merged assign.py?
  • What is left?
  • Automatic ACDC, step 1: start with a script that drill down through the various wmstats calls for a given workflow and expose what are the main issues, error code, sites, ecc. Paola will be on top of that.

RelVal Andrew

AOB

-- JenniferAdelmanMcCarthy - 2016-05-12

Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r6 - 2016-05-12 - JenniferAdelmanMcCarthy
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback