Workflow Team Meeting - March 31 4PM CERN, 10 FNAL time
Vidyo Link
Attending
- FNAL: Jen, Gaston, Jesus, Scarlet, SeangChan, Jorge
- US: Matteo, Alli
- CERN : Paola, Dima, JeanRoc, Welcome Sebastian - new transfer person
Personnel
- Jorge to Columbia April 15-May 2, Talk on April 27
- SeangChan taking 1/2 day next Thurs and Fri off next week
News - Dima
- The next reprocessing big tests are getting going, big test workflow just injected 150 priority so it should take over everything, so we need to pay attention to this workflow we should be running ~70K/4 cores in parallel if this goes well we will know that we are ready to run multicore for real
- keep an eye on pdmvserv_task_TOP-RunIISpring16DR80-00001__v1_T_160331_151408_3872
- we need to prioritize running at the T0, we need to fully saturate the T0, 13K jobs, and jobs were crashing,
- we have a problem, the T0 was down there was something wrong with the production status metric
- vlimant_BTV-RunIISummer15GS-Backfill-00048_00275_v0__160330_122825_8568 - is the test workflow
- April fools reprocessing will hopefully start on monday, we still need to prestage data, it's not ready yet, Dima is working on cleaning things up by the end of the day so things can be prestaged. We will be using high priority transfers to get things staged quickly.
3 top issues affecting production
- ReReco Workflow not being picked up by Unifed
- just testing and waiting
- stuck workflows - there was a problem with T0_CH_CERN and Alan helped us clear them out
Site support - Gaston
2016-03-25 00:00:01 |
T2_EE_Estonia |
x |
|
|
|
2016-03-25 00:00:01 |
T2_IN_TIFR |
x |
|
|
|
2016-03-30 00:00:01 |
T2_PK_NCP |
|
|
x |
|
2016-03-30 00:00:01 |
T2_UK_London_Brunel |
x |
|
|
|
- problem with the production script, all the T3's are listed as on for production
- there is a change in how the production metric will be used, Gaston will work with the team to get it set properlly
Transfers - Jorge
- No new missing files at KIT so thank you Alan and Jorge! Old workflows may still give us missing files so the workflow sandboxes will be effected, March 16 is the day the changes went in, so things older than that we may have to redo, things newer than that should be fine
Workflows
- All work backlog is current to the past month!
Rereco
- fabozzi_Commissioning2015-Cosmics-Boff-01Mar2016_763p2_160302_100618_8061 needs to be cloned this one got lost in file invalidation land
- it has been cloned: jen_a_Commissioning2015-Cosmics-Boff-01Mar2016_763p2_160331_172001_8114
Store Results
Agent Issues
Agent redeployment
Merging Scripts
AOB
--
JenniferAdelmanMcCarthy - 2016-03-29
This topic: CMSPublic
> CompOps >
CompOpsWorkflowTeam >
WorkflowTeamMeeting > WorkflowTeamMeeting20160331
Topic revision: r6 - 2016-03-31 - JenniferAdelmanMcCarthy