Workflow Team Meeting - Jan 22 4PM CERN time
Vidyo Link
Attending
- FNAL: Jen, Luis, SeangChan
- Ian, Ajit
- EU: Julian, Andrew and Alan, Andrew
Personel
- We have a new Post Doc starting next week at FNAL that will be taking over Redigi coordinating, I will start him out by teaching him operations
- First week of Feb we will have another US operator starting
News
- Upgrade workflows should be available for assignment in the next 24-48 hours, once they hit they are absolute top priority, they needed them last week
- MinBias already staged
- If needed to, kill backfill.
- any specific rules? What %, are we making ACDC's ? Do we have drop dead date
- Andrew doesn't know, need to get more info from Dima when he is around
- nothing else significant/critical is expected for at least a week.
- the 1B campaign is delayed due to validation and configuration is not finalized
- it's likely that we will need to redo 300 requests for RunIIFall14GS campaign. The decision is not taken. PPD is evaluating the impact of the bug that they found.
- we don't know if it's a gen-sim again or a digi reco
- no clue in what time frame
- Transfer Team set up a new Phedex instance for tracking logs:
- Dataset name convention will be /YEAR/MONTH/WORKFLOW_NAME of cration date
- They'll pull every new file that appears on Castor.
- does this name convention make sense?
EU Operators Meeting Notes
- Julian - needs to get with Christoph about EU operators Any news yet?
Site support
- the site support team is updating info on site stauts board
Agent Issues
Redeployment plan
- Submit2 redeployed on Wed
- Global Pool
production SL6 |
submit1 (up) submit2 (up) cmssrv217 (up) 218 (up) 219 (up) |
- Production Pool:
- vocms202 (reproc), vocms237 (step0) and vocms235 (mc) retired.
- Next in line: vocms201
Workflows
- Backfill
- is it just me, or is work getting "stuck" behind the low priority backfill. I had a lot of jobs just sit in queued for a day even though there wasn't anything higher priority running. I bumped up the priority of the WF's and things sarted moving again. I also force completed a few old WF's that were stuck. And we've been restarting components, so a number of things could be happening but we need to make sure we keep an eye on this!
- we have been tracking sites that are not running properly and the glidin team is watching it
- pdmvserv_MUO-Phys14DR-00009_00087_v0__141218_011408_4922
miniaod's
- 2 miniaod's with 100% error ProductNotFound: Needs to be sent back to requestor
- these are teh 2 oldest requests in the system can we PLEASE get them returned so we can reject them!
Rereco
- Manually closed out the last of the ReReco's, too many lumi's per job to process any further
Store Results
- A few requests this week (3 new, 1 from last week)
- Documentation and scripts updated
AOB
--
JenniferAdelmanMcCarthy - 2015-01-21
This topic: CMSPublic
> CompOps >
CompOpsWorkflowTeam >
WorkflowTeamMeeting > WorkflowTeamMeeting20150122
Topic revision: r4 - 2015-01-22 - JenniferAdelmanMcCarthy