https://indico.cern.ch/conferenceDisplay.py?confId=254692

Attending

  • Jen, Dave, Seangchan, John, Luis, Julian, Alan, Andrew, Adli, Sunil

Shift

  • Feb 4-11 Sunil
  • Feb 11-18 Xaviar
  • Anybody who can put time in this next week is incouraged to do so!
  • Please e-log when you come on and off shift, I am seeing very little traffic from our shifters

Extra Meetings this week:

  • DBS Upgrade meeting Tomorrow at 5 :https://indico.cern.ch/conferenceDisplay.py?confId=298463
  • Thurs we will have a Bonus Workflow Team Meeting at 5:30 CERN time. Jen will set up Vidyo

Issues

  • We are pushing the agents to their limits right now, it is important that the shifters log onto all the machines and do a df to make sure that we are not getting over 90% full on disk space, when we get above that threshold couch starts having issues and the agent goes down, let's try to stay ahead of this!!!
  • Where are we in clearing out MC? I am focusing pretty much completely on Redigi, Julian where does MC stand?
    • Less than 20K jobs pending, I think we're going to be ready by tomorrow afternoon.
  • Any WF's not at 80% by Thurs we will kill and clone after the upgrade
  • Changes in clone script - Luis
  • changes in scripts for the dbs3 upgrade - Luis & Julian
  • New script for setting thresholds: 12492
    • I haven't checked out the new proceedure, has anybody but Luis or John done so?
  • Clearing out Redigi WF's - Jen, Dave and Andrew
    • KIT - went through these Mon most are reading all the data from FNAL via xrootd, fingers crossed that they will run
    • IN2P3 - Dave went through these Monday, we are in "Kill and clone" Mode for them. Luis made some tweeks to the resubmit.py script and is cloning them tonight. We will see in the morning how they are going
    • Other sites:
      • Large number of WF's that are showing duplicates in the outputs. The ones that we have checked inputs for the inputs appear to be OK. The following e-logs are documenting this issue and how we are working through it:
12485 , 12460 , 12200
    • PIC WF's - no errors and WF's not at 100%, tried clone and it didn't fix the issue
pdmvserv_HIG-Fall11R2-01424_T1_ES_PIC_MSS_00019_v0__140120_140109_7919 12292 pdmvserv_TOP-Summer12DR53X-00187_T1_ES_PIC_MSS_00105_v0__131208_161000_3034 12449
    • CNAF WF's - these all have file read errors where recovery/acdc did not fix the issue. Input blocks have been checked.
    • RAL - errors even with 100% of datasets in place check with Dave but may have to clone?

  • who has time to check to see if there are any datasets in dbs2/3 that are not in the other database? Yuyi will want this info for the dbs3 upgrade meeting Wed morning Julian??? Sunil???
    • there are some files, Andrew will post the difference and tell us what they are and when the WF's ran. We need this for tomorrow's meeting.

Site issues - John

Andrew's questions

AOB

  • Next week should be much more sane than the last month and a half have been, let's take next weeks meeting to go through all the scripts we use, Julian where is the list? and vet them.. which ones are actually used? which ones can go away? for the ones we use, have they been properly updated for the DBS3 transfer?
  • we had to hack the dbsTest.py script over the weekend to get things to close out. We need to formalize the changes and get them turned into git.
  • Next Challenge after migration: Unifying mc, mc_highprio, reproc_lowprio, reproc_high_prio teams:
    • Fewer agents
    • Request-based priority on global workqueue.
    • make Agents move to drain automatically when they get above 80% once we have teams unified - new github issue -Luis
-- JenniferAdelmanMcCarthy - 04 Feb 2014
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2014-02-04 - JenniferAdelmanMcCarthy
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback