https://indico.cern.ch/conferenceDisplay.py?confId=254678
Attending:

Personel:
  • Oct 22 --> Oct 29 Julian (+Adli)
  • Oct 29 --> Nov 4 Adli
  • Edgar is gone Nov 8th now?? or are you officially gone on the 31st?
  • John is gone Oct 23-Nov 11
Agent issues
  • vomcs202 (reprocessing agent): up un running
  • FNAL not getting any jobs: behaving normally
  • WMAgent issues:
    • v0.9.82 deployed in vocms202, vocms235, cmssrv98, and vocms85. Next: vocms216, vocms234.
    • Oracle on vocms85
  • disk full problem: warning patch is being tested
  • PhEDEx subscriptions issue: solved

Workflow issues:

  • Workflow with massive fail - Input Data invalidation.
    • once the files were invalidated it moved along nicely and is now out of our hands
  • We have a number of workflows that are not at 100%, but we've run ACDC and have no failures.
  • we have made HUGE strides in understanding why WF's are getting stuck and having the stuck list back under control.. for now... please keep up the good work everyone!
"backfill"
  • due to recent agent issues, and the fact that we currently do not have a lot of jobs running to keep sites busy Oli has requested that we run "backfill" which is basically running a known job over and over and over again to make sure that there are no stability issues
  • the first 2 workflows are in: https://cmslogbook.cern.ch/elog/Workflow+processing/10729
    • we need to keep some end to end statistics
      • when was the WF submitted'
      • when did it end
      • how long did it take to ACDCD to go through etc
      • we will treat these like normal WF's only when we are all done, we delete the outputs as we already have run the data.
    • as soon as one backfill WF is finished the next one goes in the idea is that we keep constant pressure on the sites to insure that there are no problems creeping up on us. We will start with just the T1's once we get that going smoothly we will add T2's

Site Problems
  • We have started rolling out the SL6 workers in the FNAL cluster. All the test workflows ran but if you or anyone in dataops see any issues with FNAL, please let me or cmst1 know so we can take care of it asap.
cms-comp-ops-site-support-team (Site Support Team) <cms-comp-ops-site-support-team@cern.ch>

Warning: Can't find topic CMSPublic.CompOpsMeetingTickets

  • Any problem, email site support team list (while John is in vacation).
AOB
Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2013-10-29 - JenniferAdelmanMcCarthy
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback