Week of 051128

Open Actions from last week:
  • New LFC sensor to detect current thread usage, and external service availability via CLI tools (James)
  • Work out how to do log expiry with log4j (Gavin)
  • New version of LCG_MON_GRIDFTP (Maarten)
  • Check with ZS what is needed and why for gridview (James)
  • PS DB team will do a reboot of the DB for all LFC+FTS on Monday 9.30 AM DONE
  • Test + Deploy new QF for FTS (Gavin) TESTED
  • Test + Deploy new version of LFC (Sophie) IN TESTING

Chair: Harry

On Call: Sophie + Andrea

Monday:

Log: Castor2 outage Saturday evening thru sunday. Another possible outage (seen from monitoring ) on monday morning. DMA alarm on FTS node

New Actions:

  • James - rewrite procedure so "standard" sysadmin alarms are handled by sysadmins

Discussion:

  • Olof said the problem on saturday was due to the main LSF batch daemon not being able to communicate with the scheduler. approx 80K jobs backed up (mostly stages from LHCb ~75K). Jobs reinjected into the system after coming back online.
  • Eric said they had some corruption on the stager DB this morning which might be linked the the outage. They have a manual process to recover from this corruption, which was successful.
  • PS DB outage at 9.30 on all LFC and FTS production services.

Tuesday:

Log: Nothing to report

New Actions:

  • FTS upgrade to QF tomorrow (WeD)

Discussion:

  • Meeting upstairs tomorrow morning
  • lcg-mon-gridftp deployed on dpm - waiting for update of alarm before putting it on wan nodes

Wednesday

Log: Nothing

Actions: T.Kleinwort is moving the lxserv function to a new machine so the FTS QF upgrade will wait for that to complete.

Discussion: GRIDVIEW statistics showed no traffic due to the temporary stoppage of R-GMA waiting for a security fix. This has now been done.

Thursday

Log:

Actions:

Discussion:

Friday

Log:

Actions:

Discussion:

Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r4 - 2005-12-01 - HarryRenshall
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback