SAM to Nagios migration of sensors/tests - ALICE

The page concerns migration of ALICE sensors and tests that are making use of SAM submission/execution framework.

Goals

  • migrate all tests in SAM ALICE CE and VO-box sensors to Nagios based monitoring framework
    • make tests Nagios compliant
      • using special wrappers from org.sam framework
      • rewriting tests from scratch
  • integrate migrated tests into the Nagios based monitoring framework

  • Result:
    • all ALICE SAM tests using SAM submission framework migrated
    • RPM with the tests released and put into egee-SA1 repository
    • migrated tests integrated into new Nagios based monitoring framework

Planning

Two sensors to be migrated:

  • CE
    • one proxy
    • CE-sft-job - Role=lcgadmin; 2 tests on WN; no dependency between tests on WN
  • VO-box
    • one proxy
    • six tests

During integration account for test->metric name changes. Eg.:

SAM test Nagios check
CE
CE-sft-job org.sam.CE-JobState (UI/Nagios)
CE-sft-vo-swdir org.sam.WN-swdir (UI/Nagios/WN)
CE-sft-softver org.sam.WN-SoftVer (UI/Nagios/WN)
VO-box
VOBOX-PM org.alice.VOBOX-PM
VOBOX-DPD org.alice.VOBOX-DPD
VOBOX-PR org.alice.VOBOX-PR
VOBOX-PSR org.alice.VOBOX-PSR
VOBOX-SA org.alice.VOBOX-SA
VOBOX-UPR org.alice.VOBOX-UPR

Plan

P.ID Name Notes Result
1 migration of CE tests for Role=lcgadmin try using org.sam/samtest-run wrapper tests submitted with org.sam/CE-probe and produce Nagios compliant output; results come from MB; part of RPM in egee-SA1 repo
2 integration of PI1 with Nagios management of proxy with Role=lcgadmin on Nagios box tests run with Role=lcgadmin under Nagios
3 migration of VO-box tests this is a set of custom tests, which may require a re-write to be able to run under Nagios tests are submitted from command line against VO-boxes and produce Nagios compliant output; part of RPM in egee-SA1 repo
4 integration of PI4 with Nagios nothing tests run under Nagios

Milestones

Milestone Date ResultSorted ascending
M1 15 Nov'09 all tests migrated and the first release of RPM is made
M2 15 Dec'09 migrated tests integrated into new Nagios based monitoring framework

Progress

Planned Ongoing Done
PI1
CE-sft-job
- - CE-sft-vo-swdir
CE-sft-softver
PI2
integration of PI1 - -
PI3
- - vobox-DPD
- - vobox-PM
- - vobox-PR
- - vobox-PSR
- - vobox-SA
- - vobox-UPR
PI4
integration of PI3 integrated manually using Hash.pm on samnag014 * -

* metrics properly configured by NCG for Nagios; we could invoke them from CLI with nagios-run-check, meaningful results were produced; however, they fail with "(Service check did not exit properly)" when run under Nagios. This need debugging.

* created and started to populate ALICE_CRITICAL profile in MDDB: metric set org.alice.VOBOX

-- KonstantinSkaburskas - 04-Nov-2009

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2009-12-03 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback