Detailed Monitoring for WLCG perfSONAR-PS Instances

The WLCG perfSONAR-PS deployment task force is evaluating MaDDash and OMD for us in monitoring all our perfSONAR-PS instances. Some details on the initial testing deployment are documented at https://twiki.cern.ch/twiki/bin/view/LCG/MadDashWLCG

We have the possibilty to augment the WLCG perfSONAR-PS monitoring by leveraging some additional capabilities of the OMD (http://omdistro.org/start) deployment being tested. The Checkmk component of OMD has some agents that can be installed from RPM. If perfSONAR-PS host managers can install these we can get additional monitoring easily in place.

NOTE: If sites may have configured extra security via /etc/hosts.allow and /etc/hosts.deny they need to verify the needed services they enable below are allowed!

Enabling Check_MK agent Monitoring

To install the check_mk-agents on a perfSONAR-PS host, do the following:

NOTE: The Checkmk download links above use an old address. New versions are available from https://checkmk.de/download.php

  • Make sure any firewall allows access from the WLCG OMD test instance at maddash.aglt2.org (192.41.231.110) and from the OSG monitoring subnet (129.79.53.0/24).
    • -A INPUT -m state --state NEW,ESTABLISHED -m tcp -p tcp --dport 6556 -s 192.41.231.110/32 -j ACCEPT
    • -A INPUT -m state --state NEW,ESTABLISHED -m tcp -p tcp --dport 6556 -s 129.79.53.0/24 -j ACCEPT
  • Update your /etc/check_mk/logwatch.cfg file to properly track the perfSONAR-PS logs. (Download example to customize from logwatch.cfg)
  • Notify Shawn McKee that you have these installed on your host(s) so the monitoring config can be updated.

Enable SNMP Monitoring

We can also get additional information via snmp if sites are willing to deploy it. Here are the steps:

  • Login as root on the host
  • Install net-snmp via yum: yum  install  net-snmp
  • Make sure snmpd will start with system
    • chkconfig  snmpd  on
  • Configure /etc/snmp/snmpd.conf to allow the readonly community 'WLCGperfSONAR' by adding:
  • Ensure any firewalls allow access to UDP port 161 from maddash.aglt2.org (192.41.231.110) and the OSG monitoring subnet (129.79.53.0/24). Here is an example line for iptables:
    • -A INPUT -m state  --state NEW,EASTABLISHED -p udp -m udp -s 192.41.231.110/32 --dport 161 -j ACCEPT
    • -A INPUT -m state  --state NEW,EASTABLISHED -p udp -m udp -s 129.78.53.0/24 --dport 161 -j ACCEPT
  • Start snmpd via 'service snmpd start'
  • Notify Shawn McKee that you have enabled snmp monitoring access so the OMD configuration for your host(s) can be updated

Enable Dell Open Manage Server Administrator (OMSA) (For Dell Servers)

If you have a perfSONAR-PS node deployed on Dell server hardware we can monitor even more details if you install and enable Dell OMSA. Here are the steps (assumes you enabled SNMP above):

  • Login as root on the host
  • Configure to use the Dell OMSA repo:
  • Install OMSA via yum
    • yum  install  srvadmin-all
    • Once this finishes,log out and then log back in to make sure your environment/path is correct.
  • Install OpenIPMI via yum
  • Source the appropriate setup:
    • source /etc/profile.d/srvadmin-path.sh
  • Enable ipmi
    • srvadmin-services.sh enable ipmi
  • Confirm that the smuxpeer is setup to pass OMSA information via snmp
    • Check that the /etc/snmp/snmpd.conf file contains a line like the one below. If not, add it
    • smuxpeer .1.3.6.1.4.1.674.10892.1
  • Restart snmpd: service  snmpd  restart
  • Start OMSA: srvadmin-services.sh  start
  • Verify that snmp can see the Dell OIDs:
    • This requires the snmpwalk utility. Either go to a host which has it installed or install yum  install  net-snmp-utils
    • snmpwalk -v2c -c WLCGperfSONAR <host> 1.3.6.1.4.1.674.10892.1 (Where <host> is your host FQDN)
  • Notify Shawn McKee that OMSA is installed so the OMD configuration for your host(s) can be updated

Send and questions or comments to Shawn.

-- ShawnMcKee - 28 Jan 2014

  • logwatch.cfg: Logwatch configuration file for check_mk-agent-logwatch
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2019-10-17 - ShawnMcKee
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback