Yaim Based Installation of Nagios & NCG

NCG Overview

An overview of NCG is provided at GridMonitoringNcgOverview. This document describes an automated installation based on YAIM and yum.

Tutorial

Configuring the repositories

In order to install via Yaim, you need to add some yum repositories. SL5 is now only maintained version These are:

Requirements

You need
  • a host certificate in order to secure the Nagios web portal.

Installing packages

Once this is done, you can install by doing yum install httpd && yum install egee-NAGIOS lcg-CA. The explicit httpd is needed since it must be installed before the nagios RPM. The nagios RPM as supplied by DAG has a missing RPM PreRequisite.

All On One Box

  • yum install httpd
  • yum install glite-UI or yum groupinstall 'glite-UI (production - x86_64)'
  • yum install lcg-CA egee-NAGIOS
  • vi site-info.def # Configure with below parameters.
  • /opt/glite/yaim/bin/yaim -s /root/site-info.def -c -n glite-UI -n glite-NAGIOS

On Two Boxes

  • Box 1 , NAGIOS Host.
    • yum install httpd
    • yum install egee-NAGIOS lcg-CA
    • /opt/glite/yaim/bin/yaim -s /root/site-info.def -c -n glite-NAGIOS
  • Box 2, UI and NRPE Node.
    • SL4 yum install lcg-CA glite-UI egee-NRPE
    • SL5 yum install lcg-CA  egee-NRPE && yum groupinstall 'glite-UI (production - x86_64)'
    • /opt/glite/yaim/bin/yaim -s /root/site-info.def -c -n glite-NRPE -n glite-UI

If you plan to have an NRPE node running the local tests on an existing UI then install that with yum install egee-NRPE.

YAIM's site-info.def File

The configuration requires you to set the following variables in the YAIM site-info.def file:
Variable Description Example
INSTALL_ROOT Location of grid middleware. /opt
SITE_NAME the site you wish to monitor MY-SITE
NAGIOS_HOST Nagios Hostname nagios.example.org
NCG_NRPE_UI UI hostname for running NRPE. This should only be set if using a remote UI. If the UI is on the local box don't set it. ui.example.org
PX_HOST MyProxy Server to retrieve a certificate to run local tests under myproxy.example.org
SITE_BDII_HOST The site BDII for the monitored site, SITE_NAME site-bdii.example.org
BDII_HOST A top level BDII that you can use. e.g lcg-bdii.cern.ch
VOS A list of VOs who can view the nagios information. "ops dteam alice"
VO_<VONAME>_VOMS_SERVERS URI for the VOMS service. vomss://voms.cern.ch:8443/voms/ops?/ops/
NAGIOS_ADMIN_DNS comma separated list of local admin DNs that can perform actions via the nagios web interface "/DC=ch/OU=Users/CN=Dr Kildare,/DC=ch/OU=User/CN=Dr Who"
MYSQL_ADMIN Root password of MySQL Unset by default if this is set then MySQL will be configured on the localhost to support NDOUtils. When not set then MySQL and the NDOUtils schema must be loaded by hand outside of YAIM. The easiest option is to set this to a string of your choice.
NAGIOS_NSCA_PASS The shared secret used by NSCA for sending results back to the nagios server via NSCA tomato
ATP_DB_PASS The mysql password for the ATP database lemon
ATP_DB_NAME The database name for the ATP database atp
MS_DB_PASS The mysql password for the metricstore database lemon
MS_DB_NAME The database name for the Metric Store database metricstore
MDDB_DB_PASS The mysql password for the MDDB database lemon
MDDB_DB_NAME The database name for the MDDB database mddb
MYEGEE_DB_TYPE Type of the DB (Pdo_Mysql/Oracle) Pdo_Mysql
MYEGEE_DB_USER The database username for the MyEGEE portal lemon
MYEGEE_DB_PASS The database password for the MyEGEE portal lemon
MYEGEE_DB_SCHEMA The database schema of the MyEGEE portal atp
MYEGEE_DB_ATP The ATP database name of the MyEGEE portal atp
MYEGEE_DB_MS The Metric Store database name of the MyEGEE portal metricstore
MYEGEE_DB_MDDB The Metric Description database name of the MyEGEE portal mddb
MYEGEE_DB_HOST The MySQL hostaname of the MyEGEE portal. It is only needed when used with MySQL. localhost
MYEGI_ADMIN_NAME System administrator name (optional) David Horat
MYEGI_ADMIN_EMAIL System administrator email (optional) example@exampleNOSPAMPLEASE.com
MYEGI_DEFAULT_PROFILE Default nagios profile (Default value: ROC_CRITICAL) ROC_CRITICAL
MYEGI_DATABASE_ENGINE Type of DB (values: mysql/oracle) (default: mysql) mysql
MYEGI_DATABASE_USER The database username for the MyEGI portal (default: myegi) myegi
MYEGI_DATABASE_NAME The database name to use by the MyEGI portal (default: mrs) mrs
MYEGI_DATABASE_PASSWORD The database password for the MyEGI portal (mandatory) lemon
MYEGI_DATABASE_HOST The database hostname (default: localhost) localhost
MYEGI_DATABASE_PORT The database port (optional) 3306
MYEGI_DEBUG Turn on/off debug mode for MyEGI (values: True/False) (default value: False) False
ROC_NAME Your ROC name, use GOCDB names. In fact only compulsory if you are a ROC. Cannot contain multiple values. In case of multiple ROCs remove ROC_NAME and use NCG_GOCDB_ROC_NAME. no default
ATP_WEB_SECRET_KEY some secret key string based on uuid for ATP web front-end no default
NCG_MDDB_SUPPORTED_PROFILES List of supported profiles - is needed by MRS to know profiles for status recalculations. ROC,ROC_CRITICAL, ROC_OPERATORS

The following variables have defaults but you may well want to change them.

Variable Default Description
NCG_INCLUDE_EMPTY_HOSTS 1 Show hosts without services associated
NCG_ENABLE_NOTIFICATIONS (TODO) 0 If set to true nagios will be configured to send notifications
NCG_NAGIOS_ADMIN (TODO) root@localhost Email address which will receive notifications for Nagios internal checks (e.g. GridProxy-Get, GridProxy-Valid, MyProxy-ProxyLifetime, org.egee.SendToMsg, etc)
NAGIOS_MYPROXY_USER nagios Change the myproxy username, i.e the -l option to myproxy-init, myproxy-login
MSG_BROKER_CACHE_NETWORK PROD Set the Broker service to look for in the information system
MSG_BROKER_CACHE_HOST null The hostname of broker to hard code to , setting this will disable the variable MSG_BROKER_CACHE_NETWORK and auto discovery of broker
NAGIOS_HTTPD_ENABLE_CONFIG false Set true to update apache configuration for X509 auth. Will overwrite /etc/httpd/conf.d/nagios.conf and ssl.conf. If you don't do this you will have to configure apache by hand for X509 certificate authentication
NAGIOS_NCG_ENABLE_CRON (TODO) false Set true for YAIM to enable ncg cronjob for rerunning of ncg.pl every 3 hours.
NAGIOS_NCG_ENABLE_CONFIG false Set true for YAIM to write /etc/ncg/ncg.conf and execute ncg.pl for you.
NAGIOS_SUDO_ENABLE_CONFIG false Set true for YAIM to append to /etc/sudoers to allow nagios to call certain probes as root
NAGIOS_NAGIOS_ENABLE_CONFIG false Set true for YAIM to write /etc/nagios/nagios.cfg for you and reload NAGIOS.
NAGIOS_CGI_ENABLE_CONFIG false Set true for YAIM to write /etc/nagios/cgi.cfg for you.
NCG_LDAP_FILTER "" If set your NAGIOS will not monitor the SITE_NAME specified above but will instead query the top bdii for GlueSite objects that match this. e.g To monitor all sites maintained under the Italian ROC set this value to GlueSiteOtherInfo=EGEE_ROC=ITALY
NCG_GOCDB_ROC_NAME "" Set this to a GOCDB ROC name to collect a list of sites from GOCDB within a ROC. e.g CERN. In case of multiple ROCs set space separated list.
NCG_GOCDB_COUNTRY_NAME "" Set this to a GOCDB Country name to collect a list of sites from the GOCDB with a country.
NCG_TOPOLOGY_USE_SAM false If true, uses SAM for getting services
NCG_TOPOLOGY_USE_GOCDB true If true, uses SAM for getting services
NCG_TOPOLOGY_USE_ENOC true If true, uses SAM for getting services
NCG_TOPOLOGY_USE_LDAP true If true, uses SAM for getting sitenames
NCG_REMOTE_USE_SAM true If true, show SAM remote results in Nagios
NCG_REMOTE_USE_NAGIOS false If true, show project or ROC remote results in Nagios
NCG_REMOTE_USE_ENOC true If true, show ENOC (DownCollector) remote results in Nagios
NAGIOS_ROLE site This can be one of Site, ROC, Project, VO and denotes if the nagios is acting in a site, roc or project level monitoring role.
NCG_PROBES_TYPE remote,local,all Defines which type of probes should be configured. Local probes are probes executed by the Nagios. Remote probes are probes imported from external systems (e.g. SAM, remote Nagios, ENOC Downcollector). Default is all.
NCG_VO dteam List of VOs the tests should run as. A space seperated list e.g "dteam cms lhcb". You must have a member of each VO willing to store a proxy for your retrieval.
GGUS_SERVER_FQDN null The hostname of GGUS endpoint, setting this also open GGUS tickets for service notifications
ATP_WEB_DB_USER user name for ATP database. no default
ATP_WEB_DB_PASS password for ATP database. lemon
ATP_WEB_DB_NAME name for ATP database. no default
ATP_WEB_DB_ENGINE database engine for ATP database. oracle
ATP_WEB_DEBUG debug flag for ATP web front-end. false
ATP_WEB_TEMPLATE_DEBUG template debug flag for ATP web front-end. false
ATP_WEB_VIEW_TEST functional-test flag for ATP web front-end. false
ATP_WEB_INTERNAL_IPS internal IP setting for ATP web front-end. 127.0.0.1
ATP_WEB_SERVER_EMAIL server email-id for ATP web. root@localhost
ATP_WEB_EMAIL_HOST email host for ATP web front-end. localhost
ATP_WEB_PREFIX server prefix for ATP web front-end. localhost

The following variables are optional and have no default.

Variable Default Description
VO_<VONAME>_NCG_VO_FQAN no default A comma separated list of VOMS FQANs to run metrics as for a given VO. You must have a member of each VO with the appropriate FQAN willing to store a proxy for your retrieval.

Defining a Set of Sites

Note setting more than one of NCG_LDAP_FILTER, NCG_GOCDB_ROC_NAME or NCG_GOCDB_COUNTRY_NAME will give the union of the sites.

Co-Existing with Existing Nagios and/or Apache

The 4 variables NAGIOS_*_ENABLE_CONFIG above are false by default but can be set to true resulting in more configuration being done for you. Setting all to true is the easiest but may clobber any existing work. When merging in with an existing configuration of NAGIOS or Apache you may wish to leave some or all of them as false.
Variable Files that will be Edited or Replaced by YAIM Actions
NAGIOS_HTTPD_ENABLE_CONFIG /etc/httpd/conf.d/nagios.conf & ssl.conf apache will be reloaded
NAGIOS_NCG_ENABLE_CONFIG /etc/ncg/ncg.conf ncg.pl will be executed creating /etc/nagios/wlcg.d/* and /etc/nagios/nrpe/*.
NAGIOS_NAGIOS_ENABLE_CONFIG /etc/nagios/nagios.cfg nagios will be reloaded
NAGIOS_CGI_ENABLE_CONFIG /etc/nagios/cgi.cfg No actions

The following configuration files are edited by YAIM even if all the above are set to false. If this is a problem for your existing installations advise us.

Will be altered with every run of YAIM
/etc/nagios/ndo2db.cfg
/etc/voms2htpasswd.conf
/etc/nagios-proxy-refresh.conf
/etc/httpd/conf.d/nrpe.conf
/etc/broker-cache-file.conf
/etc/sysconfig/msg-to-queue
/etc/nagios/nsca.cfg
/etc/nagios/send_nsca.cfg
/etc/ncg/ncg-localdb.d/yaim-nagios-to-msg-queue.conf

The following variables have defaults, and can be changed if you have a non-standard installation of nagios or httpd.

Variable Default Description
NCG_MAIN_DB_FILE /etc/ncg/ncg.localdb location of your local configurations for NCG
NCG_TEMPLATES_DIR /usr/share/grid-monitoring/config-gen/nagios the location of NCG configuration templates
NCG_OUTPUT_DIR /etc/nagios/wlcg.d Where the nagios configuration files for the server will be generated
NCG_NRPE_OUTPUT_DIR /etc/nagios/nrpe/ Where NRPE configuration files will be generated
NAGIOS_HTPASSWD_FILE /etc/nagios/htpasswd.users Location of allowed users for nagios web portal
NAGIOS_DB_HOST (removed) localhost Hostname of MySQL for NDOUtils, localhost makes everything very easy
NAGIOS_DB_USER (removed) ndouser MySQL user name for NDOUtils.
NAGIOS_DB_PASS (removed) ndopassword Especially if MySQL is on localhost then can be left alone. You may wish to change though.

Configuration

To configure, you just need to run yaim. There are really two deployment options to consider:

Once this is completed successfully, you should be able to browser the Nagios web portal at https://SERVER_NAME/nagios/.

You may find that your web server hangs connections because of BUG:48458. If this is the case installing the dummy-ca-certs package and restarting apache should hopefully resolve this.

Security

If using local probes then these require a valid proxy certificate. This is obtained from a MyProxy service. Allow the host running the local probes, the NRPE node, to retrieve a valid proxy. i.e MyProxy should have a YAIM configuration of at least.
PX_HOST=myproxy.example.ch
GRID_TRUSTED_RETRIEVERS="'/DC=ch/DC=cern/OU=computers/CN=nrpe-ui.example.ch'"       
GRID_AUTHORIZED_RETRIEVERS="'/DC=ch/DC=cern/OU=computers/CN=nrpe-ui.example.ch'"
SITE_EMAIL="mybigsite@example.ch"

Finnally from a UI interface somewhere you must upload a proxy to MyProxy that can be retrieved by the NRPE node.

$ myproxy-init -c 336 -k NagiosRetrieve-nrpe-ui.example.ch-dteam -s myproxy.example.org \
        -l nagios -x -Z "/DC=ch/OU=computers/CN=nrpe-ui.example.ch"

Information System

Nagios now publishes important information about itself into the information system. Please add the GRIS running on the Nagios node into your site BDII.
Edit | Attach | Watch | Print version | History: r67 < r66 < r65 < r64 < r63 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r67 - 2012-05-03 - WojciechLapka
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback