TWiki
>
EGEE Web
>
SA1
>
OAT_EGEE_III
>
GridMonitoringNcgYaim
(2012-05-03,
WojciechLapka
)
(raw view)
E
dit
A
ttach
P
DF
---+!! Yaim Based Installation of Nagios & NCG %TOC% ---++ NCG Overview An overview of NCG is provided at EGEE.GridMonitoringNcgOverview. This document describes an automated installation based on YAIM and yum. ---++ Tutorial * In September 2008 a tutorial was given on this process at the EGEE'08 conference : GridMonitoringNcgYaimTutorial2008. * In September 2009 a tutorial was given on this process at the EGEE'09 conference : GridMonitoringNcgYaimTutorial2009. ---++ Configuring the repositories In order to install via Yaim, you need to add some yum repositories. *SL5 is now only maintained version* These are: * SL5 * [[http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/lcg-CA.repo][lcg-CA]] * [[http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.X/glite-BDII.repo][glite-BDII.repo]] * [[https://rpmrepo.org/RPMforge/Using][rpmforge]] dag himself now recommends rpmforge. * [[http://grid-deployment.web.cern.ch/grid-deployment/glite/repos/3.2/glite-UI.repo][glite-UI 3.2]] - this is to get the base yaim packages and glite-version only_ * [[http://www.sysadmin.hep.ac.uk/rpms/egee-SA1/centos5/x86_64/repoview/sa1-release.html][sa1-release]] - contains SL5 x86_64 yum config for SA1 ---+++ Requirements You need * a =host certificate= in order to secure the Nagios web portal. ---++ Installing packages Once this is done, you can install by doing =yum install httpd && yum install egee-NAGIOS lcg-CA=. The explicit httpd is needed since it must be installed before the nagios RPM. The nagios RPM as supplied by DAG has a missing RPM !PreRequisite. ---+++ All On One Box * =yum install httpd= * =yum install glite-UI= or =yum groupinstall 'glite-UI (production - x86_64)'= * =yum install lcg-CA egee-NAGIOS= * =vi site-info.def= # Configure with below parameters. * =/opt/glite/yaim/bin/yaim -s /root/site-info.def -c -n glite-UI -n glite-NAGIOS= ---+++ On Two Boxes * Box 1 , NAGIOS Host. * =yum install httpd= * =yum install egee-NAGIOS lcg-CA= * =/opt/glite/yaim/bin/yaim -s /root/site-info.def -c -n glite-NAGIOS= * Box 2, UI and NRPE Node. * SL4 =yum install lcg-CA glite-UI egee-NRPE= * SL5 =yum install lcg-CA egee-NRPE && yum groupinstall 'glite-UI (production - x86_64)'= * =/opt/glite/yaim/bin/yaim -s /root/site-info.def -c -n glite-NRPE -n glite-UI= If you plan to have an NRPE node running the local tests on an existing UI then install that with =yum install egee-NRPE=. ---++ YAIM's *site-info.def* File The configuration requires you to set the following variables in the YAIM _site-info.def_ file: | *Variable* | *Description* | *Example* | | INSTALL_ROOT | Location of grid middleware. | /opt | | SITE_NAME | the site you wish to monitor | MY-SITE | | NAGIOS_HOST | Nagios Hostname | nagios.example.org | | NCG_NRPE_UI | UI hostname for running NRPE. This should *only* be set if using a remote UI. If the UI is on the local box don't set it. | ui.example.org | | PX_HOST | !MyProxy Server to retrieve a certificate to run local tests under | myproxy.example.org | | SITE_BDII_HOST | The site BDII for the monitored site, =SITE_NAME= | site-bdii.example.org | | BDII_HOST | A top level BDII that you can use. | e.g lcg-bdii.cern.ch | | VOS | A list of VOs who can view the nagios information. | "ops dteam alice" | | VO_<VONAME>_VOMS_SERVERS | URI for the VOMS service. | !vomss://voms.cern.ch:8443/voms/ops?/ops/ | | NAGIOS_ADMIN_DNS | comma separated list of local admin DNs that can perform actions via the nagios web interface | "/DC=ch/OU=Users/CN=Dr Kildare,/DC=ch/OU=User/CN=Dr Who" | | MYSQL_ADMIN | Root password of MySQL | Unset by default if this is set then MySQL will be configured on the localhost to support NDOUtils. When not set then MySQL and the NDOUtils schema must be loaded by hand outside of YAIM. The easiest option is to set this to a string of your choice. | | NAGIOS_NSCA_PASS | The shared secret used by NSCA for sending results back to the nagios server via NSCA | tomato | | ATP_DB_PASS | The mysql password for the ATP database | lemon | | ATP_DB_NAME | The database name for the ATP database | atp | | MS_DB_PASS | The mysql password for the metricstore database | lemon | | MS_DB_NAME | The database name for the Metric Store database | metricstore | | MDDB_DB_PASS | The mysql password for the MDDB database | lemon | | MDDB_DB_NAME | The database name for the MDDB database | mddb | | MYEGEE_DB_TYPE | Type of the DB (Pdo_Mysql/Oracle) | Pdo_Mysql | | MYEGEE_DB_USER | The database username for the MyEGEE portal | lemon | | MYEGEE_DB_PASS | The database password for the MyEGEE portal | lemon | | MYEGEE_DB_SCHEMA | The database schema of the MyEGEE portal | atp | | MYEGEE_DB_ATP | The ATP database name of the MyEGEE portal | atp | | MYEGEE_DB_MS | The Metric Store database name of the MyEGEE portal | metricstore | | MYEGEE_DB_MDDB | The Metric Description database name of the MyEGEE portal | mddb | | MYEGEE_DB_HOST | The MySQL hostaname of the MyEGEE portal. It is only needed when used with MySQL. | localhost | | MYEGI_ADMIN_NAME | System administrator name (optional) | David Horat | | MYEGI_ADMIN_EMAIL | System administrator email (optional) | example@example.com | | MYEGI_DEFAULT_PROFILE | Default nagios profile (Default value: ROC_CRITICAL) | ROC_CRITICAL | | MYEGI_DATABASE_ENGINE | Type of DB (values: mysql/oracle) (default: mysql) | mysql | | MYEGI_DATABASE_USER | The database username for the MyEGI portal (default: myegi) | myegi | | MYEGI_DATABASE_NAME | The database name to use by the MyEGI portal (default: mrs) | mrs | | MYEGI_DATABASE_PASSWORD | The database password for the MyEGI portal (mandatory) | lemon | | MYEGI_DATABASE_HOST | The database hostname (default: localhost) | localhost | | MYEGI_DATABASE_PORT | The database port (optional) | 3306 | | MYEGI_DEBUG | Turn on/off debug mode for MyEGI (values: True/False) (default value: False)| False | | ROC_NAME | Your ROC name, use GOCDB names. In fact only compulsory if you are a ROC. Cannot contain multiple values. In case of multiple ROCs remove ROC_NAME and use NCG_GOCDB_ROC_NAME. | no default | | ATP_WEB_SECRET_KEY | some secret key string based on uuid for ATP web front-end | no default | | NCG_MDDB_SUPPORTED_PROFILES | List of supported profiles - is needed by MRS to know profiles for status recalculations. | ROC,ROC_CRITICAL, ROC_OPERATORS | The following variables have defaults but you may well want to change them. | *Variable* | *Default* | *Description* | | NCG_INCLUDE_EMPTY_HOSTS | 1 | Show hosts without services associated | | NCG_ENABLE_NOTIFICATIONS (TODO) | 0 | If set to true nagios will be configured to send notifications | | NCG_NAGIOS_ADMIN (TODO) | root@localhost | Email address which will receive notifications for Nagios internal checks (e.g. GridProxy-Get, GridProxy-Valid, MyProxy-ProxyLifetime, org.egee.SendToMsg, etc) | | NAGIOS_MYPROXY_USER | nagios | Change the myproxy username, i.e the -l option to myproxy-init, myproxy-login | | MSG_BROKER_CACHE_NETWORK | PROD | Set the Broker service to look for in the information system | | MSG_BROKER_CACHE_HOST | null | The hostname of broker to hard code to , setting this will disable the variable MSG_BROKER_CACHE_NETWORK and auto discovery of broker | | NAGIOS_HTTPD_ENABLE_CONFIG | false | Set true to update apache configuration for X509 auth. Will overwrite =/etc/httpd/conf.d/nagios.conf= and =ssl.conf=. If you don't do this you will have to configure apache by hand for X509 certificate authentication | | NAGIOS_NCG_ENABLE_CRON (TODO) | false | Set true for YAIM to enable ncg cronjob for rerunning of =ncg.pl= every 3 hours. | | NAGIOS_NCG_ENABLE_CONFIG | false | Set true for YAIM to write =/etc/ncg/ncg.conf= and execute =ncg.pl= for you. | | NAGIOS_SUDO_ENABLE_CONFIG | false | Set true for YAIM to append to =/etc/sudoers= to allow nagios to call certain probes as root | | NAGIOS_NAGIOS_ENABLE_CONFIG | false | Set true for YAIM to write =/etc/nagios/nagios.cfg= for you and reload NAGIOS. | | NAGIOS_CGI_ENABLE_CONFIG | false | Set true for YAIM to write =/etc/nagios/cgi.cfg= for you. | | NCG_LDAP_FILTER | "" | If set your NAGIOS will not monitor the SITE_NAME specified above but will instead query the top bdii for !GlueSite objects that match this. e.g To monitor all sites maintained under the Italian ROC set this value to =GlueSiteOtherInfo=EGEE_ROC=ITALY= | | NCG_GOCDB_ROC_NAME | "" | Set this to a GOCDB ROC name to collect a list of sites from GOCDB within a ROC. e.g CERN. In case of multiple ROCs set space separated list. | | NCG_GOCDB_COUNTRY_NAME | "" | Set this to a GOCDB Country name to collect a list of sites from the GOCDB with a country. | | NCG_TOPOLOGY_USE_SAM | false | If true, uses SAM for getting services| | NCG_TOPOLOGY_USE_GOCDB | true | If true, uses SAM for getting services| | NCG_TOPOLOGY_USE_ENOC | true | If true, uses SAM for getting services| | NCG_TOPOLOGY_USE_LDAP | true | If true, uses SAM for getting sitenames| | NCG_REMOTE_USE_SAM | true | If true, show SAM remote results in Nagios| | NCG_REMOTE_USE_NAGIOS | false | If true, show project or ROC remote results in Nagios| | NCG_REMOTE_USE_ENOC | true | If true, show ENOC (DownCollector) remote results in Nagios| | NAGIOS_ROLE | site | This can be one of =Site=, =ROC=, =Project=, =VO= and denotes if the nagios is acting in a site, roc or project level monitoring role.| | NCG_PROBES_TYPE |remote,local,all| Defines which type of probes should be configured. Local probes are probes executed by the Nagios. Remote probes are probes imported from external systems (e.g. SAM, remote Nagios, ENOC Downcollector). Default is all. | | NCG_VO | dteam | List of VOs the tests should run as. A space seperated list e.g "dteam cms lhcb". You must have a member of each VO willing to store a proxy for your retrieval. | | GGUS_SERVER_FQDN | null | The hostname of GGUS endpoint, setting this also open GGUS tickets for service notifications | | ATP_WEB_DB_USER | user name for ATP database. | no default | | ATP_WEB_DB_PASS | password for ATP database. | lemon | | ATP_WEB_DB_NAME | name for ATP database. | no default | | ATP_WEB_DB_ENGINE | database engine for ATP database. | oracle | | ATP_WEB_DEBUG | debug flag for ATP web front-end. | false | | ATP_WEB_TEMPLATE_DEBUG | template debug flag for ATP web front-end. | false | | ATP_WEB_VIEW_TEST | functional-test flag for ATP web front-end. | false | | ATP_WEB_INTERNAL_IPS | internal IP setting for ATP web front-end. | 127.0.0.1 | | ATP_WEB_SERVER_EMAIL | server email-id for ATP web. | root@localhost | | ATP_WEB_EMAIL_HOST | email host for ATP web front-end. | localhost | | ATP_WEB_PREFIX | server prefix for ATP web front-end. | localhost | The following variables are optional and have no default. | *Variable* | *Default* | *Description* | | VO_<VONAME>_NCG_VO_FQAN | no default A comma separated list of VOMS FQANs to run metrics as for a given VO. You must have a member of each VO with the appropriate FQAN willing to store a proxy for your retrieval. | ---+++ Defining a Set of Sites Note setting more than one of =NCG_LDAP_FILTER=, =NCG_GOCDB_ROC_NAME= or =NCG_GOCDB_COUNTRY_NAME= will give the union of the sites. ---+++ Co-Existing with Existing Nagios and/or Apache The 4 variables =NAGIOS_*_ENABLE_CONFIG= above are _false_ by default but can be set to _true_ resulting in more configuration being done for you. Setting all to true is the easiest but may clobber any existing work. When merging in with an existing configuration of NAGIOS or Apache you may wish to leave some or all of them as false. | *Variable* | *Files that will be Edited or Replaced by YAIM* | *Actions* | | NAGIOS_HTTPD_ENABLE_CONFIG | =/etc/httpd/conf.d/nagios.conf= & =ssl.conf= | apache will be reloaded | | NAGIOS_NCG_ENABLE_CONFIG | =/etc/ncg/ncg.conf= | =ncg.pl= will be executed creating _/etc/nagios/wlcg.d/*_ and _/etc/nagios/nrpe/*_. | | NAGIOS_NAGIOS_ENABLE_CONFIG | =/etc/nagios/nagios.cfg= | nagios will be reloaded | | NAGIOS_CGI_ENABLE_CONFIG | =/etc/nagios/cgi.cfg= | No actions | The following configuration files are edited by YAIM even if all the above are set to false. If this is a problem for your existing installations advise us. | *Will be altered with every run of YAIM* | | _/etc/nagios/ndo2db.cfg_ | | _/etc/voms2htpasswd.conf_ | | _/etc/nagios-proxy-refresh.conf_ | | _/etc/httpd/conf.d/nrpe.conf_ | | _/etc/broker-cache-file.conf_ | | _/etc/sysconfig/msg-to-queue_ | | _/etc/nagios/nsca.cfg_ | | _/etc/nagios/send_nsca.cfg_ | | _/etc/ncg/ncg-localdb.d/yaim-nagios-to-msg-queue.conf_ | The following variables have defaults, and can be changed if you have a non-standard installation of nagios or httpd. | *Variable* | *Default* | *Description* | | NCG_MAIN_DB_FILE | /etc/ncg/ncg.localdb | location of your local configurations for NCG | | NCG_TEMPLATES_DIR | /usr/share/grid-monitoring/config-gen/nagios | the location of NCG configuration templates | | NCG_OUTPUT_DIR | /etc/nagios/wlcg.d | Where the nagios configuration files for the server will be generated | | NCG_NRPE_OUTPUT_DIR | /etc/nagios/nrpe/ | Where NRPE configuration files will be generated | | NAGIOS_HTPASSWD_FILE|/etc/nagios/htpasswd.users|Location of allowed users for nagios web portal| | NAGIOS_DB_HOST (removed) | localhost | Hostname of MySQL for NDOUtils, localhost makes everything very easy | | NAGIOS_DB_USER (removed) | ndouser | MySQL user name for NDOUtils. | | NAGIOS_DB_PASS (removed) | ndopassword | Especially if MySQL is on localhost then can be left alone. You may wish to change though. | ---++ Configuration To configure, you just need to run yaim. There are really two deployment options to consider: Once this is completed successfully, you should be able to browser the Nagios web portal at =https://SERVER_NAME/nagios/=. You may find that your web server hangs connections because of BUG:48458. If this is the case installing the _dummy-ca-certs_ package and restarting apache should hopefully resolve this. ---+++ Security If using _local_ probes then these require a valid proxy certificate. This is obtained from a !MyProxy service. Allow the host running the local probes, the NRPE node, to retrieve a valid proxy. i.e MyProxy should have a YAIM configuration of at least. <verbatim> PX_HOST=myproxy.example.ch GRID_TRUSTED_RETRIEVERS="'/DC=ch/DC=cern/OU=computers/CN=nrpe-ui.example.ch'" GRID_AUTHORIZED_RETRIEVERS="'/DC=ch/DC=cern/OU=computers/CN=nrpe-ui.example.ch'" SITE_EMAIL="mybigsite@example.ch" </verbatim> Finnally from a UI interface somewhere you must upload a proxy to MyProxy that can be retrieved by the NRPE node. <verbatim> $ myproxy-init -c 336 -k NagiosRetrieve-nrpe-ui.example.ch-dteam -s myproxy.example.org \ -l nagios -x -Z "/DC=ch/OU=computers/CN=nrpe-ui.example.ch" </verbatim> ---++ Information System Nagios now publishes important information about itself into the information system. Please add the GRIS running on the Nagios node into your site BDII.
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r67
<
r66
<
r65
<
r64
<
r63
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r67 - 2012-05-03
-
WojciechLapka
Log In
EGEE
EGEE Web
EGEE Web Home
gLite
ProductTeams
SA3
JRA1
TMB
EMT
SA1
SA2
NA2
NA4
EGEE-UIG
List of
registered projects
List of EGEE-RP
interactions
Changes
Index
Search
Main.WebList
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
EGEE
All webs
Copyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Ask a support question
or
Send feedback