TWiki> ArdaGrid Web>GangaDIANEMonitoring (revision 34)EditAttachPDF

Ganga/DIANE Monitoring

Ganga/DIANE monitoring dashboard runs on port 80 at http://gangamon.cern.ch

Instructions for users

How to configure Ganga to send monitoring messages?

.gangarc:

[MonitoringServices]
Executable/* = Ganga.Lib.MonitoringServices.MSGMS.MSGMS

[MSGMS]
server = gridmsg101.cern.ch
port = 6163

See also: /afs/cern.ch/sw/arda/install/su3/2009/config-lostman.gear

How to configure DIANE to send monitoring messages?

Messages are automatically sent. It may be disabled in run() function in the run file: config.MSGMonitoring.MSG_MONITORING_ENABLED = False

Production service operations

The code is developed in SVN: http://svnweb.cern.ch/world/wsvn/ganga/trunk/external/dashb

The code is deployed in: /data/django/dashboard

MySQL DB access: configuration /data/django/dashboard/server/monitoringsite/settings.py

We use gangage as a service account.

How to deploy a new version of the service?

Make sure that the code is tagged in SVN already with tag dashboard-X-Y by the developers.

Login as gangage@gangamon

For convenience define:

export DASHTAG=dashboard-X-Y

Export the SVN tag:

cd /data/django
svn export svn+ssh://svn.cern.ch/reps/ganga/tags/packages/$DASHTAG

Fix configuration strings and write-protect the release area:

cd /data/django/service
python configure-access.py $DASHTAG
chmod -R a-w /data/django/$DASHTAG

Switch the production code:

cd /data/django
rm -f dashboard && ln -s $DASHTAG dashboard

Restart the service.

How to restart the service?

Login as gangage@gangamon

The collector is restarted automatically via a crontab which is defined in Quattor.

All service related commands are in /data/django/service.

If you want to disable the collector for maintenance (e.g. code upgrade) do this

cd /data/django/service
./switch_to_maintanance
......
./switch_to_production
./status_gm_service

Manual check if the collector is running:

ps -aux | grep runcollector

Restart apache (as root):

sudo /etc/init.d/httpd restart

Automatic check if the service is available (and alarm)....

Not implemented yet.

Backup

TSM backup is being set-up currently.

Old backups went to: voatlas30:/data/gangamon/gangamon

Disk space management - database logs

You may need to remove db logs from time to time:

mysql> PURGE BINARY LOGS BEFORE '2010-10-1 22:46:26';

Production server configuration

The software is managed by quattor (list of exceptions below).

Lemon monitoring page: http://lemonweb.cern.ch/lemon-status/info.php?entity=voatlas65&type=host

Relevant quattor templates:

Installation notes for voatlas65 (local changes done by hand)

Local changes to voatlas65:

  • added /etc/httpd/conf.d/dashboard.conf
  • simplejson egg easy_installed to /usr/lib/python2.4/site-packages
  • created symlink: ln -s /usr/lib/python2.4/site-packages/django /usr/lib64/python2.5/site-packages/django
  • created symlink: ln -s /usr/lib/python2.4/site-packages/yaml /usr/lib64/python2.5/site-packages/yaml
  • disabled python_mod by renaming the file: /etc/httpd/conf.d/python.conf__

* gangausage app: added pygooglechart in /data/django/external, it is referred to by /data/django/dashboard/server/django.wsgi file

MSG Information

Destinations consumed by runcollector

Currently there is no protection for the production queues in the MSG server. NEVER run collector in the production mode outside of scope of the production service.

See runcollector.py source for up-to-date destinations.

As of 2009-11-04:

Server
'gridmsg101.cern.ch'
Port
6163
Ganga destinations
/queue/ganga.status, /topic/ganga.status
Diane destinations
/queue/diane.journal, /queue/diane.status

TODO:

  • Restrict read-access of queues to official runcollector.

Web access to the message queues

You need to have a valid Grid certificate to access these pages (tested on firefox).

Production server: https://gridmsg101.cern.ch/admin/queues.jsp

Development server: https://gridmsg001.cern.ch/admin/queues.jsp

Creating development environment

It is recommended that the development environment and production server are as close as possible, to avoid compatibility issues when moving the code into production.

Whenever possible try to use the same versions as installed on gangamon server. Check the installed packages in production.

What do we need?

  • Apache2 webserver (with mod_wsgi installed)
  • MySQL database
  • Python 2.5+
  • SVN
  • Django

Ubuntu

Install Environment

Install apache, mysql, python, svn
sudo apt-get install apache2 mysql-server python subversion

Install mod_wsgi for apache

sudo apt-get install libapache2-mod-wsgi

Install MySQL support for Python

sudo apt-get install python-mysqldb

Install Django:

sudo apt-get install python-django python-django-doc

If you want the latest development version here are some hints See also: http://docs.djangoproject.com/en/1.1/intro/install

Setup MySQL database

Create mysql database (gangamon) and user (replace myuser and mypassword with some real values) with proper privileges:

> mysql -u root
CREATE DATABASE gangamon CHARACTER SET utf8;
CREATE USER 'myuser'@'localhost' identified by 'mypasswd';
GRANT ALL PRIVILEGES ON gangamon.* to 'myuser'@'localhost';

Here is more information on this:

Install Ganga/Diane Dashboard application

In production the application is installed to /data/django. We assume the same location for the development environment because to setup up the application you anyway need to have root access to modify the apache configuration files. It also makes it simpler to manage the transition between development and production environments.

Create working copy of Ganga/Diane Dashboard application:

mkdir -p /data/django
cd /data/django
svn co svn+ssh://svn.cern.ch/reps/ganga/trunk/external/dashb dashboard

Update apache configuration:

sudo cp /data/django/dashboard/server/dashboard.conf /etc/apache2/conf.d
sudo service apache2 restart

Note: on Scientific Linux the apache conf path is: /etc/httpd/conf.d

Update settings.py file with DB connection information:

cd  /data/django/dashboard/server/monitoringsite
cp settings.py.TEMPLATE settings.py
#edit settings.py and update DATABASE_USER, DATABASE_PASSWORD and SECRET_KEY

NEVER commit settings.py file to SVN (it contains sensitive information)

Rename file settings.js-example (located in client/media/scripts/) to settings.js

Initialize django databases:

cd /data/dashboard/server/monitoringsite
python manage.py syncdb

You should get this output:

Creating table auth_permission
[...]

You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): yes
Username (Leave blank to use 'myuser'): 
E-mail address: myuser@some.mail
Password: 
Password (again): 
Superuser created successfully.
Installing index for auth.Permission model
[...]

That's it! Try http://localhost/monitoring

If you want to test with web broweser running somewhere else than localhost, then you need to do one fix more: the javascript web application needs to be told at which url django is serving data. Edit /data/django/dashboard/client/media/scripts/settings.js file and replace localhost with your fully qualified server name.

You may also run a collector in test mode (using test queues).

Development of Task Monitoring applications

Code base for taskmonitor ("TRUNK") is developed here: http://svnweb.cern.ch/world/wsvn/ganga/trunk/external/dashb/taskmonitor

DIANE task monitoring is a branch of taskmonitor and it is kept here: http://svnweb.cern.ch/world/wsvn/ganga/trunk/external/dashb/dianetaskmonitor

The branch may/should be kept up-to-date with the trunk. Simply: svn merge svn+ssh://svn.cern.ch/reps/ganga/trunk/external/dashb/taskmonitor

The following are added and are DIANE-specific: client/media/scripts/settings.js

The following are svn copies:

  • client/media/css is a copy of client/media/css_example (and may be merged within the branch separately if needed)
  • index.html

Similarly for Ganga (TODO).

Send test messages to your development service instance

Run collector in the test mode

By default the collector runs in the test mode. So simply run: python manage.py runcollector

It will use the development server gridmsg001.cern.ch instead.

The msg destinations are used by the collector will be the same as on production server.

Configure Ganga to send messages in test mode

Now, suppose that you'd like to test the gangausage or monitoring messages messages. You may force Ganga to send messages to the development msg server like this:

ganga -o[MSGMS]server=gridmsg001.cern.ch

This works as of release 5.5.14 (previously the usage destination was hardcoded).

Configure DIANE to send messages in test mode

This works on code later than 2.1 release.

Set the DIANE_MSG_TEST environment variable and all messages will be published to the development server.

Notes

Other ways to restart Apache

/etc/init.d/apache2 restart
Or
/etc/init.d/httpd restart

-- JakubMoscicki - 2009-09-04

Form definition 'Atlas.ATLASServiceDocumentationForm' not found
Edit | Attach | Watch | Print version | History: r42 | r36 < r35 < r34 < r33 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r34 - 2010-11-09 - JakubMoscicki
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    ArdaGrid All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback