The Siteview collector

The Siteview collector has been developed in the framework of Dashboard.

How to stop and start the collector

The collector is stopped and started with general Dashboard commands. To see the list of installed collectors:
$dashb-agent-list
SERVICE GROUP             STATUS     SERVICES
airplane.demo.collector   STOPPED    'random.plane.generator',
ElisaTest.demo.collector  STOPPED    'Elisa.Test',
SiteViewText.siteview.collector STOPPED    'SiteView.Text',
SiteViewAllVos.siteview.collector STARTED    'SiteView.AllVos',
To stop and start one collector:
$dashb-agent-stop SiteViewAllVos.siteview.collector
STOPPED
$dashb-agent-start SiteViewAllVos.siteview.collector
.STARTED

The collector logfile

The Dashboard logfiles are in: /opt/dashboard/var/log/. The Siteview collector logfile is: /opt/dashboard/var/log/dashboard.SiteViewAllVos.siteview.collector

How to configure the logfile

The connection with the database

The collector connects to the database with a writer account. Parameters to connect to the database are set in the configuration file: /opt/dashboard/etc/dashboard-service-config/SiteViewAllVos.siteview.collector/etc/dashboard-dao/dashboard-dao.cfg

(or, if this file does not exists, this file: ~/.dashboard/etc/dashboard-dao.cfg) for example for the development instance we set:

user        = LCG_SITE_MONITORING_DEV
password    = [password]
pool_size   = 5
connect_string = (DESCRIPTION =      (ADDRESS = (PROTOCOL = TCP)(HOST = int6r1-v.cern.ch)(PORT = 10121))
     (ADDRESS = (PROTOCOL = TCP)(HOST = int6r2-v.cern.ch)(PORT = 10121))
     (LOAD_BALANCE = yes)
     (CONNECT_DATA =
       (SERVER = DEDICATED)
       (SERVICE_NAME = lcg_dev_db.cern.ch)
       (FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 200)(DELAY = 15))
     )
   )

Typical problems

Some typical scenario: the GridMap displays metrics which have not been updated recently. First thing to do: check the data in the database. Most probably also the data in the database have not been updated recently. Most probably it's because the collector is stuck and is not injecting new data in the database.

When the collector is stuck most probably is for one of these reasons:

  • It tried to connect to some unavailable URL. The connections to URLs should be put into 'try' blocks! check that.
  • It tried the connection to the database and it failed. It could be because the DB password has expired. Or because the DB is locked by another connection. Or because of a network problem.
In any case, the first thing to do is to stop and start the collector. If the collector starts and is stuck again, then a more deep examination of the logfile is needed.

-- ElisaLanciotti - 10 Jun 2009

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2009-06-11 - ElisaLanciotti
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    ArdaGrid All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback