TWiki> ArdaGrid Web>SiteStatusBoard (revision 19)EditAttachPDF

Help page of the Site Status Board

Information for the end user

Basic concepts

The Site Status Board is a monitoring tool that describes the status of Sites. The status is based on metrics. A metric is anything that can be measured: number of jobs, number of successful transfers, status of services, declaration of downtimes, etc.. The metrics are fetched regularly, and they give values to all the sites that are being monitored.

A metric can define quite a lot of information. Only the first two attributes are required:

  • Metric: the metric that it is being gathered
  • Site: name of the site
  • Validity: Defined by 'start time' and 'end time'.
  • Color: (red there are problems, yellow for warning, green for everything ok)
  • Label: Free style text
  • Value: Number for th
  • URL : pointer to more information

Metrics can be combined into views. A view is a set of metrics that apply to the same sites. For instance, there could be a view called 'Job Monitoring' that has metrics like the 'Number of running jobs', 'Number of successful jobs', 'Status of the CE', etc. Usually, the SSB administrators define the views, and the users just see the information.

The example of view TESTCOLLECTOR: view.png

Fig.1 The example of the view

Metrixs are: AverageJobNumber, CondorAveJobRun24h, CondorPledges, Running, TopologyMaintenence

Navigation tools

There is a menu on the top where you can find links to the help, bug report system and to how to authenticate to the application. You will need to authenticate if you want to change some of the values.

For the non-authorized user (can not perform any changes) the menu on the top of the view (Fig.2): view_noauth.png

Fig. 2 Menu for the non-authorized users.

After authorization the additional items are added to menu (Fig.3): view_auth.png

Fig. 3 Menu for the authorized users (administrator of view).

From any page, if you click in the 'Index' (Fig.4), you will be redirected to the home page (Fig.4.1), and the 'Expanded table' (Fig.5) shows more details about View.. In these two pages, you can select the view with the dropdown menu (Fig.6).

view_auth_index.png Fig. 4 Click to index to return to the home View page

The home page (Fig.4.1) presents all the sites, with an icon that represents the status of the site on that particular view. To calculate the status of the site, the SSB takes all the critical metrics of the view (how to set the critical metrics for a view is explained in Fig. 4.2). A green tick next to a site name means the site is working fine, a yellow tick means at least one of the critical metrics is in warning, a red dot means at least one of the critical metrics is failing, an “at work” symbol means at least one of the critical metrics reports maintenance color (brown), and a “lens” symbol means the site is under investigation (i.e. a ticket has been submitted in the Savannah ticketing system to follow up the problem)

view_home.png Fig. 4.1 Home page

In the home page, if you click on:

  • A site: you will get all the metrics for the site
  • An icon next to the site: you will get the history of site for the critical metrics
  • The magnifying glass (next to some sites): you will see the ticket associated to the problem
    • a ticket (acutally any URL) can be associated to the problem, meaning it is under invesdtigation, the lens will be automatically removed when the site becomes green again
    • HowToSetTheLensInSSB

define-critical-metrics.png Fig. 4.2 Defining critical metrics for a view

  • From the menu for the authorized users select Metric -> List
  • You go to a page with the defined metrics , from the view select list, select the view for which you want to defined critical metrics
  • You go to the list of metrics defined for the selected view, now you can set which metrics are critical for this view

view_auth_exptable.png Fig. 5 Click to Expanded table to get more details of the view

view_auth_choose.png Fig. 6 Choose the view

The Expanded table shows all the metrics and the sites for a particular view. You can filter, sort, and save the information. Clicking on the header of any of the rows will lead to a page with more details about that particular metric: plots and tables with the evolution of the metric over, quality measurements plots, etc.

Information for the metric providers

Adding new metrics:

The SSB offers the possibility to define new metrics. The procedure to do this is quite straightforward.

  • First of all, you will need write access to the SSB. If you click on 'Help -> List of administrators', you can see the people who are authorized to define the new metrics. If you are not there, you will have to ask one of the existing administrators to add you. You should send the DN subject (find it in your ~/.globus/usercert.pem, i.e. /C=XX/O=XXXX/....)
When you are registered and enter https://dashb-ssb-dev.cern.ch or http://dashb-ssb.cern.ch url or one of the other 2 instances (for alice and lhcb), you should see menu as in Fig. 3. If you still have Fig. 2 version something is wrong with your authetification. Please, inform administrators.

  • Next, create a text file that will contain the values that you want to insert in the SSB. Each line of the file should be a row, with tab separated entries. The format is the following:
< timestamp > < sitename > <value > < color> < URL>
Lines starting with a # will be treated as comments and ignored. For instance:

[root@dashboard06 /tmp]# cat siteviewtext.13
#R.Santinelli: Dashboard feeder
#Space token share pledged
2009-12-14 13:30:45 LCG.CERN.ch 400 green http://sls.cern.ch/sls/history.php?id=CERN-LHCb_MC_M-DST
2009-12-14 13:30:45 LCG.CNAF.it 30 green http://sls.cern.ch/sls/history.php?id=CNAF-LHCb_MC_M-DST
2009-12-14 13:30:45 LCG.GRIDKA.de 40 green http://sls.cern.ch/sls/history.php?id=GRIDKA-LHCb_MC_M-DST
2009-12-14 13:30:45 LCG.NIKHEF.nl 65 green http://sls.cern.ch/sls/history.php?id=SARA-LHCb_MC_M-DST
2009-12-14 13:30:45 LCG.PIC.es 15 green http://sls.cern.ch/sls/history.php?id=PIC-LHCb_MC_M-DST
2009-12-14 13:30:45 LCG.IN2P3.fr 65 green http://sls.cern.ch/sls/history.php?id=IN2P3-LHCb_MC_M-DST
2009-12-14 13:30:45 LCG.RAL.uk 40 green http://sls.cern.ch/sls/history.php?id=RAL-LHCb_MC_M-DST

This file should be accessible through a web page (let's assume http://myhost/mymetricsource.txt).

Be careful that you have all column in theproper order. If you do not see the update of the column (Metric) that you added to view, please, check the format of the mymetricsource.txt.

  • Define the metric in the SSB. Login to the SSB. On the top left of the page, there should be a menu. Click on 'Metric' (Fig.7), and then on 'new..." (Fig.8). If you are one of the authorized contributors for that SSB, you will go to a form where you can describe the new metric (Fig.9).

view_auth_metric.png Fig.7 Click on metric

metric.png Fig.8 Click on new

metric_new.png Fig.9 Describe new metric. Do not forget to click to Add column

The most important fields are the name of the column, the source (which should point to the file that you created in the previous step, http://myhost/mymetricsource.txt). If you use txt file then frequency and validity can be corrected while metric modification (next chapter). If you choose the collector then the frequency and validity can be set while metric creating.

* Note: Frequency 600 means update of column each 10 minutes. Validity value should be twice higher, i.e. 1200.*

Modifying existing metrics

If you want to change the information of any metric (like the name of the metric, source, refresh rate, etc.):

Click to Metric and then to the list (Fig. 10).

metric_modif.png Fig.10 Click on list

Browse over the list and choose the needed metric. Click to Modify (Fig.11).

metric_modif2.png Fig.10 Click on Modify

After modification the data, click to save (Fig.11).

metric_modif3.png Fig.11 Click on save

Creating virtual metric

A virtual metric is a combination of two or more metrics. The SSB will fetch the source metrics, and it will store the combined status of all the metrics. To define a new 'virtual metric', you have to login to the application. Then, click on 'Metric -> New virtual metric', and fill up the form.

Creating views

Creating a new view is also pretty easy. Again, you will need write accesss to the application. Once you have it, follow the next steps:

  • Login to the application (http://dashb-ssb.cern.ch url or one of the other 2 instances (for alice and lhcb))
  • Click on 'View -> List'

view_list.png Fig.12 Click to view list

  • Fill up the form at the bottom with the name of the view.

view_add.png Fig.13 Fill the form and click to Create view

You can also uncheck the 'visible' checkbox, to make sure that, for the time being, you are the only one who can see the view.

  • After that, you should see the view in the table above, and you can click on 'modify' to change its configuration

view_modify.png Fig.14 Find your newly created view in the list click to modify

  • Each correctable value has a box where you can type value and the button safe attached to each row.

  • In the middle of the page there is a box separated into two boxes (Fig.15). The right box shows the list of existing metric. Choose metric and push "+". The name of metric will be moved to the left box (Fig.16). Choose as many metrics as you need. Do not forget to push save in the bottom of the page (Fig.17).

view_modify1.png Fig.15 Find the needed metric in the list and click to "+"

view_modify2.png Fig.16 Metric is moved to the left column. If you want to remove it - push to "-"

view_modify3.png Fig.17 Push save at the bottom of the page

Notification when a site changes color in particular view

Authorized users can register for notifications when a site changes color (Fig 18)

sitenotifications.png Fig.18 Register for site notifications

  • From the Register site notifications link you go to the page with people registered for notifications. You can add/edit/delete only your own entries (you are logged in with your certificate). You can define for which view and for which sites (SQL like search operation) to be notified. Once a change in color for a site and view occurs, you will receive an email with link to the site history for this view (Fig. 19)

sitehistory.png

Fig.19 Site history for a view

For Site readines calculations 'loop of information' approach is used. That means that SSB admin can insert data into SSB for a set of metrics and after that with algorithm for analyzing and combining metrics' data into single metric to feed a new metric into SSB. Then a SSB admin can register to receive notifications when a site changes color for a view in which the newly created metric is a critical one.

Information for the SSB administrators

At the moment, there are four instances of the SSB installed:

  • dashb-ssb.cern.ch. Connecting to the cms_dashboard database. There are two collectors on that machine, siteviewcms and siteviewcmstext. The first one creates the metrics that we generate ourselves (like SAM, maintenance, job efficiency, etc.), and the second one collects the URL published by the metric providers. To start/stop these collectors:
ssh -l root dashb-ssb /opt/dashboard/bin/dashb-agent-start siteviewcms
ssh -l root dashb-ssb /opt/dashboard/bin/dashb-agent-start siteviewcmstext
There is a cronjob that checks every hour if the collectors are running.

  • dashb-ssb-devel.cern.ch. At the moment, no collectors are running on this machine. This machine should be configured to connect to the development database of cms

  • dashb-alice-ssb (aka dashboard06). ALICE only needs one collector:
ssh -l root dashb-alice-ssb sudo -u dashbop /opt/dashboard/bin/dashb-agent-start siteviewg_alice

  • dashb-lhcb-ssb (aka dashboard06). LHCb has two collectors:
ssh -l root dashb-alice-ssb sudo -u dashbop /opt/dashboard/bin/dashb-agent-start siteviewtext
ssh -l root dashb-alice-ssb sudo -u dashbop /opt/dashboard/bin/dashb-agent-start siteviewgeneric

Metric data modification from UI

Collected metric data can be modified from the UI. There are some requirements for this:

  • The metric should allow metric data modifications , go to modify metric form and set it either to 'with time update' or 'without time update'
metric_modify.png

  • You should be SSB admin with Modify Metrics privileges - if you are not, ask dashboard-support.cern.ch

  • You should be logged in SSB

When all 3 criterias are met, you can go to metric default plot in a couple of ways:

  • From Expanded Table , click on a Column header
  • Log in SSB, Go to Login -> Metric -> List and click "View history" button for the desired metric

Once you are at metric default plot page, you can right click on a cell from the plot and a popup window will show up from where you can modify the data

modify_metric_data.png

How to understand which plugin updates given metric

This is what I do

cd /opt/dashboard/var/log/siteviewgeneric_input/done on the server and then

ll *.metricId for examle ll *.35 will give you

Sep 21 16:13 1316614405.ATLASGENERIC_Virtual.35

Sep 22 11:17 1316683051.ATLASGENERIC_Virtual.35

and you see that Virtual plugin is responsible for metric 35

Profiles

General help page on Dashboard

-- PabloSaiz - 12 Nov 2008

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng define-critical-metrics.png r2 r1 manage 246.6 K 2012-11-15 - 14:53 IvanDzhunov  
PNGpng metric.png r1 manage 38.1 K 2012-09-04 - 15:13 OlgaKodolova  
PNGpng metric_modif.png r1 manage 38.2 K 2012-09-04 - 15:36 OlgaKodolova  
PNGpng metric_modif2.png r1 manage 38.9 K 2012-09-04 - 15:36 OlgaKodolova  
PNGpng metric_modif3.png r1 manage 67.1 K 2012-09-04 - 15:38 OlgaKodolova  
PNGpng metric_modify.png r1 manage 187.0 K 2013-07-17 - 14:23 IvanDzhunov  
PNGpng metric_new.png r1 manage 76.3 K 2012-09-04 - 15:13 OlgaKodolova  
PNGpng modify_metric_data.png r1 manage 78.0 K 2013-07-17 - 14:54 IvanDzhunov  
PNGpng sitehistory.png r1 manage 54.8 K 2012-11-15 - 14:36 IvanDzhunov  
PNGpng sitenotifications.png r1 manage 108.5 K 2012-11-15 - 14:28 IvanDzhunov  
PNGpng view.png r1 manage 99.2 K 2012-09-04 - 14:19 OlgaKodolova  
PNGpng view_add.png r1 manage 84.4 K 2012-09-04 - 16:04 OlgaKodolova  
PNGpng view_auth.png r1 manage 12.2 K 2012-09-04 - 14:21 OlgaKodolova  
PNGpng view_auth_choose.png r1 manage 27.3 K 2012-09-04 - 14:41 OlgaKodolova  
PNGpng view_auth_exptable.png r1 manage 11.9 K 2012-09-04 - 14:41 OlgaKodolova  
PNGpng view_auth_index.png r1 manage 11.2 K 2012-09-04 - 14:41 OlgaKodolova  
PNGpng view_auth_metric.png r1 manage 12.0 K 2012-09-04 - 15:13 OlgaKodolova  
PNGpng view_home.png r1 manage 29.5 K 2012-09-04 - 14:41 OlgaKodolova  
PNGpng view_list.png r1 manage 13.5 K 2012-09-04 - 16:04 OlgaKodolova  
PNGpng view_modify.png r1 manage 81.0 K 2012-09-04 - 16:04 OlgaKodolova  
PNGpng view_modify1.png r1 manage 83.9 K 2012-09-04 - 16:04 OlgaKodolova  
PNGpng view_modify2.png r1 manage 87.1 K 2012-09-04 - 16:04 OlgaKodolova  
PNGpng view_modify3.png r1 manage 87.1 K 2012-09-04 - 16:04 OlgaKodolova  
PNGpng view_noauth.png r1 manage 11.4 K 2012-09-04 - 14:21 OlgaKodolova  
Edit | Attach | Watch | Print version | History: r22 < r21 < r20 < r19 < r18 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r19 - 2014-10-14 - PabloSaiz
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    ArdaGrid All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback