---++ Grid Services Monitoring Working Group ----++ Displaying Aggregated Service Metrics in Ganglia The Nagios-based monitoring prototype is distributed with a "publisher" .cgi script to calculate aggregate service metrics and publish these through the web interface as XML. Whilst other Nagios-Ganglia integration methods are possible this documented describes a simple integration of this XML publisher interface allowing aggregated metrics to be displayed in graphical form in the [[http://ganglia.sourceforge.net/][Ganglia]] monitoring system Cluster view as shown in the screen clip below. * ganglia_shot.PNG: <br /> <img src="%ATTACHURLPATH%/ganglia_shot.PNG" alt="ganglia_shot.PNG" width='835' height='659' /> 1. Download the script =wlcg2ganglia.pl= from the repository [[https://www.sysadmin.hep.ac.uk/svn/grid-monitoring/trunk/fm/ganglia/tools/src/wlcg2ganglia.pl][here]] and install in =/opt/lcg/sbin/= 1. Install a cron (e.g. on the Ganglia server) to fetch the aggregate metrics published on the Nagios server and push them into Ganglia. *Note* ganglia 3.0.4 or above is required because of the use of gmetric host 'spoofing' feature. e.g. create =/etc/crond/wlcg2ganglia= with contents as follows <pre> PATH=/sbin:/bin:/usr/sbin:/usr/bin * * * * * nobody /opt/lcg/sbin/wlcg2ganglia.pl -s CERN_PPS \ -u https://pps-monitoring.cern.ch/cgi-bin/publisher.cgi >> /dev/null 2>&1 </pre> 1. Patch 2 ganglia .php files in =/var/www/html/ganglia= using the patch data below. When completed 3 additional summary graphs for local and remote (SAM & NPM) metrics should be displayed in the Ganglia Cluster view. <pre> *** graph.php 2007-07-30 17:10:16.000000000 +0200 --- graph.php_iann 2007-07-30 17:06:00.000000000 +0200 *************** *** 210,217 **** --- 210,241 ---- ."DEF:'bytes_out'='${rrd_dir}/pkts_out.rrd':'sum':AVERAGE " ."LINE2:'bytes_in'#$mem_cached_color:'In' " ."LINE2:'bytes_out'#$mem_used_color:'Out' "; } + # + # Handle aggregate WLCG metrics + # (eg. sam, local and npm, but this code does not care + # and looks for org.wlcg.aggregate-status.xxxx_report) + # + else if (strncmp($graph,"org.wlcg.aggregate-status.",26) == 0 ) + { + # Construct the name of the rrd file from the graph name + $metric = substr($graph,0,strlen($graph)-7); + # Chop the graph name up to make a useful title string + $style = substr($metric,4,strlen($metric)-4); + $style = substr($metric,4,strlen($metric)-4); + + $upper_limit = "--upper-limit 10"; + $lower_limit = "--lower-limit 0"; + + $vertical_label = "--vertical-label Services "; + + $series ="DEF:'ok'='${rrd_dir}/${metric}.rrd':'sum':AVERAGE " + ."DEF:'total'='${rrd_dir}/${metric}.rrd':'num':AVERAGE " + ."CDEF:'crit'=total,ok,- " + ."AREA:'ok'#00FF00:'OK' " + ."STACK:'crit'#FF0000:'CRITICAL' "; + } else { /* Got a strange value for $graph */ exit(); *** cluster_view.php 2007-07-30 17:18:54.000000000 +0200 --- cluster_view.php_iann 2007-07-30 17:18:14.000000000 +0200 *************** *** 36,44 **** # $graph_args = "c=$cluster_url&$get_metric_string&st=$cluster[LOCALTIME]"; $tpl->assign("graph_args", $graph_args); if (!isset($optional_graphs)) ! $optional_graphs = array(); foreach ($optional_graphs as $g) { $tpl->newBlock('optional_graphs'); $tpl->assign('name',$g); $tpl->assign('graph_args',$graph_args); --- 36,46 ---- # $graph_args = "c=$cluster_url&$get_metric_string&st=$cluster[LOCALTIME]"; $tpl->assign("graph_args", $graph_args); if (!isset($optional_graphs)) ! $optional_graphs = array("org.wlcg.aggregate-status.sam", ! "org.wlcg.aggregate-status.local", ! "org.wlcg.aggregate-status.npm"); foreach ($optional_graphs as $g) { $tpl->newBlock('optional_graphs'); $tpl->assign('name',$g); $tpl->assign('graph_args',$graph_args); </pre> -- Main.IanNeilson - 31 Jul 2007 * ganglia_shot.PNG: <br /> <img src="%ATTACHURLPATH%/ganglia_shot.PNG" alt="ganglia_shot.PNG" width='835' height='659' />
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
png
ganglia_shot.PNG
r1
manage
54.4 K
2007-07-31 - 17:52
IanNeilson
This topic: LCG
>
WebHome
>
LCGMonitoringWorkingGroups
>
GridServiceMonitoringInfo
>
GridMonitoringGangliaIntegration
Topic revision: r3 - 2007-07-31 - IanNeilson
Copyright &© 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback