LCG Grid Deployment - CERN ROC

Service Availability Monitoring for the ROC

Overview

The submission of SAM tests to Certified and Uncertified sites of prduction and PPS in the CERN ROC is currently done at CERN. Results are published in the production database and shown trough the following displays

Client configuration details

The SAM client for CERN ROC is installed on lxb2086.cern.ch as root in the directory /opt/lcg/same/client/

The UI used for tests submission is the one available on the same machine with glite-UI 3.1 running on SLC4. The configuration are:

  • The default top bdii: roc-bdii.cern.ch
  • The default glite WMS: roc-wms.cern.ch
  • The default myproxy: px02.lip.pt

The client configuration files we customized for CERN ROC are:

  • /opt/lcg/same/client/etc/same-dteam.conf where in particular we changed the value of common_filter

> cat /opt/lcg/same/client/etc/same-dteam.conf
# Default configuration for SAME
[DEFAULT]
# Settings for locations
workdir=%(home)s/.same
logdir=%(same_home)s/var/log
resdir=%(same_home)s/var/results
secresdir=%(same_home)s/var/results-secure
webdir=%(same_home)s/web
cachedir=%(same_home)s/var/cache
master_vo=dteam

# Logging levels:
# CRITICAL, ERROR, WARNING, INFO, DEBUG, NOTSET
# Logging level for the log file
loglevel=INFO
# Logging level for console messages
verbosity=CRITICAL

[sensors]
common_attrs="sitename nodename inmaintenance"
#common_filter="type=Production status=Certified ismonitored=y"
common_filter="ismonitored=y"
CE_filter="serviceabbr=CE"
gCE_filter="serviceabbr=gCE"
FTS_filter="serviceabbr=FTS"
FTS_attrs="sitename nodename inmaintenance tier"
SE_filter="serviceabbr=SE"
SRM_filter="serviceabbr=SRM"
LFC_filter="serviceabbr=LFC voname=%(master_vo)s"
host-cert_attrs="nodename serviceabbr"
host-cert_filter="serviceabbr=FTS,gCE,LFC,VOMS,CE,SRM,gRB,MyProxy,RB,SE,RGMA"
VOBOX_filter="serviceabbr=VOBOX voname=%(master_vo)s"

[statuscode]
ok=10
info=20
notice=30
warning=40
error=50
critical=60
maintenance=100

[submission]
vo=%(master_vo)s
test_timeout=600

[scheduler]
max_processes=10
default_timeout=1800
shell=/bin/sh

[webservices]
#publisher_wsdl=%(same_home)s/etc/publisher.wsdl
#query_wsdl=%(same_home)s/etc/query.wsdl
publisher_wsdl=http://lcg-sam.cern.ch:8080/same-ws/services/WebArchiver?wsdl
query_wsdl=http://lcg-sam.cern.ch:8080/same-ws/services/Database?wsdl

  • /opt/lcg/same/client/sensors/common/config.sh , shown below

[malanxin@lxb2086 ~]$ more /opt/lcg/same/client/sensors/common/config.sh
SAME_PREF_SE_LIST="$SAME_HOME/sensors/common/prefSE.lst"
SAME_GOOD_SE_FILTER="serviceabbr=SE type=Production status=Certified servicestatus=ok servicestatusvo=$SAME_VO"
#SAME_GOOD_SE_FILTER="serviceabbr=SE type=Production status=Certified tier=0,1 servicestatus=ok servicestatusvo=$SAME_VO"

SAME_PREF_LFC_LIST="$SAME_HOME/sensors/common/prefLFC.lst"
SAME_GOOD_LFC_FILTER="serviceabbr=LFC type=Production status=Certified servicestatus=ok servicestatusvo=$SAME_VO"
#SAME_GOOD_LFC_FILTER="serviceabbr=LFC type=Production status=Certified tier=0,1 servicestatus=ok servicestatusvo=$SAME_VO"

Sensor configuration details

The overall sensor configuration is kept in SAM client directory

/opt/lcg/same/client

The submission framework for all sites in the CERN ROC has been customised to use a particular SE.

file function notes
./CE/config.sh Commands and filters to be used by the CE sensor with respect to the "standard" CE sensor configuration, edg-_ commands have been replaced by the glite- ones, in order to use the gliteWMS and a particular SE has been taken as the reference SE, which belongs both to production and PPS grids
./gCE/config.sh Commands and filters to be used by the gCE sensor with respect to the "standard" gCE sensor configuration, a particular SE has been taken as the reference SE, which belongs both to production and PPS grids
./cron/same-cron-dteam.sh Submits in a sequence all the supprted SAM tests to all sites Good to be used in cronjobs. It produces detailed logs of all the operations done. Currently supports LFC, SE, SRM, CE, gCE, host-cert sensors

  • /opt/lcg/same/client/sensors/CE/config.sh , shown below
cat /opt/lcg/same/client/sensors/CE/config.sh
JOB_OUTPUT_CMD="glite-wms-job-output"
JOB_STATUS_CMD="glite-wms-job-status"
JOB_SUBMIT_CMD="glite-wms-job-submit -a --vo $SAME_VO"
JOB_LOGGING_INFO_CMD="glite-wms-job-logging-info -v 2"
JOB_CANCEL_CMD="glite-wms-job-cancel --noint"
JOB_SENSOR_NAME="testjob"
JOB_TEST_NAME="CE-sft-job"
CE_PREFSE_FILE="$SAME_SENSOR_HOME/prefSE.lst"
CE_CENTRALSE_FILTER="nodename=grid007g.cnaf.infn.it"

  • /opt/lcg/same/client/sensors/gCE/config.sh , shown below
cat /opt/lcg/same/client/sensors/gCE/config.sh
JOB_OUTPUT_CMD="glite-wms-job-output"
JOB_STATUS_CMD="glite-wms-job-status"
JOB_SUBMIT_CMD="glite-wms-job-submit -a --vo $SAME_VO"
JOB_CANCEL_CMD="glite-wms-job-cancel --noint"
JOB_LOGGING_INFO_CMD="glite-wms-job-logging-info -v 2"
JOB_SENSOR_NAME="glite-testjob"
JOB_TEST_NAME="gCE-sft-job"
CE_PREFSE_FILE="$SAME_SENSOR_HOME/prefSE.lst"
CE_CENTRALSE_FILTER="nodename=grid007g.cnaf.infn.it"

  • /opt/lcg/same/client/cron/same-cron-dteam.sh , shown below
[malanxin@lxb2086 ~]$ cat /opt/lcg/same/client/cron/same-cron-dteam.sh
#!/bin/bash --login
log=$HOME/same-dteam/log/same-cron-dteam.log
#myproxy=myproxy-fts.cern.ch
publish_sensors="CE gCE FTS"
sensors="LFC SE SRM FTS CE gCE host-cert"

echo "" >> $log
echo "" >> $log
echo -n "----------------------[ " >> $log
date >> $log
#myproxy-get-delegation -s $myproxy -d -S < $HOME/.myproxy_passphrase >> $log 2>&1
#voms-proxy-init -voms dteam -noregen >> $log 2>&1
voms-proxy-init -voms dteam -cert $HOME/.globusdteam/usercert.pem -key $HOME/.globusdteam/userkey.pem -pwstdin < $HOME/.cert_passphrase >> $log 2>&1

for sensor in $publish_sensors ; do
echo "" >> $log
echo "----------" >> $log
echo "Publishing $sensor sensor: " >> $log
echo "" >> $log
/opt/lcg/same/client/bin/same-exec -c /opt/lcg/same/client/etc/same-dteam.conf --publish $sensor regionname=CERN >> $log 2>&1
done

for sensor in $sensors ; do
echo "" >> $log
echo "----------" >> $log
echo "Executing $sensor sensor: " >> $log
echo "" >> $log
/opt/lcg/same/client/bin/same-exec -c /opt/lcg/same/client/etc/same-dteam.conf $sensor regionname=CERN >> $log 2>&1
done

Use and Operation

The "official" user reference for SAM is

http://goc.grid.sinica.edu.tw/gocwiki/Service_Availability_Monitoring, maintained by Piotr Nyczyk.

The regular submission of SAM tests to all sites in the CERN ROC is scheduled by cronjobs run in Lanxin's crontab

0 */2 * * * /opt/lcg/same/client/cron/same-cron-dteam.sh

In this cases the tests are run in a sequence and applied to all sites in the CERN ROC where *Monitoring=Y

-- Main.malanxin - 17 Jul 2007


This topic: LCG > WebHome > LCGGridDeployment > CERNROC > CERNROCSamInstallationNew
Topic revision: r1 - 2007-07-17 - LanxinMA
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback