LCG Grid Deployment - gLite PPS

Service Availablility Monitoring in Pre-Production

Overview

The submission of SAM tests to PPS sites is currently done at CERN. Results are published in the production database and shown through the following displays

Client configuration details

The SAM client for PPS is installed on lxb1908.cern.ch in the directory /opt/lcg/same/client/

While the SAM client is installed in root space on lxb1908, the UI used for tests submission is not the one avalable on that machine:

We use instead the AFS UI defined in > source /afs/cern.ch/project/gd/egee/glite/ui_PPS/etc/profile.d/grid_env.csh

The client configuration files we customized for PPS are:

  • /opt/lcg/same/client/etc/same.conf where in particular we changed the value of common_filter publisher_wsdl and query_wsdl

> cat /opt/lcg/same/client/etc/same.conf
# Default configuration for SAME

[DEFAULT]
# Settings for locations
workdir=%(home)s/.same
logdir=%(same_home)s/var/log
resdir=%(same_home)s/var/results
secresdir=%(same_home)s/var/results-secure
webdir=%(same_home)s/web
cachedir=%(same_home)s/var/cache

# Logging levels:
# CRITICAL, ERROR, WARNING, INFO, DEBUG, NOTSET
# Logging level for the log file
loglevel=INFO
# Logging level for console messages
verbosity=CRITICAL

[sensors]
common_attrs="sitename nodename inmaintenance"
#common_filter="type=PPS ismonitored=y"
common_filter="ismonitored=y"
CE_filter="serviceabbr=CE"
gCE_filter="serviceabbr=gCE"
FTS_filter="serviceabbr=FTS"
FTS_attrs="sitename nodename inmaintenance tier"
SE_filter="serviceabbr=SE"
SRM_filter="serviceabbr=SRM"
LFC_filter="serviceabbr=LFC voname=ops"
host-cert_attrs="nodename serviceabbr"
host-cert_filter="serviceabbr=FTS,gCE,LFC,VOMS,CE,SRM,gRB,MyProxy,RB,SE,RGMA"

[statuscode]
ok=10
info=20
notice=30
warning=40
error=50
critical=60
maintenance=100

[submission]
vo=ops
test_timeout=300

[scheduler]
max_processes=10
default_timeout=1800
shell=/bin/sh

[webservices]
publisher_wsdl=http://lcg-sam.cern.ch:8080/same-ws/services/WebArchiver?wsdl
query_wsdl=http://lcg-sam.cern.ch:8080/same-ws/services/Database?wsdl
# edit the line below if you want to publish to old-style SFT webservice
sft_publisher_url=http://lxb2070.cern.ch:8083/sft/publishTuple

  • /opt/lcg/same/client/sensors/common/config.sh , shown below
/opt/lcg/same/client/sensors/common
[root@lxb1908 common]# cat config.sh
SAME_PREF_SE_LIST="$SAME_HOME/sensors/common/prefSE.lst"
SAME_GOOD_SE_FILTER="nodename=grid007g.cnaf.infn.it"

SAME_PREF_LFC_LIST="$SAME_HOME/sensors/common/prefLFC.lst"
SAME_GOOD_LFC_FILTER="serviceabbr=LFC type=PPS status=Certified tier=0,1 servicestatus=ok servicestatusvo=$SAME_VO"

Sensor configuration details

The partivular sensor configuration for PPS is kept in the afs directory

/afs/cern.ch/project/gd/egee/sam-pps

The submission framework for PPS has been customised to use different RBs/informationsystem + a particular SE. Several bash scripts and configuration files are used: the whole of them is available in /afs/cern.ch/project/gd/egee/sam-pps

file function notes
glite_wmsui_cern_pps.conf It points to the WMS used to submit SAM tests a gLite WMS is used to submit both to LCG SEs and gLite CEs. Therefore the content of CE-config.sh had to be changed with respect to the version distributed in order not to use the standars settings of the PPS UI
CE-config.sh Commands and filters to be used by the CE sensor with respect to the "standard" CE sensor configuration, here glite_wmsui_cern_pps.conf file is used instead of the default UI settings; edg- commands have been replaced by the glite- ones, in order to use the gliteWMS and a particular SE has been taken as the reference SE, which belongs both to production and PPS grids
gCE-config.sh Commands and filters to be used by the gCE sensor with respect to the "standard" gCE sensor configuration, here glite_wmsui_cern_pps.conf file is used instead of the default UI settings; a particular SE has been taken as the reference SE, which belongs both to production and PPS grids
instance-setenv.sh Environment settings needed for the submission to PPS In particular the "HOME" directory of the PPS installation, the SAM working directory and the UI to use are defined here. The first two have to be changed if a new instance is created out of a copy of these scripts
instance-setenv.csh Environment settings needed for for the submission to PPS Same as above. This comes handy to run manual tests if you use (t)csh
apply-sensor-config.sh Specific environment settings needed by the sensors So far only CE and gCE sensors need a variable to be set to make them use, respectively, gCE-config.sh and gCE-config.sh instead of the standard sensor configuration shipped with the SAM client
create-proxy.sh Utility to create the 'ops' proxy using the secondary certificate This has to be changed if you happen not to be Antonio Retico ;-)
submit-sam-tests-pps.sh It submits one-shot SAM tests to all PPS sites Can be used for all sensors. It gets the sensor code in input
publish-submit-sam-tests-pps.sh Publishes available resuts of the previous tests and submits new ones to all PPS sites Can be used for CE and gCE sensors. It gets the sensor code in input
sam-status.sh It retrieves the status of tests previously submitted to PPS sites. Useful for manual checks. Can be used for CE and gCE sensors. It gets the sensor code in input
sam-publish.sh It publishes tests previously submitted to PPS sites. Useful for manual checks. Can be used for CE and gCE sensors. It gets the sensor code in input
one-phase-sam-tests.sh Submits in a sequence all the supprted one-shot SAM tests to all PPS sites Good to be used in cronjobs. It produces detailed logs of all the operations done. Currently supports SE, SRM, LFC, host-cert sensors
two-phase-sam-tests.sh Publishes available results and then submits in a sequence all the supprted one-shot SAM tests to all PPS sites Good to be used in cronjobs. It produces detailed logs of all the operations done. Currently supports CE and gCE sensors

Use and Operation

The "official" user reference for SAM is

http://goc.grid.sinica.edu.tw/gocwiki/Service_Availability_Monitoring, maintained by Piotr Nyczyk.

However, in order to work with the installation above described it is strongly recommended not to run the same-exec command directly. In fact, due to the several customisations done in PPS with respect to the default settings, this could cause, if the user environment is not perfectly set up,

  • results of the tests to be written in wrong directories
  • wrong commands to be run
  • wrong RBs to be used
A practical and quick set up to be sure that nothig is missing is to add in the user .tcsh file an alias as follows.
alias sampps "source /afs/cern.ch/project/gd/egee/sam-pps/instance-setenv.csh; \
              cd /afs/cern.ch/project/gd/egee/sam-pps"
and to run the alias each time you start using the PPS instance of SAM.

The regular submission of SAM tests to PPS sites is scheduled by cronjobs run in Antonio's acrontab

15 * * * * lxb1908.cern.ch /afs/cern.ch/project/gd/egee/sam-pps/two-phase-sam-tests.sh > /afs/cern.ch/project/gd/egee/sam-pps/two-phase-sam-tests.log 2>&1
45 * * * * lxb1908.cern.ch /afs/cern.ch/project/gd/egee/sam-pps/one-phase-sam-tests.sh > /afs/cern.ch/project/gd/egee/sam-pps/one-phase-sam-tests.log 2>&1
  • The one-phase-sam-tests include those tests to be run in a single step (the results of which are immediately available) e.g. LFC, SRM ...
  • The two-phase-sam-tests include those tests to be run in two phases (the results of which are not immediately available) e.g. CE, gCE ...
In both cases the tests are run in a sequence and applied to all PPS sites (Certified, Uncertified, Suspended) where Monitoring=Y

Server installation and configuration


DISCLAIMER: The content that follows is not complete nor reliable. It deals with a previous attempt done in PPS of setting up a dedicated SAM database and publishing web service then aborted. So please take it with the due caution.

The "official" documentation reference for SAM is

http://www.cern.ch/sam-docs

These guides, maintained by the SAM Team is addressed to site administrators in the LCG production system.

It contains all the instructions needed to install and configure the SAM client and server.

We used the GOCDB installation, as it's done in production.

In the following procedure, for the sake of convenience, we will reproduce the command lines we used to set up the SAM in PPS. Nevertheless the links to original procedures, when provided, indicate that those should be considered the real reference. Therefore, if you find that the original procedures have been changed and that the proposed steps are obsolete, please feel free to update them

The overall configuration steps you need to do to implement such an infrastructure are:

  • Install and configure the SAM Server
  • Install and configure the client on the AFS UI
  • Run SAM and publish the results

Install and configure SAM server

Instructions in:

http://goc.grid.sinica.edu.tw/gocwiki/Service_Availability_Monitoring

DB schema

SAM in PPS uses an Oracle account on the production db backend used for SAM prod

The tns descriptor for the connection details is:

lcg_same =
   (DESCRIPTION =
     (ADDRESS = (PROTOCOL = TCP)(HOST = lcgr3-v.cern.ch)(PORT = 10121))
     (ADDRESS = (PROTOCOL = TCP)(HOST = lcgr2-v.cern.ch)(PORT = 10121))
     (ADDRESS = (PROTOCOL = TCP)(HOST = lcgr1-v.cern.ch)(PORT = 10121))
     (ADDRESS = (PROTOCOL = TCP)(HOST = lcgr4-v.cern.ch)(PORT = 10121))
     (LOAD_BALANCE = yes)
     (CONNECT_DATA =
       (SERVER = DEDICATED)
       (SERVICE_NAME = lcg_same.cern.ch)
       (FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 200)(DELAY = 15))
     )
   )

We have got four accounts

EGEE_PPS_SAME (master)

EGEE_PPS_SAME_W (used by appliecations:bdii2oracle, gridview)

EGEE_PPS_SAME_R (read-only)

EGEE_PPS_SAME_PORTAL (used by SAM Portal)

Before the creation of the schema Gridview people have to give grants on some table of them.


grant select, references on COUNTRIES         to EGEE_PPS_SAM WITH GRANT OPTION;
grant select, references on MAINTENANCE         to EGEE_PPS_SAM WITH GRANT OPTION;
grant select, references on NODES             to EGEE_PPS_SAM WITH GRANT OPTION;
grant select, references on NODES_MAINTENANCE to EGEE_PPS_SAM WITH GRANT OPTION;
grant select, references on REGIONS           to EGEE_PPS_SAM WITH GRANT OPTION;
grant select, references on SITES             to EGEE_PPS_SAM WITH GRANT OPTION;
grant select, references on SITE_DETAILS      to EGEE_PPS_SAM WITH GRANT OPTION;
grant select, references on VO                to EGEE_PPS_SAM WITH GRANT OPTION;

The schema has to be created (as described in the general documentation) in the account EGEE_PPS_SAME

After the creation of the schema the following actions have to be done on the account:

  • on EGEE_PPS_SAM
    grant select         on vo              to EGEE_PPS_SAM_W ;
    grant select         on nodes           to EGEE_PPS_SAM_W ;
    grant select         on sites           to EGEE_PPS_SAM_W ;
    grant select         on site_details    to EGEE_PPS_SAM_W;
    grant select         on site            to EGEE_PPS_SAM_W ;
    grant select, insert on node            to EGEE_PPS_SAM_W ;
    grant select, insert on seresvo         to EGEE_PPS_SAM_W ;
    grant select, insert on ceresvo         to EGEE_PPS_SAM_W ;
    grant select, insert on nodeidmapping   to EGEE_PPS_SAM_W ;
    grant select, insert on serviceinstance to EGEE_PPS_SAM_W ;
    grant select, insert on ceresource      to EGEE_PPS_SAM_W ;
    grant select, insert on seresource      to EGEE_PPS_SAM_W ;
    grant select, insert on servicevo       to EGEE_PPS_SAM_W ;
    grant select, insert on service         to EGEE_PPS_SAM_W ;
       

  • on EGEE_PPS_SAM_W
    create or replace synonym site_details    for EGEE_PPS_SAM.site_details   ;
    create or replace synonym sites           for EGEE_PPS_SAM.sites          ;
    create or replace synonym vo              for EGEE_PPS_SAM.vo             ;
    create or replace synonym service         for EGEE_PPS_SAM.service        ;
    create or replace synonym node            for EGEE_PPS_SAM.node           ;
    create or replace synonym seresvo         for EGEE_PPS_SAM.seresvo        ;
    create or replace synonym ceresvo         for EGEE_PPS_SAM.ceresvo        ;
    create or replace synonym serviceinstance for EGEE_PPS_SAM.serviceinstance;
    create or replace synonym ceresource      for EGEE_PPS_SAM.ceresource     ;
    create or replace synonym seresource      for EGEE_PPS_SAM.seresource     ;
    create or replace synonym nodes           for EGEE_PPS_SAM.nodes          ;
    create or replace synonym nodeidmapping   for EGEE_PPS_SAM.nodeidmapping  ;
    create or replace synonym servicevo       for EGEE_PPS_SAM.servicevo      ;
    create or replace synonym site            for EGEE_PPS_SAM.site           ;
       

Sam Server configuireation

Accounts to be used in the configuration files

  • egee_pps_sam_w in /opt/lcg/same/server/db/cron/bdii2oracle.conf

  • in /opt/lcg/same/server/ws/gridview.properties
    ora_driver=oracle.jdbc.driver.OracleDriver
    ora_url=jdbc:oracle:oci:@lcg_same
    ora_username=egee_pps_sam_w
    ora_password=********
       

  • in /opt/lcg/same/server/same-ws.xml
      <ResourceParams name="jdbc/samedb">
        <!-- db connection, change this -->
        <parameter>
          <name>url</name>
          <value>jdbc:oracle:oci:@lcg_same</value>
        </parameter>
    
        <parameter>
          <name>username</name>
          <value>egee_pps_sam_r</value>
        </parameter>
        
        <parameter>
          <name>password</name>
          <value>********</value>
        </parameter>
       

-- Main.aretico - 01 Sep 2006

Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r8 - 2007-04-24 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback