gLite 3.0.2 PPS-Update 31 to PPS (Wed, 30 May 2007) - Full Report

glite-CE

Site Name: PPS-IFIC
Site Admin: Alvaro Fernandez
  • Reconfigured gliteCE node.
  • Reconfigured and restarted:
  • job submission not working, but need further investigation (maybe not a release problem, but just a parameter configuration issue)

LFC-mysql

Site Name: PPS-LIP
Site Admin: Mario David
The LFC was "almost" not affected by this release patch 1165 of glite-yaim-3.0.1-17 it says: No pool accounts are created on LFC, but taking a look at the node-info.def we have:
BASE2_FUNCTIONS="
config_host_certs
config_users
.......
and
LFC_mysql_FUNCTIONS="${BASE1_FUNCTIONS} ${BASE2_FUNCTIONS}
LFC_oracle_FUNCTIONS="${BASE1_FUNCTIONS} ${BASE2_FUNCTIONS}
so in principle the function config_users is still run on the configuration of both LFC nodes (mysql and oracle). I will investigate further to check if this is true.

MON and IC

Site Name: pps-mon.egee.cesga.es
Site Admin: asimon,esfreire

It seems all is working fine.

SE-dpm-mysql

Site Name: RU-Moscow-KIAM-PPS
Site Admin: Eugenia Kovalenko

and

Site Name: prague_cesnet_pps
Site Admin: Jan Svec

There are not any errors during the upgrading. But two new variables in site-info.def are required: DPM_INFO_USER and DPM_INFO_PASS. Without these variables the value of "Total Available SE" is zero.

KIAM had some initial problems which PRAGUE helped to solve after upgrade and re-configuration, they made an aditional step by running /etc/cron.monthly/create-default-dirs-DPM.sh script.

glite-UI

Site Name: RU-Moscow-KIAM-PPS
Site Admin: Eugenia Kovalenko

We have upgraded UI successfully. General tests were passed without any errors.

glite-WMSLB

Site Name: PPS-IFIC
Site Admin: Alvaro Fernandez

When restarting the WMSLB service, missing configuration (missing in yaim?):

[Thu May 31 10:18:03 2007] [warn] PassEnv variable GLITE_WMS_WMPROXY_WEIGHTS_UPPER_LIMIT was undefined
[Thu May 31 10:18:03 2007] [warn] PassEnv variable GLITE_SD_VO was undefined
[Thu May 31 10:18:03 2007] [warn] PassEnv variable LCMAPS_LOG_LEVEL was undefined
[Thu May 31 10:18:03 2007] [warn] PassEnv variable LCMAPS_DEBUG_LEVEL was undefined

First restart after upgrade:

Could not stop the LB Server service             [FAILED]
An unrecoverable error occurred while stopping the gLite Logging and Bookkeeping  [FAILED]
The gLite LB Server service is already running. Restarting...
Stopping glite-lb-interlogd ...glite-lb-notif-interlogd: no process killed 
done
/var/glite/glite-lb-bkserverd.pid does not exist - glite-lb-bkserverd not running?
Error at stop, but starting the services seems fine.

glite-WN

Site Name: UKI-SOUTHGRID-BHAM-PPS
Site Admin: Yves Coppens

SL4 compatibility mode

  • I am using yum. I got the following complaints from yum update:
    --> Running transaction check
    --> Processing Dependency: lcg-sam-client-sensors = 0.1-13 for package: lcg-sam-client-WNconfig-EGEE
    --> Finished Dependency Resolution
    Error: Missing Dependency: lcg-sam-client-sensors = 0.1-13 is needed by package lcg-sam-client-WNconfig-EGEE
    I had to exclude the lcg-sam-client-sensors-1.0.1-0 and lcg-sam-client-1.0.0-0 packages to resolve the dependencies problem.

  • Old style and new style yaim reconfiguration worked.

  • On my production site, the yaim function config_sw_dir is run on every worker and tries to recursively change the permissions the software area. The problem here is that every worker will do this when this should only be done once. On my production site, the workers got permission denied and I had to overwrite the config_sw_dir function. I reported this in ggus ticket 22090.
    Due to the absence of VO software on my PPS site, this is difficult to catch, so I compared the old and new 3.0.1-17 functions which are identical, so this is still a problem.

  • yaim configuration still throws an old but inoffensive error message:
    /usr/lib/python2.3/site-packages/Ft/Xml/InputSource.py:203:
    RuntimeWarning: Creation of InputSource without a URI

Site Name: PPS-LIP
Site Admin: Mario David

WN (native glite 3.1 SLC4) are not affected, but the dcache-client-1.7.0-35 should be installed/upgraded (notes on patch 1151), this version is not on the slc4 glite 3.1 repository. The RPM is in the dCache repository:
http://cvs.dcache.org/repository/apt/sl4.4.0/i386/RPMS.testing/dcache-client-1.7.0-35.i586.rpm
so it's just a matter of fetching it

lcg-CE_torque

Site Name: UKI-SOUTHGRID-BHAM-PPS
Site Admin: Yves Coppens

Updated fine.
Congiguration of the CE using new yaim does not return:

/opt/glite/yaim/bin/yaim -c -s /etc/yaim/site-info.def -n CE...
Starting edg-gatekeeper:                                   [  OK  ]
Configuration Complete
-> requires control-C I had observed this in the past and submitted a ticket (ggus 22090). Apparently, yaim developpers were aware of this problem.

[ Note this worked with RB/WN ]

I tested successfully the job submission chain UI/RB/CE/WN. I haven't had time yet to look carefully at the accounting.

(Received after report) Hi Mario,
I've just noticed a potential problem with the info provider, possibly linked to the introduction of the DENY attribute in the VOVIEW config (function config_gip in yaim), below is some sample output which at first sight does not seem correct, I'll keep looking into it,

Yves

dn:
GlueVOViewLocalID=alice,GlueCEUniqueID=epbf005.ph.bham.ac.uk:2119/jobmanager-lcgpbs-alice,mds-vo-name=local,o=grid
objectClass: GlueCETop
objectClass: GlueVOView
objectClass: GlueCEInfo
objectClass: GlueCEState
objectClass: GlueCEAccessControlBase
objectClass: GlueCEPolicy
objectClass: GlueKey
objectClass: GlueSchemaVersion
GlueVOViewLocalID: alice
GlueVOViewLocalID: atlas
GlueVOViewLocalID: biomed
GlueVOViewLocalID: calice
GlueVOViewLocalID: cms
GlueVOViewLocalID: dteam
GlueVOViewLocalID: lhcb
GlueVOViewLocalID: ops
GlueCEAccessControlBaseRule: VO:alice
GlueCEAccessControlBaseRule: VO:atlas
GlueCEAccessControlBaseRule: VO:biomed
GlueCEAccessControlBaseRule: VO:calice
GlueCEAccessControlBaseRule: VO:cms
GlueCEAccessControlBaseRule: VO:dteam
GlueCEAccessControlBaseRule: VO:lhcb
GlueCEAccessControlBaseRule: VO:ops
GlueCEStateRunningJobs: 0
GlueCEStateRunningJobs: 0
GlueCEStateRunningJobs: 0
GlueCEStateRunningJobs: 0
GlueCEStateRunningJobs: 0
GlueCEStateRunningJobs: 0
GlueCEStateRunningJobs: 0
GlueCEStateRunningJobs: 0
GlueCEStateWaitingJobs: 0
GlueCEStateWaitingJobs: 0
GlueCEStateWaitingJobs: 0

Site Name: PPS-LIP
Site Admin: Mario David

Updated fine.

Annoying:

Executing config_ldconf 
/sbin/ldconfig: /opt/glite/externals/lib/libswigpy.so.0 is not a symbolic link
/sbin/ldconfig: /opt/glite/externals/lib/libswigpl.so.0 is not a symbolic link
/sbin/ldconfig: /opt/glite/externals/lib/libswigtcl8.so.0 is not a symbolic link
.....
Also on upgrading the rpm's

>>>> Errors:

Executing config_gip_scheduler_plugin 
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.

Starting LocalLogger: edg-wl-interlogd and edg-wl-logd.    [  OK  ]
Failed to initialize input queue: input_queue_attach: error binding socket: Permission denied
This is LocalLogger, part of Workload Management System in EU DataGrid.Copyright (c) 2002 CERN, INFN and CE
SNET on behalf of the EU DataGrid.

Starting MAUI Scheduler:                                   [  OK  ]
Configuration Complete

It does not return the prompt, this problem is here for some time I had to ^C to get back the prompt.

[root@pprod03 ~]# /etc/init.d/edg-wl-locallogger status
edg-wl-logd (pid 24627) is running...
edg-wl-interlogd is stopped

Doesn't work a restart even after removing the /var/tmp/dg20*
Still same status even after a reboot. The CE is running in SLC4, this might be the cause of some of the problems found on the configuration.

Site Name: prague_cesnet_pps
Site Admin: Jan Svec

Upgrade: Everything went fine.

lcg-RB

Site Name: UKI-SOUTHGRID-BHAM-PPS
Site Admin: Yves Coppens

Old and new style configuration with:

/opt/glite/yaim/bin/yaim -c -s /etc/yaim/site-info.def -n RB

worked

-- Main.thackray - 01 Jun 2007

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2007-06-01 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback