Release notes for the gLite 3.0.2 WMS Checkpoint release

These notes describe the WMS checkpoint release with VDT 1.2.x for SL3

Release notes for the gLite 3.0.2 WMS Checkpoint release - patch 1251

These notes describe the WMS checkpoint release - patch 1251 with VDT 1.2.x for SL3

Installation

Please use the following repository to install

http://grid-deployment.web.cern.ch/grid-deployment/glite/public/cert/3.0/wms-pretest-patch1251/rhel30/

The following meta-packages are available

  • glite-WMS
  • glite-LB
  • glite-WMSLB

To use apt-get, create glite.list in /etc/apt/sources.list.d with the following contents.

rpm http://grid-deployment.web.cern.ch/grid-deployment/glite/public/cert/3.0/wms-pretest-patch1251/rhel30/ i386 externals Release3.1 updates

For CAs, you may need the following apt-get repository (for example in lcg-ca.list)

rpm http://linuxsoft.cern.ch/ LCG-CAs/current production

Configuration

The YAIM configuration for this WMS checkpoint release doesn't differ from the 3.0.1 from the configuration point of view. But all python scripts have been replaced with bash scripts. All configuration files used with YAIM 3.0.1 should be compatible with the YAIM 3.1.0 The yaim 3.1 is currently in finalization state and there is still a list of known problems and imperfections we are fixing. There are several modifications between yaim 3.0.1 and 3.1:

  • the configure_node, install_node and run_function are obsoleted although they are still located in the "/opt/glite/yaim/scripts" directory, please do not use them The configuration will very probably fail. All these commands will be removed in the next release. Their functionality has been replaced by new command yaim which has been introduced in yaim 3.0.1 and in yaim version 3.1 it became the only way to configure gLite middleware using yaim.
  • changes in the yaim packaging. Yaim 3.1 has a modular structure (glite-yaim-core, glite-yaim-clients) in contrary to the monolithic distribution of yaim 3.0.1 (glite-yaim package)
  • added service and node based configuration
The detailed documentation for yaim 3.1 is currently prepared and will be accessible soon from the : yaim 3.1 page.

Regarding WMS and LB configuration, when LB and WMS are installed on separate nodes, you should add a variable LB_HOST in your site-info.def, for example, like

LB_HOST=<LB_HOSTNAME>

or

LB_HOST="<LB1_HOSTNAME> <LB2_HOSTNAME> <...> <...>"

if you have multiple LBs (may change it to require only LB hostname). Here port for LB server is supposed to be 9000. If you use different port than 9000, please append the port after the hostname, which uses different port than 9000, like "LB1_HOSTNAME:port". Please configure your WMS and LB as follows

 
/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n glite-WMS

 
/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n glite-LB

Or combine them together with

/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n glite-WMS -n glite-LB

Please don't use glite-WMSLB node type to configure. If WMS and LB are on separate machines, please add the DN of WMS into /opt/glite/etc/LB-super-users on LB and restart LB server by "/opt/glite/etc/init.d/glite-lb-bkserverd restart". One LB can serve multiple WMS'.

Known issues

  • On LB, there is a mistyping in the logrotate config file, /etc/logrotate.d/lb-purger, "missingok" should be in the next line.
  • Don't use the c-ares from OS, instead, install the one from WMS+LB repository (here). If it is installed, please remove the one from OS. Need to check if it is CERN specific or not.

  • If for some reason, any of the condorc scripts (condorc-launcher/condorc-advertiser/condorc-authorizer) gets hold or is removed, restarting service gLite does not bring it back to live. Two workarounds are possible: running configuration again or executing 'su $GLITE_USER -c /opt/condor-c/libexec/glite/condorc-initialize'.

Release notes for the gLite 3.0.2 WMS Checkpoint release - patch 1167

Installation

Please use the following repository to install

http://lxb2042.cern.ch/gLite/APT/R3.1-RB-pretest/rhel30/

The following meta-packages are available

  • glite-WMS
  • glite-LB
  • glite-WMSLB

To use apt-get, create glite.list in /etc/apt/sources.list.d with the following contents.

rpm http://lxb2042.cern.ch/gLite/APT/R3.1-RB-pretest rhel30 externals Release3.1 updates

For CAs, you may need the following apt-get repository (for example in lcg-ca.list)

rpm http://linuxsoft.cern.ch/ LCG-CAs/current production

Configuration

The YAIM configuration for this WMS checkpoint release doesn't differ from the 3.0.1 from the configuration point of view. But all python scripts have been replaced with bash scripts. All configuration files used with YAIM 3.0.1 should be compatible with the YAIM 3.1.0 The yaim 3.1 is currently in finalization state and there is still a list of known problems and imperfections we are fixing. There are several modifications between yaim 3.0.1 and 3.1:

  • the configure_node, install_node and run_function are obsoleted although they are still located in the "/opt/glite/yaim/scripts" directory, please do not use them The configuration will very probably fail. All these commands will be removed in the next release. Their functionality has been replaced by new command yaim which has been introduced in yaim 3.0.1 and in yaim version 3.1 it became the only way to configure gLite middleware using yaim.
  • changes in the yaim packaging. Yaim 3.1 has a modular structure (glite-yaim-core, glite-yaim-clients) in contrary to the monolithic distribution of yaim 3.0.1 (glite-yaim package)
  • added service and node based configuration
The detailed documentation for yaim 3.1 is currently prepared and will be accessible soon from the : yaim 3.1 page.

Regarding WMS and LB configuration, when LB and WMS are installed on separate nodes, you should add a variable LB_HOST in your site-info.def, for example, like

LB_HOST='"<LB_HOSTNAME>:9000"'

or

LB_HOST='"<LB1_HOSTNAME>:9000","<LB2_HOSTNAME>:9000","...","..."'

if you have multiple LBs. Please configure your WMS and LB as follows

 
/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n glite-WMS

 
/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n glite-LB

Or combine them together with

/opt/glite/yaim/bin/yaim -c -s <site-info.def> -n glite-WMS -n glite-LB

Known issues

  • LB node: There is a bug in bkserverd which causes memory leak. This happens when LB server DB is configured with transactional database support. The workaround is to modify /opt/glite/etc/init.d/glite-lb-bkserverd by add "-b 0" into the start procedure
Ex:
su - $GLITE_USER -c "$GLITE_LOCATION/bin/glite-lb-bkserverd -b 0 \
Instead of
su - $GLITE_USER -c "$GLITE_LOCATION/bin/glite-lb-bkserverd \
and then restart bkserverd. This will disable transactional database support. See bug #27555.

  • Bug 25932 is still not fixed in this version of LB server, when you install LB and WMS on the same node, you also need to add the host DN into /opt/glite/etc/LB-super-users as the separate LB and WMS
 
Ex: (for 2 separate machines, lxb7283 - WMS, lxb7026 - LB) 
cat /opt/glite/etc/LB-super-users # On LB
/C=CH/O=CERN/OU=GRID/CN=host/lxb7283.cern.ch
/C=CH/O=CERN/OU=GRID/CN=host/lxb7026.cern.ch

  • Need to set the permissions of directory /var/lib/mysql manually: chmod og+rx /var/lib/mysql/ (it may only happen on the machines at CERN, but it would be good that YAIM can check it). See bug #27653.

  • /etc/glite/profile.d/glite_setenv.sh contains gridpath_append and gridenv_set commands before sourcing file /opt/glite/etc/profile.d/grid-env-funcs.sh. This is a bug in YAIM core. On the production machines at CERN, by default, LANG is set to C, then this bug is triggered. It doesn't happen if LANG is set to en.US or some others (example: export LANG=en_US.UTF-8). See bug #27577.

  • The packages bdii-3.8.8-1 and glue-schema-1.2.2-1_sl3, installed with the glite-WMS meta-package, are out of date with the yaim configuration scripts. The only possible work around is to upgrade manually to versions bdii-3.9.0-1 and glue-schema-1.3.0-2. See bug #27655. Fixed.

  • LB node: The glite-LB metapackage does not install the rpms required by function config_bdii. This will produce the following error in the configuration of a stand-alone LB:
INFO: Executing function: config_bdii
error reading information on service bdii: No such file or directory
bdii: unrecognized service
bdii: unrecognized service
The workaround is to install manually all necessary rpms: bdii; glue-schema; lcg-info-templates; lcg-schema. See bug #27656.

  • Directory /opt/glite/var/log/ does not exist (can not create file /opt/glite/var/log/xferlog) on WMS. See bug #18306.

  • Don't use the c-ares from OS, instead, install the one from WMS+LB repository (here). If it is installed, please remove the one from OS. Need to check if it is CERN specific or not.

  • lb101 is trying to connect to wms101 and the connection is blocked by the firewall on rb101:
Jun 29 19:21:28 wms101 kernel: [DENIED] IN=eth0 OUT= MAC=00:30:48:68:ed:f8:0a:00:30:81:ad:81:08:00 SRC=137.138.4.182 DST=128.142.173.15
3 LEN=40 TOS=0x00 PREC=0x00 TTL=60 ID=0 DF PROTO=TCP SPT=9001 DPT=40334 WINDOW=0 RES=0x00 RST URGP=0
Is it expected ? Mail sent to the developers.

  • Need to modifiy some parameters of the MySQL database in order to remove the 4GB limitation:
               ALTER TABLE short_fields MAX_ROWS=1000000000; 
               ALTER TABLE long_fields MAX_ROWS=55000000;
               ALTER TABLE states MAX_ROWS=9500000;
               ALTER TABLE events MAX_ROWS=175000000; 
See bug #27658.

  • If for some reason, any of the condorc scripts (condorc-launcher/condorc-advertiser/condorc-authorizer) gets hold or is removed, restarting service gLite does not bring it back to live. Two workarounds are possible: running configuration again or executing 'su $GLITE_USER -c /opt/condor-c/libexec/glite/condorc-initialize'.

  • Need to set to a bigger value (eg. 15) the threshold for which no new job submission is authorized because of a high load (see glite_wms.conf file).

  • 2007-07-24: Since yesterday afternoon, I noticed a high cpu utilization of lb101 (~100%), and it was not the case until now (see lemon monitoring web page). After some investigations with Di, we found out that the bkserverd processes are crashing and SIGSEGV signal is triggered. So, as far as I understand, each time a bkserverd process is crashing, a new one is quickly created, causing this high cpu utilization. Zdenek Salvet has been contacted and he thinks that the crashes are caused by wrong LB server RPM being installed, the right match for glite-lb-common-5.0.3-1 is glite-lb-server-1.5.5-1. He solved the problem without installing the new rpm, as requested by me.

Possibly related issues

* Command glite-wms-job-logging-info may fail due to segmentation fault (LB client) if using a glite 3.0 UI. This generally happens because UI and LB rpms differ. We strongly recommend the use glite 3.1 UI on SL4.

-- Main.nvazdasi - 11 Jul 2007

Edit | Attach | Watch | Print version | History: r20 < r19 < r18 < r17 < r16 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r20 - 2007-08-16 - DiQing
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback