EMI 1 (Kebnekaise) - Update 11 (15.12.2011)

The Update contains:

  • minor release for MPI v. 1.2.0 and UNICORE Gateway6 v. 4.2.0
  • revision release for StoRM SE, v. 1.8.1, UNICORE XUUDB, v. 1.3.2-3 and WMS, v. 3.3.4

glite-MPI v. 1.2.0, task #24435

What's new:

  • Support for processor and memory affinity for MPICH2
  • Improved configuration, now files can be in several directories and local definitions can be included.
  • Bug Fixes:
    • NFS4 is not detected as shared fs (H)
    • Cleanup not invoked with cptoshared method (M)
    • mpiexec detection needs to be done earlier (M)
    • mpich2 startup ignores MPI_MPICH2_MPIEXEC_PARAMS (M)
    • readlink -f is not portable (used in cptoshared filedist) (L)
  • Updated documentation with new features and configuration changes.

Deployment notes:

  • No special installation requirements, no need for restarting services.
  • YAIM plugin available for configuration in WN and CEs, reconfiguring the CE should fix
  • MPI execution requires the installation of additional MPI software (see manual) and may require also extra configuration of your CE and WN (e.g. passwordless SSH between nodes), check your batch system and MPI implementation for details
  • Accounting of MPI jobs requires a MPI implementation that provides tight integration with the batch system: OpenMPI in SGE and Torque/PBS (may require recompilation of packages) or MPICH/MPICH2 with OSC mpiexec in Torque/PBS.
  • Nagios plugins for MPI require the existence of a working mpi compiler.

Known Issues:

  • Fine grained process mapping is not supported with Slurm or Condor schedulers.
  • Maui versions prior to 3.3.4 do not allocate correctly all the nodes for the execution of jobs (see GGUS #67870 and GGUS #57828).
  • MPD startup of MPICH2 jobs may fail if the .mpd.conf file is not properly copied to all WN (bug #49)
  • MPICH2 with hydra only supports ssh (bug #51)

Updated Artefacts

Binary
mpi-start-1.2.0-1.noarch.rpm
glite-yaim-mpi-1.1.10-1.noarch.rpm
Binary tarball
mpi-start-1.2.0.tar.gz
glite-yaim-mpi-1.1.10.tar.gz
Sources RPM
mpi-start-1.2.0-1.src.rpm
glite-yaim-mpi-1.1.10-1.src.rpm
Sources tarball
mpi-start-1.2.0.src.tar.gz
glite-yaim-mpi-1.1.10.src.tar.gz

StoRM SE v. 1.8.1, task #24381

StoRM service is composed by 5 sub-components (4 servers: BE, FE, GridFTP, GridHTTPS, Checksummer and 1 client), which are deployable in distinguished hosts. This release represents the third of StoRM-SE in EMI-1. The first release was the v 1.7.0, the first update was the v1.7.1 and second update was the v1.8.0. This update (v1.8.1) regards the following components: BE, GridHTTPS-S (server) and GridHTTPS-P (plugin). The other components are the same of the previous version v1.8.0: Client, GridFTP and FE. Checksummer service will be released soon.

StoRM-BackEnd-Server

What's new

  • Server version is 1.8.1-2. It is a revision update (only bug fixes)
  • Fixed bugs:
    • @194: StoRM backend stop command fails il a zombie backend process exists
    • @191: Execution of mmlsquota is slow in comparison to the performance by the command line
    • @190: StoRM BackEnd memory consumption grows indefinitely
    • @189: srmGetSpaceMetadata fails if user proxy has not voms extensions
    • @168: StoRM backend does not update correctly storage space data upon namespace.xml updates
    • @160: BackEnd service name (in service status)
    • @153: StoRM BE info service fails when non mandatory parameters are missing
  • New (minor) Feature:
    • @192: Smart mmlsquota execution (features introduced during the fix of @191)

Installation and configuration

  • To install storm-backend-server, install its metapackage RPM: emi-storm-backend-server-mp
  • To configure storm-backend-server use yaim on the node se_storm_backend.
  • The System Administration guide contains any useful information about installation and configuration.
  • Moreover, a good explanation of the required YAIM variable is available in /opt/glite/yaim/examples/siteinfo/services/se_storm_backend
    • If you configure other services with YAIM on the same host always specify also se_storm_backend node
  • To upgrade storm-backend-server:
    • stop the backend service,
    • yum update storm-backend-server
    • service storm-backend-server restart.
    • restart all the other storm services (not required, but recommended)

Known issues

  • None

StoRM-GridHTTPS-Server

What's new

  • Server version is 1.0.5-2. It is a revision update (only bug fixes).
  • Fixed bugs: * @199: service initialization can fail due to a deadlock
  • New Feature (packaging, not functional):
    • @85: modify etics configuration to provide source rpm

Installation and configuration

  • To install storm-gridhttps-server install its metapackage RPM: emi-storm-gridhttps-mp
  • To configure storm-gridhttps-server use yaim on node se_storm_gridhttps. A good explanation of the required YAIM variable is available in /opt/glite/yaim/examples/siteinfo/services/se_storm_gridhttps
  • The service uses ports 8443 and (by default) 8080, so open them on your firewall.
  • The service needs to be installed on a machine on which storm file system is mounted.
  • To upgrade torm-gridhttps-server:
    • stop the service,
    • yum update storm-gridhttps-server
    • service storm-gridhttps-server restart.
    • restart all the other storm services (not required, but recommended)

Known issues

  • None

StoRM-GridHTTPS-Plugin

What's new

  • Plugin version is 1.0.3-4. It is a revisione update.
  • The changes affect the phase of building
  • New Feature (building, not functional):
    • @200: this component uses a beta version of maven-assembly-plugin

Installation and configuration

  • This component is installed as a dependency of BE. You should not install it by yourself.
  • To upgrade storm-gridhttps-plugin:
    • stop the backend service,
    • yum update storm-backend-server
    • service storm-backend-server restart.
    • restart all the other storm services (not required, but recommended)

Known issues

  • None

Documentation

What's new

Updated Artefacts

Binary
storm-backend-server-1.8.1-2.sl5.x86_64.rpm
storm-gridhttps-plugin-1.0.3-4.sl5.noarch.rpm
storm-gridhttps-server-1.0.5-2.sl5.noarch.rpm
Binary tarball
storm-backend-server-1.8.1-2.sl5.x86_64.tar.gz
storm-gridhttps-plugin-1.0.3-4.sl5.noarch.tar.gz
storm-gridhttps-server-1.0.5-2.sl5.noarch.tar.gz
Sources RPM
storm-backend-server-1.8.1-2.sl5.src.rpm
storm-gridhttps-plugin-1.0.3-4.sl5.src.rpm
storm-gridhttps-server-1.0.5-2.sl5.src.rpm
Sources tarball
storm-backend-server-1.8.1.tar.gz
storm-gridhttps-plugin-1.0.3.tar.gz
storm-gridhttps-server-1.0.5.tar.gz

UNICORE Gateway6 v. 4.2.0, task #24173

What's new:

  • Fixed logging of connection errors (more details in case of failures, clear expiration messages)
  • MDC context is used in the default log configuration
  • Enhancement: publish version on monkey page (SF feature# 3368939)
  • fix: throw fault if vsite returns an HTTP error (SF bug #3314648)
  • improvement: auto-detect keystore/truststore type
  • Added a possibility to configure the maximum SOAP header size.
  • Documentation was updated.

Installation and configuration:

  • This update changes the package name and version: unicore-gateway-6.4.x is now released as unicore-gateway6-4.x.0. After the update it is advised to always use the 'unicore-gateway6' package name.
  • This update changes some of the default configuration files (in particular gateway.properties and logging.properties). After update (as always) administrator must check for the .rpmnew and .rpmsave files in /etc/unicore/gateway directory, and merge the local settings.
  • After the update a service restart is necessary.

Known issues:

  • NONE

Updated Artefacts

Binary
unicore-gateway6-4.2.0-0.noarch.rpm
Binary tarball
unicore-gateway6-4.2.0-0.tar.gz
Sources RPM
unicore-gateway6-4.2.0-0.src.rpm
Sources tarball
unicore-gateway6-4.2.0-0.src.tar.gz

UNICORE XUUDB v. 1.3.2-3, task #24178

What's new:

  • The default logging configuration file is fixed, so log files are rotated every day and its changes are properly detected at run time.

Installation and configuration:

  • Default logging.properties file was changed so it is very likely that previous changes need to be manually merged. As always check for any .rpmnew and .rpmsave files in the '/etc/unicore/xuudb' directory.

Know issues:

  • NONE

Updated Artefacts

Binary
unicore-xuudb-1.3.2-3.noarch.rpm
Binary tarball
unicore-xuudb-1.3.2-3.tar.gz
Sources RPM
unicore-xuudb-1.3.2-3.src.rpm
Sources tarball
unicore-xuudb-1.3.2-3.src.tar.gz

WMS v. 3.3.4, task #22847

What's new

This is an update of WMS basically fixing what follows:

  • A problem with the proxy purger cron job which prevented, on ext3, new delegations for a certain DN after its 31999th submission
  • A glitch in querying data catalogues there were present also in the gLite versions
  • Wrong ownership of /var and /var/log directories which were owned by the glite user. Please note that the fix applies to new installations. In case of update the ownership of these directories must be fixed manually (see below)
  • Issues with sandbox purging by the LogMonitor
  • A catch in ICE concerning job status change detection
  • GLUE2 publication. This also allows the publication of the WMS and EMI middleware release
  • JeMalloc, an optimized memory allocator, is automatically installed and used by the WM module
  • This update introduces also the Nagios probe for WMS. Documentation available at WMSProbe

Installation and configuration

Yaim (re)configuration is required after installation/update. In case of co-location with the L&B, make sure that yaim is run also for the latter.

  • In case of update, before applying the update stop all the services. After the update (after yum update and after yaim configuration):
    • set the ownership of the directories /var and /var/log to root.root
    • execute the cron job /etc/cron.d/glite-wms-create-host-proxy.cron
  • In case of a clean install, after yaim configuration:
    • execute the cron job /etc/cron.d/glite-wms-create-host-proxy.cron

  • In the configuration file /etc/glite-wms/glite_wms.conf the following two parametes for the load_monitor must always have the same values, even if different from the default ones:
jobSubmit  = "${WMS_LOCATION_SBIN}/glite_wms_wmproxy_load_monitor --oper jobSubmit --load1 22 --load5 20 --load15 18 --memusage 99 --diskusage 95 --fdnum 1000 --jdnum 150000  --ftpconn 300";
jobRegister  =  "${WMS_LOCATION_SBIN}/glite_wms_wmproxy_load_monitor --oper jobRegister --load1 22 --load5 20 --load15 18 --memusage 99 --diskusage 95 --fdnum 1000 --jdnum 150000 --ftpconn 300";

The admin guide has been updated with the following information:

  • The location of the drain file has changed with respect to the gLite version
  • Configuration tips in case colocation with the LB
Check out the guide for more details.

Known issues

As said above, this update fixes a problem on ext3 with the proxy purger preventing new delegations for a certain user after her 31999th submission. Even with this fix, there are at any rate some minor issues on ext3:

  • a: there can be at most 31999 different users submitting to a certain WMS (very unlikely)
  • b: there can be at most 31999 valid (i.e. not expired) proxies for a certain user (the purging of the expired proxy files is done on the WMS by a cron job which runs every 6 hours)

To definitely avoid this issues, ext4 is needed. Nonetheless, re-creating new delegations each time without reusing them has to be considered a bad and not supported practice, so that what mentioned here only for the sake of clarity should be actually considered a non-issue more than a known issue.

  • In case of WMS+LB colocation, error "no state in DB" on glite-wms-job-submit means that the WMS has not been authorized in the L&B authorization file. Refer to the 'Installation and configuration" section above to fix this issue.
  • If there are problems with purging, or the first output retrieval gives "Warning - JobPurging not allowed (CA certificate verification failed)", that means that the host proxy certificate has not been installed. Refer to the "Installation and configuration" section above to fix this issue.

On very high loads, kind of submitting 1k/2k collections of 25 nodes with a frequency of 60 seconds, the following call to the L&B might require a not trivial time (wmproxy.log):

"WMPEventlogger::registerSubJobs": Registering DAG subjobs to LB Proxy..

If this time is long enough to exceed the mod_fcgid IPCCommTimeout parameter, then the request will be terminated with the consequence of leaving the collection pending forever. Other than increasing IPCCommTimeout, check that L&B is properly doing its purging. Especially when in 'both' mode, the admin can act on the purging policy to make it more frequent. In proxy mode, the admin can eventually decide to turn off automatic purging of the jobs in terminal state (-G option of glite-lb-bkserverd).

  • The 'job feedback', a newly introduced feature to replan jobs stuck at blocking queues, is far from being perfect, at this stage. Its use is encouraged to give us 'feedback'. The feedback is a mechanism that relies on the existence of a global synchronization token, so requiring the shallow resubmission feature enabled. For this reason, it must not be used with the deep resubmission enabled.

Updated Artefacts

Binary
glite-wms-ice-3.3.4-1.sl5.x86_64.rpm
glite-wms-jobsubmission-3.3.2-1.sl5.x86_64.rpm
glite-wms-wmproxy-3.3.4-2.sl5.x86_64.rpm
glite-wms-brokerinfo-3.3.2-1.sl5.x86_64.rpm
glite-wms-configuration-3.3.2-1.sl5.x86_64.rpm
glite-wms-ism-3.3.2-1.sl5.x86_64.rpm
glite-wms-matchmaking-3.3.2-1.sl5.x86_64.rpm
glite-wms-purger-3.3.2-1.sl5.x86_64.rpm
glite-yaim-wms-4.1.4-3.sl5.noarch.rpm
emi-wms-nagios-1.0.0-1.noarch.rpm
glite-jdl-api-cpp-3.2.6-1.sl5.x86_64.rpm
emi-wms-1.0.2-0.sl5.x86_64.rpm
Binary tarball
glite-wms-ice-3.3.4-1.tar.gz
glite-wms-jobsubmission-3.3.2-1.tar.gz
glite-wms-wmproxy-3.3.4-2.tar.gz
glite-wms-brokerinfo-3.3.2-1.tar.gz
glite-wms-configuration-3.3.2-1.tar.gz
glite-wms-ism-3.3.2-1.tar.gz
glite-wms-matchmaking-3.3.2-1.tar.gz
glite-wms-purger-3.3.2-1.tar.gz
glite-yaim-wms-4.1.4-3.tar.gz
glite-jdl-api-cpp-3.2.6-1.tar.gz
emi-wms-1.0.2-0.tar.gz
Sources RPM
glite-wms-ice-3.3.4-1.sl5.src.rpm
glite-wms-jobsubmission-3.3.2-1.sl5.src.rpm
glite-wms-wmproxy-3.3.4-2.sl5.src.rpm
glite-wms-brokerinfo-3.3.2-1.sl5.src.rpm
glite-wms-configuration-3.3.2-1.sl5.src.rpm
glite-wms-ism-3.3.2-1.sl5.src.rpm
glite-wms-matchmaking-3.3.2-1.sl5.src.rpm
glite-wms-purger-3.3.2-1.sl5.src.rpm
glite-yaim-wms-4.1.4-3.sl5.src.rpm
emi-wms-nagios-1.0.0-1.src.rpm
glite-jdl-api-cpp-3.2.6-1.sl5.src.rpm
emi-wms-1.0.2-0.sl5.src.rpm
Sources tarball
glite-wms-ice-3.3.4-1.src.tar.gz
glite-wms-jobsubmission-3.3.2-1.src.tar.gz
glite-wms-wmproxy-3.3.4-2.src.tar.gz
glite-wms-brokerinfo-3.3.2-1.src.tar.gz
glite-wms-configuration-3.3.2-1.src.tar.gz
glite-wms-ism-3.3.2-1.src.tar.gz
glite-wms-matchmaking-3.3.2-1.src.tar.gz
glite-wms-purger-3.3.2-1.src.tar.gz
glite-yaim-wms-4.1.4-3.src.tar.gz
emi-wms-nagios-1.0.0.src.tgz
glite-jdl-api-cpp-3.2.6-1.src.tar.gz
emi-wms-1.0.2-0.src.tar.gz

-- DoinaCristinaAiftimiei - 14-Dec-2011

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2011-12-14 - DoinaCristinaAiftimiei
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EMI All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback