Proposal for parallel installation of middleware clients
Objective
Make it as easy as possible for a site to install different versions of the middleware in parallel on the WNs and advertise their presence to the InfoSys.
Advantages
- Parallel Versions
- Allows easy rollback of problematic updates
- Supports the case where a particular update suits one experiment but not another (or even different applications within one VO).
- Example; patch #1641 introduced gfal-1.10.7 which
- Fixed some segfault problems
- Introduced bug #33288 (creation of non-existant sub-directories)
- Finer grained publishing of what’s installed
- Allows experiments to match properly and know what they're going to get when they land on a WN
- Manage the introduction of support for other platforms, compilers etc
- Timely deployment of updates
- This proposal does not explicitly address speedy rollout of updates, but in making updating a WN less risky and in allowing operational tracking of which versions of the WN are deployed it should be beneficial.
Mechanism
Supply the entire glite-WN as a single rpm, fully relocated and with the version embedded in the name (thus allowing parallel installations).
If the path can be properly fixed, eg
/opt/glite-WN-3.1.7-2
then the rpm can be made "zero-config", with the exception the maintenance of a single set of site-wide defaults in the environment such as LCG_GFAL_INFOSYS.
Publishing of the availability of
glite-WN-3.1.7-2
would need an info provider, installed on the node which is publishing subcluster information, which checks each subcluster for the presence of new updates.
We are not proposing to retire the current distribution method (single rpms), nor the tarball.
The integrated WN releases will also be made available to the application area, as simple versioned directory structures in the usual AFS space, thus allowing them to independently distribute the releases if they wish.
Matching
An application could check for
GlueHostApplicationSoftwareRunTimeEnvironment: GLITE-WN-3_1_7_2
if they needed a particular version of a library or client which was contained in that release. Frameworks could implement the
> 3.1.7-2
case.
rpm lists for WN versions are already available here;
http://glite.web.cern.ch/glite/packages/R3.1/deployment/glite-WN/glite-WN.asp
To use a certain version of the Worker Node you would add two attributes to the JDL:
MWVersion = "<WN_version>";
Requirements = Member("<WN_version>",other.GlueHostApplicationSoftwareRunTimeEnvironment);
An extension of this mechanism could be to allow the user to request 'no environment' which would result in the job having no grid specific settings in the environment. The application could then manage the environment itself.
Support for alternative python versions and compilers
We intend to introduce support for alternative python versions and compilers in the standard WN release. The recompiled libraries would be relocated and would not be configured by default, but would nevertheless be available for use. gLite would not distribute the interpreters or compile chain tools themselves.
Given that, for example, support for LFC bindings to python2.5 would be introduced in a particular glite-WN release, its availability at a site could be determined simply by checking the versions advertised as being available.
Requirements
A script, called
a1_grid_env.sh
, must be preinstalled on the WNs in
/etc/profile.d
. This script ensures that the
correct environment is set up for the job, and has been part of the release for a while. It would not be part of the proposed rpm.
Concerns
There would be inefficient use of disk storage as many packages would be duplicated.