LCGSCM Deployment Status

Certification

You can consult at any time our wiki pages about the work in progress in the integration and the modules development on :

Preproduction

Please find the last status report in our WorkLog (Last Edit: 2011-06-21 - 15:14 by AndresAeschlimann) (Last Edit: 2010-03-25 - 10:25 by UnknownUser)

April 30th 2008

Patches released to production with Update21:

  • PATCH:1800: New vdt_globus_jobmanager_common to fix globus-cass-cache problem on WN

Update20 was accelerated due to requirements coming from CCRC08. The installation issue with the CE was due to a mistake in the release preparation. The issue was fixed by Update21.

The dependency of the installation function from the new version of yaim-core version was not correctly set. Of course, as the correct version of yaim-core was already deployed in pre-production (but not in production) this issue was not visible for the pre-deployment testers in PPS. This particular issue could only have been trapped by a deployment test in production (currently not foreseen by the release procedure). When a release to production is prepared, in fact the patches are selected from preproduction and bundled together in a production update. Generally this is a subset of the patches installed in pre-production: possible dependencies among patches at this stage are recognised only by the documentation. Errors in the documentation cannot be trapped. In order to strengthen the process a further checkpoint in the release procedure should be inserted, which has always be rated too expensive in terms of elapsed time.

BTW: yaim-core was being held back in PPS because it forced a change in the permissions schema for the site-info.def and containing directory to be implemented at all sites, which was not rated acceptable for the operations. The issue found later on in production affecting the submission from CE3.0 to WN3.1 has another explanation. CE at version 3.1 has been in production for more than two months, which means that regression tests are not being done in certification. Pre-production runs, by mandate, the top version of the services

April 23th 2008

Patches released with Update20:

Summary

  • UI/WN/VOBOX: As new features, the new glite-data-gfal version (1.10.11-1) provides new functions gfal_abortrequest and gfal_abortfilesseveral, and the new glite-data-dm-util (lcg_util) version (1.6.11-1) now prints the SE type (SRMv1, SRMv2, Classic SE) in verbose mode (when relevant). They also fix several bugs, such as:
    • lcg-ls does not work for the classic SE
    • lcg-cr glibc memory corruption
    • gfal_stat seg. fault with dummy LFN
    • lcg-sd doesn't doesn't work with SRMv2 request token
    • lcg-gt segmentation fault

  • DPM/LFC v1.6.10: The new DPM/LFC version provides several new features and bug fixes, for example:
    • fix problem of replication of a zero-length file improve logging of updatefilestatus method
    • DICOM back-end service for DPM
    • producing re-buildable source RPMs
    • group writable directories when SRM started with umask 0
    • DPM-DSI: DPM's gridftp does not allow for ':' in SURL (GGUS ticket #32335)
    • support for CKSM (md5 only yet)

  • lcg-CE: Changes in Globus jobmanager and GASS cache. These modifications improve the performance of the lcg-CE by a factor of two to three.

April 16th 2008

  • gLite 3.1 Update19 was released to production yesterday with HIGH priority.
    All details of the update can be found in the release notes
    Of particular relevance:
    • update of the gite wms client introducing a security fix
    • new version of lcg-vomscert for version 3.1

  • gLite 3.0 Update48 to production is in preparation for this week
    The update will contain in particular:
    • new version of FTA changing the gridFTP session handling
    • new version of lcg-vomscert for version 3.0

  • A meeting was held with Atlas and CMS to kick-off a pilot WMS 3.1 service at CERN-PROD Details and timelines available at PPIslandKickOff2008x04x15.

January 15th 2008

Certification

  • Nothing new to report.

PPS

  • SL4 VO Box will be released to production repositories next week.

December 12th 2007

Certification

  • Nothing new to report.

PPS

  • SL4 (32 bit) VObox goes into the PPS this week.
  • SL4 (32 bit) DPM & LFC released to production this week.
  • gLite 3.1 Update 7 went into production this week but there was a problem with the new version of GFAL/LCG Utils which segfaults when used with a classic SE. The patch has been withdrawn and a recipe for downgrading given to those sites which have already upgraded. This was not caught in the PPS because this patch skipped the PPS and went straight to production.

November 14th 2007

Certification

  • Nothing new to report.

PPS

  • Released to production Monday this week:
    • gLite 3.1 (SL4) LCG CE
      • Gstat monitoring shows error state for new CE. This is a problem with Gstat and not the service. Will be fixed by the Gstat team within the next few days.
    • VOMS server certificate
      • Release was too close to the certificate expiry date. The reason for this is under investigation.
      • ALL middleware services and clients which access voms.cern.ch need to install the new certificate.

October 10th 2007

Certification

  • Nothing new to report.

PPS

  • Nothing new to report.

October 10th 2007

Certification

  • Certification of LCG CE for SL4 continues. Short delay due to the necessity of implementing functionality to handle grid-map files (it had been thought that only VOMS proxies would be needed).

PPS

  • Nothing new to report.

August 22st 2007

Certification

  • Stress testing SL4 lcg-CE (3000 jobs submitted with 100% success).

PPS

  • New gLite 3.1 (SL3) WMS is available in the PPS repository. Currently undergoing deployment testing. Instances should be ready for use by the experiments this week.
  • DPM security patch will be released today/tomorrow. The vulnerability has been classified as high risk by the security groups so the patch will go to PPS and production at the same time.

August 15th 2007

Certification

  • VOMS-admin testing has revealed some problems with the latest patch, mainly relating to the upgrade of the db schema and the functioning of the web interface.
  • A new, high priority DPM security patch is in certification.

PPS

  • Latest gLite 3.0 'experimental' WMS patch is being installed at CERN (see agenda of this week's SCM).
  • PPS UI on LXPLUS now points to the CNAF WMS. This was due to a hardware failure on the node hosting the WMS at the CERN PPS site.
  • Problems with setting up the UI on LXPLUS as LXPLUS runs on 64 bit machines which have no 32 bit libraries installed. The gLite 3.1 UI is built on SL4/32 bit. Currently identifying the missing libraries and installing them for the UI.
  • Fix released for FTS show-stopper bug (problem with service discovery). Released to PPS and prod service.
  • This week will also see the release of:
    • YAIM 3.1.1 for gLite 3.0
    • GFAL client 1.9.3-1 and LCG_util 1.5.2-1 (patch 1220)
    • Several R-GMA fixes

July 11th 2007

Certification

  • gLite 3.1 / SL4
    • 32bit
      • Work currently ongoing to package glite-SE_dpm_disk, glite-MON, glite-AMGA_mysql.
      • Testing of glite-BDII and glite-CE.
    • 64bit
      • Nothing to report. Still various packages missing for the Worker Node.
  • Configuration
    • Finalisation of modular yaim-3.1.1.
  • Testing
    • Support for 3.1 WMS (SL3) as released to CERN for production testing. A repository and configuration update will follow.
    • Testing of Cream CE.
    • Stress testing of 'glexec' glite-CE.
  • Patches currently with the certification team:
    • #735 lcg-mon-job-status, now uses stateEnterTime
    • #894 Rebranded GIP that includes inproved LDIF parsing
    • #901 Update of c-ares rpm
    • #911 VOMS Configuration update
    • #953 Configuration changes for VOMS
    • #970 [ VOMS ] New voms parameter -skipcacheck to be included in the config files
    • #1029 EGEE tomcat5 package security update
    • #1093 R-GMA gin fix for Bug #17323
    • #1123 Allow multi-node Normal jobs on lcg-RB and UI
    • #1132 Update lcg-mon-job-status to include new UI field
    • #1138 tool to renew voms proxies via myproxy server
    • #1164 APEL Update (glite-apel_R_2_0_17)
    • #1167 The necessary tags for a good wms updated to 15/05/07
    • #1173 Back-porting: blah 3.1 in branch 3.0 and fix for #26493
    • #1177 glite-gatekeeper startup fix on glite-CE
    • #1183 Top level BDII for glite 3.1
    • #1184 Site level BDII for glite 3.1
    • #1185 Replace GRIS with BDII for the lcg-CE
    • #1190 Gridview Web Service Client for SE (Castor, DPM, Classic)
    • #1197 VOMS admin server 2.0.3
    • #1199 Removal of CASTOR client dependencies
    • #1203 The necessary tags for a good wms updated to 21/06/07
    • #1211 R-GMA CLI fix for Bug #27504 Removed getAllTable call from init
    • #1212 Optimisations to the R-GMA registry as a fix for bug #27510
    • #1215 R3.1/SLC4/i386: LFC/DPM 1.6.5-3
    • #1219 fix for DENY tags to lcg-info-dynamic-scheduler
    • #1220 R3.0: GFAL/lcg_util release (glite-data_R_3_1_32_1)
    • #1221 R3.1: GFAL/lcg_util release (glite-data_R_3_1_32_1)
    • #1225 R3.1: proper service discovery packages for SLC4

PPS

  • Issues with gLite 3.1 UI.
  • glite3.0.2 PPS Update 34 released to PPS
    • It contains an urgent security patch for the DPM
  • PPS-RAL started submitting SAM tests. Together with PPS-CYFRONET they have now taken over CERN_PPS.
  • SRM 2.2 testing in progress.
  • FTS 2.0 testing in progress.
  • Patches currently in the PPS:
    • #1202 Package version update for gLite 3.1
    • #1208 CGSI_gSOAP 1.1.17-2

June 20th 2007

Certification

  • Several of the SA3 management team are at the JRA1 all hands meeting, so reporting is light this week.
  • Patches currently in certification:
    • #735 lcg-mon-job-status, now uses stateEnterTime
    • #980 Glue 1.3 Schema
    • #1093 R-GMA gin fix for Bug #17323
    • #1123 Allow multi-node Normal jobs on lcg-RB and UI
    • #1132 Update lcg-mon-job-status to include new UI field
    • #1138 tool to renew voms proxies via myproxy server
    • #1149 gfal 1.9.1
    • #1164 APEL Update (glite-apel_R_2_0_17)
    • #1167 The necessary tags for a good wms updated to 15/05/07
    • #1179 LFC/DPM 1.6.5-1
    • #1183 Top level BDII for glite 3.1
    • #1184 Site level BDII for glite 3.1
    • #1188 R3.1: LFC/DPM 1.6.5-1
    • #1195 R3.1: LFC/DPM 1.6.5-2
    • #1199 Removal of CASTOR client dependencies

PPS

  • YAIM 3.0.1-22 expected to be ready to go to PPS end of this week / beginning of next. Note: This is the last release of YAIM until 3.1.1 which is due to be ready for the PPS on 30th June.
  • Linking of the PPS to the OSG grid to create an interoperability "test-bed" is in progress. Expect to send first test jobs next week.
  • SRM 2.2 testing in progress.
  • FTS 2.0 testing in progress.
  • Patches currently in the PPS:
    • #1040 lcg-info-dynamic-lsf new version
    • #1044 lsf_local_submit_attributes.sh
    • #1117 New DPM GIP plugin.
    • #1151 dcache 1.7.0-35 upgrade including widely requested
    • #1152 glite-yaim-3.0.1-16
    • #1153 Prevent syslog writing to another filedescriptor in glite-lb-bkserverd
    • #1155 d-cache-lcg-6.2.0 fixes account mapping bug
    • #1174 New BDII with indexing

May 30th 2007

Certification

  • SA3 all hands meeting so a "light" week in testing
  • Stress testing Condor 6.8.5 and the glite-CE
  • Patches currently in certification:
    • #1177 glite-gatekeeper startup fix on glite-CE
    • #1175 YAIM function for configuring MPI
    • #1174 New BDII with indexing
    • #1173 Back-porting: blah 3.1 in branch 3.0 and fix for #26493
    • #1167 The necessary tags for a good wms updated to 15/05/07
    • #1165 glite-yaim-3.0.1-17
    • #1164 APEL Update (glite-apel_R_2_0_17)
    • #1155 d-cache-lcg-6.2.0 fixes account mapping bug
    • #1153 Prevent syslog writing to another filedescriptor in glite-lb-bkserverd
    • #1152 glite-yaim-3.0.1-16
    • #1151 dcache 1.7.0-35 upgrade including widely requested
    • #1149 gfal 1.9.1
    • #1144 R-GMA Server fix for bugs #21558, #20090 and #23052
    • #1138 tool to renew voms proxies via myproxy server
    • #1132 Update lcg-mon-job-status to include new UI field
    • #1124 R-GMA Server fix for NumberFormatError
    • #1123 Allow multi-node Normal jobs on lcg-RB and UI
    • #1117 New DPM GIP plugin.
    • #1098 util-java and trustmanager update
    • #1093 R-GMA gin fix for Bug #17323
    • #1079 Missing dependency for glite-CE
    • #1062 Condor 6.8.4
    • #1046 Condor plugin for lcg-info-dynamic-scheduler
    • #1044 lsf_local_submit_attributes.sh
    • #1040 lcg-info-dynamic-lsf new version
    • #1029 EGEE tomcat5 package security update
    • #980 Glue 1.3 Schema
    • #970 [ VOMS ] New voms parameter -skipcacheck to be included in the config files
    • #953 Configuration changes for VOMS
    • #911 VOMS Configuration update
    • #901 Update of c-ares rpm
    • #898 LCG-CE modifications for DGAS support
    • #894 Rebranded GIP that includes inproved LDIF parsing
    • #766 Reassigned item: edg-utils is now a meta-package with a dependency on fetch-crl
    • #735 lcg-mon-job-status, now uses stateEnterTime

PPS

May 16th 2007

Certification

  • gLite 3.1 / SL4
    • 32bit
      • two new UI bugs found - 26188, 26220 - both dependency problems, both being addressed urgently by developers.
      • Work on relocatable (tarball) UI progressing.
    • 64bit
      • build at 79%, no client nodes complete (therefore no testing)
    • Work on documentation and web area for production release of 3.1 WN
  • Other Testing
    • VOMS-admin 2.0 in testing
  • Patches certified since last report:
    • #1079 Mising dependency for glite-CE
    • #1121 LFC/DPM 1.6.4-3
    • #1124 R-GMA Server fix for NumberFormatError
    • #1144 R-GMA Server fix for bugs #21558, #20090 and #23052
    • #1156 lcg-vomscerts-4.5.0 has new cert for lcg-voms.cern.ch
  • Patches currently in certification:
    • #735 lcg-mon-job-status, now uses stateEnterTime
    • #1044 lsf_local_submit_attributes.sh
    • #1117 New DPM GIP plugin.
    • #1123 Allow multi-node Normal jobs on lcg-RB and UI
    • #1132 Update lcg-mon-job-status to include new UI field
    • #1138 tool to renew voms proxies via myproxy server
    • #1155 d-cache-lcg-6.2.0 fixes account mapping bug
    • #1164 APEL Update (glite-apel_R_2_0_17)

PPS

  • Integration of SRM2.2 test SEs into the PPS progressing:
    • CERN_PPS is for the time being publishing end-points in US in the information system
    • SAM tests are being summitted to all published SRMs.
    • Atlas transmitted some requirements on FTS channels for preliminary tests.
  • gLite 3.0 PPS-Update 30 released to PPS on Tuesday. This contained the following patches:
    • #1156 lcg-vomscerts-4.5.0 has new cert for lcg-voms.cern.ch
  • gLite 3.0 Update 24 released to production. This contains:
    • #1086 RGMA Client Exception Addition
    • #1089 Removal of incorrect apel-* deps from metapackages
    • #1113 lcg-infosites obsoleting lcg-info-api-ldap
    • #1120 FTS 2.0 (update)
    • #1121 LFC/DPM 1.6.4-3
    • #1133 glite-yaim 3.0.1-13
    • #1147 glite-yaim-3.0.1-15
    • #1156 lcg-vomscerts-4.5.0 has new cert for lcg-voms.cern.ch
  • Patches currently in the PPS:
    • #898 LCG-CE modifications for DGAS support
    • #1046 Condor plugin for lcg-info-dynamic-scheduler
    • #1079 Missing dependency for glite-CE
    • #1124 R-GMA Server fix for NumberFormatError
    • #1144 R-GMA Server fix for bugs #21558, #20090 and #23052

May 9th 2007

Certification

  • gLite 3.1 / SL4
    • All node types still have packaging problems. We are awaiting a bugfix for issues with the ETICS client to resolve some of these problems.
    • Some dependencies on Globus still need to be updated (eg #25703).
    • A new version of the WN is being prepared which addresses all problems discovered in PPS.
    • The UI has only one outstanding bug left (25819) before a release can be made to PPS.
    • Progress on all node types can now be followed from the integration dashboard: http://grid-deployment.web.cern.ch/grid-deployment//cgi-bin/reports.cgi?action=index
  • Other Testing
    • Testing gLiteCE under production environment - 20 WN in a total of 200 Virtual processors handled by a single gLiteCE.
    • Testing voms-admin 2.0.2 installation in SLC4 and SLC3.
  • Patches certified since last report:
    • #1040 (lcg-info-dynamic-lsf new version)
    • #1086 (RGMA Client Exception Addition)
    • #1089 Removal of incorrect apel-* deps from metapackages
    • #1113 lcg-infosites obsoleting lcg-info-api-ldap
    • #1120 FTS 2.0 (update)
    • #1079 Missing dependency for glite-CE
    • #1121: LFC/DPM 1.6.4-3
    • #1124 R-GMA Server fix for NumberFormatError
    • #1133 (glite-yaim 3.0.1-13
    • #1144 R-GMA Server fix for bugs #21558, #20090 and #23052
    • #1147 YAIM patch for DPM
  • Patches currently in certification:
    • #735 lcg-mon-job-status, now uses stateEnterTime - ON HOLD
    • #1044 lsf_local_submit_attributes.sh
    • #1047 APEL Update (glite-apel_R_2_0_12)
    • #1117 New DPM GIP plugin.
    • #1123 Allow multi-node Normal jobs on lcg-RB and UI
    • #1132 Update lcg-mon-job-status to include new UI
    • #1138 tool to renew voms proxies via myproxy server
    • #1145 glite-yaim-3.0.1-14
    • #1151 dcache 1.7.0-35 upgrade including widely requested
    • #1152 glite-yaim-3.0.1-16
    • #1153 Prevent syslog writing to another filedescriptor in glite-lb-bkserverd
    • #1155 d-cache-lcg-6.2.0 fixes account mapping bug
    • #1156 lcg-vomscerts-4.5.0 has new cert for lcg-voms.cern.ch

PPS

  • Configuration of PPS for SRM2.2 testing is well under way.
  • PPS site at Valencia (IFIC) have installed a patch to allow ATLAS to certify the fix for the DENY tag in VOViews.
  • Work in progress to link the PPS to the OSG ITB (integration testbed) for interoperability testing.
  • gLite 3.0 PPS-Updates 28 and 29 deployed to the PPS. This update contains the following patches:
    • #898 LCG-CE modifications for DGAS support
    • #1046 Condor plugin for lcg-info-dynamic-scheduler
    • #1079 Missing dependency for glite-CE
    • #1086 RGMA Client Exception Addition
    • #1089 Removal of incorrect apel-* deps from metapackages
    • #1113 lcg-infosites obsoleting lcg-info-api-ldap
    • #1120 FTS 2.0 (update)
    • #1121 LFC/DPM 1.6.4-3
    • #1124 R-GMA Server fix for NumberFormatError?
    • #1133 glite-yaim 3.0.1-13
    • #1144 R-GMA Server fix for bugs #21558, #20090 and #23052
    • #1147 glite-yaim-3.0.1-15
  • gLite 3.0 Update 23 released to production. This contains:
    • #1118 lcg-vomscerts-4.4.1 has correct cert for biomed/egeode
    • #1115 New version of lcg-info with support for VOViews, sites and services
    • #1110 Dcache 1.7.0-34 upgrade with GridFTP bug fixes
    • #1108 glite-yaim 3.0.1-12
    • #1069 edg-mkgridmap-2.9.0 fixes bug 24343
  • This week the SL4_compat WN (SL3 WN made to run on SL4) was released to production.

April 25th 2007

Certification

  • Patches certified since last report:
    • #1086: RGMA Client Exception Addition
    • #1040: lcg-info-dynamic-lsf new version (performance improvements for CEs using LSF)
  • Patches currently in certification:
    • #1138 tool to renew voms proxies via myproxy server
    • #1133 glite-yaim 3.0.1-13
    • #1128 Wmproxy authZ/authN issues
    • #1124 R-GMA Server fix for NumberFormatError
    • #1123 Allow multi-node Normal jobs on lcg-RB and UI
    • #1121 LFC/DPM 1.6.4-3
    • #1116 The necessary tags for a good wms
    • #1107 Change periodic_hold expression for condor-c/gliteCE jobs
    • #1096 fixes in wms.client
    • #1089 Removal of incorrect apel-* deps from metapackages
    • #1079 Missing dependency for glite-CE
    • #1075 list match works fine
    • #1044 lsf_local_submit_attributes.sh
    • #938 a better dependencies handling in wms and wms-ui
    • #735 lcg-mon-job-status, now uses stateEnterTime

PPS

  • PPS-Update 28 not yet released to the PPS. The reason for this is being investigated with the integration team.
  • gLite 3.0 Update 22 released to production contained the following patches:
    • #1101 GFAL 1.9.0-2/lcg_utils 1.5.1-1
    • #1088 Removal of incorrect myproxy deps in glite-SE* metapackages
    • #1066 yaim 3.0.1-10
    • #1052 fix reading LB super-users file
    • #915 Voms 1.7 (update from 1.6)
  • Patches currently in the PPS:
    • #1118 lcg-vomscerts-4.4.1 has correct cert for biomed/egeode
    • #1115 New version of lcg-info with support for VOViews, sites and services
    • #1110 Dcache 1.7.0-34 upgrade with GridFTP bug fixes
    • #1108 glite-yaim 3.0.1-12
    • #1069 edg-mkgridmap-2.9.0 fixes bug 24343

April 18th 2007

Certification

  • gLite 3.1 / SL4
    • The gLite 3.1 / SL4 Worker Node was released to PPS.
    • Work is now concentrating on the UI, where there are 15 config bugs, 4 packaging errors and 4 runtime bugs.
    • Work has also begin on installation tests for the ETICS built gLite 3.1 WMS. There are 4 outstanding runtime issues.
  • Patches certified since last report:
    • #1115: New version of lcg-info with support for VOViews, sites and services
    • #1110: Dcache 1.7.0-34 upgrade with GridFTP bug fixes
    • #1108: glite-yaim 3.0.1-12
    • #1046: Condor plugin for lcg-info-dynamic-scheduler
    • #898: LCG-CE modifications for DGAS support
  • Patches currently in certification:
    • #1124 R-GMA Server fix for NumberFormatError
    • #1123 Allow multi-node Normal jobs on lcg-RB and UI
    • #1121 LFC/DPM 1.6.4-2
    • #1120 FTS 2.0 (update)
    • #1089 Removal of incorrect apel-* deps from metapackages - ON HOLD
    • #1086 RGMA Client Exception Addition
    • #1079 Missing dependency for glite-CE
    • #1075 list match works fine
    • #1044 lsf_local_submit_attributes.sh
    • #1040 lcg-info-dynamic-lsf new version
    • #938 a better dependencies handling in wms and wms-ui
    • #735 lcg-mon-job-status, now uses stateEnterTime - ON HOLD

PPS

  • Problems found with the data management clients of the native SL4 WN. GGUS tickets and Savannag bugs submitted.
  • The upgrade path from the interim "SL3 compiled WN on SL4" to the natively compiled SL4 WN does not work. Discussions are under way as to whether to release the interim WN to production. The consequent rework of gLite Update 21 (which contains the interim WN and was due to be released to production on Monday 16 April) has caused a delay in the release of this update.
  • gLite 3.0 Update 21 will probably contain the following patches:
    • #1085 Missing package python-fpconst for SL3 installation
    • #1084 Missing dependency on lcg-expiregridmapdir for glite-WMS
    • #1077 glite-yaim-3.0.1-9 update
    • #1074 Missing dependency on glite-yaim in metapackages
    • #1038 lcg-info-dynamic-scheduler peformance improvement for bug#23636
  • gLite 3.0 PPS-update 27 deployed to the PPS. This update contains the following patches:
    • #1069 edg-mkgridmap-2.9.0 fixes bug 24343
    • #1108 glite-yaim 3.0.1-12
    • #1110 Dcache 1.7.0-34 upgrade with GridFTP? bug fixes
    • #1115 New version of lcg-info with support for VOViews, sites and services
    • #1118 lcg-vomscerts-4.4.1 has correct cert for biomed/egeode
  • Patches currently in the PPS:
    • #1118 lcg-vomscerts-4.4.1 has correct cert for biomed/egeode
    • #1115 New version of lcg-info with support for VOViews, sites and services
    • #1110 Dcache 1.7.0-34 upgrade with GridFTP bug fixes
    • #1108 glite-yaim 3.0.1-12
    • #1101 GFAL 1.9.0-2/lcg_utils 1.5.1-1
    • #1088 Removal of incorrect myproxy deps in glite-SE* metapackages
    • #1069 edg-mkgridmap-2.9.0 fixes bug 24343
    • #1066 yaim 3.0.1-10
    • #1052 fix reading LB super-users file
    • #915 Voms update to 1.7 branch

April 11th 2007

Certification

  • gLite 3.1 / SL4
  • Patches certified since last report:
    • #1069 edg-mkgridmap-2.9.0 fixes bug 24343
    • #1118 lcg-vomscerts-4.4.1 has correct cert for biomed/egeode
  • Patches currently in certification:
    • #1115 New version of lcg-info with support for VOViews, sites and services
    • #1110 Dcache 1.7.0-34 upgrade with GridFTP bug fixes
    • #1108 glite-yaim 3.0.1-12
    • #1089 Removal of incorrect apel-* deps from metapackages - ON HOLD
    • #1086 RGMA Client Exception Addition
    • #1079 Missing dependency for glite-CE
    • #1075 list match works fine
    • #938 a better dependencies handling in wms and wms-ui
    • #898 LCG-CE modifications for DGAS support
    • #735 lcg-mon-job-status, now uses stateEnterTime - ON HOLD

PPS

  • No gLite 3.0 Updates released to production this week due to the Easter holidays.
  • No gLite 3.0 PPS-Updates released to PPS this week due to the Easter holidays.
  • Contacted the WLCG experiments to arrange for the testing of the experiment applications on the native SL4 WN. This testing will start after Easter.
  • The new version of YAIM (YAIM 3.0.1-9) which came with PPS-update 25 has caused some problems due to documentation bugs - there were several new configuration parameters which were not documented. The Integration & Test team are working to resolve these bugs.
  • Patches currently in PPS:
    Note that this includes the SL4 natively compiled WN
    • #1101 GFAL 1.9.0-2/lcg_utils 1.5.1-1
    • #1088 Removal of incorrect myproxy deps in glite-SE* metapackages
    • #1085 Missing package python-fpconst for SL3 installation
    • #1084 Missing dependency on lcg-expiregridmapdir for glite-WMS
    • #1077 glite-yaim-3.0.1-9 update
    • #1074 Missing dependency on glite-yaim in metapackages
    • #1066 yaim 3.0.1-10
    • #1052 fix reading LB super-users file
    • #1038 lcg-info-dynamic-scheduler peformance improvement for bug #23636
    • #915 Voms update to 1.7 branch

April 4th 2007

Certification

  • gLite 3.1 / SL4
    • Testing continues on the gLite3.1/SL4 WN. Installation of rpms still needs to be forced in some cases, work continues on cleaning up dependencies.
    • The gLite 3.1 UI currently has no working edg-job-* commands; earlier problems with glite-wms-* commands have been fixed.
  • Patches certified since last report:
    • #1101 GFAL 1.9.0-2/lcg_utils 1.5.1-1
    • #1088 Removal of incorrect myproxy deps in glite-SE* metapackages
    • #1085 Missing package python-fpconst for SL3 installation
    • #1077 glite-yaim-3.0.1-9 update
    • #1074 Missing dependency on glite-yaim in metapackages
    • #1066 yaim 3.0.1-10
    • #1061 Pull-in recent LB 3.0 bug fixes
    • #1052 fix reading LB super-users file
    • #1038 lcg-info-dynamic-scheduler peformance improvement for bug #23636
    • #979 changes on wms component files
    • #915 Voms update to 1.7 branch
  • Patches currently in certification:
    • #1108 glite-yaim 3.0.1-12
    • #1105 glite-yaim-3.0.1-11 update
    • #1100 LFC/DPM 1.6.4
    • #1097 FTS 2.0 (updated)
    • #1093 R-GMA gin fix for Bug #17323
    • #1079 Missing dependency for glite-CE
    • #1046 Condor plugin for lcg-info-dynamic-scheduler
    • #1044 lsf_local_submit_attributes.sh
    • #1040 lcg-info-dynamic-lsf new version
    • #980 Glue 1.3 Schema
    • #970 [ VOMS ] New voms parameter -skipcacheck to be included in the config files
    • #938 a better dependencies handling in wms and wms-ui
    • #898 LCG-CE modifications for DGAS support

PPS

  • gLite 3.0 Updates 19 and 20 deployed to production. These updates contain the following patches:
    • #1065 Missing dependency on glite-security-voms-admin-client for glite-UI metapackage
    • #1059 gsiopenssh-VDT1.2.6rhas_3-2 fixes vulnerability
    • #1056 Dcache 1.7.0-29 upgrade with GridFTP bug fixes
    • #1048 bdii-3.8.8 has larger slapd cache and other improvements
  • gLite 3.0 PPS-updates 25 and 26 deployed to the PPS. These updates contain the following patches:
    • #1085 Missing package python-fpconst for SL3 installation
    • #1084 Missing dependency on lcg-expiregridmapdir for glite-WMS
    • #1078 GFAL 1.5.0 and lcg_utils 1.9.0 7
    • #1077 glite-yaim-3.0.1-9 update
    • #1074 Missing dependency on glite-yaim in metapackages
    • #1038 lcg-info-dynamic-scheduler peformance improvement for bug #23636
  • Patches Rejected:
    • Patch #1078 (GFAL 1.5.0 and lcg_utils 1.9.0) had to be withdrawn from PPS as it was found that lcg-del does not work on a classic SE. A follow up patch is already available.
  • Patches currently in PPS:
    Note that this includes the SL4 natively compiled WN
    • #1101 GFAL 1.9.0-2/lcg_utils 1.5.1-1
    • #1088 Removal of incorrect myproxy deps in glite-SE* metapackages
    • #1085 Missing package python-fpconst for SL3 installation
    • #1084 Missing dependency on lcg-expiregridmapdir for glite-WMS
    • #1077 glite-yaim-3.0.1-9 update
    • #1074 Missing dependency on glite-yaim in metapackages
    • #1066 yaim 3.0.1-10
    • #1052 fix reading LB super-users file
    • #1038 lcg-info-dynamic-scheduler peformance improvement for bug #23636
    • #915 Voms update to 1.7 branch

March 21st 2007

Certification

  • gLite 3.1 / SL4
    • Work continues testing the ETICS built native SL4 (32bit) binaries for the gLite 3.1 UI and WN.
    • Some problems with failing commands (bugs #24557 and #24556) are being investigated with the developers.
    • Updated configuration is being worked on for Globus and for setting the environment.
    • Problems with packaging of various components are still being solved.
  • Certified patches
    • #1059 (gsiopenssh-VDT1.2.6rhas_3-2 fixes vulnerability)
    • #1048 (bdii-3.8.8 has larger slapd cache and other improvements)
    • #1031 (Back-porting: blah 3.1 in branch 3.0)
  • Patches currently in certification:
    • #1089 Removal of incorrect apel-* deps from metapackages
    • #1088 Removal of incorrect myproxy deps in glite-SE* metapackages
    • #1086 RGMA Client Exception Addition
    • #1085 Missing package python-fpconst for SL3 installation
    • #1077 glite-yaim-3.0.1-9 update
    • #1075 list match works fine
    • #1074 Missing dependency on glite-yaim in metapackages
    • #1069 edg-mkgridmap-2.9.0 fixes bug 24343
    • #1052 fix reading LB super-users file
    • #1039 wms scripts changing
    • #1038 lcg-info-dynamic-scheduler peformance improvement for bug#23636
    • #1026 WMProxy memory allocation doesn't increase anymore
    • #1020 changes in wms-ui comands
    • #1015 UI Python commands 7 - High
    • #993 Fixed the problem in loading a perl module for the limiter script
    • #980 Glue 1.3 Schema
    • #979 changes on wms component files
    • #978 Changes in wms script files 7 - High
    • #976 OutputSandboxBaseDestURI (inheritance) works fine for Collections and DAGs
    • #938 a better dependencies handling in wms and wms-ui
    • #919 No Done event from LRMS
    • #918 LM dies with bad dagCondorLog file
    • #915 Voms update to 1.7 branch
    • #884 libtar fix for gLite 3.1
    • #735 lcg-mon-job-status, now uses stateEnterTime [ON HOLD]

PPS

  • gLite 3.0 PPS-update 24 just deployed. This update contains the following patches:
    • #1048 bdii-3.8.8 has larger slapd cache and other improvements
    • #1059 gsiopenssh-VDT1.2.6rhas_3-2 fixes vulnerability
  • Patches currently in PPS:
    • #1065 Missing dependency on glite-security-voms-admin-client for glite-UI metapackage
    • #1059 gsiopenssh-VDT1.2.6rhas_3-2 fixes vulnerability
    • #1056 Dcache 1.7.0-29 upgrade with GridFTP bug fixes
    • #1048 bdii-3.8.8 has larger slapd cache and other improvements
  • CERN_PPS Site:
    • Site upgraded to gLite 3.0.2 UPDATE 23. dist-upgrade failed for SLC4 WN due to unresolved package dependencies.
    • Information Provider on LSF CEs didn't return correct information at CERN_PPS. Some hacks are needed to the LSF configuration file.
    • Created test script to verify the status of all computing elements developed on the site.

March 14th 2007

Certification

  • SL4/gLite3.1
    • Continuing tests on the gLite 3.1 (natively compiled on SL4 With ETICS) UI and WN.
    • Installation is still 'forced' at the rpm level and packaging needs to be cleaned up.
    • Some configuration still has to be done manually.
    • Problems found with a segfault from glite-job-list-match on the UI; under investigation.
  • Patches currently in certification:
    • #1059 gsiopenssh-VDT1.2.6rhas_3-2 fixes vulnerability
    • #1057 new version of lcg-infosites solving several bugs
    • #1052 fix reading LB super-users file
    • #1048 bdii-3.8.8 has larger slapd cache and other improvements
    • #1039 wms scripts changing
    • #1038 lcg-info-dynamic-scheduler peformance improvement for bug#23636
    • #1031 Back-porting: blah 3.1 in branch 3.0
    • #1026 WMProxy memory allocation doesn't increase anymore
    • #1020 changes in wms-ui comands
    • #1015 UI Python commands
    • #993 Fixed the problem in loading a perl module for the limiter script
    • #980 Glue 1.3 Schema
    • #979 changes on wms component files
    • #978 Changes in wms script files
    • #976 OutputSandboxBaseDestURI (inheritance) works fine for Collections and DAGs
    • #938 a better dependencies handling in wms and wms-ui
    • #919 No Done event from LRMS
    • #918 LM dies with bad dagCondorLog file
    • #884 libtar fix for gLite 3.1
    • #735 lcg-mon-job-status, now uses stateEnterTime

PPS

  • SAM for PPS: client and sensors upgraded to version 1.1.0
  • CERN_PPS site: Problems found in reproducing the special configuration of LSF CE previously done on ce110 ce111.
  • gLite 3.0 PPS-update 22 deployed. This contains:
  • gLite 3.0 PPS-update 23 just released. This contains:
    • Dcache 1.7.0-29 upgrade with GridFTP bug fixes
    • lcg-vomscerts-4.4.0 (new certificates for biomed and egeode VOMS servers)
    • Other minor bug fixed
  • Patches currently in PPS:
    • #1068 lcg-vomscerts-4.4.0 has new cert for biomed/egeode
    • #1065 Missing dependency on glite-security-voms-admin-client for glite-UI metapackage
    • #1056 Dcache 1.7.0-29 upgrade with GridFTP bug fixes
    • #1051 yaim 3.0.0-38
    • #1045 Matchmaking fix for field containing a "."
    • #1036 gLite WMS respects $EDG_WL_SCRATCH
    • #1014 UI Python commands
    • #1013 WMProxy does not create all the needed job directories
    • #1003 Dcache 1.7 upgrade
    • #996 edg-mkgridmap 2.8.1 more robust

February 28th 2007

Certification

  • SL4
    • The first binaries for a natively compiled SL4 WN were produced.
    • Installation testing has started. Configuration is now being worked on.
    • Preparation of a repository for the SL3 WN binaries to be installed on SL4. This will solve earlier problems with maintaining such installations and has been released to PPS.
  • Development of tests for submitting jobs from multiple users to the glite-CE.
  • Addressing issues with tarball installation of UI (and WN) on lxplus.
  • #1013, #1014, #1036, #1045 (all WMS) certified
  • Patches in process:
    • #1042 DPM/LFC 1.6.3 - certification nearly done, progress reported in the patch; https://savannah.cern.ch/patch/?func=detailitem&item_id=1042
    • #884 libtar fix for gLite 3.1
    • #894 Rebranded GIP that includes inproved LDIF parsing
    • #898 LCG-CE modifications for DGAS support
    • #901 Update of c-ares rpm
    • #911 VOMS Configuration update
    • #918 LM dies with bad dagCondorLog file
    • #919 No Done event from LRMS
    • #930 lcg-info-dynamic-scheduler release 2.0.1
    • #938 a better dependencies handling in wms and wms-ui
    • #953 Configuration changes for VOMS
    • #980 Glue 1.3 Schema
    • #996 edg-mkgridmap 2.8.1 more robust
    • #1003 Dcache 1.7 upgrade
    • #1031 Back-porting: blah 3.1 in branch 3.0
    • #1038 lcg-info-dynamic-scheduler peformance improvement for bug #23636
    • #1040 lcg-info-dynamic-lsf new version
    • #1044 lsf_local_submit_attributes.sh
    • #1046 Condor plugin for lcg-info-dynamic-scheduler
    • #1048 bdii-3.8.8 has larger slapd cache and other improvements
    • #1051 yaim 3.0.0-38
    • #1055 glite-yaim-3.0.1-8 update

PPS

  • Deployment of gLite 3.0 PPS-update 20 completed. This contains:
    • patch #950 "New updated Torque (2.1.6-1cri_sl3_2st) and Maui ( 3.2.6p17-1_sl3)". Unfortunatley problems have already been found with this patch.
  • Deployment of gLite 3.0 PPS-update 21 just announced. This contains:
    • #996 edg-mkgridmap 2.8.1 more robust
    • #1003 Dcache 1.7 upgrade

February 21st 2007

Certification

  • WN now builds natively on SLC4. Certification will now start.
  • dCache 1.7 (patch #1003) certified.
  • DPM/LFC 1.6.2 (patch #1010) rejected - a new patch has been produced (#1028). Unfortunately problems have also been found with this patch and we are now waiting for a new patch from DPM/LFC.
  • A WMS test for parametric jobs is available.
  • New tests for the information system related commands on the UI are available.
  • Investigation of problems found with stress tests against 3.1 WMS.
  • Work on yaim 3.0.1-7 finished; now in certification.
  • Another gLite 3.1 WMS has been installed for CMS.
  • Investigation of problems encountered with the TAR_UI on lxplus (SLC4/x86_64).
  • Patches currently in certification:
    • 735 lcg-mon-job-status, now uses stateEnterTime
    • 884 libtar fix for gLite 3.1
    • 894 Rebranded GIP that includes inproved LDIF parsing
    • 898 LCG-CE modifications for DGAS support
    • 901 Update of c-ares rpm
    • 911 VOMS Configuration update
    • 918 LM dies with bad dagCondorLog file
    • 919 No Done event from LRMS
    • 930 lcg-info-dynamic-scheduler release 2.0.1
    • 938 a better dependencies handling in wms and wms-ui
    • 953 Configuration changes for VOMS
    • 969 glite-yaim-3.0.1-7 update
    • 980 Glue 1.3 Schema
    • 996 edg-mkgridmap 2.8.1 more robust
    • 1003 Dcache 1.7 upgrade

PPS

  • Deployment of gLite 3.0 PPS-update 20 started. This contains patch 950 which introduces Torque 2.
  • New site RAL-PPS joins the PPS.
  • Patch 1010 was introduced into the PPS in PPS-update 19 but quickly rejected due to deployment issues. However, it caused some DPM instabilities at 2 sites leading to complete re-installation and scratching of the database.
  • Patches currently in PPS:
    • 849 Support of SLC4 in Python configuration parser
    • 985 correct return code of edg_wll_QueryJobs for emtpy result
    • 991 Patch for org.glite.ce release 1.5.20
    • 950 New updated Torque (2.1.6-1cri_sl3_2st) and Maui ( 3.2.6p17-1_sl3)

January 31st 2007

Certification

  • Patch #907: Dcache 1.7.0
  • Patch #983: DPM/LFC 1.6.1 (glite-data_R_1_6_41) is on the certification testbed. It is being tested with the updated yaim configuration.
  • Condor 6.8.3 was installed on a glite-CE and WMS - confirmed fixed 100 job limit on glite-CE, no problems seen on WMS.
  • Blah patch 982 (org.glite.ce release 1.5.19) in certification. CERN LSF with old blah did not show this bug. Once upgraded to patched version, the bug appears. Maybe because log parser is running on a separate node and could not be upgraded at the same time. To be verified.
  • Debugging done on results of WMS 3.1 tests. Proxy renewal is not working.
  • VOMS - patches 869, 910 and 915 are being certified with developer assistance. 910 is close. A VOMS bug cleanup was also done.
  • Tarball for SL4 UI given to PPS and FIO.
  • WN/UI cleanup and build for 3.1 still in progress.
  • Work continues on building edg workload on SL4.
  • Patches currently in certification:
    • 884 libtar fix for gLite 3.1
    • 918 LM dies with bad dagCondorLog file
    • 919 No Done event from LRMS
    • 930 lcg-info-dynamic-scheduler release 2.0.1
    • 950 New updated Torque (2.1.6-1cri_sl3_2st) and Maui ( 3.2.6p17-1_sl3)

PPS

  • Feedback received from Alice after testing SL4 WNs in CERN_PPS: "The whole Alice software is perfectly running in SLC4 WNs. Jobs are successfully finishing through the corresponding VOBOX that we set up at CERN for these tests"
  • Feedback from ATLAS is that they have successfully completed their testing on the SLC4 WNs in the PPS.
  • Patches currently in PPS:
    • 849 Support of SLC4 in Python configuration parser 1 - On hold until cleared for release by the experiments
    • 874 WMS limiter (tcg item #301) 1 - On hold due to dependency on patch 941
    • 843 soappy addition to python api 7 - HIGH PRIORITY - Due to be released to production.
    • 968 condor-lcgrb improves Condor on lcg-RB 1 - Returned to certification due to packaging issue.
    • 869 Voms admin update 5 - Due to be released to production.
    • 941 config-glite 1.8 SLC3 fix 5 - Installing on PPS this week.
    • 910 VOMS Configuration update 7 - HIGH PRIORITY - Installing on PPS this week.

December 13th 2006

  • New version of APEL.
  • Correction of LFC installation script.
  • New release of condor for WMS.
  • External partner relation really improve with the last All hand meetings.
  • New cern CA has been tested without any problem.
  • Test of the new torque is OK.

November 15th 2006

  • Updated WMS (from CMS testing) released to production
  • job wrapper monitoring released to production
  • APEL 2 released to PPS
  • dCache with pnfs/postgresql in certification
  • gLite 3.0 WN on SL4 (compat mode) released to PPS
  • Still suffering on testbed from IP renumbering and firewalling

November 8th 2006

  • SL4 WN version deliver to PPS today for experiments testing.
  • Certification of modular YAIM.
  • Dcache new meta-pkg postgres base pnfs.
  • APEL is used for accounting and the latest version has been certfied for the lcgCE. APEL for both CE and it is dgas compatible.

November 1st 2006

  • We deliver to PPS job wrapper (sam client)
  • A new version of the WMS
  • IP nightmarre

October 25th 2006

  • Security patch for Torque has been made in emergency last fridday.
  • Normal evolution of the certification process.

October 18th 2006

  • Developement of SAM modules is growing.
  • Procdure of scheduling the patch priority from JRA1 to PPS completely savannah handle.
  • Testing of external sites start SGE, Condor, LSF in the testbed.
  • Very slow answer from CNAF SA3 team.
  • site BDII now separate from lcg-CE.
  • IP change preparation.

October 11th 2006

  • WMS certification progression of glite3.1.
  • LFC / DPM Deliver to preproduction
  • Progression of the devellopement of a new YAIM
  • Close cooperation with SAM develloper (new test modules, new client client package on the WN)

September 21th 2006

  • Progression of external site coperation.
  • LFC Oracle version is given to us
  • A new machine is used for building the packages

September 13th 2006

  • 3 external sites will provide certification for condor and SGE.
  • SAM is the testing frame work for the certification testsuite (voms testsuite is now include).
  • A voms server is include in the Testbed and manage by Maria Alandes.

September 6th 2006

  • On glite WMS and CE, applied the latest patches and installed condor 6.7.19
  • Tuned a lot of condor parameters following the suggestions of developer.
  • Started certification process of gLite release 3.0.3
  • Lot of work has been done for SGE support.
  • Writing voms admin tests and update test plan fro voms admin and voms core.
  • Decision to use SAM portal has framework for our test suites.
  • Started to implement the 'VOMS role'-based user configuration support in YAIM

August 16th 2006

  • We have faster way to install and upgrade the nodes patch per patch 4 huge machines are ready and everything is reinstalable in 20 minutes.
  • A bug in linux with the command id cause errors in execution of few yaim scripts.
  • 2 external site provide certification for the Sun Grid Engine and Dcache.

August 2nd 2006

  • Relese 3.0.2 has been certified and deliver
  • We are currently working with a new process to accelerate the certification (installation / configuration).
  • Problem to synchronized test suite and new functionalities.
  • External sites begin to be really involve in our process (Sun Grid Engine, Dcache).
  • We have to reinstall a testbed supporting LSF.

June 19st 2006

  • We are waiting for few new update of WMS to begin the certification of the next Release candidate, the last one has been refuse.
  • We set a new update of certiifcation testbed procedure with more virtualization involve.
  • We have now external certification site

June 1st 2006

  • New version of configuration modules -config.py and YAIM to solve lot of problems :
    • Information system of the gliteCE.
    • Condor possible configuration problem
    • Few updates on WMS for middleware bugs (proxy renewal).
    • fetch-crl unification from IGTF

May 14th 2006

  • The main problems of the deployment are due to the new way to configure the CE.
  • The immature state of the gliteCE.
  • The confusion create by the lack of knowledge of the glite middleware and the integration of it on an exisitng LCG site.
  • The old middleware has no major problems.

May 16th 2006

  • glite3.0.0 Glite tarball WN / UI is ready and distribute. installable on SL3 and on Debian. A new YAIM arrived with few bug fixes in the configuration. Lot of sysadmin are really slow to install due to the change on the conf of Glite services.
  • An major upgrade with lot of bug fixes called Glite 3.0.1 is on the certification repository, test will run for few days and adjust before any upgrade.

May 10th 2006

April 19th 2006

  • Testing of RC2 a pretty clean. On the cert TB the PPS is installing. Few fix to do for RC3 in configuration and upgrade of DPM/LFC and bug fixes on glite WMS.

March 21st 2006

  • Glite3.0 RC1 has beeen deliver to the PPS. 14 sites has install it (not all services on all sites). RC2 is currently in peparation and the result are pretty clean some bug due to limitation of ca with gridftp, and proxy renewal. All meta-rpm name has been change to glite-ROLE.

Feb 22nd 2006

  • Certification/Configuration process of Glite3.0 in progress

January 31st 2006

  • The release 2.7.0 is out. Site begin to install. One important things is missing the VOMS enable DPM. It will be update as soon as the new voms and the enable dpm are certify.

November 30th 2005

  • Date for the released fix to the 5th of january

List of new components

  • Last release of LFC and RLS
  • new version of lcgutils
  • FTS client
  • R-GMA 5
  • New YAIM with new rgma and new SE management

Work in progress

  • Testing and integration of the new components
  • creation of a branch to reduce the amount of new components to integrate and continue the cerification of new components that will not be integrate in LCG-2_7_0
  • Settings of the specification of the new building system all packages will be more uniform

November 16th 2005

Activities completed since last week

  • Integration of the new Grid Info Provider (easyer to install), and info about jobs runing.
  • New Workload management (various bug fixes)
  • New RGMA
  • Integreation of new monitoring tools based on R-GMA.
  • FTS client

Work in progress

  • Check compatibility with SPI tools.

Issues

  • No new Glite tools integrate
  • Change of the SE management in YAIM not yet perfect

Dash Status

  • We decides not to expect new glite component.
  • Wait for a new Storage tools around DPM.
  • YAIM new conf in progress.

-- TimBell - 02 Oct 2005


This topic: LCG > WebHome > LCGServiceChallenges > ProgressLogs > ServiceChallengeFourProgress > LcgScm > LcgScmStatus > LcgScmStatusDeploy
Topic revision: r69 - 2008-04-29 - AntonioRetico
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback