LCG-2_6_0 Post Mortem

The tests of the ROCs have been very valuable as the first tests of the users. This has to be part of the release from now on. The 3ROCs needed 5 working days to deploy and give feedback. We need at least one additional week to implement all fixes and cleanup the release. The final packaging to get the software in a shape that it can be given to the ROCs for pre-release tests takes about a week. As a consequence we have to stop integration and initial testing 3 weeks before the release date.

Problems that have been identified with the 2.6 release

1) The local deployment tests have not been finished before sending the release candidate to the "3 ROCs" for testing

The result was duplication of work and a potential loss of confidence by trivial problems surfacing

2) The current test suite(s) (gilberts stress tests and Piotrs SFT) is (are) not following the evolution of new functions closely enough a) Gilbert can't know about changes and he is in addition only working on a "voluntary" basis on the maintenance of the tests b) Many of the tests still assume the RLS being used

3) We have been bitten again by junk (partial) data in the information system.

4) It is very hard to close the door for changes, new components, and fixes in time to get the release integrated and tested. The reason for this is related to the size of the interval between releases and the lack of an easy way to provide updates.

5) In addition to an overhaul of the stress tests we need to add performance tests

6) We released while we had a known bug on the RB which could block the RBs.

Proposed solutions:

Well sort of.....

3) Not much that we can do here, you can't cover the complete junk space, obvious errors are filtered by the BDII.

6) We should define a set of core services on the RB, BDII, CE ..... before we release all open bugs related to core services have to be reviewed by another team member and the severity level has to be adjusted.

About all the rest

We need a test person who is coordinating the test evolution an maintains the tests. (or individuals for certain areas). The role of the SFT-2 in certification has to be understood.

It is clear that we wait until tests end before we (pre)release

The information about changes should be collected in the patch submission step via savannah.

We think that information like this should be provided whenever a patch introduces changes:

cron-jobs

requirement for GSI infrastructure

host/service cert needs to be registered with

ports

log files

configuration parameters and their meaning

location of conf. files

Where state is kept (db, files)

uses information system for?

depends on other services

changes in usage

suggested tests

............ and ????

The list is certainly incomplete. For patches that don't change the above list the developer can just check a box, declaring that there are no changes.

Performance Tests

We'll start with the RB, together with the EIS people (Andrea + ?) we have to create a workload that reproduces the behavior that has been observed during the DCs. We can use this then as a standardized benchmark.

Other tests are probably needed for the data management

We realized that at the root of many problems the long release intervals could be found.

Plan: Release every 3 month as before, add releases for special activities and provide updates for the current release.

Current definition for the term "middleware update":

A middleware update is a change of the software that can be applied to the existing systems without changing the configuration

This can be translated to: Whenever you can get the job done with APT alone it is an update!

YAIM is not seen as being part of the middleware and bug fixes are released as soon as available, independent of the need for a change

The big question is who will be the test maintainer?

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2006-11-28 - LaurenceField
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback