Backwards compatibility in LHCb software

1. Introduction

The terms used in this TWiki are defined in the attached document, LHCb compatibility, which should be read before continuing on this TWiki.

In this TWiki we discuss the maintenance, implementation and testing of backwards compatibility in our software, along with what aspects of compatibility we feel we have the manpower to enforce.

2. Contents

3. Obligations for compatibility

As a scientific community which is both publicly funded and open-source, we are obliged to maintain reproducibility of results we use for publications long after the original publication, and in some cases, long after the experiment is terminated. A commonly accepted best-practice is to maintain reproducibility for ten years after publication.

As a collaboration which is working towards a common goal, we are encouraged to simplify the use of our software as much as possible and thus give the most productive division of manpower and expertise.

Both of these assertions imply:

  • we should maintain compatibility of our software, for future-proofing against unexpected use-cases, simplicity and reproducibility of analyses.

4. Statement about manpower

Unfortunately manpower is limited. All possible backwards-compatibilities cannot be checked or maintained, but it is clear that future use cases if they are only realized in the future, require manpower in the future, so a certain level of backward compatibility will reduce future manpower. We choose to enforce backward compatibility wherever there is an already identified use case, or in aspects where backwards compatibility has already caused problems in the past, looking to avoid them in the future.

  • we enforce compatibility to reduce the manpower overhead in the future, where we have very good reasons for doing so

5. Compatibility which is maintained

The two systems (A B) described in the attached document, can be taken to refer to one of the following situations:

  1. Data as it comes from the experiment, the raw files (A), and the software stack of applications (B)
  2. Two different software applications (A and B) (e.g. Boole and Brunel, or Brunel and DaVinci)
  3. A software stack of all applications (A) and a specific tag in a specific version of the LHCb conditions database (B)

5.1 Compatibility of Raw Files

  • We will maintain the ability to fully process raw files used for any analysis beyond the lifetime of the experiment with the latest software

5.2 Compatibility of applications

Compatibility of applications is maintained through the format of the intermediate files which are exchanged between them. For the Brunel/DaVinci combination, this would be DSTs. It is very much necessary that new code should be able to process or re-process older files, but it is not strictly necessary for old code to be able to process new files.

Keeping old code able to process new files is often very useful, but requires a lot of manpower. So as far as possible it is maintained only on a best-effort basis, or if there is an identified specific need in production. At the moment we do not have any standardized tests for this, nor do we enforce or guarantee backwards compatibility in this case.

  • New code will always be compatible with both new and old files

  • Old code is not always compatible with new files, but all incompatibilities are well documented and advertised

5.3 Compatibility of applications with the database

The conditions database is a complicated way to hold a set of numbers. A tag of the database is a semi-stable fixed set of numbers which usually produce a valid result. Improvements to the database are simply refining existing numbers or adding new numbers and tagging the changes. It follows then that there is almost never any justification to introduce an incompatibility in the database. Despite this assertion incompatibilities may be introduced very regularly if care is not taken during development.

Since manpower is still limited it is not possible to check and maintain backwards compatibility for every version of every application. Instead we choose to maintain the versions which:

  • were used for major yearly reprocessings
  • the versions of the trigger software which were used in the Pit for a large fraction of the data.

It is then very possible to maintain compatibility between these limited versions and with every self-consistent set of database tags, through nightly testing.

Old and new software combined with any self-consistent point in the database should always be valid.

  • New code will always be compatible with both old and new database tags
  • Old code will be compatible with both old and new database tags, patching only where strictly necessary. If patches are required they will be applied to a given subset set of applications only.

6. Tests

6.1 Fixed-file tests

To check our ability to process of old files we do have standardized tests each night. These are the so-called 'fixed file' tests where for each production application, the new software is run over old files and it is checked that the outcome is pretty much the same as it was last time. By definition the output cannot be exactly the same, that would mean we made no progress, however the changes should always be in a positive direction.

6.2 latest database testing

To check the compatibility of the database with the latest software we also have standardized tests each night. These are also part of the 'fixed file' tests where for each production application, the new software is run over old files with different sets of database tags, the tag which was used to create the file, the latest global tag and all of the latest local tags. Again it is checked that the outcome is pretty much the same as it was last time.

6.3 lhcb-compatibility nightly

To check the compatibility of the database with the older software we have the lhcb-compatibility nightly. The lhcb-compatibility nightly slot runs patched versions of old software against the latest local and global database tags to ensure the latest tags are backward compatible. These tests are very simple, that they only check no warnings and errors are thrown in the processing, so represent the bare minimum of database backwards compatibility.

7. Modifications and concequences

7.1 When to patch?

In the attached document a patch is described as an improvement which is not necessarily backwards-compatible. This should only be done when really absolutely necessary, because it requires a lot of manpower and documentation to ensure the users are aware of the changes made. Much more discussion is in the attached file.

  • Patches are required if true bugs are noticed in existing code
  • Patches are desirable when the manpower of back-porting the patch is less than the manpower of fixing the compatibility

Examples of patches include:

  • backwards-incompatible event model improvement
  • fixing a C++ bug which was only exposed with a new database version

7.2 Schema evolution in Root

If a given event model class has any simple addition in member variables or modification to member functions the Root schema evolution ensures backwards compatibility, providing that all the constructors are properly initializing all member variables, otherwise the behavior can be, sporadic.

If a new class is added, or if the type of member variables is changed then the Root schema evolution will probably be unable to handle this and new files will be unreadable without a patch to the old software.

7.3 Forbidden database actions

Many backwards-incompatible changes are made because the meaning of a given condition in the database is changed. Say for example there is an entry which corresponds to the list X,Y,Z. If this is changed such that it is from now on interpreted as 2X,Y,Z, or Z,Y,X, or maybe even A,B,C,X,Y,Z that would mean the previous code is now physically incorrect. If there is some advantage to changing the the parameters it is always better to define a new set of parameters which hold the new values and are used by the new software.

So, the following actions are generally forbidden:

  • removal of a condition which has been used
  • changing the meaning of a condition which has been used

Instead of removing the condition it is of no overhead to keep the condition in the database such that old software can still run.

7.4 Best Practise

It is always a good idea to protect against backwards incompatible changes and allow the code to work with future backwards incompatible changes. Usually that means protecting C++ snippets against missing conditions and falling back on a previous default value, but it also extends to the clever initial choice of parameters for the database with backwards compatibility in mind to reduce headaches in the future.

8. Coded examples

9 . Summary

In general:

  • we should maintain compatibility of our software, for future-proofing against unexpected use-cases, simplicity and reproducibility of analyses.
  • we enforce compatibility to reduce the manpower overhead in the future, where we have very good reasons for doing so

For raw files

  • We will maintain the ability to fully process raw files used for any analysis beyond the lifetime of the experiment with the latest software

Between files and software

  • New code will always be compatible with both new and old files
  • Old code is not always compatible with new files

Between software and database:

  • New code will always be compatible with both old and new database tags
  • Old code will be compatible with both old and new database tags, patching only where strictly necessary. If patches are required they will be applied to a given subset set of applications only.

So, the following database actions are generally forbidden:

  • removal of a condition which has been used
  • changing the meaning of a condition which has been used

Backwards incompatible changes are forseen, but they are always to be avoided where possible

  • Patches are required if true bugs are noticed in existing code
  • Patches are desirable when the manpower of back-porting the patch is less than the manpower of fixing the compatibility

-- RobLambert - 22-Feb-2011

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2011-02-24 - RobLambert
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback