Condition Data Access in AthenaMT

Introduction

This page describes the mechanism of accessing condition data by serial and multi-threaded Athena jobs and contains real-world examples demonstrating changes in the user code (Algorithms, Tools, Services) required for the migration to the new condition access infrastructure in AthenaMT.

Condition data access in Athena (serial and MT)

Condition data access in Serial Athena

In serial Athena, Conditions are managed by the Interval of Validity Service (IOVSvc). At the start of a job, the IOVSvc is configured to manage a number of objects in an associated Conditions Database (COOL Conditions DB), which stores the value of each object for each IOV. At the start of each event, the IOVSvc examines the validity of each registered object. Objects that are no longer valid are re-read through the IOVDbSvc, which reads them either directly from the COOL DB or from its local cache, and any required post-processing of the data is performed by an associated callback function. The processed objects are then placed in the Transient Detector Store (StoreGate), from whence they can be retrieved by a user Algorithm.

This workflow fails when multiple events are processed concurrently. Since only a single instance of the condition data can be held at any one time in detector store, if two events are processed concurrently, with associated condition data from different IOVs, one will overwrite the other. Furthermore, neither the IOVSvc itself nor any of the condition callback functions were designed to be thread safe, and since these are shared instances, threading problems are bound to occur. A major rewrite of the entire infrastructure is required.

Condition data access in AthenaMT

Several different designs for the condition handling were examined, with two key requirements in mind: minimize changes to client code, and minimize memory usage.

Processing Barrier. The first considered design was to use a processing barrier, such that only events where all Condition objects were unchanged were processed concurrently. No new events would be scheduled until all events within the same Condition region had finished processing. Then the condition store would be updated using the IOVSvc machinery, and new events could be scheduled. By utilizing this technique, very few changes would need to be made to the client code, and there is no extra memory usage, as there is only one instance of the condition store. The majority of the work would be in making the scheduler aware of the Condition boundaries, and doing the appropriate filling and draining of associated events. The fundamental problem with this method, however, is that it assumes that Condition boundaries are infrequent, so that the loss of concurrency when the scheduler is drained and refilled is minimal. On ATLAS, however, Condition changes can sometimes occur very rapidly, for example as frequently as once per event in the Muon subsystem. This would have the effect of serializing event processing, with complete loss of concurrency. Another problem is that it assumes that all events are processed in sequence. If events are out of order near a Condition boundary, then the processing barrier could be triggered multiple times, once again resulting in a significant loss of concurrency.

Multiple Condition Stores. Another proposed design was to use multiple condition stores, one per concurrent event, in the same manner as the Event Stores are duplicated for each concurrent event. The mechanism by which data is retrieved from the condition store would be modified, such that clients would associate with the correct Store. Impact on client code would be small only the condition data retrieval syntax would need to be updated. However, beyond merely ensuring thread safety of the IOVSvc and the callback functions, there are two significant problems with this design: the memory usage would balloon, as objects would be duplicated between each Store instance; and also the execution of the callback functions that are used to process data would be duplicated, resulting in extra CPU overhead.

Multi-Cache Condition Store. The chosen solution is to implement an intersection of the two preceding designs, with a single condition store that holds containers of condition data objects, where the elements in each container correspond to individual IOVs. Clients access Condition objects via smart references, called ConditionHandles, which implement the logic to determine which element in any ConditionContainer is appropriate for a given event. The callback functions are migrated to fully-fledged Condition Algorithms, which are managed by the framework like any other Algorithm, but only executed on demand when the Condition objects they create need to be updated.

Building blocks of the Condition Data Access infrastructure in AthenaMT

Condition Handles

One of the fundamental changes in the client code needed for the migration to AthenaMT is that all access to event data must be done via smart references, called DataHandles. We capitalized on the migration to DataHandles by requiring that all access to Condition data be done via related ConditionHandles. There are two types of ConditionHandles:

  • ReadCondHandle for read-only access of condition objects. Upon dereferencing, this handle uses the current event and run numbers to look inside the Container and find the object corresponding to current Interval of Validity.
  • WriteCondHandle is used by Condition Algorithms for creating new condition objects and storing them in corresponding Condition Container.

Condition Store and Condition Containers

Condition Store holds containers of Condition Data Objects called Condition Containers. Elements in each Condition Container correspond to individual Intervals of Validity. Condition Objects are inserted into Condition Containers by Condition Algorithms.

Condition Algorithms

Condition Algorithms create new Condition Objects and store them into Condition Containers. By using ConditionHandles in the Condition Algorithms to write data to the Condition Store, the framework solves the problem of Algorithm ordering for us, ensuring that the Condition Algorithm is executed, and the updated Condition Objects are written to the Condition Store before any downstream Algorithm which needs to use them are executed.

When a Condition Algorithm is executed, it queries the Conditions Database for data corresponding to the current event, as well as its associated IOV, creates the new object for which it is responsible, and adds a new entry in the Condition Container that is associated with a WriteCondHandle. When a downstream Algorithm that needs to read this object from the Condition Store via ReadCondHandle is executed, the data is guaranteed to be present. The ReadCondHandle uses the current IOV to identify which element in the container is the appropriate one, and returns its value.

Condition Service

All Condition Handles are registered with the Condition Service. At the start of the event, the Condition Service analyzes the subset of the objects held in the condition store that have been registered with it, and determines which are valid or invalid for the current event. If an object is found to be invalid, the Condition Algorithm that produces that object will be scheduled. If an object is found to be valid, then the Scheduler will be informed that this object is present, and placed in the registry of existing objects. In this case the Condition Algorithm will not execute.

Client Condition Algorithms need to interact with the Condition Service only for registering Write Condition Handles. For using Read Condition Handles there is no need to interact with the Condition Service.

CondInputLoader

CondInputLoader is a special Condition Algorithm which plays a key role in the AthenaMT Condition Access infrastructure. All condition data objects which are retrieved from the Condition Database (aka Raw Condition Objects) need to be declared to the CondInputLoader at the configuration step. CondInputLoader creates Write Condition Handle dependency on all these Raw Condition Objects, such that downstream clients can access them via Read Condition Handles. As a result of such data dependency, at every event the CondInputLoader gets executed prior to any other Condition Algorithm, it interacts with the Condition Database (via IOVSvc and IOVDbSvc mechanisms) and makes sure that for given even all valid Raw Condition Objects are accessible to downstream clients via Condition Store.

Enabling MT Condition Access in Reconstruction

The new Condition Access Infrastructure is expected to seamlessly work in serial Athena and in AthenaMT. Hence, in order to allow adiabatic migration of the client code, the following changes have been introduced into IOVDbSvc/CondDB.py :

from AthenaCommon.AppMgr import ServiceMgr as svcMgr 
from IOVSvc.IOVSvcConf import CondSvc 
from IOVSvc.IOVSvcConf import CondInputLoader 
from AthenaCommon.AlgSequence import AthSequencer 
import StoreGate.StoreGateConf as StoreGateConf 

condInputLoader = CondInputLoader( "CondInputLoader") 
condSeq = AthSequencer("AthCondSeq") 

svcMgr += CondSvc() 
svcMgr += StoreGateConf.StoreGateSvc("ConditionStore") 
condSeq += condInputLoader

As a result of this, if some Athena job wants to use the new Condition Access mechanism it should do:

from IOVDbSvc.CondDB import conddb

While in AthenaMT the correct execution order of algorithms is guaranteed by the Gaudi Scheduler, we need to make sure Condition Algorithms are executed prior to other condition clients in serial Athena as well. For this purpose we have introduced a new Condition Sequence which is executed by serial Athena prior to Top Sequence.

Migration of client code

General strategy

In serial Athena, Condition Data clients usually register callback functions on Condition Data objects. Inside callback function the client retrieves data object from the Detector Store (Store Gate) and caches it in its private data member (Raw Condition Data Object). Sometimes, in addition to object retrieval, the client also performs some computation using condition data object as input and caches the results of these computations (Derived Condition Data Object).

In order to migrate such code to the new Condition Data Access infrastructure the following needs to be done:

  • Raw Condition Data Object needs to be declared to CondInputLoader;
  • Condition Data Objects need to be accessed in the client code via Read Condition Handles;
  • Callbacks must be dropped;
  • If a callback produces Derived Condition Data Objects, it needs to be transformed into a Condition Algorithm. The latter one will read its input objects via Read Condition Handles and store its outputs into Condition Store using Write Condition Handles

Simple example (does not require Condition Algorithm)

A simple example of Condition Data Client migration can be illustrated by CaloLCClassificationTool from Calorimeter/CaloUtils package, which was migrated back in the SVN era. The code before migration is in CaloUtils-01-00-64, and after migration in CaloUtils-01-01-02.

Changes in the header file (CaloLCClassificationTool.h)

1. Added include statement for ReadCondHandleKey

#include "StoreGate/ReadCondHandleKey.h"

2. Dropped the callback declaration

virtual StatusCode LoadConditionsData(IOVSVC_CALLBACK_ARGS) override;

3. Replaced DataHandle to the condition object

const DataHandle<CaloLocalHadCoeff> m_data;

with ReadCondHandleKey

SG::ReadCondHandleKey<CaloLocalHadCoeff> m_key;

Changes in the source file (CaloLCClassificationTool.cxx)

1. Added include statement for ReadCondHandle

#include "StoreGate/ReadCondHandle.h"

2. Constructed ReadCondHandleKey (in the constructor of the tool):

m_key("EMFracClassify"),

Note This is an example when for given Condition Data Object the COOL folder name is different from the Store Gate key. We need to pass the later ("EMFracClassify") as a parameter to the constructor of the ReadCondHandleKey.

3. Changes in initialize()

3.1. Dropped callback registration

3.2. Initialized ReadCondHandleKey

ATH_CHECK( m_key.initialize() );

4. Changes in the method which uses condition data object (CaloLCClassificationTool::classify())

4.1. Access condition object through ReadCondHandle

const CaloLocalHadCoeff* condObject(0);
SG::ReadCondHandle<CaloLocalHadCoeff> rch(m_key);
condObject = *rch;
if(condObject==0) {
  ATH_MSG_ERROR("Unable to access conditions object");
  return CaloRecoStatus::TAGGEDUNKNOWN;
}

4.2. Throughout the method replaced m_data with condObject.

5. Dropped the definition of the callback function.

CONDCONT_DEF for the Condition Container

There is a special version of the CLASS_DEF macro which must be used for Condition Objects called CONDCONT_DEF. It is used to declare the ClassID of the container as well as enable loading by the CondInputLoader. In order to be able to store Condition Object (CaloLocalHadCoeff in this particular example) the following was added to CaloLocalHadCoeff.h header file (package Calorimeter/CaloConditions):

#include "AthenaKernel/CondCont.h"
CONDCONT_DEF( CaloLocalHadCoeff , 82862607 );

Two clid commands below produce the same ClassID (82862607 in the above example) of the templated container, not the CaloLocalHadCoeff class istelf.

> clid -cs CaloLocalHadCoeff
82862607
> clid -s "CondCont<CaloLocalHadCoeff>"
82862607

Changes in configuration

We need to inform CondInputLoader that it needs to manage data of the appropriate COOL folder, as described above. This can be done through the updated interface of the conddb python object. Before for declaring some COOL folder to IOVDbSvc we had to do:

conddb.addFolder("CALO","/CALO/HadCalibration2/CaloEMFrac")

Now, in order to declare the same COOL folder to the CondInputLoader as well, we need to do:

conddb.addFolder("CALO","/CALO/HadCalibration2/CaloEMFrac", className='CaloLocalHadCoeff')

More complex example (requires Condition Algorithm)

Client code migration to the new Condition Data Access infrastructure becomes more complex when the client code inside its condition callback function not only retrieves Raw Condition Data object(s) but, in addition to that, performs some computation using this object as input and caches the results locally. In such cases the callback function needs to be transformed into a new Condition Algorithm class and the computation results need to be stored into Condition Object (using WriteCondHandle), which will be accessible to the original client via ReadCondHandle.

One example of such condition client is PixelConditionsTools/PixelRecoDbTool. The code prior to the migration can be accessed in SVN at PixelConditionsTools-00-07-01. In this implementation, PixelRecoDbTool contains a private data member

mutable PixelCalib::PixelOfflineCalibData* m_calibData;
which is used to store the results of computations implemented in the callback function PixelRecoDbTool::IOVCallBack().

Migration Strategy

New Condition Object

Given the old implementation already uses a well defined object PixelOfflineCalibData for caching computation results in the callback function (this is not the case for all Condition Clients!), we don't need to introduce a new Condition Object for the migration, but simply use PixelOfflineCalibData for this purpose. The only change that needs to be implemented is the addition of CONDCONT_DEF as described here.

New Condition Algorithm

A brand new condition algorithm has been introduced for this migration: PixelCalibAlgs/PixelCalibCondAlg. The algorithm has been added to an existing package only to avoid introduction of a new package for this migration. In the long run subsystems/groups may want to consider creating dedicated new packages for hosting conditions algorithms.

Migrated Code

The results of the migration of PixelRecoDbTool as a new tool in here Git multithreading branch.

Condition Algorithm

InnerDetector/InDetCalibAlgs/PixelCalibCondAlg.h/.cxx is basically a copy of the original callback function PixelRecoDbTool::IOVCallBack(). Some important points to note:

1. It accesses Raw Condition Data object (DetCondCFloat) via ReadCondHandle and stores Derived Condition Data object PixelOfflineCalibData into Condition Container using WriteCondHandle

2. The following code snippet creates new derived Condition Data Object, defines IOV for it and records it into Condition Container:

PixelCalib::PixelOfflineCalibData* writeCdo = new PixelCalib::PixelOfflineCalibData();

// ... fill in the contents of writeCdo

// Define validity of the output cond object and record it   
EventIDRange rangeW;   
if(!readHandle.range(rangeW)) {
     ATH_MSG_ERROR("Failed to retrieve validity range for " << readHandle.key());
     delete writeCdo; // avoid memory leaks, Otherwise, Coverity will complain.
     return StatusCode::FAILURE;
}   
if(writeHandle.record(rangeW,writeCdo).isFailure()) {
     ATH_MSG_ERROR("Could not record PixelCalib::PixelOfflineCalibData " << writeHandle.key()
     		                       << " with EventRange " << rangeW 		  
                                       << " into Conditions Store");
     return StatusCode::FAILURE;   
} 

In this particular example, the Derived Object depends on only one Raw Object, which makes it rather trivial to define validity of the Derived Object. In more complex cases the resulting validity should be an intersection of the IOVs of the objects the Derived Object depends on. For more details see the example below.

Condition Objects

In this example we have two Condition Objects

1. Raw Condition Object is DetectorDescription/DetDescrCond/DetDescrConditions/DetCondCFloat. This is the object retrieved from the Condition Database and then accessed by PixelCalibCondAlg and PixelRecoDbTool via ReadCondHandle. In order to be able to store/read this object to/from Condition Store the following has been added to DetDescrConditions/DetCondCFloat.h header file:

#include "AthenaKernel/CondCont.h" 
#include "SGTools/BaseInfo.h" 
...
CONDCONT_DEF( CondCont<DetCondCFloat> , 85257013 );

2. Derived Condition Object is InnerDetector/InDetConditions/PixelConditionsData/PixelOfflineCalibData. This is the object which is created and stored into Condition Store using WriteCondHandle by the Condition Algorithm PixelCalibCondAlg, and then accessed by PixelRecoDbTool via ReadCondHandle. In order to be able to store/read this object to/from Condition Store the following has been added to PixelConditionsData/PixelOfflineCalibData.h:

#include "AthenaKernel/CondCont.h" 
#include "SGTools/BaseInfo.h"
...
CONDCONT_DEF(CondCont<PixelCalib::PixelOfflineCalibData> , 213651723 );

Condition Client

Changes in the Condition Client tool (PixelRecoDbTool) are similar to those applied to the CaloLCClassificationTool discussed above. To summarize:

1. Private data members corresponding to Condition Data Objects replaced with ReadCondHandleKey -s;

2. Callback function dropped;

3. ReadCondHandleKey -s initialized in initialize() of the Tool;

4. Condition Data Objects are accessed throughout the code via ReadCondHandles.

Changes in configuration

In order to declare Raw Condition Data Object to CondInputLoader we had to implement changes in InnerDetector/InDetExample/InDetRecExample/share/InDetRecConditionsAccess.py similar to one described here. In particular this line

conddb.addFolder("PIXEL_OFL","/PIXEL/PixReco")
was replaced with
conddb.addFolder("PIXEL_OFL","/PIXEL/PixReco",className='DetCondCFloat')

In addition to the above we also add the new Condition Algorithm to the AthCondSeq sequencer:

from AthenaCommon.AlgSequence import AthSequencer             
condSequence = AthSequencer("AthCondSeq")             
from PixelCalibAlgs.PixelCalibAlgsConf import PixelCalibCondAlg             
condSequence += PixelCalibCondAlg( "PixelCalibCondAlg" )

Extra Material (not covered in above examples)

Validity check for Write Condition Handle

Unlike AthenaMT, which schedules the execution of a Condition Algorithm only when at least one of its Read Condition Objects is not valid for the given event, serial Athena treats Condition Algorithms as other "regular" algorithms, i.e. they get unconditionally executed at every event. Sometimes the process of constructing and filling Derived Condition objects can be resource-hungry, hence it should be executed only when needed. In order to avoid making new Derived Condition object, and eventually discarding it, when the corresponding Write Condition Handle is already valid for the given event, it is necessary to check the validity of Write Condition Handles at the early stage of the execute() method of Conditiion Algorithms. For example:

StatusCode MyCondAlg::execute()
{
// Setup WriteCondHandle
SG::WriteCondHandle<WT1> wchWT1{m_wKey1};
if (wchWT1.isValid()) {
    ATH_MSG_DEBUG("Found valid write handle");
    return StatusCode::SUCCESS;
}

...
}

Such a check can also protect us from recreating a valid Derived Condition Object in AthenaMT in cases when multiple concurrent events are being processed out of order.

Intersection of the Intervals of Validity

If some Condition Algorithm needs to produce a Derived Condition Object which depends on more than one input Condition Objects (either Raw or Derived), then the IOV of the resulting object needs to be an intersection of IOV-s of the objects it depends on. In order to properly construct an IOV for the Derived Condition Object, the corresponding Write Condition Handle must explicitly declare its dependency on one or several Read Condition Handles. The following code example demonstrates how to intersect three input IOV-s:

    1StatusCode MyCondAlg::execute()
    2{
    3/* Suppose we have three ReadCondHandleKey-s and their corresponding condition objects
    4  Object Type: RT1, ReadCondHandleKey: m_rKey1
    5  Object Type: RT2, ReadCondHandleKey: m_rKey2 ----> CAN BE NULL
    6  Object Type: RT3, ReadCondHandleKey: m_rKey3 ----> CAN BE NULL
    7
    8  And we need to produce one Derived Object of type WT1 with WriteCondHandleKey m_wKey1. 
    9*/
   10
   11// Setup WriteCondHandle
   12SG::WriteCondHandle<WT1> wchWT1{m_wKey1};
   13if (wchWT1.isValid()) {
   14    ATH_MSG_DEBUG("Found valid write handle");
   15    return StatusCode::SUCCESS;
   16}
   17
   18// Deal with first ReadCondHandle
   19SG::ReadCondHandle<RT1> rchRT1{m_rKey1};
   20wchWT1.addDependency(rchRT1);
   21
   22// Declare dependency on other two ReadCondHandles, but first check if the keys are not empty
   23if (!m_rKey2.key().empty()) {
   24    SG::ReadCondHandle<RT2> rchRT2{m_rKey2};
   25    wchWT1.addDependency(rchRT2);
   26}
   27
   28if (!m_rKey3.key().empty()) {
   29    SG::ReadCondHandle<RT3> rchRT3{m_rKey3};
   30    wchWT1.addDependency(rchRT3);
   31}
   32
   33// Consrtuct the Derived Condition Object
   34std::unique_ptr<WT1> pWT1 = std::make_unique<WT1>(...);
   35
   36// Fill its content ....
   37
   38// Record the Derived Condition Object into Condition Store
   39//
   40// NB: the recorded object will have an IOV equal to the intersection of all IOVs for which addDependency() was called
   41// 
   42ATH_CHECK(wchWT1.record(std::move(pWT1)));
   43
   44return StatusCode::SUCCESS;
   45}

If it is known at the compilation time which input Condition Objects the Derived Object will be declaring its dependency on, then the dependency declaration can be done in one ago, as opposed to incremental dependency declaration shown in the above code fragment. For example, if we knew that none of our Read Condition Handle Keys can be empty, then we could replace all of the above calls to WriteCondHandle::addDependency() with one shown below:

wchWT1.addDependency(rchRT1,rchRT2,rchRT3);

Inheritance of conditions objects

A conditions object may inherit from another, in a similar manner as for DataVector. To declare such an inheritance relationship, use the CONDCONT_DEF macro with an extra argument of the type base class:

class Base { ... };
class Derived : public Base { ... };
CONDCONT_DEF ( Derived, 243030042, Base );

Then you can retrieve a conditions object of type Derived as type Base.


Major updates:
-- AndreasSalzburger - 2020-4-22 -- VakhoTsulaia - 2017-02-28

Responsible: VakhoTsulaia
Last reviewed by: Never reviewed

Edit | Attach | Watch | Print version | History: r27 < r26 < r25 < r24 < r23 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r27 - 2020-05-24 - VakhoTsulaia
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Atlas All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback