IOV and IOV Metadata In a Nutshell
Complete:
Introduction
Interval of Validity (IOV) is the contiguous (in time) set of events for which calibration or alignment data are to be used in reconstruction. In the CMS conditions model, IOV is a pure offline concept. The conditions data, which are stored in the offline database as POOL-ORA objects, are also called the
payload objects. Each payload object is indexed by its IOV while the data objects themselves do not contain any validity related information. The IOV index (implemented as a
IOVSequence
class) can be deleted and recreated independent of the data object it points to. The IOV index is annotated with metadata which are called the
IOV metadata. The payload data might be rewritten as new objects after processing but no updating/overwriting of the existing payload data is allowed. A description of the general concept of IOV used in the CMS offline conditions model can be found
here.
Current implementation
In the current implementation, the IOVSequence is a POOL-ORA object. Here is an example layout of an IOVSequence:
SINCETIME |
PAYLOAD OBJECT TOKEN |
1 |
5CBD2141-68B7-DA11-ABCE-00A0C9DA776A |
4 |
16956E49-68B7-DA11-99EF-00A0C9DA776A |
8 |
56AC6956-68B7-DA11-A05E-00A0C9DA776A |
In this example, the IOVSequence object is an index on three payload objects. The first object, which is identified by its unique POOL object token 5CBD2141-68B7-DA11-ABCE-00A0C9DA776A, is valid from run 1 to run 3; the second object 16956E49-68B7-DA11-99EF-00A0C9DA776A is valid from run 4 to run 7;
the third object is valid from run 8 on.
Since the IOVSequence is a POOL object, it is identified by a unique but not-human-friendly token. We use IOV metadata to assign meaning to and categorise the
iov tokens. The current implementation of IOV metadata a very simple one: associating a human-readable name, also known as a
"tag", with the token-string. This association is stored in a database table with the following layout:
In CMSSW, the package
CondCore/IOVService is responsible for the IOV index management; the package
CondCore/MetaDataService for iov metadata operations.
When writing a new IOV index, we
must assign a name(a tag) to the token of the IOV object through the MeataDataService interface. All POOL-ORA objects, such as the payload objects and the IOV objects, can be written into the database using the interface in
CondCore/DBCommon/DbSession.
For reading conditions data from db, the
EventSetup source module
PoolDBESSource should be used. The following IOV and IOV metadata interactions are encapsulated in this module: the metadata service translates the tag name to the corresponding iov token, then the IOV service retrieves the right payload object using the token. PoolDBESSource parameters are described
here.
Typical calibration usecases by example
This section describes by example how to solve some typical calibration usercases using the Conditions system in CMSSW. It is not intended to be an exhaustive description of the calibration and alignment work flow and data flow. We address three most important use cases:
- Offline calibration with event data
- Online to Offline transfer of conditions data
- Re-analysis of an old IOV range
In these examples, we assume that conditions data relevant to offline reconstruction and data reprocessing are stored in the offline master database ORCOFF, data relevant to HLT operations are stored in ORCON and permanently in ORCOFF. The online data used in the calibration procedure are stored in the online master database OMDS. For testing, sqlite file and oracle development accounts can be used.
1. Event data based calibration
- In the calibration job, event data collected in the calibration run are the input source. In the calibration algorithm, we write the calibration constants in POOL-ORA objects using the CMS.PopCon interface. For each payload object, we assign a validity to its token. PopCon will take care of the proper management of the IOVSequence and tag. For examples please refer to the detailed documentation in the popcon section of the software guide here.
- In the RecHit building reconstruction job, event data collected in the physics run are the input source. We use/configure the PoolDBESSource to read the calibration constants from db asking the tag "mycalib". Inside the RecHit builder, we ask the calibration constants as object from the EventSetup handle, use the constants to build and finally deliver the calibrated RecHit. A tutorial on DT calibration from real data using PoolDBESSource can be found here
.
- In this kind of calibration procedure, the databases and database software involved are considered "pure offline" where data are stored directly in offline databases ORCON and/or ORCOFF. ( no database knowledge is required)
2. Calibration using IP5 defined conditions data
- For this kind of use case, data stored in the pure relational OMDS should be transformed to C++ object to be stored POOL in the offline databases such as ORCON and ORCOFF. A CMS.PopCon application shall be constructed that has as input the data stored in OMDS.
The queries needed in this procedure should be constructed by the detector-specific online database experts who have knowledge about SQL and the schema design of their online database.
3. Re-analyze calibration data
- Configure the PoolDBESSource to read the existing calibration data with the old tag "myoldcalib". Note: It makes no difference here whether the existing data and tag come from an online procedure or from offline analysis.
- In an EDAnalyzer, we get the conditions data as objects from EventSetup handle, re-analyze these data plus other data and at the end of the job write out new payload objects, using CMS.PopCon interface, assigning the proper validity interval and writing into a a new tag "mynewcalib". NOTE: The IOV sequence "myoldcalib" as well as the old payload data it points will be kept.
- It's also possible that we only need to create a new IOV index to existing data without create new payload objects.
- This kind of procedure is considered "pure offline". ( no database knowledge is required)
IOV extension and tagging policy
We distinguish low, middle and high level calibration and alignment tasks whose IOV and tagging policy are described in general
here. Depending on the source of conditions data:
IOV extension and tagging Policy |
1. whatever process running in a controlled environment is allowed to perform IOV extensions |
2. all other processes are not allowed to make IOV extension but rather write a new IOV index with a new tag. Moving tags and generic names (like "latest_calibration" which might point to different IOV index in function of time) are not allowed. |
In the above statements:
- case 1) covers all known cases of prompt calibration with latencies < 1 week
- case 2) covers all offline calibrations which usually need large samples, i.e. longer time. The new IOV index might be a) a replica of the old one with one row added or b) a totally new one. Use case for a): calibration are really changing in time and need correction vs time; Use case for b): calibration are derived more precisely and they have retroactive validity
- In the case of re-processing "high level" calibrations, like jet energy, b tagging, and "middle level" calibrations, like precise calorimeter energy calibration, the tag collection stay untouched while the frequent low level calibration (i.e. prompt calibration) IOV tags can be extented. Therefore, new incoming data will get the proper low level condition data while the high level condition data remain the same. In fact the fixed tag collection of these offline defined high (middle) level calibrations defines the re-processing procedure - i.e. one consistent set of tags is used for re-processing.
The reason behind this policy is that uncontrolled environment should not interfere with existing tags which are used for prompt calibration, prompt reconstruction and production. New tags will go in "standard" production only after validation/scrutiny.
In the above statements,
"controlled environment", has to be considered every process which is centrally managed. It includes:
- DAQ, online, DQM (runs in IP5)
- prompt calibration (IP5/Tier0).
For the prompt calibration, a general infrastructure will be provided in order to harmonize the interface of single subdetector calibration codes. Given the relevance of such task, the code for the prompt calibration, once released, will follow the same rules and restriction as the official production code.
From the end-user point of view, all the machinery regarding the IOV and calibration payload tags is hidden, i.e. a standard cmsRun job of a user running H -- >gg analysis will always get access to the latest consistent tag.
Review Status
Responsible:
ZhenXie
Last reviewed by: Reviewer