Chapter 2: The Basics of CMS Offline Computing



2.1 Introduction to the Basics of CMS Offline Computing

Complete: 5
Detailed Review status

This chapter introduces you to the CMS computing environment and to CMSSW, the CMS software framework used to perform analysis using collision data and MC datasamples.

The CMS Computing Model (WorkBookComputingModel) section describes how computing centres available to CMS around the world are distributed and configured in a tiered architecture that functions as a single coherent system. It also describes how detector data and MC data travel through the tiers, and how data are distributed, stored, and accessed.

CMSSW consists of over a thousand subpackages which have been created to provide an extensive toolkit for users to carry out analyses of data and perform other software-related tasks with only a small contribution of code by themselves. The CMSSW Application Framework (WorkBookCMSSWFramework) section describes:

  • the modular architecture of the CMSSW framework
  • what comprises an Event
  • how data are uniquely identified in an Event
  • how Event data are processed
  • the basics of provenance tracking.

Review status

Reviewer/Editor and Date (copy from screen) Comments
KatiLassilaPerini - 06 Dec 2009 Updated page

Responsible: SudhirMalik
Last reviewed by: KatiLassilaPerini - 02 Feb 2011



2.2 CMS Computing Model

Complete: 5

Contents

Goals of this workbook page

When you finish this page, you should understand:
  • the tier structure of the CMS computing model.
  • how detector data and MC data travel through the tiers.
  • how data are distributed, stored, and accessed.

Introduction

CMS presents challenges not only in terms of the physics to discover and the detector to build and operate, but also in terms of the data volume and the necessary computing resources. Data sets and resource requirements are at least an order of magnitude larger than in previous experiments.

CMS computing and storage requirements would be difficult to fulfill at any one place, for both technical and funding reasons. Additionally, most CMS collaborators are not CERN-based, and have access to significant non-CERN resources, which it is advantageous to harness for CMS computing. Therefore, the CMS computing environment has been constructed as a distributed system of computing services and resources that interact with each other as Grid services. The set of services and their behaviour together comprise the computing, storage and connectivity resources that CMS uses to do data processing, data archiving, Monte Carlo event generation, and all kinds of computing-related activities.

The computational infrastructure is intended to be available to CMS collaborators, independently of their physical locations, and on a fair share basis.

Tier architecture of computing resources

The computing centres available to CMS around the world are distributed and configured in a tiered architecture that functions as a single coherent system. Each of the three tier levels provides different resources and services:

Tier-0 (T0)

The first tier in the CMS model, for which there is only one site, CERN, is known as Tier-0 (T0). The T0 performs several functions. The standard workflow is as follows:

  1. accepts RAW data from the CMS Online Data Acquisition and Trigger System (TriDAS)
  2. repacks the RAW data received from the DAQ into primary datasets based on trigger information (immutable bits). Roughly 10 datasets are expected in the 7TeV run when there is sufficient luminosity and eventually growing to 15.
  3. archives the repacked RAW data to tape.
  4. distributes RAW data sets among the next tier stage resources (Tier-1) so that two copies of every piece of RAW data is saved, one at CERN, another at a Tier-1.
  5. performs PromptCalibration in order to get the calibration constants needed to run the reconstruction.
  6. feeds the RAW datasets to reconstruction.
  7. performs prompt first pass reconstruction which writes the RECO, Analysis Object Data (AOD) and mini-AOD extraction.
  8. distributes the RECO datasets among Tier-1 centers, such that the RAW and RECO match up at each Tier-1.
  9. distributes full AOD/mini-AOD to all Tier-1 centers.

The T0 does not provide analysis resources and only operates scheduled activities.

The T0 merges output files if they are too small. (This will affect RECO and AOD, and maybe AlcaReco; under certain repacker scenarios one could even imagine merging RAW data files but this will be avoided as much as possible.) The goal of CMS is to write appropriately sized data into the tape robots. Currently CMS typically imports 2-3GB files, though 5-10GB files are technically possible and are desirable for tape system performance.

At CERN, though logically separated from the T0 is the CMS-CAF (CERN Analysis Facility). The CAF offers services associated with T1 and T2 centers and performs latency critical, non-automated activities. The CAF is not needed for normal Tier0 operation; it is intended for short-term, high priority, human-operated calibration, physics validation and analysis. For example, the CAF would be used for very fast physics validation and analysis of the Express Stream (a subset of the data that is tagged by Online and then processed as quickly as possible).

Tier-1 (T1)

There is a set of seven Tier-1 (T1) sites, which are large centers in CMS collaborating countries (large national labs, e.g. FNAL, and RAL). Tier-1 sites will in general be used for large-scale, centrally organized activities and can provide data to and receive data from all Tier-2 sites. Each T1 center:

  1. receives a subset of the data from the T0 related to the size of the pledged resources in the WLCG MOU
  2. provides tape archive of part of the RAW data (secure second copy) which it receives as a subset of the datasets from the T0
  3. provides substantial CPU power for scheduled:
    • re-reconstruction
    • skimming
    • calibration
    • AOD extraction
  4. stores an entire copy of the AOD
  5. distributes RECOs, skims and AOD to the other T1 centers and CERN as well as the associated group of T2 centers
  6. provides secure storage and redistribution for MC events generated by the T2's (described below)

Tier-2 (T2)

A more numerous set of smaller Tier-2 (T2) centres ("small" centres at universities), but with substantial CPU resources, provide capacity for user analysis, calibration studies, and Monte Carlo production. T2 centers provide limited disk space, and no tape archiving. T2 centers rely upon T1s for access to large datasets and for secure storage of the new data (generally Monte Carlo) produced at the T2. The MC production in Tier-2's will in general be centrally organized, with generated MC samples being sent to an associated Tier-1 site for distribution among the CMS community. All other Tier-2 activities will be user driven, with data placed to match resources and needs: tape, disk, manpower, and the needs of local communities. The Tier-2 activities will be organized by the Tier-2 responsibles in collaboration with physics groups, regional associations and local communities.

In summary, the Tier-2 sites provide:

  1. services for local communities
  2. grid-based analysis for the whole experiment (Tier-2 resources available to whole experiment through the grid)
  3. Monte Carlo simulation for the whole experiment

As of July '18 there are about 55 T2 sites, each associated with one of the seven T1 sites or directly to CERN (the following image does not represent the actual T2 groupings under the T1s):

Tier structure

One can refer to the Dashboard site status monitoring page for the most up-to-date information about available sites along with their status.

Data Organization

Data a physicist wants to see

To extract a physics message for a high energy physics analysis, a physicist has to combine a variety of information:

  • reconstructed information from the recorded detector data, specified by a combination of trigger paths and possibly further selected by cuts on reconstructed quantities (e.g., two jets),
  • MC samples which simulate the physics signal under investigation, and
  • background samples (specified by the simulated physics process).

The physics abstractions physicists use to request these items are datasets and event collections. The datasets are split off at the T0 and distributed to the T1s, as described above. An event collection is the smallest unit within a dataset that a user can select. Typically, the reconstructed information needed for the analysis, as in the first bullet above, would all be contained in one or a few event collection(s). The expectation is that the majority of analyses should be able to be performed on a single primary dataset.

Data are stored as ROOT files. The smallest unit in computing space is the file block which corresponds to a group of ROOT files likely to be accessed together. This requires a mapping from the physics abstraction (event collection) to the file location. CMS has a global data catalog called the Dataset Aggregation System (DAS) which provides mapping between the physics abstraction (dataset or event collection) and the list of fileblocks corresponding to this abstraction. It also gives the user an overview of what is available for analysis, as it has the complete catalog. The locations of these fileblocks within the CMS grid (several centers can provide access to the same fileblock) are resolved by the PhEDEx, the Physics Experiment Data EXport service. PhEDEx is responsible for transporting data around the CMS sites, and keeps track of which data exists at which site. The mapping thus occurs in two steps, at the DAS and PhEDEx. See WorkBookAnalysisWorkFlow for an illustration (note that, in that illustration, the role of the data-location service is represented by 'DLS', which was eliminated as being functionally redundant with the information contained in PhEDEx).

The CMS Data Hierarchy

CMS Data is arranged into a hierarchy of data tiers. Each physics event is written into each data tier, where the tiers each contain different levels of information about the event. The different tiers each have different uses. The three main data tiers written in CMS are:

  1. RAW: full event information from the Tier-0 (i.e. from CERN), containing 'raw' detector information (detector element hits, etc)
    • RAW is not used directly for analysis
  2. RECO ("RECOnstructed data"): the output from first-pass processing by the Tier-0. This layer contains reconstructed physics objects, but it's still very detailed
    • RECO can be used for analysis, but is too big for frequent or heavy use when CMS has collected a substantial data sample.
  3. AOD ("Analysis Object Data"): this is a "distilled" version of the RECO event information, and is expected to be used for most analyses
    • AOD provides a trade-off between event size and complexity of the available information to optimize flexibility and speed for analyses

The data tiers are described in more detail in a dedicated WorkBook chapter on Data Formats and Tiers It is the desire of CMS that the data tiers are written into separate files, though applications will be able to access more than one file simultaneously (an application will be able to access Reco and the corresponding RAW events from separate files.)

Detector data flow through Hardware Tiers

The following diagram shows the flow of CMS detector data through the tiers.

Oli's data flow tier diagram 1/12/06

The essential elements of the flow of real physics data through the hardware tiers are:

  • T0 to T1:
    • Scheduled, time-critical, will be continuous during data-taking periods
    • reliable transfer needed for fast access to new data, and to ensure that data is stored safely
  • T1 to T1:
    • redistributing data, generally after reprocessing (e.g. processing with improved algorithms)
  • T1 to T2:
    • Data for analysis at Tier-2s

Monte Carlo data flow through Hardware Tiers

Monte Carlo generated data is typically produced at a T2 center, and archived at its associated T1 and made available to the whole CMS collaboration.

MC data flow in tiers

Workflows in CMS Computing

A workflow can be described simply as "what we do to the data". There are three principle areas of workflow in CMS:

  1. At Tier2 Centres: Monte Carlo events are generated, detector interactions simulated, events reconstructed in the same manner as will be applied to data, and the events are then moved to tape storage for later use
  2. At The Tier0 Center: Data is received from the CMS detector experiment, it is "repacked" - i.e. events from the unsorted online streams are sorted into physics streams of events with similar characteristics. Reconstruction algorithms are run, AOD is produced, and RAW, RECO and AOD are exported to Tier1 sites
  3. The user - i.e. YOU!: prepare analysis code, send code to site where there is appropriate data, then run your code on the data and collect the results
    • the process of finding the sites with data and CPU, running the jobs, and collecting the results is all managed for you (via the grid) by CRAB

Managing Grid Jobs

The management of grid jobs is handled by a series of systems, described in WorkBookAnalysisWorkFlow. The goal is to schedule jobs onto resources according to the policy and priorities of CMS, to assist in monitoring the status of those jobs, and to guarantee that site-local services can be accurately discovered by the application once it starts executing in a batch slot at the site. As a user, these issues should be invisible to you.

The datasets are tracked as they are distributed around the globe by the CMS Dataset Aggregation Service (DAS), while the Physics Experiment Data Export service (PhEDEx) moves data around CMS.

A major bottleneck in the data analysis process can be retrieval of data from tape stores, so storage and retrieval are major factors in optimising analysis speed.

Information Sources

  • CMS Computing Project Technical Design Report at http://cms.cern.ch/iCMS/ (select CPT on left menu, find Technical Design Reports underneath the table in main section of page).

  • Material on this page taken also from Tony Wildish's Computing Model lecture on the 2007 CERN Summer Studentship Programme.

Review status

Reviewer/Editor and Date (copy from screen) Comments
JennyWilliams - 27 Sep 2007 updates based on Tony Wildish's summer students lecture
PeterElmer - 19 Feb 2007 updates and improvement in description of roles of Tier1 and Tier2 centres
TonyWildish - 28 Jan 2008 correct description of T0 workflow
IanFisk - 21 Feb 2008 Clarifications to the number of Tier-1s and a few minor corrections and additions. Added concept of 2 file reading

I went through chapter 2 section 2. The information is relevant and clear. This section had a few links out of date, I updated them.

Open a new link at "DBS"

Install at several places link " PhEDEx "

Update link at "AOD"

Update link at " CERN LHCC 2004-035/G-083 "

Responsible: TonyWildish
Last reviewed by: PatriciaMcBride - 22 Feb 2008



2.3 CMSSW Application Framework

Complete: 5
Detailed Review Status

Contents

Goals of this page

When you finish this page, you should understand:
  • the modular architecture of the CMSSW framework
  • what comprises an Event
  • how data are uniquely identified in an Event
  • how Event data are processed
  • the basics of provenance tracking

Introduction: CMSSW and Event Data Model (EDM)

The overall collection of software, referred to as CMSSW, is built around a Framework, an Event Data Model (EDM), and Services needed by the simulation, calibration and alignment, and reconstruction modules that process event data so that physicists can perform analysis. The primary goal of the Framework and EDM is to facilitate the development and deployment of reconstruction and analysis software.

The CMSSW event processing model consists of one executable, called cmsRun, and many plug-in modules which are managed by the Framework. All the code needed in the event processing (calibration, reconstruction algorithms, etc.) is contained in the modules. The same executable is used for both detector and Monte Carlo data.

The CMSSW executable, cmsRun, is configured at run time by the user's job-specific configuration file. This file tells cmsRun

  • which data to use
  • which modules to execute
  • which parameter settings to use for each module
  • what is the order or the executions of modules, called path
  • how the events are filtered within each path, and
  • how the paths are connected to the output files

Unlike the previous event processing frameworks, cmsRun is extremely lightweight: only the required modules are dynamically loaded at the beginning of the job.

The CMS Event Data Model (EDM) is centered around the concept of an Event. An Event is a C++ object container for all RAW and reconstructed data related to a particular collision. During processing, data are passed from one module to the next via the Event, and are accessed only through the Event. All objects in the Event may be individually or collectively stored in ROOT files, and are thus directly browsable in ROOT. This allows tests to be run on individual modules in isolation. Auxiliary information needed to process an Event is called Event Setup, and is accessed via the EventSetup.

You will find more infomation on the CMSSW Framework in WorkBookMoreOnCMSSWFramework.

Events in CMSSW

Events as formed in the DAQ and Trigger System (TriDAS)

Physically, an event is the result of a single readout of the detector electronics and the signals that will (in general) have been generated by particles, tracks, energy deposits, present in a number of bunch crossings. The task of the online Trigger and Data Acquisition System (TriDAS) is to select, out of the millions of events recorded in the detector, the most interesting 100 or so per second, and then store them for further analysis. An event has to pass two independent sets of tests, or Trigger Levels, in order to qualify. The tests range from simple and of short duration (Level-1) to sophisticated ones requiring significantly more time to run (High Levels 2 and 3, called HLT). In the end, the HLT system creates RAW data events containing:

  • the detector data,
  • the level 1 trigger result
  • the result of the HLT selections (HLT trigger bits)
  • and some of the higher-level objects created during HLT processing.

Events from a software point of view: The Event Data Model (EDM)

In software terms, an Event starts as a collection of the RAW data from a detector or MC event, stored as a single entity in memory, a C++ type-safe container called edm::Event. Any C++ class can be placed in an Event, there is no requirement on inheritance from a common base class. As the event data is processed, products (of producer modules) are stored in the Event as reconstructed (RECO) data objects. The Event thus holds all data that was taken during a triggered physics event as well as all data derived from the taken data. The Event also contains metadata describing the configuration of the software used for the reconstruction of each contained data object and the conditions and calibration data used for such reconstruction. The Event data is output to files browsable by ROOT. The event can be analyzed with ROOT and used as an n-tuple for final analysis.

Products in an Event are stored in separate containers, organizational units within an Event used to collect particular types of data separately. There are particle containers (one per particle), hit containers (one per subdetector), and service containers for things like provenance tracking.

The full event data (FEVT) in an Event is the RAW plus the RECO data. Analysis Object Data (AOD) is a subset of the RECO data in an event; AOD alone is sufficient for most kinds of physics analysis. RAW, AOD and FEVT are described further in the introduction to the analysis. The tier-structured CMS Computing Model governs which portions of the Event data are available at a given tier. For event grouping, the model supports both physicist abstractions, such as dataset and event collection, as well as physical packaging concepts native to the underlying computing and Grid systems, such as files. This is described in Data Organization. Here is a framework diagram illustrating how an Event changes as data processing occurs:

framework diagram

Modular Event Content

It is important to emphasize that the event data architecture is modular, just as the framework is. Different data layers (using different data formats) can be configured, and a given application can use any layer or layers. The branches (which map one to one with event data objects) can be loaded or dropped on demand by the application. The following diagram illustrates this concept:

modular_event_products.gif

Identifying Data in the Event

Data within the Event are uniquely identified by four quantities:

C++ class type of the data
E.g., edm::PSimHitContainer or reco::TrackCollection.
module label
the label that was assigned to the module that created the data. E.g., "SimG4Objects" or "TrackProducer".
product instance label
the label assigned to object from within the module (defaults to an empty string). This is convenient if many of the same type of C++ objects are being put into the edm::Event from within a single module.
process name
the process name as set in the job that created the data

Modular Architecture of the Framework

A module is a piece (or component) of CMSSW code that can be plugged into the CMSSW executable cmsRun. Each module encapsulates a unit of clearly defined event-processing functionality. Modules are implemented as plug-ins (core libraries and services). They are compiled in fully-bound shared libraries and must be declared to the plug-in manager in order to be registered to the framework. The framework takes care to load the plug-in and instantiate the module when it is requested by the job configuration (sometimes called a "card file"). There is no need to build binary executables for user code!

When preparing an analysis job, the user selects which module(s) to run, and specifies a ParameterSet for each via a configuration file. The module is called for every event according to the path statement in the configuration file.

There are six types of modules, whose interface is specified by the framework:

Source
Reads in an Event from a ROOT file or they can create empty events that later generator filters fill in with content (see WorkBookGeneration), gives the Event status information (such as Event number), and can add data directly or set up a call-back system to retrieve the data on the first request. Examples include the DaqSource which reads in Events from the global DAQ, and the PoolSource which reads Events from a ROOT file.
EDProducer
CMSSW uses the concept of producer modules and products, where producer modules (EDProducers) read in data from the Event in one format, produce something from the data, and output the product, in a different format, into the Event. A succession of modules used in an analysis may produce a series of intermediate products, all stored in the Event. An EDProducer example is the RoadSearchTrackCandidateMaker module; it creates a TrackCandidateCollection which is a collection of found tracks that have not yet had their final fit.
EDFilter
Reads data from the Event and returns a Boolean value that is used to determine if processing of that Event should continue for that path. An example is StopAfterNEvents filter which halts processing of events after the module has processed a set number of events.
EDAnalyzer
Studies properties of the Event. An EDAnalyzer reads data from the Event but is neither allowed to add data to the Event nor affect the execution of the path. Typically an EDAnalyzer writes output, e.g., to a ROOT Histogram.
EDLooper
A module which can be used to control 'multi-pass' looping over an input source's data. It can also modify the EventSetup at well defined times. This type of module is used in the track based alignment proceedure.
OutputModule
Reads data from the Event, and once all paths have been executed, stores the output to external media. An example is PoolOutputModule which writes data to a standard CMS format ROOT file.

The following sketch shows an example path comprised of a Source ("POOL Source"), two EDProducers ("Digitizer" and "Tracker"), and EDFilter ("!NTrack Filter"), and an OutputModule ("POOL Output"):

processing model

The user configures the modules in the job configuration file using the module-specific ParameterSets. ParameterSets may hold other ParameterSets. Modules cannot be reconfigured during the lifetime of the job.

Once a job is submitted, the Framework takes care of instantiating the modules. Each bit of code described in the config file is dynamically loaded. The process is as follows:

  1. First cmsRun reads in the config file and creates a string for each class that needs to be dynamically loaded.
  2. It passes this string to the plug-in manager (the program used to manage the plug-in functionality).
  3. The plug-in manager consults the string-to-library mapping, and delivers to the framework the libraries that contain the requested C++ classes.
  4. The framework loads these libraries.
  5. The framework creates a parameter set (PSet) object from the contents of the (loaded) process block in the config file, and hands it to the constructor.
  6. The constructor constructs an instance of each module.
  7. The executable cmsRun, runs each module in the order specified in the config file.

fw_edm.gif

In a second figure below, we see a Source that provides the Event to the framework. (The standard source which uses POOL is shown; it combines C++ Object streaming technology, such as ROOT I/O, with a transaction-safe relational database store.) The Event is then passed to the execution paths. The paths can then be ordered into a list that makes up the schedule for the process. Note that the same module may appear in multiple paths, but the framework will guarantee that a module is only executed once per Event. Since it will ask for exactly the same products from the event and produce the same result independent of which path it is in, it makes no sense to execute it twice. On the other hand a user designing a trigger path should not have to worry about the full schedule (that could involve 100's of modules). Each path should be executable by itself, in that modules within the path, only ask for things they know have been produced in a previous module in the same path or from the input source. In a perfect world, order of execution of the paths should not matter. However due to the existence of bugs it is always possible that there is an order dependence. Such dependencies should be removed during validation of the job.

processing model

-->

Provenance Tracking

To aid in understanding the full history of an analysis, the framework accumulates provenance for all data stored in the standard ROOT output files. The provenance is recorded in a hierarchical fashion:

  • configuration information for each Producer in the job (and all previous jobs contributing data to this job) is stored in the output file. The configuration information includes
    • the ParameterSet used to configure the Producer,
    • the C++ type of the Producer and
    • the software version.
  • each datum stored in the Event is associated with the Producer which created it.
  • for each Event, the data requested by each Producer (when running its algorithm) are recorded. In this way the actual interdependencies between data in the Event are captured.

Access CMSSW Code

The CMSSW code is contained in GitHub CMSSW. You can browse this huge amount of code or search using the CMSSW Software Cross-Reference (http://cmslxr.fnal.gov/lxr/). You can access it directly on /cvmfs/cms.cern.ch/slc6_amd64_gcc700/cms/cmssw/. Packages are organized by functionality. This is a change from the older CMS framework, in which they were organized by subdetector component.

Information Sources

CMS computing TDR 4.1 -- cmsdoc.cern.ch/cms/ppt/tdr; discussion with J Yarba 3/17/06 and with Liz Sexton-Kennedy and Oliver Gutsche.
ECAL CMSSW Tutorial (Paolo Meridiani, INFN Roma, ECAL week April 27, 2006)
Framework tutorial
Physics TDR

Discussion with Luca Lista, and his presentation RECO/AOD Issues for Analysis Tools.

Recovering data from CMS software black hole. G. Zito

Review status

Reviewer/Editor and Date (copy from screen) Comments
XuanChen - 16 Jul 2014 changed access CMSSW code from cvs to github
PetarMaksimovic - 23 Jul 2009 Drastic simplification to jibe with the new Chapters 3 and 4
Main.Aresh - 19 Feb 2008 change in the index (table tags inserted into TWIKI conditionals for printable version) and in "ExamineOutput" where "PyROOT" arises problems for printable version
JennyWilliams - 23 Oct 2007 review, minor editing
Main.lsexton - 18 Dec 2006 Answered some questions, small changes in phrasing, and corrected the use of the deprecated Event interface found in the G.Zito material
AnneHeavey - 06 and 07 Nov 2006 Reworked "ROOT browsing" section; pulled in info from WorkBookMakeAnalysis; fixed up in prep for mass printing; used G.Zito's EDAnalyzer example

I went through chapter 2 section 3. The information is relevant and clear. This section had two links out of date, I updated them.

The entry content " The Processing Model " did not exist any more .

Updated the link at " Framework tutorial "

Responsible: SudhirMalik
Last reviewed by: DavidDagenhart - 25 Feb 2008

Edit | Attach | Watch | Print version | History: r19 < r18 < r17 < r16 < r15 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r19 - 2010-01-18 - KatiLassilaPerini


ESSENTIALS

ADVANCED TOPICS


 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback