Difference: WorkBookComputingModel (56 vs. 57)

Revision 572017-11-16 - NitishDhingra

Line: 1 to 1
 
META TOPICPARENT name="WorkBook"

2.2 CMS Computing Model

Complete: 5
Line: 109 to 109
  The physics abstractions physicists use to request these items are datasets and event collections. The datasets are split off at the T0 and distributed to the T1s, as described above. An event collection is the smallest unit within a dataset that a user can select. Typically, the reconstructed information needed for the analysis, as in the first bullet above, would all be contained in one or a few event collection(s). The expectation is that the majority of analyses should be able to be performed on a single primary dataset.
Changed:
<
<
Data are stored as ROOT files. The smallest unit in computing space is the file block which corresponds to a group of ROOT files likely to be accessed together. This requires a mapping from the physics abstraction (event collection) to the file location. CMS has a global data catalog called the Dataset Bookkeeping System (DBS) which provides mapping between the physics abstraction (dataset or event collection) and the list of fileblocks corresponding to this abstraction. It also gives the user an overview of what is available for analysis, as it has the complete catalog. The locations of these fileblocks within the CMS grid (several centers can provide access to the same fileblock) are resolved by the PhEDEx, the Physics Experiment Data EXport service. PhEDEx is responsible for transporting data around the CMS sites, and keeps track of which data exists at which site. The mapping thus occurs in two steps, at the DBS and PhEDEx. See WorkBookAnalysisWorkFlow for an illustration (note that, in that illustration, the role of the data-location service is represented by 'DLS', which was eliminated as being functionally redundant with the information contained in PhEDEx).
>
>
Data are stored as ROOT files. The smallest unit in computing space is the file block which corresponds to a group of ROOT files likely to be accessed together. This requires a mapping from the physics abstraction (event collection) to the file location. CMS has a global data catalog called the Dataset Aggregation System (DAS) which provides mapping between the physics abstraction (dataset or event collection) and the list of fileblocks corresponding to this abstraction. It also gives the user an overview of what is available for analysis, as it has the complete catalog. The locations of these fileblocks within the CMS grid (several centers can provide access to the same fileblock) are resolved by the PhEDEx, the Physics Experiment Data EXport service. PhEDEx is responsible for transporting data around the CMS sites, and keeps track of which data exists at which site. The mapping thus occurs in two steps, at the DAS and PhEDEx. See WorkBookAnalysisWorkFlow for an illustration (note that, in that illustration, the role of the data-location service is represented by 'DLS', which was eliminated as being functionally redundant with the information contained in PhEDEx).
 

The CMS Data Hierarchy

Line: 161 to 161
 

Managing Grid Jobs

The management of grid jobs is handled by a series of systems, described in WorkBookAnalysisWorkFlow. The goal is to schedule jobs onto resources according to the policy and priorities of CMS, to assist in monitoring the status of those jobs, and to guarantee that site-local services can be accurately discovered by the application once it starts executing in a batch slot at the site. As a user, these issues should be invisible to you.
Changed:
<
<
The datasets are tracked as they are distributed around the globe by the CMS Dataset Bookkeeping Service (DBS), while the Physics Experiment Data Export service (PhEDEx) moves data around CMS.
>
>
The datasets are tracked as they are distributed around the globe by the CMS Dataset Aggregation Service (DAS), while the Physics Experiment Data Export service (PhEDEx) moves data around CMS.
  A major bottleneck in the data analysis process can be retrieval of data from tape stores, so storage and retrieval are major factors in optimising analysis speed.
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback