5.2 Grid Computing Context
Complete:
Detailed Review status
Goals of this page:
This page will provide you a general context within which the CMS distributed analysis infrastructure is placed.
The buzzword here is "Grid Computing", and this page will provide you a rudimentary introduction to its terms,
to the extent they are relevant to CMS.
Contents
Introduction
To satisfy emerging IT needs in the scientific, industrial, governmental and commercial arenas, Grid computing has been conceived as an expansion of distributed computing. Grid computing involves the distribution of computing resources among geographically separated sites (creating a "grid" of resources), all of which are configured with specialized software for routing jobs, authenticating users, monitoring resources, and so on. Shared, site-based computing resources may include computing and/or storage nodes, software, data, a variety of scientific instruments, and so on.
Grid computing aims to provide reliable and secure access to widely scattered resources for authorized users located virtually anywhere in the world. When a user submits a job, the Grid software controls where the job gets sent for processing. Think of a Grid as a utility, much like the electrical utility grid. A company may buy electric power from a variety of physically separate sources, pool it, and distribute it to all its customers with high reliabililty. The customers do not need to know where their electricity originates, just that their wall sockets always work. In Grid computing, the end user does not need to know where particular resources reside, just that they are available with high reliability.
As regards CMS, it is virtually impossible to store all of the data in a single location, and to contain all of the CPU power at the same site for data storage, analysis and Monte Carlo production. For this reason, Grid technology is being used. A
3-level Tier structure of computing resources has been organized to handle the vast storage and computational requirements of the CMS experiment. A CMS physicist may use Grid tools to submit a CMS analysis job to a "Workload Management System" (WMS), and does not need to worry about the details such as location of data and available computing power, which are handled
transparently.
Worldwide LHC Computing Grid Project (WLCG)
The mission of the
WLHC Computing Project (WLCG)
is to build and maintain a data storage and analysis infrastructure for the entire high energy physics community that will use the LHC. The WLCG project aims to collaborate and interoperate with other major Grid development projects and production environments around the world.
As such, WLCG has developed relationships with regional computing centres as
T1 centres. These centres exist in a number of different countries in Europe, North America and Asia. Each T1 centre is part of at least one of the Grids,
EGEE
,
OSG
,
NorduGrid
, and potentially others, and provides sharable resources. These resources become accessible to CMS through WLCG User Interfaces such that any CMS user can potentially use their facilities.
Enabling Grid for E-sciencE (EGEE)
EGEE
is a project of the European Union which provides a world-wide Grid infrastructure for several scientific communities, including High Energy Physics and the LHC experiments. The vast majority of the WLCG sites outside the US is part of EGEE. EGEE provides not only the infrastructure, but also a complete Grid middleware stack,
gLite
.
Open Science Grid (OSG)
The
Open Science Grid
is a US Grid computing infrastructure that supports scientific computing via an open collaboration of science researchers, software developers and computing, storage and network providers. CMS researchers (US-based or not) working from a WLCG User Interface (UI) can access both EGEE and OSG Tier1 and Tier2 resources in a fully transparent way. More info at
US CMS Grid Services and Open Science Grid
.
NordGrid
is project by several countries (mostly in
NorthernEurope) to develop and operate a grid infrastructure.
Grid Security, Authentication and Authorization
Maintaining security within a Grid is very important. Grid user authentication is based on Digital Certificates (sometimes called "Grid certificates" or just "certificates"). A digital certificate is a specialized file, issued by a trusted authority, that is used to verify a user's identity on a computer and/or over a computer network (e.g., on a Grid). Authenticated users must obtain authorisation to use particular Grid resources.
Authorisation is provided by membership in a Virtual Organisation (VO). A VO is a group of individuals or institutions who share the computing resources of a Grid for a common goal, e.g., the CMS collaboration. The VO must be able to verifiably identify applicants and members (i.e., to trust the certificates), control which individuals join the organisation, control what they are allowed to do, and make its list of members available to the software that controls and monitors the Grid.
The VO keeps the list of authorized users in a VOMS server and when a valid certificate is presented by a user a special file called "proxy" is created on the user disk. The proxy has a limited validity (while certificates last usually one year) and is used to issue grid commands.
In an analogy with traveling, the certificate acts as your passport (thus providing authentication). Your VO "stamps a visa in your passport", thus saying "I know who you are and where you come from, and I authorize you to visit such-and-such places and to do such-and-such things." The proxy is the equivalent of a staying permit that let's you do some of those things for a limited time. Proxy renewal is as important in grid computing as staying permits for migrant workers.
Review status
Complete review. No changes. The page accomplishes its goal.
Responsible:
StefanoBelforte
Last reviewed by:
StefanoBelforte - 28 Feb 2008
--
FrankWuerthwein - 04-Dec-2009