NA3.3.2 - EMI Standardization Article Website October/November 2011
Version 1.1
For several years there have been many issues surrounding the integration of
Open Grid Services Architecture (OGSA) concepts in Distributed Computing Infrastructures (DCIs)
and standardisation of Grid services within Grid middleware systems provided by EMI.
OGSA represents a massive architecture in the context of distributed systems based on the concept
of Grid services that cover functionality of many technical areas such as compute, data, and security.
The issues can be partly explained by the fact that working interactions among numerous Grid services
as envisaged by OGSA are non-trivial when we consider their implementations based on
Web services message exchanges. These exchanges essentially represent an XML-based Remote
Procedure Call (RPC) where each single difference in the correspondingly used XML-based
protocol or schema can break the interconnection between EMI services and their clients.
Standardisation of these XML-based protocols or schemas bears the potential to enable more functioning
and stable interconnections between the EMI Grid services that form together a DCI thus enabling
interoperability with other DCIs and similiar infrastructures that adopt the same standards.
Such aforementioned common open standards can be defined as follows:
A common open standard is any standard developed by a standardization
development organisation (SDO) following an open process
and being commonly relevant to the
e-Science community. Those standards are typically normatively defined and publicly available.
In principle, EMI works with standards that are normatively defined in specifications
while some of them are still considered as so-called emerging open standards
when these are not implemented by many technology providers yet.
SDOs relevant in the context of EMI are most notably:
- Open Grid Forum (OGF)
- e.g. GLUE2, SRM, GridFTP, WS-DAI, ByteIO, PGI/OGSA-BES/JSDL, etc.
- Organization for the Advancement of Structured Information Standards (OASIS)
- e.g. SAML, XACML, WS-Trust, etc.
- Internet Engineering Task Force (IETF)
- e.g. X.509-based Public Key Infrastructure (PKI), etc.
One of the key standards in EMI is the OGF specification GLUE2.
This standard represents a Grid information model that
normatively defines attributes (i.e. properties) for certain
important entities like computing or storage resources
and EMI Grid services that are deployed in DCIs.
The GLUE2 schema is an evolution from the proprietary GLUE1.3
schema that essentially was created during the course
of the EGEE series of projects.
One of the goals of EMI is to describe each service with this standard
in order to achieve a common EMI information ecosystem.
Another key standard is the OGF Usage Record Format (UR)
that is a normative schema for tracking
resource usage.
This standard not only bears the potential to enable cross-infrastructure accounting, but
also set the foundation for higher-level billing and pricing services and techniques that
are relevant for using EMI products outside the academic field in commercial environments.
Since it essentially stands for computational resource usage information,
EMI works not only on its evolution,
but also on its extension towards storage resource tracking.
Members of EMI chair this activity in order to contribute the EMI proprietary
extensions to standardization.
In terms of security, the Security Assertion Markup Language (SAML)
from OASIS is very well known in the industry,
but also recently more and more in the scientific domain and thus relevant for EMI.
It is a very extensive standard but mostly used in our given context for the transfer of security
attributes that state the VO or project membership as well as the role posessions of end-users.
SAML has much potential to be the next generation e-science security standard and as such it is
one cornerstone of the security harmonization activities within EMI.
Apart from the commercial field, SAML-based security solutions also gained a considerable profile in the
academic field, for instance by Shibboleth Federations in Europe and
InCommon in the US.
Also many EMI products adopt this promising security standard thus paving the way of being used
by new user communities interesting in modern security setups.
The eXtensible Access Control Markup Language (XACML)
is the counterpart to SAML providing a very strong language for the definition
of security policies used during authorization decisions.
XACML is developed by OASIS
and is also relevant to EMI especially in defining common attribute-based
authorization policies across the different Grid middleware systems in EMI.
Another important open standards schema is the so-called Job Submission and Description
Language (JSDL).
This prominent OGF standard describes how computationally-driven
jobs can be submitted to EMI Grid middleware systems.
This standard is already used in production setups
within international DCIs and several extensions have been already defined in the past.
These are the Single-Program-Multiple-Data (SPMD) JSDL extension, the JSDL Parameter
Sweep Extension and the HPC Profile Application Extensions. Nevertheless, there are several
improvements proposed by EMI in order to extend JSDL. In order to bring these
proprietary extensions into standardization,
EMI actively steers this standardization activities through the
Production Grid Infrastructure (PGI) working group.
Closely related to the JSDL standard is the OGF OGSA-Basic Execution Service (BES) specification
that makes use of JSDL in order to submit jobs to computational resources.
OGSA-BES specifies exactly those operations that are required to submit and manage computational
activities within Grid middleware adoptions.
Initial OGSA-BES adoptions have been used in
production setups leading to several additional requirement in terms of functionality.
Therefore, EMI worked on an EMI execution services specification
with numerous improvements in order to contribute to an improved version of this standard
for production DCIs.
The standardization of EMI execution services concepts is an activity performed
within the PGI working group that is chaired by EMI members.
The Storage Resource Manager (SRM) is a storage management specification that is very
well adopted by over five different implementations that serve different end-user needs.
Three implementations are part of the EMI releases.
Complementary to this storage standard is the access
to relational databases and such like using the
WS-Data Access and Integration (WS-DAI) specification.
Metadata catalogs such as
AMGA in EMI rely on this specification.
In contrast, the OGF
GridFTP specification defines mechanisms for the use
of a FTP for very large data amounts making it the de-facto large-scale data transfer standard
available in distributed computing today.
Another data transfer standard that is relevant is the
ByteIO specification that offers access to files with remote POSIX-based methods.
Both of the aforementioned standards are also implemented in EMI products.
Apart from the specific standards above, EMI also observes new emerging standard specifications.
Examples are the OGF WS-Agreement specification, the Distributed Resource Management
Application API (DRMAA) version 2 specification or the OGSA - Resource Usage Service specification.
In addition, EMI also observes standardization activities in the emerging cloud domain such as the
OGF Open Cloud Computing Interface (OCCI) or the Cloud Data Management API (CDMI) from SNIA.
Finally, EMI is not only adopting a wide variety of open standards,
but also actively contributing to the process of defining evolutions of them.
By chairing and strong contributions in the open standard community,
EMI is a key central player in Europe to transfer lessons learned from
DCI production experience into the international standardization process.