Consolidation and Harmonization of Compute Area Clients and APIs

Alvise Dorigo, Martin Skou Andersen, Björn Hagemeier

Goals

The goal of the task force consolidation and harmonization of compute clients is to determine possible ways of harmonization of the existing clients also in the light of the existence of the EMI Execution Services interfaces developed by EMI. The main issues are to improve:

  • usability
  • maintainability
  • portability to different architectures and operating systems

So far, we have evaluated several approaches for achieving this goal. The choice of the approach is not simple, each having its own set of advantages and disadvantages. We have therefore prepared this document describing the approaches, their pros and cons, and consequences of these choices.

Making the cut between CLI and API

With the three middlewares involved in the compute area, each has its own approach on where to make the cut between the command line interface (CLI) for end users and the client library for developers. Where ARC inherently follows the approach of abstracting the underlying client library from the command line interface, the CREAM CLI operates directly on the underlying library. UNICORE does a bit of both by providing the extensible UNICORE command line client offering UNICORE specific commands that are implemented by directly using the UNICORE client library. This client is extensible and through this supports access to OGSA-BES services.

Also from the UNICORE world, there is the HiLA library and its respective command line client, HiLA Shell. Like the ARC client HiLA has been intended to be used for accessing multiple implementations through the same client. It is an abstraction layer above specific Grid middleware clients. Currently, there is an implementation for UNICORE 6 only, but EMI-ES can be envisioned. The HiLA Shell client was incepted as a demonstrator for the library and is thus not as sophisticated as the UNICORE commandline client or the ARC client.

  • Where to make the cut between CLI and API *CREAM has some lower level libs
  • Martin thought of the process as merging all clients into one

-MC: 1) we have to take into account that each middleware has preferences towards a given programming language. C and JAVA must be taken into account, they cover the biggest part, but for example the gLite WMS offers python bindings as well. 2) there are preferences in the job description language as well. would emi-es be able to cover all the cross middleware production needs? if this could be done in some way, how do we map al the various dialects about job description into EMI-ES?

    • What can they do?
    • Unification?

Command Line Client (CLI)

Current command line clients support a number of features. These should roughly be available in any harmonized or consolidated client, such that users can easily pick up how to use the new clients. The essential question in this aspect is whether to integrate the functionality ot access EMI-ES in all existing middleware clients or to write a new middleware client allowing access to EMI-ES, thus consolidating functionality in a single client.

MC: what about providing hooks, in case of a unified client? the other way around things would be easier anyway..

As an addition, functionality to access EMI-ES could be integrated in the existing clients, but we do not see this as a goal for EMI. The integration in the existing clients would also depend on the decisions regarding the client library (see below).

  • Features of the client
    • submission
    • job monitoring
    • match-making
    • file staging
    • ...
  • How to implement
    • Single client for all EMI-ES implementations
    • vs. Integrate functionality in existing clients
    • We would like a PTB decision about whether it is acceptable to only offer a client accessing EMI-ES or whether we need something more, i.e. accessing all previously existing services.
    • ARC library is generic and can be used in a sense as a client library for EMI-ES
    • The HiLA approach is similar
    • We intend to evaluate existing generic logic in ARC and HiLA for which one to use

We will discuss the individual proposed solutions in the following sections.

Single EMI-ES command line client

Pros

  1. New users would only have to learn a single interface in order to access all EMI middleware, making an even larger set of resources available to them.
  2. EMI-ES contain the essential Grid interfaces and is thus clean and lean. It is not laden with legacy functions that are only needed rarely. The single EMI-ES client would make this visible.
  3. Reduction of active code, as only the single client would need to be maintained. This is one of the key EMI targets.
  4. Some clients are currently available on multiple platforms, while others are only available on selected platforms. Starting with a new client, it would be possible to aim for platform independence, thus allowing users to use the platform of their choice rather than the developer's choice.

Cons

  1. A lot of effort has been spent over years to develop the existing clients. They do their tasks and the do them well.
  2. Users are used to use the existing clients and know their interfaces in terms of options and arguments and their use.
  3. Users may have developed scripts that use the existing clients and therefore abandoning the existing clients would lead to user's disappointment. Users would have to adapt their scripts if they want to use the new single EMI command line client and its potential new features.

Consequences

Integration of EMI-ES functionality in existing clients

Pros

  1. Users would be able to keep using the clients they've always been using with the additional benefit of being able to access other additional middleware through the EMI-ES interfaces.
  2. Less initial overhead as new new client framework would have to be implemented.

Cons

  1. This approach requires at least three (ARC, gLite, UNICORE) clients to be maintained, not leading to a reduction of code. On the contrary, code will even increase, as the EMI-ES client library, which we assume will be developed, will have to be integrated in multiple clients.
  2. Users may be distracted by the additional features of their clients, not knowing what to do with them.
  3. There may even be potential conflicts within the clients, e.g. when the existing client so far required a library that conflicts with a library required for accessing the EMI-ES interfaces.
  4. The uptake of the new EMI-ES functionality would be hampered, making it badly accepted right from the start.

Consequences

Alignment of existing clients

From the point of view of a user, it would already be an improvement in usability, if the multiple existing clients would follow similar rules for their user interfaces, e.g. provide similar if not equal names for commands and options. A comparison of commands and their options has already been done and can be found in EmiJra1T2ComputeClientOpportunities. It would be a minor effort to harmonize the existing tools with respect to command names and options to the arguments.

Pros

  1. Enables users to easily switch clients and make use of different middlware through this

Cons

  1. Requires (minor) changes in some of the clients. The changes themselves will be easy to do. However, they will lead to irritations with the users. While in many cases the aligned user interface will only be added on top of the existing interfaces, there may be cases where the aligned interface conflicts with the existing one, e.g. an aligned option is used for a different purpose in one of the existing clients

Consequences

Client Library

From the point of view of the client library, the question is a little more difficult to answer. Whereas end users do not care much in which programming language their clients are written, developers are more limited in this regard. They want to use their problems in the language they use every day.

One interesting approach to solve the difficulty of choosing the right library is SAGA, the Simple API for Grid Applications. It defines abstract interfaces independent from the programming language and provides language bindings for various languages. There are two benefits in this. The interfaces for different Grid implementations are abstract and thus provide access to different implementations. Additionally, with the interfaces being defined independent from the programming language, the switch from one language to another becomse quite easy.

On the other hand, we found SAGA not to be very wide spread and the implementation of SAGA adaptors for EMI-ES may become difficult, as they would have to be maintained for each programming language implementation.

At the same time, there is already a similar approach available from the ARC middleware. The API is defined in an abstract manner and multiple programming languages are supported via an automated language wrapper.

  1. SAGA or not SAGA: by now I'm in favor of not using SAGA, as it doesn't seem to be wide-spread.
  2. If not SAGA, what else? Martin suggested reworking of existing libraries.
  3. From whose point of view should the requirements be driven? Martin wrote an interesting email on February 3rd (see below).

From the EMI point of view, as I see it, it would be to have one CLI and library (API) supporting all the Execution Services in EMI. With the EMI-ES agreement, this should be a CLI and library able to interact with an EMI-ES. That is only the minimal solution, since it could be extended to support the full features of the respective ESs in EMI.

From the developers point of view: Since the compute functionality, which is to be supported in the end product, is already existing in any of the 3 compute clients/APIs, the goal would be to reuse functional stable code, which has proved its worth. This of course means that some components/parts must be dropped (3->1).

From existing users and third party developers* point of view, which I think is the most important: Here I am only concerned about existing users and third party developers, and in this respect users are using the CLI, while third party developers are using the API. Users would expect a CLI to be similar to the one they are already using, and it should at least deliver the same level of functionality. Users can be convinced to use a different CLI, but only if it does not limit their functionality. The same with third party developers. Additionally they would also expect some backwards compatibility, and concepts not changing too much.

* Users and third party developers already using any of the 3 compute area CLIs/APIs.

For me these three points of view cannot be unified.

The following questions should be answered next:

  • Compare the capabilities of current client libraries in a similar manner as we have done for the command line interfaces.
  • Consider if a common CLI implementation would base on a unified library.
  • Which programming languages need to be supported? Evaluation of existing libs will help this decision.
  • Try and avoid to add more functionality to client or libraries as a first rule in order not to overload it.

Next steps

  • CLI
    • Evaluation of ARC and HiLA abstract libraries and the clients based on them
    • Comparison of available commands in the existing clients

  • Libs
    • Look at capabilities of current libraries and compare them similar to CLIs

-- BjoernHagemeier - 18-Mar-2011

Edit | Attach | Watch | Print version | History: r9 | r7 < r6 < r5 < r4 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r5 - 2011-04-13 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EMI All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback