Please refer to the updated version at: https://twiki.cern.ch/twiki/bin/view/EGEE/EGEEgLiteWorkPlans

Information and Monitoring work plan

Requirements

There are only two requirements directly related to the JRA1-UK work. Under the security heading:

  • #101 - asks us to make use of VOMS groups and roles

and under the Information System heading:

  • #111 - we understand that the real requirement is to provide fast and reliable access to service end points. The requirment authors appear to assume that this requires some form of caching

Effort

We are currently very low on manpower having just had 4 resignations from the team. This leaves by early July:

Djaoui, Abdeslem: Development

Duncan, Alastair: Development and integration

Fisher, Steve: Cluster leader

Wilson, Antony: Developmenet and support

We are recruiting at least 3.5 developers to provide the 168 PM over the 2 year period.

R-GMA Plans

R-GMA is now working fairly well with experienced users. However the design has some limitations which are hard to work around. We are currently working on a new design which includes authz, multiple VDBs (name spaces) and registry and schema replication and support for Oracle and other RDBMS.

The first release of this new design will contain no new functionality but will be a sound basis to work from. Significant changes in the design include:

- The registry no longer sends out notifications. This should increase reliability and does make registry replication much easier to implement.

- There will only be one socket open for streaming from one machine to another

- regular handling of remote invocation (time-outs etc)

- database independence

- managed tuple stores - essential to support authz

This first release will probably also include multiple VDB support but not the ability to issue a query spanning more than one VDB.

In subsequent releases we will provide in this sequence:

1 Queries over multiple VDBs

2 Authz by VDB. This will make use of VOMS Groups and Roles (or any other certificate attributes)

3 Registry replication

4 Schema replication

5 Oracle support

Service Discovery (SD) Plans

We will work on more responsive SD by invoking the plug-ins in parallel.

We are developing a "configuration-free" SD which is useful as a bootstrap mechanism as it can locate the R-GMA server on the local subnet.

Time scales and assignment of people

We are currently unable to make good predictions as considerable effort can sometimes be "lost" in support. In addition JRA1-UK has a temporary criticial shortage of manpower. We hope to achieve by the end of this year:

- queries over multiple VDBs and authz (using VOMS attributes) by VDB

- R-GMA clients with the option of automatic configuration

- Parallel invocation of SD plugins

However this depends on being able to get new people empployed and working effectively in a rather short time.

Cache

SD Performance and requirement 111

We understand from Erwin that requirement 111 is to provide fast and reliable access to service end points. The authors of the requirement appear to assume that this requires some form of client side caching.

We had originally intended to provide client caching in the Service Discovery APIs. However we are no longer convinced of the benefit.

Clients run on UIs, within other services and on WNs. Were client caching to be provided it could be by process, by user or by host. The cache could be read first which is best for performance or used as a last resort which is best for getting the correct results.

Per process caching would be fairly easy to provide - but it is much better provided by the application code. The application makes a single call to get the SD end points it needs, and only goes back to SD if all the end points prove to be inoperative.

If caching were by user it would be necessary to store the information between jobs. This could easily be done with a $HOME for each user - but this does not work on the WNs with any kind of dynamic account system.

The third option of making the cache host-wide seems impractical for security reasons as different users may have the rights to see different services. It would require a privileged daemon and not the client API collecting information from all services which is effectively what happens with an R-GMA secondary producer.

So what is the solution? I suggest that best approach is to modify the SD API implementation so that it invokes plug-ins in parallel rather than sequentially. Once a plausible response is obtained other threads can be be ignored or killed. This should reduce the time that SD takes to respond. Note that if R-GMA and BDII are each only available for 90% of the time then the pair should give 99% availability and in fact both are much better than 90%.

Services and users should use SD sensibly by minimising calls to it. Neither R-GMA nor BDII should be used directly to obtain service end points.

Sufficient secondary producers (aka archivers) should be installed to obtain good R-GMA response. The right number should be determined in consultation with SA1.

Work is already going on to bring the BDII and R-GMA SD in line and to use the same configuration files. This ensures that both systems will give the same answer for service versions and will minimise the configuration effort.

Finally please note that the service described at https://savannah.cern.ch/task/?func=detailitem&item_id=3069 could very easily be provided by R-GMA once the authz is in place. This would require a single primary producer for the "constants" and a few secondary producers.

-- Main.grandic - 27 Jun 2006

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2008-01-21 - LaurenceField
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback