SD Performance and requirement 111

We understand from Erwin that requirement 111 is to provide fast and reliable access to service end points. The authors of the requirement appear to assume that this requires some form of client side caching.

We had originally intended to provide client caching in the Service Discovery APIs. However we are no longer convinced of the benefit.

Clients run on UIs, within other services and on WNs. Were client caching to be provided it could be by process, by user or by host. The cache could be read first which is best for performance or used as a last resort which is best for getting the correct results.

Per process caching would be fairly easy to provide - but it is much better provided by the application code. The application makes a single call to get the SD end points it needs, and only goes back to SD if all the end points prove to be inoperative.

If caching were by user it would be necessary to store the information between jobs. This could easily be done with a $HOME for each user - but this does not work on the WNs with any kind of dynamic account system.

The third option of making the cache host-wide seems impractical for security reasons as different users may have the rights to see different services. It would require a privileged daemon and not the client API collecting information from all services which is effectively what happens with an R-GMA secondary producer.

So what is the solution? I suggest that best approach is to modify the SD API implementation so that it invokes plug-ins in parallel rather than sequentially. Once a plausible response is obtained other threads can be be ignored or killed. This should reduce the time that SD takes to respond. Note that if R-GMA and BDII are each only available for 90% of the time then the pair should give 99% availability and in fact both are much better than 90%.

Services and users should use SD sensibly by minimising calls to it. Neither R-GMA nor BDII should be used directly to obtain service end points.

Sufficient secondary producers (aka archivers) should be installed to obtain good R-GMA response. The right number should be determined in consultation with SA1.

Work is already going on to bring the BDII and R-GMA SD in line and to use the same configuration files. This ensures that both systems will give the same answer for service versions and will minimise the configuration effort.

Finally please note that the service described at https://savannah.cern.ch/task/?func=detailitem&item_id=3069 could very easily be provided by R-GMA once the authz is in place. This would require a single primary producer for the "constants" and a few secondary producers.

-- Main.jwhite - 05 Oct 2006


This topic: EGEE > TCGHome > CaCHE
Topic revision: r2 - 2008-01-21 - LaurenceField
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback