Here we try to summarize an outcome of meetings/discussions pieces of existing documentation, related to this subject :

Whether experiments are happy with one single algorithm for VO-specific availability calculation which is consistent with GridView algorithm?

In general people tempt to agree that we should keep just one single algorithm (based on the discussion at the meeting on the 23.06.2010).

Could GridView people, please, describe shortly algorithm which is currently used here.....

What should be done differently of the current algorithm is the following:

1). If site has two services of the same type and one is down another one in maintenance, currently GridView considered the service type in maintenance. It should be other way around. It should be rather down than maintenance, because otherwise the site can register one fake service and keep it all the time in maintenance and it won't be ever down. Writing this page, I asked myself, what should happen if there 10 services of the same type and 9 of them are in maintennace 1 is down, should be the overall state of the service type considered to be down then? Should not one take as a value a value of the majority of services of a given type?

2). When there are several services of the same flavour, then there is a logical 'OR' in the availability calculation. VOs should be able to redefine this default behavior, if the need, changing 'OR' by 'AND'. THis possibility should be foreseen on the UI where VOs define a profile.

How to handle VO-specific service types, like CRAB server for example

David told that there was an agreement recorded in the document approved by MB, that all services which would be tested by SAM should be registered in GOCDB. Alessandro mentioned that registration in GOCDB is not a straight forward process and pointed to a corresponding savannah bug https://savannah.cern.ch/support/?113592. Andrea expressed some doubts that experiments would be happy to register experiment-specific services in GOCDB. It was suggested to re-discuss this question inside the experiments in order to understand whether they agree with the statement that all experiment-specific services which need to be tested by SAM should be registered in GOCDB.

Some remarks:

The validity of the test should be defined on the test level. Default is 24 hours. Where/how it should be defined?

If there is no critical tests defined for the site, the site would be always green

SAM won't use BDII for availability calculation, since information which can be taked from BDII (which services are used by VO), should come with the VO topology description. However, if topology description is not provided, then BDII can be used on this purpose.

Whether the overall site is considered to be in downtime or only a particular service type is defined by site admin. VOs are free to define different profiles in order to decouple various functionalities of the sites in various profiles.

GridView considers only scheduled downtime as maintenance, unscheduled downtime is regarded as the site is down.

Meetings

Meetings are recorded on this twiki only starting from the 23.06.2010, there were many discussion before this date

Meeting 23.06.2010

David Collados , Phool Chand, Andrea Sciaba, Roberto Santinelli, Alessandro Di Girolamo, Akshat Kakkar, Pablo Saiz, Julia Andreeva

The goal of the meeting was to understand the requirements of the experiments for VO-specific availability calculations. The starting point is the document created by William Ollivier. The document is old, two years ago. Discussed issues are described above.

-- JuliaAndreeva - 23-Jun-2010

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2010-06-23 - JuliaAndreeva
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    ArdaGrid All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback