Grid Monitoring Data Exchange Standard (Draft)

General concepts

Data types

In order to avoid compatibility problems when calling web services written in different programming languages from various sources (web browser, application, etc.) we suggest to narrow down the list of data types to the following set:
  • Scalar values:
    • string - quite intuitive
    • number - optionally two sub-types: int, float
    • boolean - can be represented as string "true" or "false"
    • timestamp - W3C date and time, default precision: up to seconds (optionally fractions of seconds)
  • Structures:
    • list - sequence of elements of any type, preserving order of elements (optionally random access with indexing from 0)
    • dictionary - unordered associative array (hash) with key of any scalar type and value of any type (optionally may preserve order of pairs)

Conventions

Passing list in URL

A list of values should be passed as a single parameter in the URL using the following rules:
  • parameter name MAY have [] suffix (for compatibility with PHP)
  • individual values from the list should be passed as separate param=value or param[]=value expressions separated by & sign (CGI multiple selection box form)
Example:

Timestamp format

Suggestions for all cases (including XML):
  • W3C date and time format (based on ISO8601)
  • all components from year to second (no fractions if possible)
  • all values in UTC ("Z" as timezone)
  • Example of this format convention would be: ...?startTime=2007-02-09T14:42:20Z

Boolean value

Boolean value for URL and XML should be passed by one of the strings: true, false

Data model

The following picture shows the data model used for standardised request and response formats:

The data model introduces three categories:

  • class
  • attribute
  • role (marked as a label over associations)

All services following the standard should comply to the following rules:

  • accept parameters related to the data model (request)
  • build a query over the service specific data repository using the given parameters as a filter
  • deliver a result of the query in standard format (response)

Request format

As the most typical request format for all web services we suggest an HTTP URL with parameters encoded in GET or POST query string:

  • should be supported by all components that don't require complex types for parameters (flat lists at most)
  • procedure name can be encoded either as part of URL (URI path element) or one of parameters
  • all parameters related to the data model (common parameters) should be constructed with recommendation given in next section of this document
  • in addition service can expect any number of service specific parameters outside the common data model

Common parameters

All the parameters related to the data model defined in this document should be constructed using one of the two possible notations (note that underscore is used as a separator, and capitalisation of identifiers):
  • ClassName_attributeName - for example Site_name
  • roleName_ClassName_attributeName - for example criticalFor_VO_name
The optional role in the parameter has to be used when the class alone for a given request is not enough to construct a query without ambiguity.

Additionally the current standard defines a set of additional parameters not related directly to the data model but used to narrow down a query:

  • startTime - beginning of the time range for historical queries
  • endTime - end of the time range for historical queries

Composing request URL

The URL composed as a valid request must contain the following components:
  • base url containing host and path components (may contain procedure name)
  • query string in case of GET method starting with ? and in case of POST passed as POST data:
    • service specific parameters (may contain procedure name)
    • subset of common parameters

The exact list of service specific parameters and supported common parameters together with the exact semantics should be defined in the specification of a given service.

Example:

  • base URL: http://server.org/service
  • service specific parameters:
    • proc=dailyPlot (name of procedure)
  • common parameters:
    • Site_name=CERN-PROD
    • ServiceMetric_name=site-daily-avail
    • startTime=2007-02-01T00:00:00Z
    • endTime=2007-02-10T00:00:00Z
  • final URL:
    http://server.org/service?proc=dailyPlot&Site_name=CERN-PROD&ServiceMetric_name=site-daily-avail& \\
    startTime=2007-02-01T00:00:00Z&endTime=2007-02-10T00:00:00Z
    

Response Format

Examples

Current status of services

Request URL:
http://server.org/current_status?return=criticalFor_VO_name&Region_name=CERN&Site_name=CERN-PROD& \\
Service_endpoint=https%3A%2F%2Fce101.cern.ch%3A2119%2F&calculatedFor_VO_name[]=OPS&calculatedFor_VO_name[]=Atlas

In the URL above the meaning of the parameters is the following:

  • return - additional role_Class_attribute to be included in the output, here: return for which VO name each metric result is critical
  • Region_name - selected value for region.name, triggers the output of region element
  • Site_name - selected value for region.name, triggers the output of site element
  • Service_name - selected value for region.name
  • calculatedFor_VO_name[] - list of selected VO names for which metric results should be returned

Response XML:

<?xml version="1.0"?>

<root xmlns="http://cern.ch/grid-mon/2007/05/mon-exchange-schema/">
  <Region name="CERN">
    <Site name="CERN-PROD">
      <type>Production</type>
      <status>Certified</status>
      <SiteMetric name="site-daily-avail">
        <measurement>
          <status>ok</status>
          <summary>0.3</summary>
          <timestamp>2007-02-25T00:00:00Z</timestamp>
        </measurement>
      </SiteMetric>
      <Service endpoint="https://ce101.cern.ch:2119/" type="CE">
        <isMonitored>true</isMonitored>
        <inMaintenance>false</inMaintenance>
        <metricGroup groupBy="calculatedForVO" value="OPS">
          <ServiceMetric name="service-daily-avail">
            <measurement>
              <status>ok</status>
              <summary>0.3</summary>
              <timestamp>2007-02-25T00:00:00Z</timestamp>
            </measurement>
          </ServiceMetric>
          <ServiceMetric name="CE-sft-job">
            <measurement>
              <status>ok</status>
              <criticalForVO>OPS</criticalForVO>
              <criticalForVO>Atlas</criticalForVO>
              <timestamp>2007-02-26T13:00:00Z</timestamp>
            </measurement>
          </ServiceMetric>
          <ServiceMetric name="CE-totalcpus">
            <measurement>
              <status>ok</status>
              <summary>2433</summary>
              <criticalForVO>OPS</criticalForVO>
              <timestamp>2007-02-26T13:20:00Z</timestamp>
            </measurement>
          </ServiceMetric>
          <ServiceMetric name="CE-freecpus">
            <measurement>
              <status>ok</status>
              <summary>200</summary>
              <criticalForVO>Atlas</criticalForVO>
              <timestamp>2007-02-26T11:30:00Z</timestamp>
            </measurement>
          </ServiceMetric>
        </metricGroup>
        <metricGroup groupBy="calculatedForVO" value="Atlas">
          <ServiceMetric name="CE-sft-job">
            <measurement>
              <status>ok</status>
              <criticalForVO>Atlas</criticalForVO>
              <timestamp>2007-02-26T12:11:00Z</timestamp>
            </measurement>
          </ServiceMetric>
        </metricGroup>
      </Service>
    </Site>
  </Region>
</root>

History of selected test for a service

Request URL:
http://server.org/metric_history?Service_endpoint=https%3A%2F%2Fce101.cern.ch%3A2119%2F& \\
calculatedFor_VO_name=OPS&ServiceMetric_name=CE-sft-job

Response XML:

<?xml version="1.0"?>

<root xmlns="http://cern.ch/grid-mon/2007/05/mon-exchange-schema/">
  <Service endpoint="https://ce101.cern.ch:2119/" type="CE">
    <metricGroup groupBy="calculatedForVO" value="OPS">
      <ServiceMetric name="CE-sft-job">
        <measurement>
          <timestamp>2007-02-26T10:00:00Z</timestamp>
          <status>ok</status>
        </measurement>
        <measurement>
          <timestamp>2007-02-26T11:00:00Z</timestamp>
          <status>error</status>
        </measurement>
        <measurement>
          <timestamp>2007-02-26T12:00:00Z</timestamp>
          <status>ok</status>
        </measurement>
      </ServiceMetric>
    </metricGroup>
  </Service>
</root>

History of selected test for a host

Request URL:
http://server.org/metric_history?Host_name=bdii001.cern.ch&HostMetric_name=host-cpu-load

Response XML:

<?xml version="1.0"?>

<root xmlns="http://cern.ch/grid-mon/2007/05/mon-exchange-schema/">
  <Host name="bdii001.cern.ch">
    <HostMetric name="host-cpu-load">
      <measurement>
        <timestamp>2007-02-26T10:00:00Z</timestamp>
        <summary>0.1</summary>
        <status>ok</status>
      </measurement>
      <measurement>
        <timestamp>2007-02-26T11:00:00Z</timestamp>
        <summary>10.4</summary>
        <status>warning</status>
      </measurement>
      <measurement>
        <timestamp>2007-02-26T12:00:00Z</timestamp>
        <summary>0.8</summary>
        <status>ok</status>
      </measurement>
    </HostMetric>
  </Host>
</root>

TODO

  • the whole Response format section, however the current example are good starting point
  • semantics of the data model: how it maps to GOC DB, Glue Schema, SAM and the authoritative sources of values
  • naming convention of metric names, suggestion type:name format, examples:
    • I decided not to use the prefixes and leave it to be service (repository) specific. In future if we are observing problems with metric names clashing we can introduce namespacing with the default namespace (no namespace prefix) being of local scope (referring to this monitoring tool) [-- PiotrNyczyk - 04 Apr 2007]
  • decide on capitalization of identifiers, suggestion: ClassName, attributeName, roleName

Comments

Please feel free to add any comments below:
  • -- Emir Imamagic - 17 May 2007
    • I'm aware that semantics of data is still in TODO list, but I would like to add a comment regarding attribute type in class Service. I see in the examples that for type values you suggest using nodetype values (e.g. CE, SE, MON). I think that nodetype isn't the best source of values for service type because it doesn't give a clear description what the service really is (e.g. service on port 2119 is either standard or gLiteCE-flavour Gatekeeper). Also if you have multiple nodetypes deployed on single host, single service can be associated with multiple nodetypes (e.g. MDS service is deployed on both CE, SE and MON). My suggestion would be to use real service names (e.g. GridFTP, MDS, ...) or GlueServiceType values (where applicable).
    • If I understodd correctly, nodetype values are associated with hosts. Shouldn't the class Host include a value (or additional class) which defines which nodetype that host is? Or do you plan to exclude it from the standard completely, because it doesn't have to be used in general (e.g. some grid implementation might not use it at all)?
  • -- IanNeilson - 06 Mar 2007
    • What is the string encoding used. Should this be defined?
      • IMHO we don't have to define that as both for URIs and XML there are standards for that . In URIs it is clear that everything that goes beyond accepted set of chars should go as %hex (for example %20 for space). In XML you have a default character set or encoding attribute in the top (<?xml ...) [-- PiotrNyczyk - 04 Apr 2007]
    • Why is it necessary to "recommend" alternatives to XML-RPC? (just asking)
      • No, and that's why I removed all references to XML-RPC, SOAP, etc. [-- PiotrNyczyk - 04 Apr 2007]
  • -- IanNeilson - 27 Mar 2007
    • A 'pseudo'-metric is needed in the data model to account for the derived (from critical tests) overall status of the service.
      • I think we shouldn't introduce a new class for that as it doesn't change anything in attributes or relations to other classes. This is just a metter of the semantics that one metric uses values measured on a "real" system and another uses the results coming from other metrics. [-- PiotrNyczyk - 04 Apr 2007]
    • What is the timestamp of the above metric? (now, of latest metric, of earliest metric ..... )
      • I would say the timestamp should be always "time of measurement" and the details can be only defined for a particular metric individually. [-- PiotrNyczyk - 04 Apr 2007]

Change Log

  • -- PiotrNyczyk - 04 Apr 2007
    • answered to the comments.
    • modified a bit the XML format according to Paul Millar's comments (by mail)
    • removed prefixes for metric names
  • -- PiotrNyczyk - 12 Apr 2007 - modifications after exchange of ideas with Paul Millar
    • changed the namespace in XML to a working one
    • changed the structure of history XML: additional element "historyEntry"
  • -- PiotrNyczyk - 16 May 2007 - modifications after phone conference (Piotr, James, Paul)
    • changed data model: Service identified by an endpoint
    • changed the namespace in XML to 2007/05
    • changed the structure of XMLs:
      • replaced element "historyEntry" with "measurement"
      • "measurement" obligatory even for current status query
      • added example of SiteMetric in current status response
      • example of HostMetric response
Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng data-model-2007-05-16.png r1 manage 23.7 K 2007-05-16 - 15:20 PiotrNyczyk  
PNGpng data-model.png r4 r3 r2 r1 manage 23.9 K 2007-04-04 - 16:44 PiotrNyczyk  
Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r13 - 2007-05-23 - EmirImamagic
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback