Topology information integration - Proposal 1


This proposal is based on:
  • observation of significant number and distributed character of topology information sources,
  • current experience with designing and maintaining SAM/GridView database model containing the grid topology information,
  • brief research in the area of existing technologies for integrating distributed and multi-domain information systems.

The suggested solution is to use Semantic Web approach or similar technologies to build integration and data exchange platform for all the grid monitoring and operation management tools that need topology information. This is in contrast to existing approach used in SAM/GridView system, which is using a number of protocols and information access methods (HTTP/XML, direct Oracle connections, flat text files, etc.) to build a single and monolithic topology model of the grid.

Consequently, the basic guidelines for the new approach are the following:

  • define core vocabulary (namespace or ontology) for concepts that are common for most of the grid tools, like: Service, VO, etc.
  • define namespaced vocabularies for individual sources of topology information: BDII (Glue), GOCDB, VO specific etc.
  • expose information provided by the topology data sources as RDF
  • use messaging system (MSG) to publish and subscribe for instantaneous topology changes
  • use local caching wherever possible (local RDF stores or equivalent in monitoring tools)
  • use core vocabulary and in future ontology specifications (OWL, reasoning) to 'glue' together information coming from various sources

Information representation and annotation

The topology information can be easily represented as RDF triples. However, because of different validity lifetime of information, level of authoritativeness, and other factors depending on the source and type of information, a special care has to be taken to provide additional annotation or meta-data. This meta-data should contain at least the following information:
  • original source of the information - who produced the information (used to identify authoritativeness)
  • assertion time - when the information was actually produced (freshness)
  • declared validity time - until when the producer declares the information to be valid
  • imposed validity time - for how long from the assertion time the information coming from a given source and of a given type should be considered valid (according to a policy on the ontology level, no matter of declared validity), this type of meta-data can be defined on ontology level as an inferable rule

There are at several ways to represent this kind of meta-data in RDF:

  • using RDF reification - quite complex to maintain and query, can be heavy in storage (triple storage bloat)
  • using contexts or sub-graphs - RDF store implementation specific
  • using 'fake' annotation - additional properties or annotation objects pointing to the resources

Core vocabulary

Information transport

Query/response paradigm

Publish/subscribe paradigm

Local information caching

Information integration and equivalence

-- PiotrNyczyk - 12 Mar 2008

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2008-03-13 - PiotrNyczyk
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback