GGUS <-> Nagios interactions
Proposed design (draft)
Progress on the Nagios side
Current tasks:
- Provide some addressing mechanim to identify the unique destionation service/host check (= Nagios_service_URI)
- Implement a basic msg-to-queue -> "service comment update" mechanism
- Investigate the mapping between GGUS vs Nagios fields
GGUS -> Nagios (1)
- A message is sent to the Topic _grid.ticket.(name of the GGUS intance).notification
- A specific msg-to-queue handler is listening on this topic and instanciate the appropriate objects, and create/modify existing Nagios services
- The content of the message itself should contain at least the Nagios_service_URI (= Nagios hostname + affected host + affected service)
GGUS -> Nagios (reply-to) (2)
- A message is sent to the Queue _grid.ticket.ack.(hash(nagios hostname))
- A specific msg-to-queue handler is listening on this topic and updates the "Service comment" field of the affected service. The content of the comment is up to the GGUS team. It may be an HTML formatted message containing the GGUS ticket URL, a specific error message, etc. (See screenshot)
- The content of the message itself should contain at least the Nagios_service_URI (= Nagios hostname + affected host + affected service) and a return message (URL, error code, etc.)
Nagios -> GGUS (3)
- A notification handler is developed to trigger a message aimed at Topic _grid.ticket.(name of the GGUS intance).notification. This notification handler is configured via NCG.
- The content of the message itself should contain at least the Nagios_service_URI (= Nagios hostname + affected host + affected service) and a reply-to field
- Some intelligence is needed to check the service comments before sending a message (ex: has a GGUS ticket being already opened?)
Others
- Nagios could also send a reply-to to messages marked as (1) on the diagram.
- The actual mapping between GGUS and Nagios fields is still not clear.
- Message signature is to be discussed. The "service comments" being remotely modified should not be a backdoor to add arbitrary HTML (or worse) tags on the Nagios server. (public keys could be exchanged via the MSG?)
Progress on the GGUS side
Purpose
GGUS system is offering various kinds of interfaces to remote systems up to now. The number of interfaces has increased during the last years. For avoiding an ongoing increase of different interface types GGUS wants to offer a generic standard interface for the future. This interface will be Grid Messaging. It will be used for communication between systems/servers only.
Grid Messaging Clients
For using Grid Messaging two different clients are needed: a publisher and a listener. The publisher creates the messages and pushes them to a topic or queue on the server. The listener receives the messages from the server. GGUS implemented the clients using PHP for easily integrating them into the GGUS system as nearly all components of GGUS system are implemented in PHP. CERN is using Python, but other programming languages are also possible.
Workflows
A. Creating a new ticket in GGUS (Remote -> GGUS)
For creating a new ticket in GGUS system send a create message to topic /topic/Grid.Ticketing.GGUS.Create. In the header of the create message specify the topic to which the GGUS ticket ID should be returned to using option “reply-to”. GGUS system returns the ID by posting a message to the specified topic. For getting the possible values for the fields please see item 35407 of this document.
Mandatory fields for creating a new GGUS ticket are “Short Description” and “Last Modifier”.
A.a. Creating a new ticket relying on TPM
For creating a new ticket without knowing the appropriate support unit or which should be handled by TPM please do not specify any value in fields
- Responsible_Unit,
- Notify_Site and
- Status
This leads to tickets created with default values for these fields. All such tickets are handled by the TPM.
A.b. Creating a new ticket and assigning it to a support unit directly
For assigning a new ticket to a support unit directly please specify the support unit in field 'Responsible_Unit' or the site in field 'Notify_Site'. If a site name is specified in field 'Notify_Site' the related ROC is set in field 'Responsible_Unit' automatically. This workflow is described in a twiki page at
https://twiki.cern.ch/twiki/bin/view/EGEE/SA1_USAG#Direct_routing_to_sites_faq.
Information about the support units implemented in GGUS is available at the GGUS Responsible Unit Info.
B. Updating an existing ticket in GGUS (Remote -> GGUS)
Ticket updates are sent to topic /topic/Grid.Ticketing.GGUS.Update. They have to contain the GGUS ticket ID as mandatory field. For getting the possible values for the fields please see item 35407 of this document.
C. Creating a new ticket in a remote system (GGUS -> Remote)
GGUS sends create messages to topic /topic/Grid.Ticketing.Create for creating new tickets in remote systems. In the header of the create message GGUS specifies the topic to which the ticket ID should be returned to using option “reply-to”. Usually the topic for returning the ticket ID is /topic/Grid.Ticketing.GGUS.Reply.
The templates used are linked from item 35408 of this document.
D. Updating an existing ticket in a remote system (GGUS -> remote)
Updates on existing tickets in remote systems are pushed to topic /topic/Grid.Ticketing.Update.
The templates used are linked from item 35408 of this document.
Message Formats
It is agreed that all messages should consist of key-value pairs. The easiest way to reach this goal is to use templates. The templates shown below are drafts and may change.
Templates
The templates for the messages are described in document 3541_Templates_for_GridMessaging.pdf available at
https://gus.fzk.de/pages/ggus-docs/interfaces/docu_ggus_interfaces.php
.
Meanings of the various fields in GGUS
The meanings of the various fields are described in document 3541_Templates_for_GridMessaging.pdf available at
https://gus.fzk.de/pages/ggus-docs/interfaces/docu_ggus_interfaces.php
.
All comments are marked with '//'.
Possible values for the various fields in GGUS system
Possible values of the GGUS drop-down list fields are described in document 3542_Dropdown_Listvalues.pdf available at
https://gus.fzk.de/pages/ggus-docs/interfaces/docu_ggus_interfaces.php
.
Topics to be used
There are different topics in use depending on the workflows. For the communication from a remote system to the GGUS system (Remote -> GGUS) the topics are:
- /topic/Grid.Ticketing.GGUS.Create for creating new GGUS tickets
- /topic/Grid.Ticketing.GGUS.Update for updating existing tickets in GGUS
- /topic/Grid.Ticketing.GGUS.Reply for returning IDs of remote systems to GGUS
For the communication from GGUS to remote systems (GGUS -> Remote) the topics are:
- /topic/Grid.Ticketing.Create for creating new tickets in remote systems
- /topic/Grid.Ticketing.Update for updating existing tickets in remote systems
Open Issues
* a. Signing messages with a valid grid certificate: this issue is still to be implemented on GGUS side and has also to be implemented by the remote system.
* b. Authorization of publishers and consumers: how could the system avoid access of people that are not part of the project and thus not authorized (-> privacy issue)?
* c. Handling of attachments (-> virus protection!).
* d. Providing an
ActiveMQ server at FZK in production mode (as far as I know this was already discussed). This is a decision on political level.
--
RomainWartel - 07 Jul 2009