Gridview Service at CERN

This page documents the deployment of the Gridview service at CERN.

Gridview Related Links:

  1. Gridview Twiki Page
  2. Gridview Service Dash Board Notes
  3. Gridview Deployment at CERN
  4. Gridview Admin Guide
  5. Quattor Installation of Gridview
  6. Gridview Software Release Status
  7. Installation and Configuration of Gridview Publisher
  8. Installation and Configuration of Gridview Web Service Clients
  9. Using Gridview XML Interface
  10. URL for Excel Report generation
  11. Downloads
  12. New Gridview Interface
  13. Gridview Monthly Availabilities and Reports

Contents

Gridview Components

The following diagram shows all the different modules of the Gridview application and their relation to each other.

gridview_service.jpg

Each of the above modules is described here:

SAM/Gridview Data Repository:

This repository contains all raw data collected by SAM and Gridview archivers pertaining to Service Availability, Data transfer and Job Status Monitoring modules. It also contains the summary data computed by Gridview summarizers. We use Oracle 10g on RAC for this purpose.

Archivers:

Archivers are programs that receive raw monitoring data published from sites all over the grid and store it in the Data Repository. Currently Gridview archives 4 kinds of data:
  • Gridftp Logs: Data transfer Logs collected from Gridftp servers in LCG/EGEE.
  • Job Status Logs: These logs are collected from Resource Brokers.
  • SAM test results: These are the result of SAM tests being launched by the SAM framework on LCG/EGEE sites.
  • FTS logs: These are file transfer logs collected from FTS servers
There are various ways through which these logs are collected:
  • R-GMA: Gridftp and Job status logs from non-CERN sites are transported by means of R-GMA and archived by Gridview's R-GMA based archivers.
  • Web services: Gridftp and Job status logs from CERN and FTS logs are transported by means of Gridview's web services.
  • SAM test results: These data are archived by SAM service's web service archivers and written to the SAM/Gridview repository

Summarizers:

Summarizers are programs that read raw monitoring data from the repository and analyze and compute summaries from these raw data. Gridview has different summarization programs that compute data transfer, job status, SAM and FTS summaries.

Synchronizers:

In order to compute summaries from raw data, Gridview needs different pieces of information available in databases across LCG/EGEE. For better performance, Gridview uses synchronizer modules that copy this information from the remote databases onto the data repository and also keep this information synchronized between the two databases. There are currently two synchronizers running:
  • GOCDB Synchronizer: Gridview reads the GOCDB tables containing site and node information and scheduled shutdown information
  • CIC Portal Synchronizer: Information about VOs are read from the CIC database.
At present, both GOCDB and CIC databases are Oracle based and direct read only access is provided to Gridview servers.

Frontend:

This is the interface users see to access Gridview. This is provided by the Gridview presentation module, which has a web interface and programs to generate the wide variety of Gridview graphs.

Gridview module inter-dependencies:

Gridview has been developed as a set of independent modules with the flexibility of deploying them in various ways. Though it is possible to install all Gridview modules on a single server, it is preferred to split them across multiple servers in order to increase efficiency and reduce response time. The module inter-dependencies are as listed below:

  • The gridview common module has to be installed on every gridview server
  • All front end modules (data transfer, job status, fts and service availability) have to be installed on the same server.
  • Each archiver can be installed on a different machine if required
  • The gridview-arch-common module has to be installed on each archiver machine
  • Each summarizer can be installed on a different machine if required
  • Each synchronizer can be installed on a different machine if required

It is the responsibility of the system admin installing the modules to ensure that multiple instances of the same module do not run on different servers. Exceptions are: gridview-common and gridview-arch-common modules.

Further note that:

  • All gridview modules installed on a server share a common directory tree under /opt/gridview.
  • The configuration files are always stored in /opt/gridview/etc directory
  • The logs are always written to /var/log/gridview directory. The only exception is the log of the frontend modules which run under Apache web server and therefore get written into the Apache log files in /var/log/httpd directory.

Gridview dependency on middleware components

Gridview depends on the functioning of various other software components as listed below:

Module Dependency
Common Oracle instant client 10.0.023
Frontend Apache, PHP 4.3.9
R-GMA Archivers Java 1.5, R-GMA consumer
Web Service Archivers Java 1.5, Tomcat 5.5
Summarizers Cron, PHP 4.3.9
Synchronizers Cron, PHP 4.3.9

Disk Space Requirement:

All gridview monitoring data is stored in the data repository (Oracle database). Therefore the local disks of the machines are used lightly. The only data stored in the local disks of the machines are the log files. Even these are configured to be automatically logrotated after 7 days.

Partitioning:

All gridview code resides in /opt directory. Hence it is required to have a separate partition mounted at /opt. Preferred size of the partition is 2 GB or more. It is also required to have a sufficiently large /var partition (8 GB or more) to store logs.

Disk space requirements for different modules:

The following table lists the disk space requirements for the different modules of Gridview.

Module Local disk requirement Type of information stored in the disk Remarks
Frontend 8 GB on /var Apache web server logs Logs kept for 90 days with default logrotate
Archivers 8 GB on /var Archiver logs Logs kept for 90 days with logrotate
Summarizers 8 GB on /var Summarizer logs Logs kept for 7 days with logrotate
Synchronizers 8 GB on /var Synchronizer logs Logs kept for 7 days with logrotate

Oracle database requirements:

All Gridview data resides in the central data repository housed in Oracle 10g RAC.

Connectivity requirements:

The following table shows the TCP port numbers which need to be open for Gridview to run. This lists only the ports for which accesses to/from outside CERN is needed.

Module Incoming Port no(s) Outgoing Port no(s)
Frontend 80(HTTP), 443(SSL) None
Archivers 8080(Tomcat) None
Summarizers None None
Synchronizers None 1521(Oracle)

Backup Objects:

Database Backup:

Gridview data repository resides in Oracle database on Oracle10g RAC. Backup procedures are handled by PSS.

Disk Backups

Disk level backups need to be taken for storing the logs of different modules. The logs are stored in directories mentioned above in the section on disk space requirements.

Current Gridview deployment scenario:

Gridview presently runs on 3 mid-range servers with each server performing different duties and runs different gridview modules. Therefore each server has a different configuration as detailed below.

Machine Alias if any Gridview Component(s) running Dependencies
grvw001 gridview Frontend Apache, PHP, Oracle
grvw002 gvarch Archivers Java, Tomcat, R-GMA, Oracle
grvw004 - Summarizers and Synchronizers PHP, Oracle, cron

Server capabilities needed:

Though all gridview modules can be installed on a single system, we prefer to spread them out to multiple systems in order to improve response time and limit effect of failure of a system. Each midrange server is a dual processor Xeon (2.4,3GHz) server with 2 GB memory and a 160 GB hard disk. It is recommended to install gridview services on at least 4 systems as detailed under:

  1. Frontend: This is the user interface module where we need to have good response time to the user. The preferred way of deploying the frontend is on two machines loadshared by means of DNS lookups.
  2. Archivers: Here is where all archiver modules run. These modules run continuously on the system and archive large number (hundreds of thousands) of monitoring records in a day.
  3. Summarizers/Synchronizers: These services can be put on a single system because they run as a set of cron jobs with a frequency of 30 minutes to an hour.

Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r13 - 2008-05-19 - KislayBhatt
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback