Procedures and Tools for gLite Nodes

This section describes the procedures and tools adopted for the deployment and maintenance of gLite nodes. These nodes are expected to join the EGEE production infrastructure. The set of gLite nodes of one D4Science site will be seen in the EGEE production infrastructure as one gLite site. Being part of the EGEE production infrastructures, gLite nodes must follow the procedures defined for this infrastructure. At the same time they can profit from the tools already provided by EGEE. This section will therefore summarize the EGEE procedures and tools that D4Science has adopted for its gLite nodes.

As introduced in the overview section, the operation of the infrastructure is based on five main areas of work. Consequently, the procedures and tools for the gLite nodes are also organised according to the same areas of work:


Software Installation and Upgrade

Procedures

gLite is composed by several components provide the different grid services (computing element, storage element, information system, etc.). The most recent release of gLite is version 3.1 and includes almost all gLite components. This release has been certified on Scientific Linux 4 (SL4). Some gLite components are still in version 3.0 which is certified on Scientific Linux 3 (SL3). All gLite components run on i386 (a small subset also supports x86_64).

gLite nodes are expected to run on dedicated machines, however depending on the specific gLite component installed a gLite node may co-exist with one gCube node. For example the gLite Computing Element (CE) or the gLite Storage Element (SE) should run in one dedicated gLite node where as the gLite Work Node (WN) can be installed together with another gLite components or even in the same machine of a gCube node.

The default installation method for SL4 is the YUM tool. gLite components can be installed using such tool since all components have YUM meta-packages associated. Besides YUM, APT can also be used. The configuration of gLite is performed by a set of shell scripts built within the YAIM framework (YAIM doesn't support the installation of gLite). The provided configuration scripts can be used by Site Managers with no need for in-depth knowledge of specific middleware configuration details. They must adapt some configuration files, according to provided examples. The resulting configuration is a default site configuration. Local customizations and tuning of the middleware, if needed, can then be done manually.

Updates in gLite are requested regularly (usually every 15 days or less). Upgrades to full new version of gLite are not foreseen, updates are done on a component basis. Details on what needs to be done to execute one gLite update can be found on the update specific webpage. Sites are asked to keep their installations up to date with respect to the latest update release. Every update is announced via the EGEE CIC portal broadcast tool. Site managers are expected to subscribe to this tool.

Further documentation and instructions to install and upgrade gLite nodes can be found in the gLite website.

Tools

  • YUM/APT: Package manager for the gLite software
  • YAIM: Configuration tool for the gLite software
  • CIC Portal: EGEE operations portal used to announce gLite upgrades


Links
gLite 3.1 Release http://glite.web.cern.ch/glite/packages/R3.1
gLite 3.1 Updates http://glite.web.cern.ch/glite/packages/R3.1/updates.asp
gLite 3.1 Installation and Configuration https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide310
YUM http://www.linux.duke.edu/projects/yum
YAIM Tool https://twiki.cern.ch/twiki/bin/view/LCG/YaimGuide400
CIC Portal https://cic.gridops.org


Certification

Procedures

The procedure for a given site to be certified as part of the EGEE production infrastructure depends on the requirements of each EGEE federation. Each EGEE federation is represented by one Regional Operation Centre (ROC). However, in general the certification process includes the following steps:

  1. Site: requests a X.509 user certificates from its national Certification Authority (CA) for all site manager;
  2. Site: contacts its ROC to get information about what site-specific information and which statement of acceptance of policy the site has to provide;
  3. ROC: validates the submitted information and adds the site in the EGEE GOCDB database, setting its certification status to "candidate";
  4. Site: add missing information in the GOCDB (adding security contacts, more site administrators, etc.);
  5. ROC: validates the submitted information and changes the site certification status to "uncertified";
  6. Site: requests the membership for dteam and ops Virtual Organisations (VO) and subscribes to relevant mailing lists;
  7. Site: installs gLite (with guidance and support of its ROC support contacts);
  8. ROC: starts the execution of certification tests via SAM;
  9. ROC: sets the certification status of the site to "certified" and the production status to "production".

Due the the previous participation of DILIGENT in the EGEE Pre-Production Service infrastructure, the gLite sites already registered in EGEE do not need to provide the registration information. These sites can start immediately the certification process from step 7.

Tools

  • SAM: Certification, monitoring, and site availability tests for the gLite software
  • GOCDB: Database of sites, resources and site managers of the EGEE production infrastructure (certificate required)


Links
SAM https://lcg-sam.cern.ch:8443/sam/sam.py
GOCDB https://goc.gridops.org
Documentation http://egee-sa1.web.cern.ch/egee-sa1/certification.html


User and Operational Support

Procedures

Operational tickets should be raised via the local ROC helpdesk. Each ROC helpdesk is connected to the central EGEE helpdesk known as GGUS. The ROCs can then decide whether to escalate the ticket to GGUS for EGEE-wide resolution. User tickets can be created directly in GGUS and allocated to a choice of support units.

Non-official support is also possible using the gLite-discuss mailing list.

Tools

  • GGUS: Issue tracking system for EGEE production infrastructure operations
  • gLite-discuss Mailing List: EGEE mailing list for gLite related questions


Links
GGUS https://gus.fzk.de/pages/home.php
gLite-discuss Mailing List glite-discuss@cernNOSPAMPLEASE.ch
Documentation https://twiki.cern.ch/twiki/bin/view/EGEE/EGEEROperationalProcedures#7_Reporting_Problems


Security

Procedures

The security of gLite nodes is based on X.509 certificates. User and host certificates must be issued by one EUGridPMA Certification Authority. No further Certification Authorities are trusted.

The management of Virtual Organizations is done using the gLite component Virtual Organization Membership Service (VOMS). The VOMS server of the production infrastructure must make available one "d4science" VO. All gLite nodes are expected to support the "d4science" VO. The "d4science" VO must have one VO Manager that is the responsible for managing the VO membership and the assignment of correct roles.

Tools

  • X.509 Certificates: Issued by EUGridPMA CAs
  • VOMS: Management tool for "d4science" VO


Links
EUGridPMA http://www.eugridpma.org/members/worldmap
VOMS User Guide https://edms.cern.ch/file/571991/1/voms-guide.pdf
VOMS Admin Guide https://edms.cern.ch/file/572406/1/user-guide.pdf


Monitoring and QoS

Procedures

There are several tools for monitor the EGEE production infrastructure nodes (and consequently the D4Science gLite nodes). Many tools share the same information but provide different ways to access it. Such large number of tools cover many monitoring possibilities.

Tools

  • SAM
  • GStat
  • GridView
  • GridMap
  • RTM


Links
SAM https://lcg-sam.cern.ch:8443/sam/sam.py
GStat http://goc.grid.sinica.edu.tw/gstat
GridView http://gridview.cern.ch/GRIDVIEW/
GridMap http://lxb2003.cern.ch/gm/gridmap.html
RTM http://gridportal.hep.ph.ic.ac.uk/rtm



-- PedroAndrade - 25 Jan 2008

Edit | Attach | Watch | Print version | History: r15 < r14 < r13 < r12 < r11 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r12 - 2008-02-11 - PedroAndrade
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    D4Science All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback