HA Test Bed Setup

Proposal

The HA test bed provides the FIO and ADC groups a test environment for evaluating High Availability software and hardware. It can also be used to test outside of the application test environments for operations such as switch over from master to slave and recovery.

The configuration required can model the basic building blocks of the SC4 infrastructure proposal, namely

  • Two machines in master/slave mode with the slave monitoring the master's application. The resources is shared such that a switch from the master to slave involves taking over the shared disk and shared network address.

  • Two machines running a MySQL database performing replication.

The applications to be tested are

  • MyProxy
  • CE
  • RB

Overview

Edit drawing `HaTestBedDrawing` (requires a Java 1.1 enabled browser)

Hardware Required

Component Number Purpose
Servers 2 Master and Slave. The servers do not need to be high end or ultra reliable but should be running the same OS level as the production systems. At least 20GB internal disk is required. Mirroring of internal disks is not necessary.
Network Interfaces 4 Two networks are required. One for the standard access, the other for monitoring purposes
FC HBAs 2 cards
1-2 ports/card
Fibre Channel cards for master and slave to connect to the FC switch or disks. Ideally, these would be two port cards and the same as the version used in production but this is not essential since it is expected that the HBA drivers will perform in a similar fashion
FC Cables 2-4 Cables for HBA to switch or disks
FC Switch 1 with at least 4 ports Connection from machines to disks. The switch will be logically split in two with zoning to simulate two switches
Disk Array 1-2 One disk array would be sufficient for many of the tests. Two arrays would be simulated using 2 LUNs on the same device

Software Required

A basic lxdev like configuration would be sufficient with SLC3 to build the base service.

The following define the other components used and their setup / configuration.

Component Description
Linux-HA HaTestLinuxHaSetup
Fibre Channel HaTestFcSetup for Fibre Channel HBA and Disks

Resources

The following resources would be required for the setup and development of the deliverables

Resource Time (days) Activity
Andras Horvath 15 HA Setup
RPM packaging
Switchover scripts
Tim Bell 15 Quattor/NCM components
Lemon Sensors
Working Instructions
Test Plan
!MyProxy install and test
Thorsten Kleinwort 2 Co-ordination of machines setup and installation
FIO System Administrator 1 Linux installation of machines
Tim Wibley Physical Installation of machines
David Asbury 2 Fibre Channel Switch Setup
German Cancio 0.5 Consulting on appropriate Quattor parameters

Deliverables

The test will produce the following deliverables

  • A hardware recommendation for the Grid SC4/production services
  • Validated levels of drivers for HBAs and disk arrays
  • A set of middleware packaged as RPMs which will provide the basic HA services
  • Quattor/NCM components and PAN variables for configuring master/slave servers, FC connections and HA.
  • Lemon sensors for Linux HA to report alarms (switch) and monitor state (master/slave)
  • Switch over scripts for shared disks and network
  • Working Instructions for administrators for unplanned and planned switchover
  • A test plan for regression testing in the event of new hardware/software
  • Certification of at minimum the MyProxy configuration within an HA environment

MySQL is currently not included in the scope since this requires agreement for the organisation structure and services for this database.

A request to test out DRBD has also been received. This will be investigated after the initial tests have been completed since it may be an interesting option for the less critical services.

Tasks

Nr Description Status Open Date Who Log
1 Visit Test Lab closed 2005/09/18 All Done.
2 Arrange machine hardware installation in test lab rack closed 2005/09/18 Thorsten Done
3 LANDB definitions closed 2005/09/19 Tim lxdev13, lxdev14
4 Obtain FC switch from David closed 2005/09/19 Tim Done
5 Network cabling closed 2005/09/19 Andras Done.
6 O/S install closed 2005/09/19 Thorsten lxdev13 done. lxdev14 waiting.
7 F/C Switch install closed 2005/09/19 Tim
Andras
Swtch installed and on network.
Problem with cable availability for lxdev13
8 FC setup closed 2005/09/26 Andras Zoning and cabling done, both nodes see the shared disk
9 linux-ha basics closed 2005/09/26 Andras Recompiled RPMs, simple FS mount test and takeover work
10 NCM open 2005/10/25 Andras wrote initial candidates for NCM component and template (see attach)

Setup log

see HaTestBedSetupLog

Configuration Details

Machine CERN Network Monitoring
lxdev13 137.138.8.149 192.168.0.151
lxdev14 137.138.8.150 192.168.0.152
qlogic switch N/A 192.168.0.150
service IP none yet 192.168.0.160

RPM Version Purpose
sansurfer 5.00.01-1.cern Graphical interface for SAN switch management
heartbeat 2.0.1-1.cern.1 Core services for linux-ha
heartbeat-pils 2.0.1-1.cern.1 Plugin framework for linux-ha
heartbeat-stonith 2.0.1-1.cern.1 STONITH services - not used yet (needs special hw) but depended upon

-- TimBell - 14 Sep 2005

-- AndrasHorvath - 25 Oct 2005

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatdraw HaTestBedDrawing.draw r4 r3 r2 r1 manage 2.6 K 2005-09-19 - 16:35 TimBell TWiki Draw draw file
GIFgif HaTestBedDrawing.gif r4 r3 r2 r1 manage 2.3 K 2005-09-19 - 16:36 TimBell TWiki Draw GIF file
Perl source code filepm linuxha.pm r1 manage 4.8 K 2005-10-25 - 16:14 UnknownUser would-be ncm component to configure linux-ha
Unknown file formattpl pro_linuxha.tpl r1 manage 0.9 K 2005-10-25 - 16:15 UnknownUser would-be CDB template for linux-ha config
Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r10 - 2005-10-25 - TimBell
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback