High Availability Implementation for MyProxy


The CERN requirements for the MyProxy service requires a highly available configuration. As discussed in PxNotes, the high availability functions are available in the standard MyProxy implementation using myproxy_replicate. This creates a read-only replica of the data which supports retrieve operations only. This can cover the retrieve operations but will not cover the user oriented actions such as init or destroy.

To provide a full high availability function, the following approach was taken

  • Master/Slave set up using Linux-HA and shared IP service address
  • Master stores data in /var/proxy and replicates using myproxy_replicate to slave in /var/proxy.slave
  • Master rsync's data from /var/proxy to the slave /var/proxy directory
  • The slave myproxy server is started in slave mode to read from /var/proxy.slave (i.e. read-only mode)
  • In the event of master failure as detected by Linux-HA, the daemon is stopped on the slave and then restarted with the read-write copy from /var/proxy

Using Linux-HA with a small myproxy resource script (start/stop/monitor/status) provide this function. The take over time is around 2 seconds following detection of a failure. There may be a substantial delay between occurrence of failure and detection. If further work on the client configuration is made such that a replica server can be queried, this window will be covered by the replica.

The HA configuration has been implemented as follows

Edit drawing `PxWlcgHaNormal` (requires a Java 1.1 enabled browser)

In the event of a failure or an operator initiated switch for planned maintenance, the configuration is changed

  • Service IP now points to slave server
  • Slave myproxy started with /var/myproxy as repository (which was being received via rsync from master)
  • Master adopts slave role (if it is able to)

Edit drawing `PxWlcgHaFail` (requires a Java 1.1 enabled browser)


For the cost of two machines with small disk space, a highly available MyProxy implementation can be made which is resiliant to network, machine and storage failures.

-- TimBell - 05 Oct 2005

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatdraw PxWlcgHaFail.draw r2 r1 manage 4.0 K 2005-10-05 - 21:48 TimBell TWiki Draw draw file
GIFgif PxWlcgHaFail.gif r2 r1 manage 3.2 K 2005-10-05 - 21:48 TimBell TWiki Draw GIF file
Unknown file formatdraw PxWlcgHaNormal.draw r1 manage 3.6 K 2005-10-05 - 21:38 TimBell TWiki Draw draw file
GIFgif PxWlcgHaNormal.gif r1 manage 2.6 K 2005-10-05 - 21:38 TimBell TWiki Draw GIF file
Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2005-10-05 - TimBell
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback