MyProxy Notes

Introduction

MyProxy is a credential repository for the Grid. Storing your Grid credentials in a MyProxy repository allows you to retrieve a proxy credential whenever and wherever you need one, without worrying about managing private key and certificate files. You can allow trusted servers to renew your proxy credential using MyProxy, so, for example, your long-running tasks don't fail because of an expired proxy credential. A professionally managed MyProxy server can provide a more secure storage location for Grid credentials than typical end-user systems.

MyProxy participates in the following flows

Certificates

A certificate has a fixed lifetime decided when the proxy is initialised. Usually, these certificates are requested for short periods to reduce the risk of someone steaking the certificate and using it to imitate the user. However, this can cause problems where
  • Jobs are long running. The ceriticate needs to be renewed while the job is running.
  • Jobs wait in the queue. The certicate needs to be renewed when the job starts (if it has expired)

There are several parts to a certicate

  • User check - are they really who they claim to be ?
  • Roles check - what virtual organisation is the user in and what rights does the user have

Both of these parts can expire and need to be renewed. These notes cover the user check rather than the roles check which is covered in the VOMS service.

Configuration

The combination of MyProxy/ResourceBroker/VOMS server must be configured and consistent. Since the MyProxy server will only accept renewal requests from a small set of machines, it is necessary to control carefully each site to ensure a consistent configuration.

Job Submission

When a job is submitted, its certicates are sent with the job. The renewal process is run from the RB and contacts the MyProxy server to obtain a fresh certificate.

The worker node runs a process in addition to the job which requests a new certificate when the old one is due to expire. The first request is made 90 minutes before expiry and then frequently following that if the operation fails.

High Availability

The MyProxy service failure has the following impact

  • New requests using myproxy-init will fail with an error
  • Running jobs may fail within 90 minutes due to certicate expiry

A solution to improve the availability is therefore required to arrive at the class C level which is the requested service in ScFourServiceDefinition.

The MyProxy application does support aliases. This permits a local change of server to be made in the event of hardware failure if the state of the server can be maintained.

The state information is kept in /var/myproxy. The state data can be replicated using the myproxy_replicate command which copies from master server to slave servers.

ALERT! The replicate function is not available in the current EDG version. It has not yet been tested in the CERN environment although the release notes for GT4.0.2 Jun 2005 and the man page myproxy_replicate suggest that it should work. It may be possible to back-port it since the function is relatively stand-alone.

Load balancing may be a possibility for the slave servers (i.e. send renewal requests to the slaves but keep a single master for the proxy init operations). This is currently not being considered since the client software does not distinguish between renewal request servers and other operations.

Approach MyProxy 1

An IP alias myproxy.cern.ch to the master server. Regular updates are sent to slave servers using the myproxy_replicate procedure. In the event of failure of the master, the IP alias is changed by heartbeat and the master role moved to one of the slaves. This would cause an outage of <5 minutes and the loss of new certificates initialised between the last replication and the switchover.

Drawing is not editable here (insufficient permission or read-only site) Drawing is not editable here (insufficient permission or read-only site)

Approach MyProxy 2

If the replication procedure does not meet the objective, a shared disk approach with a warm standby machine MypxHaDrawing2.

Drawing is not editable here (insufficient permission or read-only site) Drawing is not editable here (insufficient permission or read-only site)

Equipment required

Approach 1

The approach 1 has two independent servers with local disk. Given the very small disk space requirements, there is no need for external disk as long as the internal disks are mirrored.

Component Number Purpose
Midrange Server 2 MyProxy master and slave machines

Approach 2

Approach 2 would be used if the requirement for recovery was to also cover the small window between replication and recovery. This would lead to a substantial cost increase.

Component Number Purpose
Midrange Server 2 MyProxy master and slave machines
FC HBA 2 Fibre channel connectivity
FC Switch Ports 2 Connectivity for the two servers
FC Disk space 20GB Storage for credentials (2x10GB on different disk subsystems)

Engineering required

Development Purpose
Start/Stop/Status procedure Scripts for MyProxy operations
MyProxy Replication procedure On a regular interval, maybe even driven by a dirwatch or logwatch to trigger off the proxy-init operations
Lemon MyProxy availability test A lemon aware sensor which can be used for reporting availability. This should monitor the number of myproxy-server processes which are running
Linux Heartbeat availability test A Linux-HA aware sensor which would activate the procedure for automatic switch from master to slave
Switch procedure Automatic switch from master to slave changing the DNS alias, disabling the master, enabling the slave in its new master role
Capacity Metric Capacity metrics defined for
Number of renewals / second
Number of inits / second
Quattor configuration for Linux-HA NCM component to configure Linux-HA/Heartbeat

Other Items to Consider

The security for the MyProxy server should be kept carefully controlled since it can be used to obtain user certificates.

Open Items

Nr Description Status Open Date Who Log
1 What myproxy version is included in gLite ? Will we get an update ? inprogress 2005/09/20 Tim  
2 How to use restart procedure from crontab chk-myproxy in /root open 2005/09/20 Tim  

Discussion Log

MyProxy Replication

That is the expected behavior.

The only functions that should work on a secondary (slave) server are
myproxy-login and myproxy-retrieve.  All other myproxy client operations
should fail when accessing a secondary.

Thanks

Patrick

At 10:09 AM 10/3/2005, Tim Bell wrote:

>I have managed to get the myproxy-replicate function working so that the
>credentials are copied to the slave server.
>
>However, the owner field in the .data file is being set to the master
>machine rather than the user.
>
>For example, on the master
>
># grep OWNER /var/myproxy/*.data
>OWNER=/C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176
>
>When I do the replicate, the same file on the slave has
>
># grep OWNER /var/myproxy/*.data
>OWNER=/C=CH/O=CERN/OU=GRID/CN=host/lxdev13.cern.ch
>
>Does anyone know how I can make it so that the owner is correct.
>Without this, any attempt to access the data (e.h. myproxy-info) is failing.
>
>--
>Tim Bell
>CERN Div IT/FIO, CH-1211 Geneva 23
>Phone: +41 22 76 72370
>E-Mail: tim.bell@cern.ch
>Office: 31 1-008

As Patrick describes, the primary server is the "owner" of the
credentials on the secondary servers.  This allows the primary to update
the credentials as necessary and keeps users from making changes to
credentials directly on the secondaries.

We're working on better client-side handling for replication for an
upcoming release.  We're also working on better documentation...

Cheers,
Jim
-- TimBell - 12 Sep 2005
Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatdraw MypxHaDrawing1.draw r9 r8 r7 r6 r5 manage 2.8 K 2005-09-29 - 21:40 TimBell TWiki Draw draw file
GIFgif MypxHaDrawing1.gif r9 r8 r7 r6 r5 manage 2.6 K 2005-09-29 - 21:40 TimBell TWiki Draw GIF file
Unknown file formatdraw MypxHaDrawing1fail.draw r5 r4 r3 r2 r1 manage 3.0 K 2005-09-29 - 21:41 TimBell TWiki Draw draw file
GIFgif MypxHaDrawing1fail.gif r5 r4 r3 r2 r1 manage 2.8 K 2005-09-29 - 21:41 TimBell TWiki Draw GIF file
Unknown file formatdraw MypxHaDrawing2.draw r2 r1 manage 2.4 K 2005-09-13 - 09:16 TimBell TWiki Draw draw file
GIFgif MypxHaDrawing2.gif r2 r1 manage 2.4 K 2005-09-13 - 09:16 TimBell TWiki Draw GIF file
Unknown file formatdraw MypxHaDrawing2fail.draw r2 r1 manage 2.5 K 2005-09-13 - 09:17 TimBell TWiki Draw draw file
GIFgif MypxHaDrawing2fail.gif r2 r1 manage 2.6 K 2005-09-13 - 09:17 TimBell TWiki Draw GIF file
Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r6 - 2005-10-03 - TimBell
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback