MyProxy
This page is work in progress during installation and will be tidied up on completion.
Architecture
The High Availability configuration which is implemented for the CERN MyProxy service is described at
PxWlcgHa.
Installation
The high level installation process for LCG 2.6 is defined
here.
Get a Host Certificate
~gridca/scripts/cert-req -host lxdev13.cern.ch -dir `pwd
This command will create a request for a host or service digital
certificate from the CERN Certification Authority.
When this command completes the certificate request should be emailed
to the CERN CA for validation. Instructions for this are included in
the certificate request file.
A summary of the command options can be obtained using the command:
cert-req -help
See also the CERN CA website:
http://service-grid-ca.web.cern.ch/service-grid-ca/ for more details.
Do you want to continue [y]/n
Running OPENSSL to generate private key and certificate request files.
lGenerating a 2048 bit RSA private key
.............................+++
.......................................................................+++
writing new private key to '/afs/cern.ch/group/c3/grid/hostcert/hostkey.pem'
-----
Your certificate request is saved in: /afs/cern.ch/group/c3/grid/hostcert/hostreq.pem
* If you own a CERN CA personal certificate you can sign this
* request with your certificate and send it for approval.
* Doing this will assist in the processing of your request.
* Use the following command to sign and send your request
* substituting the correct paths for your certificate and
* private key files if the .globus locations given below are
* not correct. You will be prompted for the passphrase of
* your private key:
openssl smime -sign -to service-grid-ca@cern.ch -subject "Certificate Request" -signer ~/.globus/usercert.pem -inkey ~/.globus/userkey.pem < /afs/cern.ch/group/c3/grid/hostcert/hostreq.pem | /usr/sbin/sendmail -t
* If you do not own a CERN CA personal certificate use
* the following command to send this request for approval:
mail service-grid-ca@cern.ch -s "Certificate Request" < /afs/cern.ch/group/c3/grid/hostcert/hostreq.pem
Mail the host certificate to the CA authority
Assuming you have a certificate yourself,
openssl smime -sign -to service-grid-ca@cern.ch -subject "Certificate Request" -signer ~/.globus/usercert.pem -inkey ~/.globus/userkey.pem < /afs/cern.ch/group/c3/grid/hostcert/hostreq.pem | /usr/sbin/sendmail -t
Copy certificates to machine
On completion of the certificate request, an e-mail will be sent to you including the hostkey certificate.
The hostkey.pem and the hostcert.pem files should be copied to the /etc/grid-security directory.
The hostkey file comes from the original directory where you made the certificate request
$ cp hostkey.pem /etc/grid-security/hostkey.pem
$ chmod 400 /etc/grid-security/hostkey.pem
The hostcert.pem file comes from the e-mail
$ cp host_lxdev13.cern.ch.cert /etc/grid-security/hostcert.pem
$ chmod 644 /etc/grid-security/hostcert.pem
Configuration
The CDB profile is set up as follows. Currently, this is linked to the machine profile but a new pro_type should be created before production.
include pro_type_gridpx_slc3;
include pro_linuxha_prodpx;
Internal HA Network
There is a private ethernet connection between the two machines in the HA cluster. This is set up in =netinfo.=
hostname template.
The following lines are an example
"/system/network/interfaces/eth0/ip" = "128.142.160.73";
"/system/network/interfaces/eth0/gateway" = "128.142.1.1";
"/system/network/interfaces/eth0/netmask" = "255.255.0.0";
"/system/network/interfaces/eth1/ip" = "192.168.1.101";
"/system/network/interfaces/eth1/gateway" = "128.142.1.1";
"/system/network/interfaces/eth1/netmask" = "255.255.255.0";
If the machine is already installed, the script in /etc/sysconfig/network-scripts/ifcfg-eth1 can be updated as follows
DEVICE=eth1
BOOTPROTO=static
IPADDR=192.168.1.101
NETMASK=255.255.255.0
ONBOOT=yes
TYPE=Ethernet
This should be followed by a
service network restart
Shutting down interface eth0: [ OK ]
Shutting down interface eth1: [ OK ]
Shutting down loopback interface: [ OK ]
Setting network parameters: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
Bringing up interface eth1: [ OK ]
Setting up the shared host certificate
The Linux HA configuration has an additional host associated with the service, that of the shared IP address.
The shared address also needs a certificate in the same way as the host. This ensures that when a machine asked for the myproxy service, they really get the server they ask for and not a machine pretending to be the server.
This certificate is only used by the myproxy server and is set up within the rc.ha-gridpx script so that it is the default certificate for the service.
If you get the message below, this is because the myproxy server has been started without using this shared certificate.
Server authorization failed. Server identity
(/C=CH/O=CERN/OU=GRID/CN=host/px101.cern.ch)
does not match expected identities
myproxy@prod-px.cern.ch or host@prod-px.cern.ch.
If the server identity is acceptable, set
MYPROXY_SERVER_DN="/C=CH/O=CERN/OU=GRID/CN=host/px101.cern.ch"
and try again.
Unit Testing
Create a proxy
Logging into lxplus
$ . /afs/cern.ch/project/gd/LCG-share/sl3/etc/profile.d/grid_env.sh
$ export MYPROXY_SERVER=prod-px
$ grid-proxy-init
Your identity: /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176
Enter GRID pass phrase for this identity:
Creating proxy ............................................. Done
Your proxy is valid until: Sat Jan 28 23:22:06 2006
$ myproxy-init
Your identity: /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176
Enter GRID pass phrase for this identity:
Creating proxy ................................................................................................. Done
Proxy Verify OK
Your proxy is valid until: Wed Oct 5 09:46:09 2005
Enter MyProxy pass phrase:
Verifying password - Enter MyProxy pass phrase:
A proxy valid for 168 hours (7.0 days) for user timbell now exists on lxdev13.
Check the proxy has been stored by logging into the prod-px machine and
# ls -ltr /var/myproxy
total 8
-rw------- 1 root root 96 Sep 28 09:46 timbell.data
-rw------- 1 root root 3332 Sep 28 09:46 timbell.creds
Check that the proxy information can be retrieved
$ myproxy-info
username: timbell
owner: /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176
timeleft: 167:55:41 (7.0 days)
Checking Lemon Sensor
Basic Lemon sensor can be checked as follows
/usr/bin/perl -I/opt/edg/lib/perl -MGridPx /usr/libexec/sensors/edg-fmon-sensor.pl
INI 1 GridPx::PxStatus
GET 1
PUT 01 1 1138200170 0.234219
INI 2 GridPx::PxLoad
GET 2
PUT 01 2 1138200182 7
Testing the full lemon definition can be done by asking lemon for the value of the sensor.
# lemon-cli -m 804
Retrieving samples from spool directory '/var/spool/edg-fmon-agent'
Nodes: px101
Metrics: 804
Start: (null) - (-1)
End: (null) - (-1)
local: px101 804 Wed Jan 25 22:19:52 2006 4
Total: 1 result
Test proxy
Problems
Server Setup
No host certificate
Host certificate needs to be set up in /etc/grid-security/
Sep 28 08:57:22 lxdev13 myproxy-server: <1667> Problem with server credentials. GSS Major Status: General failure GSS Minor Status Error Chain: acquire_cred.c:125: gss_acquire_cred: Error with GSI credential globus_i_gsi_gss_utils.c:1310: globus_i_gsi_gss_cred_read: Error with gss credential handle globus_gsi_credential.c:721: globus_gsi_cred_read: Valid credentials could not be found in any of the possible locations specified by the credential search order. globus_gsi_credential.c:447: globus_gsi_cred_read: Error reading host credential globus_gsi_system_config.c:4118: globus_gsi_sysconfig_get_host_cert_filename_unix: Could not find a valid certificate file: The host cert could not be found in: 1) env. var. X509_USER_CERT=NULL 2) /etc/grid-security/hostcert.pem 3) /opt/globus/etc/hostcert.pem 4) /root/.globus/hostcert.pem The host key could not be found in: 1) env. var. X509_USER_KEY=NULL 2) /etc/grid-security/hostkey.pem 3) /opt/globus/etc/hostkey.pem 4) /root/.globus/hostkey.pem globus_gsi_credential.c:2
User Problems
Expired Certificate
$ myproxy-info
Error authenticating: GSS Major Status: General failure
GSS Minor Status Error Chain:
acquire_cred.c:125: gss_acquire_cred: Error with GSI credential
globus_i_gsi_gss_utils.c:1310: globus_i_gsi_gss_cred_read: Error with gss credential handle
globus_gsi_credential.c:306: globus_gsi_cred_read: Error with credential: The proxy credential: /tmp/x509up_u19580
with subject: /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176/CN=proxy
expired 0 minutes ago.
because
$ grid-proxy-info
subject : /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176/CN=proxy
issuer : /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176
identity : /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176
type : full legacy globus proxy
strength : 512 bits
path : /tmp/x509up_u19580
timeleft : 0:00:00
No user certificate
$ myproxy-info
Error authenticating: GSS Major Status: General failure
GSS Minor Status Error Chain:
acquire_cred.c:125: gss_acquire_cred: Error with GSI credential
globus_i_gsi_gss_utils.c:1310: globus_i_gsi_gss_cred_read: Error with gss credential handle
globus_gsi_credential.c:721: globus_gsi_cred_read: Valid credentials could not be found in any of the possible locations specified by the credential search order.
globus_gsi_credential.c:447: globus_gsi_cred_read: Error reading host credential
globus_gsi_system_config.c:3977: globus_gsi_sysconfig_get_host_cert_filename_unix: Error with certificate filename
globus_gsi_system_config.c:380: globus_i_gsi_sysconfig_create_cert_string: Error with certificate filename: /etc/grid-security/hostcert.pem not owned by current user.
globus_gsi_credential.c:239: globus_gsi_cred_read: Error reading proxy credential
globus_gsi_system_config.c:4660: globus_gsi_sysconfig_get_proxy_filename_unix: Could not find a valid proxy certificate file location
globus_gsi_system_config.c:4657: globus_gsi_sysconfig_get_
Check if user certificate is valid
$ grid-proxy-info
ERROR: Couldn't find a valid proxy.
Use -debug for further information.
If this error received, run
grid-proxy-init
RPMs
The RPMs from the GD group are stored in
/afs/cern.ch/project/gd/RpmDir_i386-sl3/external
. The initial rpm used is
myproxy-VDT1.2.0rh9_LCG-2.i386.rpm
.
Grid Configuration
The grid configuration steps are manual.
# ncm-ncd --co yaim
[INFO] NCM-NCD version 1.2.3 started by root at: Fri Feb 3 16:24:12 2006
[INFO] executing configure on components....
[INFO] running component: yaim
---------------------------------------------------------
[INFO] updated /etc/lcg-quattor-site-info.def
[WARN] configure = false => Do not execute : "/opt/lcg/yaim/scripts//configure_node /etc/lcg-quattor-site-info.def PX".
[INFO] configure on component yaim executed, 0 errors, 1 warnings
=========================================================
[WARN] 0 errors, 1 warnings executing configure
The list of the trusted hosts should be obtained from the Grid Developers. These values should be entered into the
/etc/lcg-quattor-site-info.def
file. They can be copied from another working myproxy machine if required.
The following steps should then be run
[root@px102 init.d]# /opt/lcg/yaim/scripts//configure_node /etc/lcg-quattor-site-info.def PX
Configuring config_upgrade ...
Configuring config_ldconf ...
/sbin/ldconfig: /opt/glite/externals/lib/libswigpy.so.0 is not a symbolic link
/sbin/ldconfig: /opt/glite/externals/lib/libswigpl.so.0 is not a symbolic link
/sbin/ldconfig: /opt/glite/externals/lib/libswigtcl8.so.0 is not a symbolic link
Configuring config_sysconfig_edg ...
Configuring config_sysconfig_globus ...
Configuring config_sysconfig_lcg ...
Configuring config_crl ...
Configuring config_rfio ...
rfiod already stopped: [FAILED]
Configuring config_host_certs ...
Configuring config_edgusers ...
Configuring config_java ...
Configuring config_gip ...
Setting up an R-GMA Gin...
- Configuring a gip information provider
- Not configuring an fmon information provider
- Not configuring a glite-ce information provider
Wrote configuration to: /opt/glite/etc/rgma-gin/gin.conf
All done
Stopping rgma-gin: [ OK ]
Starting rgma-gin: Too many logins for 'rgma'.
[FAILED]
For more details check /var/log/glite/rgma-gin.log
Configuring config_globus ...
creating globus-sh-tools-vars.sh
creating globus-script-initializer
creating Globus::Core::Paths
checking globus-hostname
Done
Creating...
/opt/globus/etc/grid-info.conf
Done
Creating...
/opt/globus/sbin/SXXgris
/opt/globus/libexec/grid-info-script-initializer
/opt/globus/libexec/grid-info-mds-core
/opt/globus/libexec/grid-info-common
/opt/globus/libexec/grid-info-cpu*
/opt/globus/libexec/grid-info-fs*
/opt/globus/libexec/grid-info-mem*
/opt/globus/libexec/grid-info-net*
/opt/globus/libexec/grid-info-platform*
/opt/globus/libexec/grid-info-os*
/opt/globus/etc/grid-info-resource-ldif.conf
/opt/globus/etc/grid-info-resource-register.conf
/opt/globus/etc/grid-info-resource.schema
/opt/globus/etc/grid.gridftpperf.schema
/opt/globus/etc/gridftp-resource.conf
/opt/globus/etc/gridftp-perf-info
/opt/globus/etc/grid-info-slapd.conf
/opt/globus/etc/grid-info-site-giis.conf
/opt/globus/etc/grid-info-site-policy.conf
/opt/globus/etc/grid-info-server-env.conf
/opt/globus/etc/grid-info-deployment-comments.conf
Done
Creating gatekeeper configuration file...
Done
Creating grid services directory...
Done
Creating state file directory.
Done.
Reading gatekeeper configuration file...
Determining system information...
Creating job manager configuration file...
Done
Setting up fork gram reporter in MDS
-----------------------------------------
Done
Setting up pbs gram reporter in MDS
----------------------------------------
loading cache /dev/null
checking for qstat... no
Setting up condor gram reporter in MDS
----------------------------------------
loading cache /dev/null
checking for condor_q... no
Setting up lsf gram reporter in MDS
----------------------------------------
loading cache /dev/null
checking for lsload... no
loading cache ./config.cache
checking for mpirun... /usr/bin/mpirun
updating cache ./config.cache
creating ./config.status
creating fork.pm
loading cache /dev/null
checking for mpirun... /usr/bin/mpirun
checking for qdel... no
loading cache /dev/null
checking for condor_submit... no
loading cache /dev/null
loading cache ./config.cache
creating ./config.status
creating grid-cert-request-config
creating grid-security-config
Stopping Globus MDS [FAILED]
Starting Globus MDS (gcc32dbgpthr) [ OK ]
Configuring config_proxy_server ...
MyProxy not running
Starting up MyProxy
Configuration Complete
Setting up replication
The HA MyProxy configuration will
- Use
myproxy-replicate
to copy the read-only keys to the slave machine
- Use rsync to sync the
/var/proxy
directories
This requires a setup in root
authorized_keys
to permit the ssh from the master to the slave.
In a Quattorised environment, the root authorized_keys are managed by Quattor and will be overwritten if modified. Therefore, a user key needs to be set up so that this operation will work.
On the master and slave machines, generate a key.
$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
fc:30:3a:01:a3:23:ac:f1:58:80:40:bf:99:13:ce:8b root@px101.cern.ch
Take the .pub files from /home/root/.ssh on both machines and copy them into
/afs/cern.ch/project/core/conf/ssh
. The name of the shared id address should be used for the key file (e.g. prod-px_root.key).
In CDB, the access needs to be configured. This is currently put in the machine profile. For example,
#
# Access from other grid servers to sync proxies
#
"/software/components/access_control/privileges/acl_root/role/gridpx_root/0/targets"=list("+node::px101","+node::px103");
"/software/components/access_control/roles/gridpx_root/0"="px101_root";
"/software/components/access_control/roles/gridpx_root/1"="px103_root";
The NCM component should then be re-run to load the access control into
/root/.ssh/authorized_keys
.
# ccm-fetch
# ncm-ncd --co access_control
Check for the hosts in the authorized keys and then try an ssh as root from one machine to the other to test.
This operation needs to be repeated if any of the hosts are re-installed.
CDB Configuration
The following files were defined
- pro_monitoring_cos_gridpx.tpl
The following rpms were defined
- lemon-sensor-grid-px (elfms/lemon/sensors/sensor-px)
- heartbeat (from linux HA web site)
- CERN-CC-gridpx (fio/fabric/gridpx)
The user definitions for
were set to be allowed to login so that the 'su -' operations in the grid software would work.
--
TimBell - 23 Sep 2005