TWiki
>
LCG Web
>
LCGServiceChallenges
>
ProgressLogs
>
ServiceChallengeFourProgress
>
ScFourServiceTechnicalFactors
>
PxWlcg
(revision 23) (raw view)
Edit
Attach
PDF
---+ !MyProxy This page is work in progress during installation and will be tidied up on completion. ---++ Architecture The High Availability configuration which is implemented for the CERN !MyProxy service is described at PxWlcgHa. <img src="%ATTACHURLPATH%/PX-Production.gif" alt="PX-Production.gif" /> ---++ Installation The high level installation process for LCG 2.6 is defined <a href="http://grid-deployment.web.cern.ch/grid-deployment/documentation/LCG2-Manual-Install/">here</a>. ---+++ Get a Host Certificate * Get a host certificate from the LCG web page http://service-grid-ca.web.cern.ch/service-grid-ca/help/server_req.html <verbatim> ~gridca/scripts/cert-req -host lxdev13.cern.ch -dir `pwd This command will create a request for a host or service digital certificate from the CERN Certification Authority. When this command completes the certificate request should be emailed to the CERN CA for validation. Instructions for this are included in the certificate request file. A summary of the command options can be obtained using the command: cert-req -help See also the CERN CA website: http://service-grid-ca.web.cern.ch/service-grid-ca/ for more details. Do you want to continue [y]/n Running OPENSSL to generate private key and certificate request files. lGenerating a 2048 bit RSA private key .............................+++ .......................................................................+++ writing new private key to '/afs/cern.ch/group/c3/grid/hostcert/hostkey.pem' ----- Your certificate request is saved in: /afs/cern.ch/group/c3/grid/hostcert/hostreq.pem * If you own a CERN CA personal certificate you can sign this * request with your certificate and send it for approval. * Doing this will assist in the processing of your request. * Use the following command to sign and send your request * substituting the correct paths for your certificate and * private key files if the .globus locations given below are * not correct. You will be prompted for the passphrase of * your private key: openssl smime -sign -to service-grid-ca@cern.ch -subject "Certificate Request" -signer ~/.globus/usercert.pem -inkey ~/.globus/userkey.pem < /afs/cern.ch/group/c3/grid/hostcert/hostreq.pem | /usr/sbin/sendmail -t * If you do not own a CERN CA personal certificate use * the following command to send this request for approval: mail service-grid-ca@cern.ch -s "Certificate Request" < /afs/cern.ch/group/c3/grid/hostcert/hostreq.pem </verbatim> ---+++ Mail the host certificate to the CA authority Assuming you have a certificate yourself, <verbatim> openssl smime -sign -to service-grid-ca@cern.ch -subject "Certificate Request" -signer ~/.globus/usercert.pem -inkey ~/.globus/userkey.pem < /afs/cern.ch/group/c3/grid/hostcert/hostreq.pem | /usr/sbin/sendmail -t </verbatim> ---+++ Copy certificates to machine On completion of the certificate request, an e-mail will be sent to you including the hostkey certificate. The hostkey.pem and the hostcert.pem files should be copied to the /etc/grid-security directory. The hostkey file comes from the original directory where you made the certificate request <verbatim> $ cp hostkey.pem /etc/grid-security/hostkey.pem $ chmod 400 /etc/grid-security/hostkey.pem </verbatim> The hostcert.pem file comes from the e-mail <verbatim> $ cp host_lxdev13.cern.ch.cert /etc/grid-security/hostcert.pem $ chmod 644 /etc/grid-security/hostcert.pem </verbatim> ---++ Configuration The CDB profile is set up as follows. Currently, this is linked to the machine profile but a new pro_type should be created before production. <verbatim> include pro_type_gridpx_slc3; include pro_linuxha_prodpx; </verbatim> ---+++ Internal HA Network There is a private ethernet connection between the two machines in the HA cluster. This is set up in =netinfo.=<i>hostname</i> template. The following lines are an example <verbatim> "/system/network/interfaces/eth0/ip" = "128.142.160.73"; "/system/network/interfaces/eth0/gateway" = "128.142.1.1"; "/system/network/interfaces/eth0/netmask" = "255.255.0.0"; "/system/network/interfaces/eth1/ip" = "192.168.1.101"; "/system/network/interfaces/eth1/gateway" = "128.142.1.1"; "/system/network/interfaces/eth1/netmask" = "255.255.255.0"; </verbatim> If the machine is already installed, the script in /etc/sysconfig/network-scripts/ifcfg-eth1 can be updated as follows <verbatim> DEVICE=eth1 BOOTPROTO=static IPADDR=192.168.1.101 NETMASK=255.255.255.0 ONBOOT=yes TYPE=Ethernet </verbatim> This should be followed by a <verbatim> service network restart Shutting down interface eth0: [ OK ] Shutting down interface eth1: [ OK ] Shutting down loopback interface: [ OK ] Setting network parameters: [ OK ] Bringing up loopback interface: [ OK ] Bringing up interface eth0: [ OK ] Bringing up interface eth1: [ OK ] </verbatim> ---++ Setting up the shared host certificate The Linux HA configuration has an additional host associated with the service, that of the shared IP address. The shared address also needs a certificate in the same way as the host. This ensures that when a machine asked for the myproxy service, they really get the server they ask for and not a machine pretending to be the server. This certificate is only used by the myproxy server and is set up within the rc.ha-gridpx script so that it is the default certificate for the service. If you get the message below, this is because the myproxy server has been started without using this shared certificate. <verbatim> Server authorization failed. Server identity (/C=CH/O=CERN/OU=GRID/CN=host/px101.cern.ch) does not match expected identities myproxy@prod-px.cern.ch or host@prod-px.cern.ch. If the server identity is acceptable, set MYPROXY_SERVER_DN="/C=CH/O=CERN/OU=GRID/CN=host/px101.cern.ch" and try again. </verbatim> ---++ Unit Testing ---+++ Create a proxy Logging into lxplus <verbatim> $ . /afs/cern.ch/project/gd/LCG-share/sl3/etc/profile.d/grid_env.sh $ export MYPROXY_SERVER=prod-px $ grid-proxy-init Your identity: /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176 Enter GRID pass phrase for this identity: Creating proxy ............................................. Done Your proxy is valid until: Sat Jan 28 23:22:06 2006 $ myproxy-init Your identity: /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176 Enter GRID pass phrase for this identity: Creating proxy ................................................................................................. Done Proxy Verify OK Your proxy is valid until: Wed Oct 5 09:46:09 2005 Enter MyProxy pass phrase: Verifying password - Enter MyProxy pass phrase: A proxy valid for 168 hours (7.0 days) for user timbell now exists on lxdev13. </verbatim> Check the proxy has been stored by logging into the prod-px machine and <verbatim> # ls -ltr /var/myproxy total 8 -rw------- 1 root root 96 Sep 28 09:46 timbell.data -rw------- 1 root root 3332 Sep 28 09:46 timbell.creds </verbatim> Check that the proxy information can be retrieved <verbatim> $ myproxy-info username: timbell owner: /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176 timeleft: 167:55:41 (7.0 days) </verbatim> ---+++ Checking Lemon Sensor Basic Lemon sensor can be checked as follows <verbatim> /usr/bin/perl -I/opt/edg/lib/perl -MGridPx /usr/libexec/sensors/edg-fmon-sensor.pl INI 1 GridPx::PxStatus GET 1 PUT 01 1 1138200170 0.234219 INI 2 GridPx::PxLoad GET 2 PUT 01 2 1138200182 7 </verbatim> Testing the full lemon definition can be done by asking lemon for the value of the sensor. <verbatim> # lemon-cli -m 804 Retrieving samples from spool directory '/var/spool/edg-fmon-agent' Nodes: px101 Metrics: 804 Start: (null) - (-1) End: (null) - (-1) local: px101 804 Wed Jan 25 22:19:52 2006 4 Total: 1 result </verbatim> ---+++ Test proxy ---++ Problems ---+++ Server Setup ---++++ No host certificate Host certificate needs to be set up in /etc/grid-security/ <verbatim> Sep 28 08:57:22 lxdev13 myproxy-server: <1667> Problem with server credentials. GSS Major Status: General failure GSS Minor Status Error Chain: acquire_cred.c:125: gss_acquire_cred: Error with GSI credential globus_i_gsi_gss_utils.c:1310: globus_i_gsi_gss_cred_read: Error with gss credential handle globus_gsi_credential.c:721: globus_gsi_cred_read: Valid credentials could not be found in any of the possible locations specified by the credential search order. globus_gsi_credential.c:447: globus_gsi_cred_read: Error reading host credential globus_gsi_system_config.c:4118: globus_gsi_sysconfig_get_host_cert_filename_unix: Could not find a valid certificate file: The host cert could not be found in: 1) env. var. X509_USER_CERT=NULL 2) /etc/grid-security/hostcert.pem 3) /opt/globus/etc/hostcert.pem 4) /root/.globus/hostcert.pem The host key could not be found in: 1) env. var. X509_USER_KEY=NULL 2) /etc/grid-security/hostkey.pem 3) /opt/globus/etc/hostkey.pem 4) /root/.globus/hostkey.pem globus_gsi_credential.c:2 </verbatim> ---+++ User Problems ---++++ Expired Certificate <verbatim> $ myproxy-info Error authenticating: GSS Major Status: General failure GSS Minor Status Error Chain: acquire_cred.c:125: gss_acquire_cred: Error with GSI credential globus_i_gsi_gss_utils.c:1310: globus_i_gsi_gss_cred_read: Error with gss credential handle globus_gsi_credential.c:306: globus_gsi_cred_read: Error with credential: The proxy credential: /tmp/x509up_u19580 with subject: /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176/CN=proxy expired 0 minutes ago. </verbatim> because <verbatim> $ grid-proxy-info subject : /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176/CN=proxy issuer : /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176 identity : /C=CH/O=CERN/OU=GRID/CN=Tim Bell 6176 type : full legacy globus proxy strength : 512 bits path : /tmp/x509up_u19580 timeleft : 0:00:00 </verbatim> ---++++ No user certificate <verbatim> $ myproxy-info Error authenticating: GSS Major Status: General failure GSS Minor Status Error Chain: acquire_cred.c:125: gss_acquire_cred: Error with GSI credential globus_i_gsi_gss_utils.c:1310: globus_i_gsi_gss_cred_read: Error with gss credential handle globus_gsi_credential.c:721: globus_gsi_cred_read: Valid credentials could not be found in any of the possible locations specified by the credential search order. globus_gsi_credential.c:447: globus_gsi_cred_read: Error reading host credential globus_gsi_system_config.c:3977: globus_gsi_sysconfig_get_host_cert_filename_unix: Error with certificate filename globus_gsi_system_config.c:380: globus_i_gsi_sysconfig_create_cert_string: Error with certificate filename: /etc/grid-security/hostcert.pem not owned by current user. globus_gsi_credential.c:239: globus_gsi_cred_read: Error reading proxy credential globus_gsi_system_config.c:4660: globus_gsi_sysconfig_get_proxy_filename_unix: Could not find a valid proxy certificate file location globus_gsi_system_config.c:4657: globus_gsi_sysconfig_get_ </verbatim> Check if user certificate is valid <verbatim> $ grid-proxy-info ERROR: Couldn't find a valid proxy. Use -debug for further information. </verbatim> If this error received, run =grid-proxy-init= ---++ Account Setup The gridpx cluster has a set of users dedicated for it. Follow the SINDES procedure at [[https://twiki.cern.ch/twiki/bin/view/ELFms/PasswordHeader][PasswordHeader]] to define the local users. ---++ RPMs The RPMs from the GD group are stored in =/afs/cern.ch/project/gd/RpmDir_i386-sl3/external=. The initial rpm used is =myproxy-VDT1.2.0rh9_LCG-2.i386.rpm=. ---++ Grid Configuration The grid configuration steps are manual. <verbatim> # ncm-ncd --co yaim [INFO] NCM-NCD version 1.2.3 started by root at: Fri Feb 3 16:24:12 2006 [INFO] executing configure on components.... [INFO] running component: yaim --------------------------------------------------------- [INFO] updated /etc/lcg-quattor-site-info.def [WARN] configure = false => Do not execute : "/opt/lcg/yaim/scripts//configure_node /etc/lcg-quattor-site-info.def PX". [INFO] configure on component yaim executed, 0 errors, 1 warnings ========================================================= [WARN] 0 errors, 1 warnings executing configure </verbatim> The list of the trusted hosts should be obtained from the Grid Developers. These values should be entered into the =/etc/lcg-quattor-site-info.def= file. They can be copied from another working myproxy machine if required. The following steps should then be run <verbatim> [root@px102 init.d]# /opt/lcg/yaim/scripts//configure_node /etc/lcg-quattor-site-info.def PX Configuring config_upgrade ... Configuring config_ldconf ... /sbin/ldconfig: /opt/glite/externals/lib/libswigpy.so.0 is not a symbolic link /sbin/ldconfig: /opt/glite/externals/lib/libswigpl.so.0 is not a symbolic link /sbin/ldconfig: /opt/glite/externals/lib/libswigtcl8.so.0 is not a symbolic link Configuring config_sysconfig_edg ... Configuring config_sysconfig_globus ... Configuring config_sysconfig_lcg ... Configuring config_crl ... Configuring config_rfio ... rfiod already stopped: [FAILED] Configuring config_host_certs ... Configuring config_edgusers ... Configuring config_java ... Configuring config_gip ... Setting up an R-GMA Gin... - Configuring a gip information provider - Not configuring an fmon information provider - Not configuring a glite-ce information provider Wrote configuration to: /opt/glite/etc/rgma-gin/gin.conf All done Stopping rgma-gin: [ OK ] Starting rgma-gin: Too many logins for 'rgma'. [FAILED] For more details check /var/log/glite/rgma-gin.log Configuring config_globus ... creating globus-sh-tools-vars.sh creating globus-script-initializer creating Globus::Core::Paths checking globus-hostname Done Creating... /opt/globus/etc/grid-info.conf Done Creating... /opt/globus/sbin/SXXgris /opt/globus/libexec/grid-info-script-initializer /opt/globus/libexec/grid-info-mds-core /opt/globus/libexec/grid-info-common /opt/globus/libexec/grid-info-cpu* /opt/globus/libexec/grid-info-fs* /opt/globus/libexec/grid-info-mem* /opt/globus/libexec/grid-info-net* /opt/globus/libexec/grid-info-platform* /opt/globus/libexec/grid-info-os* /opt/globus/etc/grid-info-resource-ldif.conf /opt/globus/etc/grid-info-resource-register.conf /opt/globus/etc/grid-info-resource.schema /opt/globus/etc/grid.gridftpperf.schema /opt/globus/etc/gridftp-resource.conf /opt/globus/etc/gridftp-perf-info /opt/globus/etc/grid-info-slapd.conf /opt/globus/etc/grid-info-site-giis.conf /opt/globus/etc/grid-info-site-policy.conf /opt/globus/etc/grid-info-server-env.conf /opt/globus/etc/grid-info-deployment-comments.conf Done Creating gatekeeper configuration file... Done Creating grid services directory... Done Creating state file directory. Done. Reading gatekeeper configuration file... Determining system information... Creating job manager configuration file... Done Setting up fork gram reporter in MDS ----------------------------------------- Done Setting up pbs gram reporter in MDS ---------------------------------------- loading cache /dev/null checking for qstat... no Setting up condor gram reporter in MDS ---------------------------------------- loading cache /dev/null checking for condor_q... no Setting up lsf gram reporter in MDS ---------------------------------------- loading cache /dev/null checking for lsload... no loading cache ./config.cache checking for mpirun... /usr/bin/mpirun updating cache ./config.cache creating ./config.status creating fork.pm loading cache /dev/null checking for mpirun... /usr/bin/mpirun checking for qdel... no loading cache /dev/null checking for condor_submit... no loading cache /dev/null loading cache ./config.cache creating ./config.status creating grid-cert-request-config creating grid-security-config Stopping Globus MDS [FAILED] Starting Globus MDS (gcc32dbgpthr) [ OK ] Configuring config_proxy_server ... MyProxy not running Starting up MyProxy Configuration Complete </verbatim> ---++ Setting up replication The HA !MyProxy configuration will * Use =myproxy-replicate= to copy the read-only keys to the slave machine * Use rsync to sync the =/var/proxy= directories This requires a setup in root =authorized_keys= to permit the ssh from the master to the slave. In a Quattorised environment, the root authorized_keys are managed by Quattor and will be overwritten if modified. Therefore, a user key needs to be set up so that this operation will work. On the master and slave machines, generate a key. <verbatim> $ ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/root/.ssh/id_dsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_dsa. Your public key has been saved in /root/.ssh/id_dsa.pub. The key fingerprint is: fc:30:3a:01:a3:23:ac:f1:58:80:40:bf:99:13:ce:8b root@px101.cern.ch </verbatim> Take the .pub files from /home/root/.ssh on both machines and copy them into =/afs/cern.ch/project/core/conf/ssh=. The name of the shared id address should be used for the key file (e.g. prod-px_root.key). In CDB, the access needs to be configured. This is currently put in the machine profile. For example, <verbatim> # # Access from other grid servers to sync proxies # "/software/components/access_control/privileges/acl_root/role/gridpx_root/0/targets"=list("+node::px101","+node::px103"); "/software/components/access_control/roles/gridpx_root/0"="px101_root"; "/software/components/access_control/roles/gridpx_root/1"="px103_root"; </verbatim> The NCM component should then be re-run to load the access control into =/root/.ssh/authorized_keys=. <verbatim> # ccm-fetch # ncm-ncd --co access_control </verbatim> Check for the hosts in the authorized keys and then try an ssh as root from one machine to the other to test. This operation needs to be repeated if any of the hosts are re-installed. ---++ CDB Configuration The following files were defined * pro_monitoring_cos_gridpx.tpl The following rpms were defined * lemon-sensor-grid-px (elfms/lemon/sensors/sensor-px) * heartbeat (from linux HA web site) * CERN-CC-gridpx (fio/fabric/gridpx) The user definitions for * rgma * edguser were set to be allowed to login so that the 'su -' operations in the grid software would work. -- Main.TimBell - 23 Sep 2005
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
gif
PX-Production.gif
r2
r1
manage
24.8 K
2005-12-13 - 17:24
TimBell
vsd
PX-Production.vsd
r2
r1
manage
167.5 K
2005-12-13 - 17:23
TimBell
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r28
|
r25
<
r24
<
r23
<
r22
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r23 - 2006-02-22
-
TimBell
Log In
LCG
LCG Wiki Home
LCG Web Home
Changes
Index
Search
LCG Wikis
LCG Service
Coordination
LCG Grid
Deployment
LCG
Apps Area
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
Altair
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
LCG
All webs
Copyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback