YAIM 4.0.0 Certification (third phase)
Developers testing
There is no testing done by the developers at this second stage.
Fresh install testing
rpm list
The list of rpms can be found in etics in
/afs/cern.ch/project/gd/www/yaim
:
- glite-yaim-core-4.0.0-10.noarch.rpm
- glite-yaim-core-4.0.0-11.noarch.rpm: The core rpm had to be regenerated because now yaim lfc contains 3.0 and 3.1 configuration and the configure_node command was only taking into account UI and WN.
- glite-yaim-clients-4.0.0-3.noarch.rpm
- glite-yaim-clients-4.0.0-4.noarch.rpm: The clients rpm has been regenerated to configure one single sgm account for vobox and to include the _30 functions in its function list (this has been detected in GGUS ticket 26303).
- glite-yaim-clients-4.0.0-5.noarch.rpm: The clients rpm has been regenerated because by mistake we introduced part of a fix in yaim core and it needed some changes also in clients, so we've decided to include them. This modification affects only config_wn and it fixes bug #28082 and #27896.
- glite-yaim-dcache-4.0.0-2.noarch.rpm
- glite-yaim-dpm-4.0.0-2.noarch.rpm
- glite-yaim-fts-4.0.0-5.noarch.rpm
- glite-yaim-lfc-4.0.0-4.noarch.rpm
- glite-yaim-myproxy-4.0.0-2.noarch.rpm
The main difference with the rpms from the second phase of certification is:
- config_bdii: the detection of openldap version was removed, but it is still required, adding one line before line "version=openldap-2.1" as:
result=`ls -d /usr/share/doc/openldap-servers-* | cut -d"-" -f3`
- Changes for LFC node-info.d since 3.1 configuration is now in the node.
- Remove "+" from the line 5 of function config_seclassic
- add line "mkdir -p /var/log/glite/rgma-server" before line "chown
tomcat4:tomcat4 /var/log/glite/rgma-server/" in config_rgma_server
- remove config_bdii_only from function list of CE
- in config_globus put back "GLOBUS_MDS=yes" for lcg CE
- clients contains a fix for TAR WN in gLite 3.0.
For a list of the bugs that this version of
YAIM intends to fix, please see the following Savannah patches:
CERN testing report
General remark:
- For all resources, we need to uninstall lcg-info-generic and install glite-info-generic-2.0.2-1, besides, BDII needs to be upgarded to 3.9.1-3.
3.0 UI
- When calling config_globus, it gives some error info:
/opt/globus/sbin/globus-initialization.sh: line 61: ./setup-globus-mds-gris:
No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 62: ./setup-globus-gatekeeper:
No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 63:
./setup-globus-gram-job-manager: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 64:
./setup-globus-gram-reporter-fork: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 65:
./setup-globus-gram-reporter-pbs: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 66:
./setup-globus-gram-reporter-condor: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 67:
./setup-globus-gram-reporter-lsf: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 68:
./setup-globus-job-manager-fork: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 69:
./setup-globus-job-manager-pbs: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 70:
./setup-globus-job-manager-condor: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 71:
./setup-globus-job-manager-lsf: No such file or directory
it's harmless, but annoying.
- when calling config_glite_ui_30,
setup-ssl-utils: Complete
..Done
WARNING: The following packages were not set up correctly:
globus_trusted_ca_42864e48_setup-noflavor-pgm
Check the package documentation or run postinstall -verbose to see what
happened
this can be ignored although I list it here
3.0 WN
- when calling config_globus,
/opt/globus/sbin/globus-initialization.sh: line 65: ./setup-globus-gram-reporter-pbs: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 66: ./setup-globus-gram-reporter-condor: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 67: ./setup-globus-gram-reporter-lsf: No such file or directory
loading cache ./config.cache
checking for mpirun... /usr/bin/mpirun
updating cache ./config.cache
creating ./config.status
creating fork.pm
/opt/globus/sbin/globus-initialization.sh: line 69: ./setup-globus-job-manager-pbs: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 70: ./setup-globus-job-manager-condor: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 71: ./setup-globus-job-manager-lsf: No such file or directory
but it can be ignored.
3.0 LCG CE
- when calling config_fmon_client, it complaims:
INFO: Executing function: config_fmon_client
JM.conf file parsing: [FAILED]
Batch system log dir not defined in JM.conf!
JM.conf file parsing: [FAILED]
Batch system log dir not defined in JM.conf!
WMS+LB
The test is based on the current checkpoint release, 1251.
- node-info should be updated to use config_gip not config_gip_services, config_gip should be updated for WMS port and add LB node type.
VOBOX
- when calling config_vobox
ls: /opt/vobox/dteam/proxy_repository: Permission denied
Starting ProxyRenewal Daemon: vobox-renewd [ OK ]
Process does not exist ... [FAILED]
ls: /opt/vobox/dteam/proxy_repository: Permission denied
Starting ProxyRenewal Daemon: vobox-renewd [ OK ]
Process does not exist ... [FAILED]
ls: /opt/vobox/dteam/proxy_repository: Permission denied
Starting ProxyRenewal Daemon: vobox-renewd [ OK ]
Process does not exist ... [FAILED]
ls: /opt/vobox/dteam/proxy_repository: Permission denied
Starting ProxyRenewal Daemon: vobox-renewd [ OK ]
.......
and it tried to configue VOBOX for all sgm pool accounts. And also it keeps printing out above error message on console after configuring. Notice: in grid-mapfile all sgm DN are mapped to sgm01, so we don't need to have daemons or cron for all sgm pool accounts.
Upgrade testing
In order to test
YAIM 4.0.0, you have to install
patch 1238
for 3.0 or
patch 1239
for 3.1.
Since
YAIM 4.0.0 depends on a new version of the
GIP, you also need to apply
patch 1363
. This will install a new rpm called glite-info-generic.
3.0 APT repository
In order to test the upgrade to the new
YAIM 4.0.0 in gLite 3.0 node types, please, apply
patch 1238
and
patch 1363
by editing your apt string in the following way:
rpm http://lxb2042.cern.ch/gLite/APT/R3.0-cert rhel30 externals Release3.0 updates updates.certified internal patch1238.uncertified patch1363.uncertified
3.1 YUM repository
In order to test the upgrade to the new
YAIM 4.0.0 in gLite 3.1 node types (only WN and UI), please, apply
patch 1239
by editing your yum string in the following way:
[patch 1239]
name=gLite 3.1 patch 1239
baseurl=http://grid-deployment.web.cern.ch/grid-deployment/glite/integration/cert/3.1/patches/1239/sl4/i386/
enabled=1
External Sites testing reports
INFN testing report
By Elisabetta Molinari
See comments in
task #5523
.
TCD testing report
By John Walsh
http://www.cs.tcd.ie/~walshj1/EGEE-SA3/Certification/patch-testing/patch1238/
GRNET testing report
By Dimitrios Apostolou
See comments in
patch #1238
.
CESGA testing report
By Esteban Freire
3.0 LCG CE
- Installation from scratch on SL 3
- We need to add SITE_BDII_HOST=sa3-ce.$MY_DOMAIN into site-info.def file
- Package glite-info-generic-2.0.2-1.noarch.rpm installed by hand
We got these errors:
INFO: Now creating the grid-mapfile - this may take a few minutes...
voms search(https://voms.cern.ch:8443/voms/ops/services/VOMSCompatibility?method=getGridmapUsers&container=%2Fops%2FRole%3Dproduction): Internal Server Error
voms search(https://test01.egee.cesga.es:8443/voms/cesga/services/VOMSCompatibility?method=getGridmapUsers&container=%2Fcesga%2FRole%3Dlcgadmin): Internal Se
rver Error
voms search(https://lxb1928.cern.ch:8443/voms/dteam/services/VOMSCompatibility?method=getGridmapUsers&container=%2Fdteam%2FRole%3Dlcgadmin): Connect failed:
connect: Connection timed out; Connection timed out
voms search(https://lxb1928.cern.ch:8443/voms/dteam/services/VOMSCompatibility?method=getGridmapUsers&container=%2Fdteam%2FRole%3Dproduction): Connect failed
: connect: Connection timed out; Connection timed out
voms search(https://lxb1928.cern.ch:8443/voms/dteam/services/VOMSCompatibility?method=getGridmapUsers&container=%2Fdteam): Connect failed: connect: Connectio
n timed out; Connection timed out
voms search(https://voms.cnaf.infn.it:8443/voms/compchem/services/VOMSCompatibility?method=getGridmapUsers&container=%2Fcompchem%2FRole%3Dlcgadmin): Internal
Server Error
voms search(https://voms.cnaf.infn.it:8443/voms/compchem/services/VOMSCompatibility?method=getGridmapUsers&container=%2Fcompchem%2FRole%3Dproduction): Intern
al Server Error
voms search(https://swevo.ific.uv.es:8443/voms/fusion/services/VOMSCompatibility?method=getGridmapUsers&container=%2Ffusion%2FRole%3Dlcgadmin): Internal Serv
er Error
INFO: Configuration Complete.
- Globus-gatekeeper is up and running after yaim configuration
- All services are up and running on lcg-CE
- job submission is working fine
3.0 WN
- Installation from scratch on SL 3
- We need to add SITE_BDII_HOST=sa3-ce.$MY_DOMAIN into site-info.def file
We got these errors:
[ .... ]
Warning: Host cert file: /etc/grid-security/hostcert.pem not found. Re-run
setup-globus-gram-job-manager after installing host cert file.
Determining system information...
Creating job manager configuration file...
Done
Setting up fork gram reporter in MDS
-----------------------------------------
Done
/opt/globus/sbin/globus-initialization.sh: line 65: ./setup-globus-gram-reporter-pbs: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 66: ./setup-globus-gram-reporter-condor: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 67: ./setup-globus-gram-reporter-lsf: No such file or directory
loading cache ./config.cache
checking for mpirun... (cached) /usr/bin/mpirun
creating ./config.status
creating fork.pm
/opt/globus/sbin/globus-initialization.sh: line 69: ./setup-globus-job-manager-pbs: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 70: ./setup-globus-job-manager-condor: No such file or directory
/opt/globus/sbin/globus-initialization.sh: line 71: ./setup-globus-job-manager-lsf: No such file or directory
loading cache ./config.cache
creating ./config.status
creating grid-cert-request-config
creating grid-security-config
INFO: Executing function: config_lcgenv_30
INFO: Executing function: config_replica_manager
----------------------------------------------------------------------
CONFIGURATION SUCCESSFUL!
----------------------------------------------------------------------
- globus-url-copy to sa3-ce.egee.cesga.es it works
- All services are up and running on WN
SE_dpm_mysql
- Upgrade with apt repository from yaim 3.1 adding the patch repository it works fine..
- We use as repository:
rpm http://lxb2042.cern.ch/gLite/APT/R3.0-cert rhel30 externals Release3.0 updates updates.certified internal patch1238.uncertified
We got these errors:
[root@sa3-se etc]# /opt/glite/yaim/bin/yaim -c -s site-info.def -n glite-SE_dpm_mysql
voms search(https://voms.cern.ch:8443/voms/ops/services/VOMSCompatibility?method=getGridmapUsers&container=%2Fops%2FRole%3Dproduction): Internal Server Error
voms search(https://test01.egee.cesga.es:8443/voms/cesga/services/VOMSCompatibility?method=getGridmapUsers&container=%2Fcesga%2FRole%3Dlcgadmin): Internal Server Error
....
This with all voms-servers
[ .... ]
chown: failed to get attributes of `/opt/glite/var/lock/gip': No such file or directory
chown: failed to get attributes of `/opt/glite/var/cache/gip': No such file or directory
chown: failed to get attributes of `/opt/glite/etc/gip/ldif': No such file or directory
chmod: failed to get attributes of `/opt/glite/var/lock/gip': No such file or directory
chmod: failed to get attributes of `/opt/glite/var/cache/gip': No such file or directory
chmod: failed to get attributes of `/opt/glite/etc/gip/ldif': No such file or directory
/opt/glite/yaim/bin/../libexec/configure_node: line 721: /opt/glite/etc/gip/ldif/static-file-dSE.ldif: No such file or directory
/opt/glite/yaim/bin/../libexec/configure_node: line 730: /opt/glite/etc/gip/plugin/glite-info-dynamic-se: No such file or directory
chmod: failed to get attributes of `/opt/glite/etc/gip/plugin/glite-info-dynamic-se': No such file or directory
/opt/glite/yaim/bin/../libexec/configure_node: line 847: /opt/glite/etc/gip/ldif/static-file-SE.ldif: No such file or directory
/opt/glite/yaim/bin/../libexec/configure_node: line 961: /opt/glite/etc/gip/glite-info-generic.conf.tmp: No such file or directory
[ ... ]
- globus-url-copy to sa3-ce.egee.cesga.es it works
- All services are up and running on SE_dpm_mysql
MON
- Upgrade with apt repository from yaim 3.1 adding the patch repository it works fine..
- We use as repository:
rpm http://lxb2042.cern.ch/gLite/APT/R3.0-cert rhel30 externals Release3.0 updates updates.certified internal patch1238.uncertified
- We got dependencies installing glite-info-generic-2.0.2-1.noarch.rpm
[root@sa3-mon root]# rpm -vhi glite-info-generic-2.0.2-1.noarch.rpm
error: Failed dependencies:
perl(Net::LDAP::LDIF) is needed by glite-info-generic-2.0.2-1
- We add SITE_BDII_HOST in site-info.def
tomcat5 ok
gridice-mds OK
rgma-gin OK
rgma-publish-site OK
rgma-servicetool OK
[root@sa3-mon log]# rgma-server-check
*** Running R-GMA server tests on sa3-mon.egee.cesga.es ***
Checking Tomcat is running on the local machine...
Successfully connected to Tomcat.
Java VM version: 1.4.2_14 (OK)
Connecting to https://lxb2019.cern.ch:8443/R-GMA/SchemaServlet...
Successfully connected to Schema.
Using PongServlet (1) on https://lxb2019.cern.ch:8443/R-GMA/PongServlet.
Using certificate /var/lib/tomcat5/conf/hostcert.pem.
Using key /var/lib/tomcat5/conf/hostkey.pem.
Checking other servlets...
Connecting to https://sa3-mon.egee.cesga.es:8443/R-GMA/PrimaryProducerServlet:OK
Checking clock synchronization: OK
Connecting to https://sa3-mon.egee.cesga.es:8443/R-GMA/SecondaryProducerServlet:OK
Checking clock synchronization: OK
Connecting to https://sa3-mon.egee.cesga.es:8443/R-GMA/OnDemandProducerServlet:OK
Checking clock synchronization: OK
Connecting to https://sa3-mon.egee.cesga.es:8443/R-GMA/ConsumerServlet:OK
Connecting to streaming port 8088 on sa3-mon.egee.cesga.es:OK
Checking clock synchronization: OK
*** R-GMA server test successful ***
PIC testing report
The following bug has been found:
I'm experiencing some minor problems with the infosystem. I'm a little confused which info is
published by the wrapper. Here are the relevant outputs:
[root@ce03 root]# /opt/glite/libexec/glite-info-wrapper
dn:
GlueCEUniqueID=ce03.pic.es:2119/jobmanager-lcgpbs-pps,mds-vo-name=loca
l,
o=grid
glueceinfototalcpus: 0
[root@ce03 root]# /opt/glite/etc/gip/plugin/glite-info-dynamic-ce
dn:
GlueCEUniqueID=ce03.pic.es:2119/jobmanager-lcgpbs-pps,mds-vo-name=loca
l,
o=grid
GlueCEInfoTotalCPUs: 4
As I guess the info from the plugin should overwrite the static info,
so the glite-info-wrapper should give the correct value of 4, shouldn't it?
And why are the variables of the wrapper all in lower case? Could that
be a problem?
The following patch has been created to fix this:
#patch 1363
UCY testing report
By Asterios Katsifodimos
See report in
http://docs.google.com/View?docid=dgskzrkt_16hkmtbk
CERN testing report
See comments in patch
patch #1238
and
patch #1239
.
--
MariaALANDESPRADILLO - 24 Aug 2007