Installation of GLite 3.0.2 middleware on Scientific Linux 3.0.8 and CentOS 4.4 for the UCY certification testbed
On plethon.grid.ucy.ac.cy:/data1/others/certificationtestbed there is an NFS shared directory which contains
config files, vmware ready and configured machines, ssh keys and other usefull stuff.
The hardware setup
The following hardware were used:
2 x IBM xSeries 335 (2x Intel Xeon 2.8GHz CPUs, 3GB RAM, 40GB HDD)
1 x Dell
PowerEdge (2x Intel P3 800MHz CPUs, 768MB RAM, 2x35GB 1x70GB SCSI HDD)
1 x Dell Precision 330 (1x Intel P4 1.5GHz, CPU, 512MD RAM, 40GB HDD)
1 x Generic PC (1x Intel P4 1.7GHz, CPU, 256MD RAM, 40GB HDD)
1 x Generic PC (1x Intel P3 800MHz, CPU, 384MD RAM, 40GB HDD)
Hardware setup:
Network interface: eth0
IP address prefix: 194.42.27
DNS name suffix: grid.ucy.ac.cy
Machines used static IP addresses
sysId
|
Name
|
IP
|
MAC
|
Mem
|
HDD
|
odin (IBM)
|
ce201
|
239
|
|
768MB
|
7GB
|
odin (IBM)
|
wn202
|
228
|
|
768MB
|
7GB
|
odin (IBM)
|
wn201
|
231
|
|
768MB
|
5GB
|
tor (IBM)
|
wmslb201
|
235
|
|
768MB
|
5GB
|
tor (IBM)
|
mon201
|
237
|
|
768MB
|
5GB
|
tor (IBM)
|
bdii201
|
234
|
|
768MB
|
5GB
|
Dell
|
se201
|
238
|
|
768MB
|
120GB
|
Gen P3
|
lfc201
|
232
|
|
256MB
|
40GB
|
Gen P4
|
amga201
|
233
|
|
384MB
|
40GB
|
Dell Prec.330
|
ui201
|
236
|
|
512MB
|
40GB
|
The software installed
Operating system: Scientific Linux 3.0.8 and
CentOS 4.4
GLite middleware version: 3.0.2
Documentation used
The following documents were used:
Generic Installation and Configuration (LCG-GIS-MI), V3.0.0
Installation and Configuration Guide, V3.0(rev.2)
GLite 3.0 User Guide Manuals Series (CERN-LCG-GDEIS-722398) V0.1
http://wiki.egee-see.org/index.php/GLite30
The installation procedure
Installed the Linux operating system (SL3.0.8)
Installed Java version 1.4.2_13
Installed GLite version 3.0.2
Environments that had to be changed
The different settings had to be changed for all the machines.
Changing the path
By default the PATH environment variable does not include the directory /usr/sbin, which means that the yaim configuration scripts fails when it tries to add the user account useradd program
In the file .bashrc add the following line
PATH=$PATH:/usr/sbin:/sbin
Setting the IP address
Since the machines uses static IP addresses the following changes must be done to the file /etc/hosts
Modify the first line in the file from
127.0.0.1 machineName localhost.localdomain localhost
to
127.0.0.1 localhost.localdomain localhost
where machineName is the name of the machine (like ce201). Basically remove the machine name.
Add as the first line in the file (FQDN = Fully Qualified Domain Name):
MachineIP Machine FQDN Machine short name
Example :
194.42.27.239 ce201.grid.ucy.ac.cy ce201
Setting the hostname
When the program hostname is called it must return the FQDN, which it does not by default. In the file /etc/sysconfig/network make sure that the FQDN name is set for the HOSTNAME.
Example:
HOSTNAME=ce201.grid.ucy.ac.cy
The change will not occur until the machine is restarted. To test that the hostname is updated use the program /bin/hostname
Example:
[root@ce201 root]# /bin/hostname
ce201.grid.ucy.ac.cy
Setting the time server for the machines
Every machine has to be synchronized with a time server to avoid problems.
It is a very important thing for the functionality of the middleware.
The site-info.def file
Here are some things to keep in mind when making a site-info.def file.
Do not use a ?@? symbol for any of the passwords.
The DPM_DB_PASSWORD should be in double quotes like this: ?passwd?
For a glite CE its BDII_CE_URL should be like this:
BDII_CE_URL="ldap://CE_HOST:2170/mds-vo-name=resource,o=grid"
Especially note the 2170 instead of 2135 and resource instead of local.
For a glite
DPM its BDII_SE_URL should be like this:
BDII_CE_URL="ldap://SE_HOST:2170/mds-vo-name=resource,o=grid"
The example file values for CE_OS, CE_OS_RELEASE, and CE_OS_VERSION are wrong.
For CE_DATADIR in the example file it is set to ??. This will not work due to a bug and should be set to unset without any quotes.
Commands to run after installing the machines
These commands have to be run after every machine installation:
root@machine# apt-get update
root@machine# apt-get install ca_BitFace ctb-vomscerts
Can install this one on the UI ONLY to submit SAM tests from
root@machine# apt-get install lcg-sam-client-sensors-ctb
Installation of the different type
The following procedures were followed to install the different components on the different machines.
Copying configuration files : groups.conf, users.conf, wn-list.conf, site-info.def
into:
/opt/glite/yaim/etc
copy the directory vo.d containing a config file for each supported VOs into: /opt/glite/yaim/etc
After configuration need to copy certificates for supported VOs into: /etc/grid-security/vomsdir/
SEE (voms.grid.auth.gr) VO: from
http://www.grid.auth.gr/pki/hellasgrid-ca-2002/cacert/
geclipse VO: from
https://dgrid-voms.fzk.de:8443/voms/geclipse/
Adding the host keys and certificates for some nodes
For the CE, MON, WMSLB, SE, LFC first do the install_node or ./bin/yaim -i -s before continuing to the configure_node or ./bin/yaim -c -s you have to add the certificates to these machines. This is done by adding the hostcert.pem and hostkey.pem files to the following directory: /etc/grid-security/
Configuration of the CE
The following commands was issued to configure the CE
[root@ce201 root]# ./bin/yaim -i -s etc/site-info.def -m glite-CE glite-torque-server-config
OR
./install_node site-info.def glite-CE glite-torque-server-config
Change the following file /var/spool/maui/maui.cfg, set both SERVERHOST and ADMINHOST to the FQDN of the machine of the CE and add edguser tomcat4 to the line ADMIN3. If the line does not exist just add it.
Example:
SERVERHOST ce201.grid.ucy.ac.cy
ADMIN1 root
ADMINHOST ce201.grid.ucy.ac.cy
ADMIN3 edginfo rgma edguser tomcat4
[root@ce201 root]# ./bin/yaim -c -s etc/site-info.def -n gliteCE TORQUE_server BDII_site
OR
./configure_node site-info.def gliteCE TORQUE_server BDII_site
Last line of file /opt/lcg/etc/lcg-info-dynamic-scheduler.conf has the ?h in from of the CE hostname like this:
/opt/lcg/libexec/vomaxjobs-maui -h ce201.grid.ucy.ac.cy. If the line does not exist, run:
[root@ce201 root]# ./bin/yaim -c -s etc/site-info.def -n gliteCE TORQUE_server
[root@ce201 root]# ./bin/yaim -c -s etc/site-info.def -n BDII_site
And check the last line of the file again.
Fix this file /var/spool/maui/maui.cfg again.
Then restart the maui server:
[root@ce201 root]# service maui restart
[root@ce201 root]# touch /opt/globus/var/log/xferlog
NOT NEEDED ANYMORE: Start the log parser (HAVE TO BE DONE EVERY TIME IT REBOOTS):
[root@ce201 root]# /opt/glite/bin/BLParserPBS -s /var/spool/pbs -p 33332&
After Every reconfigure (Fix the maui config file):
[root@ce201 root]# nano /var/spool/maui/maui.cfg
(add edguser tomcat4 to the line ADMIN3)
ADMIN3 edginfo rgma edguser tomcat4
After reinstalling the CE in a new machine, you have to follow this
https://wiki.egi.eu/wiki/Tools/Manuals/TS60
for every workernode. This has to be done because the CE's ssh key changes
and the workernode cannot communcate with it.
Configuration of the WMSLB
The following commands was issued to configure the WMSLB
[root@wmslb201 root]# ./bin/yaim -i -s etc/site-info.def -m glite-WMSLB
OR
./install_node site-info.def glite-WMSLB
[root@wmslb201 root]# ./bin/yaim -c -s etc/site-info.def -n WMSLB
OR
./configure_node site-info.def WMSLB
[root@wmslb201 root]# touch /var/log/glite/lcmaps.log
[root@wmslb201 root]# chown glite.glite /var/log/glite/lcmaps.log
[root@wmslb201 root]# touch /opt/glite/var/log/xferlog
The GLITE_LOCATION environment is not defined for some of the services. The solution is to modify /opt/glite/etc/init.d/* to set some default value to /opt/glite for GLITE_LOCATION. Currently it is set to /home/glbuild/GLITE_3_0_0/stage/.
ReReStarting the gLite service (HAVE TO BE DONE EVERY TIME IT REBOOTS OR CONFIGURED) also have to be done every time the CE is restarted or reconfigured.
[root@wmslb201 root]#service gLite stop ; service gLite stop ; service gLite start
Configuration of the BDII
The following commands was issued to configure the
BDII
[root@bdii201 root]# ./bin/yaim -i -s etc/site-info.def -m glite-BDII
OR
./install_node site-info.def glite-BDII
[root@bdii201 root]# ./bin/yaim -c -s etc/site-info.def -n
BDII
OR
./configure_node site-info.def
BDII
[ // For unregistered sites, i.e. it does not have a top level
BDII
If the site is not registered, then in order for the WMSLB to find the local CE the following has to be done.
In the file /opt/bdii/etc/bdii.conf change the BDII_AUTO_UPDATE to BDII_AUTO_UPDATE=nov
In the file /opt/bdii/etc/bdii-update.conf add the
BDII ldap information for the local CE.
Example:
CY-02-CYGRID-CERT ldap://ce201.grid.ucy.ac.cy:2170/mds-vo-name=CY-02-CYGRID-CERT,o=grid
Then restart the
BDII service:
[root@bdii201 root]# service bdii restart
OR ... [root@bdii201 root]# service
BDII restart
After the service is restarted wait for some time, like 10 minutes, for the information to be propagated.
]
Configuration of the SE
The following commands was issued to configure the SE
In order to use
DPM the disk must have a separate partition for the storage. Make two partitions and mount it to /dpm1 and dpm2, then change the permission on them:
chmod 770 /dpm1/
chmod 770 /dpm2/
Make sure that you have the following entry in users.conf:
151:dpmmgr:151:dpmmgr:x:dpm:
[root@se201 root]# ./bin/yaim -i -s etc/site-info.def -m glite-SE_dpm_mysql
OR
./install_node site-info.def glite-SE_dpm_mysql
[root@se201 root]# ./bin/yaim -c -s etc/site-info.def -n SE_dpm_mysql
OR
./configure_node site-info.def SE_dpm_mysql
[root@se201 root]# touch /var/log/dpm-gsiftp/gridftp.log
Configuration of the LFC
The following commands was issued to configure the LFC
[root@lfc201 root]# ./bin/yaim -i -s site-info.def -m glite-LFC_mysql
OR
./install_node site-info.def glite-LFC_mysql
[root@lfc201 root]# ./bin/yaim -c -s etc/site-info.def -n LFC_mysql
OR
./configure_node site-info.def LFC_mysql
Configuration of the WN (SL 3.0.8)
The following commands was issued to configure the WN
[root@wn201 root]# ./bin/yaim -i -s etc/site-info.def -m glite-WN glite-torque-client-config
OR
./install_node site-info.def glite-WN glite-torque-client-config
[root@wn201 root]# ./bin/yaim -c -s etc/site-info.def -n WN_torque
OR
./configure_node site-info.def WN_torque
Edit the two following files and change the /usr/java/j2sdk1.4.2_08 to whichever version of java is installed.
/opt/edg/etc/profile.d/edg-wl-ui-gui-env.csh
/opt/edg/etc/profile.d/edg-wl-ui-gui-env.sh
Configuration of the MON
The following commands were issued to configure them.
[root@mon201 root]# ./bin/yaim -i -s site-info.def -m glite-MON
OR
./install_node site-info.def glite-MON
[root@mon201 root]# ./bin/yaim -c -s etc/site-info.def -n MON
OR
./configure_node site-info.def MON
Configuration of the UI
The following commands were issued to configure them.
[root@ui201 root]# ./bin/yaim -i -s site-info.def -m glite-UI
OR
./install_node site-info.def glite-UI
[root@ui201 root]# ./bin/yaim -c -s etc/site-info.def -n UI
OR
./configure_node site-info.def UI
Edit the four following files and change the /usr/java/j2sdk1.4.2_08 to whichever version of java is installed.
/opt/edg/etc/profile.d/edg-wl-ui-gui-env.csh
/opt/edg/etc/profile.d/edg-wl-ui-gui-env.sh
/opt/edg/var/etc/profile.d/edg-wl-ui-gui-env.csh
/opt/edg/var/etc/profile.d/edg-wl-ui-gui-env.sh
To set the default VO for all the users, modify the file /opt/glite/etc/glite_wmsui_cmd_var.conf and set the variable
DefaultVo = "yourDefaultVO" ;
Set the default LFC for the users, so that they don?t need to specify it:
[root@ui201 root]# echo "export LFC_HOST=lfc201.grid.ucy.ac.cy" > /etc/profile.d/dteam.sh
[root@ui201 root]# echo "export LCG_CATALOG_TYPE=lfc" >> /etc/profile.d/dteam.sh
[root@ui201 root]# chown root:root /etc/profile.d/dteam.sh
[root@ui201 root]# chmod 755 /etc/profile.d/dteam.sh
Do the same for the each VO that you support, by changing from dteam.sh to see.sh for instance.
Configuration of the AMGA
Add the following to this file /opt/glite/yaim/script/node-info.def to add a AMGA node
AMGA_FUNCTIONS="${BASE1_FUNCTIONS} ${BASE2_FUNCTIONS}
config_lcmaps
config_lcas"
Starting the AMGA server when rebooted:
/etc/init.d/rhdb start
/etc/init.d/mdservice start
Configuration of CentOS 4.4 (might also apply for SL44) Workernode
Preparation
Download the Java 1.5 and install it.
Download yaim and install it.
Copy the gLite.repo and dag.repo to to /etc/yum/repos.d.(see bottom of the page)
Copy everything from SA3_site_info/general/site_config/etc to /opt/glite/yaim/etc
Make sure that time is synchronized, use: ntpdate -u 1.gr.pool.ntp.org
In the /opt/glite/yaim/etc/site-info.def file:
change the JAVA_LOCATION to the java path you installed, for example:JAVA_LOCATION="/usr/java/jdk1.5.0_11"
Installation - Configuration
yum update
yum install glite-WN_sl4compat
yum install edg* glite* torque*
./configure_node ../etc/site-info.def WN_torque
After Installation
AFTER 5 MINUTES: Edit the two following files and change the /usr/java/j2sdk... to whichever version of java is installed.
/opt/edg/etc/profile.d/j2.csh
/opt/edg/etc/profile.d/j2.sh
Install missing packages
[root@wn202 yaim]# yum install apt-get
[root@wn202 yaim]# apt-get update
[root@wn202 yaim]# apt-get install lcg-profile edg-profile globus-config
Send some jobs to the WN
If the jobs enter the RUN state and after some seconds get back to QUEUED state run the following:
On the WN : [root@wn202]# /opt/edg/sbin/edg-pbs-knownhosts
On the CE: [root@ce201]# /opt/edg/sbin/edg-pbs-knownhosts; /opt/edg/sbin/edg-pbs-shostsequiv
This happens after a full reinstallation of a machine, because host key of the WN changes.
The new yaim 3.0.1-x configuration
VO specific files in the vo.d folder
The new yaim uses a different way of defining VO specific values. There has to be created a vo.d folder in /opt/glite/yaim/etc/.
There, has to be files that define a VO. Every file has to have the name of the vo.
For example:
[root@ce201 root]# cat /opt/glite/yaim/etc/vo.d/see
SW_DIR=$VO_SW_DIR/see
DEFAULT_SE=$DPM_HOST
STORAGE_DIR=$DPM_STORAGE_DIR/see
QUEUES="see"
VOMS_SERVERS="vomss://voms.grid.auth.gr:8443/edg-voms-admin/see?/see"
VOMSES="see voms.grid.auth.gr 15010 /C=GR/O=HellasGrid/OU=grid.auth.gr/CN=voms.grid.auth.gr see"
Note that the variables $(vo)_GROUP_ENABLE=$(vo) have to be defined in the site-info.def file
Appendix
gLite.repo:
[glite-base]
name=Glite
baseurl=http://grid-deployment.web.cern.ch/grid-deployment/glite/apt/cert/3.1/glite-WN/sl4/i386/
enabled=1
[torque-base]
name=Torque
baseurl=http://grid-deployment.web.cern.ch/grid-deployment/glite/apt/cert/3.1/glite-TORQUE_client/sl4/i386/
enabled=1
name=CAs
baseurl=http://linuxsoft.cern.ch/LCG-CAs/current
enabled=1
protect=1
[jpackage]
name=jpackage
baseurl=http://linuxsoft.cern.ch/jpackage/1.6/redhat-el-4.0/free
enabled=1
[jpackage-generic]
name=jpackage
baseurl=http://linuxsoft.cern.ch/jpackage/1.6/generic/free
enabled=1
dag.repo:
[dag]
name=DAG rpms
baseurl=ftp://ftp.scientificlinux.org/linux/extra/dag/redhat/el4/en/$basearch/dag/
enabled=1
--
LouisPoncet - 15 May 2007