Deploying Sun Grid Engine in a LCG Computing Element

Disclaimer

This software is considered beta -- you use it at your own risk. It may be not fully optimized or correct and therefore, should be considered as experimental. There is no guarantee that it is compatible with the way in which your site is configured.

About

Author: Gonçalo Borges, goncalo@lipNOSPAMPLEASE.pt (reviewed by CESGA)

Version: 4.0.1-4

Abstract: SGE Yaim integration Manual for lcg-CE 3.1 and glite-WN 3.1

RPMS Description:

sge-V61u3-1.i386.rpm: Contains the binaries and libraries needed to run sge commands;

sge-utils-V61u3-1.i386.rpm: Instalation scripts and SGE utilities;

sge-daemons-V61u7_1-3.i386.rpm: The SGE daemons;

sge-ckpt-V61u3-1.i386.rpm: For checkpointing purposes;

sge-parallel-V61u3-1.i386.rpm: For running parallel environments, as OpenMpi, Mpich, etc;

sge-docs-V61u3-1.i386.rpm: Documentation, manuals and examples;

sge-qmon-V61u3-1.i386.rpm: The SGE GUI interface;

RPMS Download:

SGE server and SGE client software should be downloaded from an external source (to be provided by LIP)

http://www.lip.pt/grid/sgeV61u3toSLC4/sge-ckpt-V61u3-1.i386.rpm

http://www.lip.pt/grid/sgeV61u3toSLC4/sge-daemons-V61u3-1.i386.rpm

http://www.lip.pt/grid/sgeV61u3toSLC4/sge-devel-V61u3-1.i386.rpm

http://www.lip.pt/grid/sgeV61u3toSLC4/sge-docs-V61u3-1.i386.rpm

http://www.lip.pt/grid/sgeV61u3toSLC4/sge-parallel-V61u3-1.i386.rpm

http://www.lip.pt/grid/sgeV61u3toSLC4/sge-qmon-V61u3-1.i386.rpm

http://www.lip.pt/grid/sgeV61u3toSLC4/sge-utils-V61u3-1.i386.rpm

http://www.lip.pt/grid/sgeV61u3toSLC4/sge-V61u3-1.i386.rpm

Pré-Requisites:

glite-info-dynamic-sge-1.0.0-1.noarch.rpm depends on perl-XML-Simple >= 2.14 available from the DAG repository.

If we want to use SGE X-window tools:

yum install openmotif
yum install pdksh
yum install xorg-x11-xauth

https://twiki.cern.ch/twiki/bin/view/EGEE/CertTestBedWorld

Check that your apt repositories are properly set to:

On CE 3.1

cat /etc/yum.repos.d/glite.repo

[lcg-CE]
name=lcg-CE 3.1
baseurl=http://grid-deployment.web.cern.ch/grid-deployment/glite/cert/3.1/lcg-CE/sl4/i386/
enabled=1
protected=1

cat /etc/yum.repos.d/lcg-CA.repo

 
[CA]
name=CAs
baseurl=http://linuxsoft.cern.ch/LCG-CAs/current

cat /etc/yum.repos.d/jpackage5.0.repo

 
[jpackage17-generic]
name=JPackage 1.7, generic
baseurl=http://mirrors.dotsrc.org/jpackage/1.7/generic/free/
enabled=1
protect=1
gpgkey=http://www.jpackage.org/jpackage.asc
gpgcheck=1

[jpackage17-generic-nonfree]
name=JPackage 1.7, generic non-free
baseurl=http://mirrors.dotsrc.org/jpackage/1.7/generic/non-free/
enabled=1
protect=1
gpgkey=http://www.jpackage.org/jpackage.asc
gpgcheck=1

[jpackage5-generic]
name=JPackage 5, generic
baseurl=http://mirrors.dotsrc.org/jpackage/5.0/generic/free/
enabled=1
protect=1
gpgkey=http://www.jpackage.org/jpackage.asc
gpgcheck=1

[jpackage5-generic-nonfree]
name=JPackage 5, generic non-free
baseurl=http://mirrors.dotsrc.org/jpackage/5.0/generic/non-free/
enabled=1
protect=1
gpgkey=http://www.jpackage.org/jpackage.asc
gpgcheck=1

cat /etc/yum.repos.d/dag.repo

[dag]
name=DAG (http://dag.wieers.com) additional RPMS repository
baseurl=http://linuxsoft.cern.ch/dag/redhat/el4/en/$basearch/dag
gpgkey=http://linuxsoft.cern.ch/cern/slc4X/$basearch/docs/RPM-GPG-KEY-dag
gpgcheck=1
enabled=1
protect=0

cat /etc/yum.repos.d/patch1474.repo

[patch1474]
name=patch #1474, Patch to enable Sun Grid
baseurl=http://grid-deployment.web.cern.ch/grid-deployment/glite/cert/3.1/patches/1474/sl4/i386/
enabled=1

cat /etc/yum.repos.d/patch1661.repo

[patch1661]
name=patch #1661, Patch to enable Sun Grid YAIM glite-yaim-lcg-ce
baseurl=http://grid-deployment.web.cern.ch/grid-deployment/glite/cert/3.1/patches/1661/sl4/i386/
enabled=1

On WN 3.1

cat /etc/yum.repos.d/glite.repo

 
[glite-WN]
name=gLite 3.1 Worker Node
baseurl=http://grid-deployment.web.cern.ch/grid-deployment/glite/cert/3.1/glite-WN/sl4/i386/
enabled=1

cat /etc/yum.repos.d/lcg-CA.repo

[CA]
name=CAs
baseurl=http://linuxsoft.cern.ch/LCG-CAs/current

cat /etc/yum.repos.d/dag.repo

 
[dag]
name=DAG (http://dag.wieers.com) additional RPMS repository
baseurl=http://linuxsoft.cern.ch/dag/redhat/el4/en/$basearch/dag
gpgkey=http://linuxsoft.cern.ch/cern/slc4X/$basearch/docs/RPM-GPG-KEY-dag
gpgcheck=1
enabled=1
protect=0

cat /etc/yum.repos.d/jpackage5.0.repo

[ckage17-generic]
name=JPackage 1.7, generic
baseurl=http://mirrors.dotsrc.org/jpackage/1.7/generic/free/
enabled=1
protect=1
gpgkey=http://www.jpackage.org/jpackage.asc
gpgcheck=1

[jpackage17-generic-nonfree]
name=JPackage 1.7, generic non-free
baseurl=http://mirrors.dotsrc.org/jpackage/1.7/generic/non-free/
enabled=1
protect=1
gpgkey=http://www.jpackage.org/jpackage.asc
gpgcheck=1

[jpackage5-generic]
name=JPackage 5, generic
baseurl=http://mirrors.dotsrc.org/jpackage/5.0/generic/free/
enabled=1
protect=1
gpgkey=http://www.jpackage.org/jpackage.asc
gpgcheck=1

[jpackage5-generic-nonfree]
name=JPackage 5, generic non-free
baseurl=http://mirrors.dotsrc.org/jpackage/5.0/generic/non-free/
enabled=1
protect=1
gpgkey=http://www.jpackage.org/jpackage.asc
gpgcheck=1

cat /etc/yum.repos.d/patch1471.repo

[patch1474]
name=patch #1474, Patch to enable Sun Grid
baseurl=http://grid-deployment.web.cern.ch/grid-deployment/glite/cert/3.1/patches/1474/sl4/i386/
enabled=1

Please ensure that “passwordless ssh” will work from a WN pool account to a CE pool account. This is something which is not specific for this precise deployment but needed by all grid infrastructures.

CE 3.1 gatekeeper Installation

  • First we install lcg-CE base packages:
yum install lcg-CA cert-lcg-CE

  • Download SGE packages from LIP:
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-ckpt-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-daemons-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-devel-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-docs-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-parallel-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-qmon-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-utils-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-V61u3-1.i386.rpm

rpm -vhi sge-ckpt-V61u3-1.i386.rpm sge-daemons-V61u3-1.i386.rpm sge-docs-V61u3-1.i386.rpm sge-parallel-V61u3-1.i386.rpm sge-qmon-V61u3-1.i386.rpm sge-utils-V61u3-1.i386.rpm sge-V61u3-1.i386.rpm

Preparing...                ########################################### [100%]
   1:sge                    ########################################### [ 14%]
   2:sge-utils              ########################################### [ 29%]
   3:sge-ckpt               ########################################### [ 43%]
   4:sge-daemons            ########################################### [ 57%]
   5:sge-docs               ########################################### [ 71%]
   6:sge-parallel           ########################################### [ 86%]
   7:sge-qmon               ########################################### [100%]

yum install cert-glite-SGE_utils

Dependencies Resolved

=============================================================================
 Package                 Arch       Version          Repository        Size 
=============================================================================
Installing:
 cert-glite-SGE_utils    noarch     3.1.0-0          patch             1.5 k
Installing for dependencies:
 glite-apel-sge          noarch     2.0.5-1          patch              28 k
 glite-info-dynamic-sge  noarch     1.0.0-1          patch              30 k
 glite-yaim-sge-utils    noarch     4.0.1-1          patch              10 k
 lcg-jobmanager-sge      noarch     1.0.0-1          patch              20 k
 perl-XML-Simple         noarch     2.17-1.el4.rf    dag                71 k

Transaction Summary
=============================================================================
Install      6 Package(s)         
Update       0 Package(s)         
Remove       0 Package(s)         
Total download size: 161 k

  • Download YAIM_SGE packages from LIP:
wget http://www.lip.pt/grid/sge2lcgce/glite-yaim-sge-server-4.0.1-4.noarch.rpm

rpm -ivh glite-yaim-sge-server-4.0.1-4.noarch.rpm

Preparing...                ########################################### [100%]
   1:glite-yaim-sge-server  ########################################### [100%]

  • Check your site-info.def and configure CE SGE batch server using yaim functions:
/opt/glite/yaim/bin/yaim -c -s site-info.def -n lcg-CE -n SGE_server -n SGE_utils

WN 3.1 SGE node Installation

  • First we install glite-WN base packages:
yum install cert-glite-WN

  • Download SGE packages from LIP:
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-daemons-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-docs-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-parallel-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-utils-V61u3-1.i386.rpm
http://www.lip.pt/grid/sgeV61u3toSLC4/sge-V61u3-1.i386.rpm
http://www.lip.pt/grid/sge2lcgce/glite-yaim-sge-client-4.0.1-3.noarch.rpm

rpm -vhi sge-daemons-V61u3-1.i386.rpm sge-docs-V61u3-1.i386.rpm sge-parallel-V61u3-1.i386.rpm sge-utils-V61u3-1.i386.rpm sge-V61u3-1.i386.rpm

Preparing...                ########################################### [100%]
   1:sge                    ########################################### [ 20%]
   2:sge-utils              ########################################### [ 40%]
   3:sge-daemons            ########################################### [ 60%]
   4:sge-docs               ########################################### [ 80%]
   5:sge-parallel           ########################################### [100%

rpm -vhi glite-yaim-sge-client-4.0.1-3.noarch.rpm

Preparing...                ########################################### [100%]
   1:glite-yaim-sge-client  ########################################### [100%]

  • Finally check your site-info.def and configure using SGE Yaim functions:
/opt/glite/yaim/bin/yaim -c -s site-info.def  -n WN -n SGE_client

Testing:

Test the information system using the following commands:

ldapsearch -x -h <your_ce> -p 2135 -b "mds-vo-name=local,o=grid"
ldapsearch -x -h <your_ce> -p 2170 -b "mds-vo-name=resource,o=grid"

  • Check if it is returning the proper queue names and available resources.

  • Try to submit a simple script from a give pool account in your CE. From this test you will check if the SGE prompt commands (like qsub or qstat) are working. If the job finishes sucessfully, the stdout and stderr files won't be available in our CE since, in a normal grid event, they would be transfered directly from the WN to the RB using GSIFTP. Try to check stderr/stdout files in the WN...

  • Try to do a globus-job-run using fork from a UI (you have to start your proxy first):

[goncalo@ui01]$ globus-job-run ce03.lip.pt:2119/jobmanager-fork /bin/uname -a
Linux ce03.lip.pt 2.6.9-34.EL.cern #1 Sun Mar 12 12:19:53 CET 2006 i686 athlon i386 GNU/Linux

  • Try to do a globus-job-run using lcgsge from a UI:

[goncalo@ui01]$ globus-job-run ce03.lip.pt:2119/jobmanager-lcgsge /bin/uname -a
Linux sgewn01.lip.pt 2.6.9-34.EL.cern #1 Sun Mar 12 12:19:53 CET 2006 i686 i686 i386 GNU/Linux

  • Try to submit a job though the RB from a UI:

[goncalo@ui01]$ edg-job-submit -r ce03.lip.pt:2119/jobmanager-lcgsge-dteamgrid well.jdl
Selected Virtual Organisation name (from proxy certificate extension): dteam
Connecting to host rb02.lip.pt, port 7772
Logging to host rb02.lip.pt, port 9002
*********************************************************************************************
                               JOB SUBMIT OUTCOME
 The job has been successfully submitted to the Network Server.
 Use edg-job-status command to check job current status. Your job identifier (edg_jobId) is:
 - https://rb02.lip.pt:9000/Ab0W2EpWMPkpJKjAMpRCsQ
*********************************************************************************************

[goncalo@ui01 ce02]$ edg-job-status https://rb02.lip.pt:9000/Ab0W2EpWMPkpJKjAMpRCsQ
*************************************************************
BOOKKEEPING INFORMATION:
Status info for the Job : https://rb02.lip.pt:9000/Ab0W2EpWMPkpJKjAMpRCsQ
Current Status:     Done (Success)
Exit code:          0
Status Reason:      Job terminated successfully
Destination:        ce03.lip.pt:2119/jobmanager-lcgsge-dteamgrid
reached on:         Fri Feb  2 18:42:38 2007
*************************************************************

[goncalo@ui01 ce02]$ cat /tmp/jobOutput/goncalo_Ab0W2EpWMPkpJKjAMpRCsQ/well.out
One Perl out of the sea!
This is Linux sgewn01.lip.pt 2.6.9-34.EL.cern #1 Sun Mar 12 12:19:53 CET 2006 i686 i686 i386 GNU/Linux
Fri Feb  2 18:31:14 WET 2007
Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2008-02-29 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback