TWiki
>
SGE_Yaim_Version2 Web
>
WebHome
(2007-05-21,
JavierLopezCacheiro
)
(raw view)
E
dit
A
ttach
P
DF
%TOC% ---+++!! Deploying Sun Grid Engine in a LCG Computing Element ---++ Disclaimer This software is considered *beta* -- you use it at your own risk. It may be not fully optimized or correct and therefore, should be considered as experimental. There is no guarantee that it is compatible with the way in which your site is configured. ---++ About _Author_: Gonçalo Borges, goncalo@lip.pt _Version_: 0.0.0-2 _Abstract_: SGE Yaim integration Manual for lcg-CE and glite-WN ---++ RPMS Description: _gliteWN-yaimtosge-0.0.0-2.i386.rpm_: Modification to standard glite yaim tool for glite-WN integration using SGE as scheduler system. It will install: {{{ /etc/profile.d/sge.sh (csh): To set the proper environment; /opt/glite/yaim/scripts/configure_sgeclient.pm: SGE installation directories; /opt/glite/yaim/scripts/nodesge-info.def: SGE nodes functions definition; /opt/glite/yaim/functions/config_sge_client: Configures SGE exec host; }}} _lcgCE-yaimtosge-0.0.0-2.i386.rpm_: Modification to standard glite yaim tool for lcg-CE integration using SGE as scheduler system. It will install: {{{ /etc/profile.d/sge.sh (csh): To set the proper environment; /opt/glite/yaim/scripts/configure_sgeserver.pm: SGE installation directories; /opt/glite/yaim/scripts/nodesge-info.def: SGE nodes functions definition; /opt/glite/yaim/functions/config_sge_server: Configures SGE QMASTER /opt/globus/lib/perl/Globus/GRAM/JobManager/lcgsge.pm: The SGE jobmanager; /opt/lcg/libexec/lcg-info-dynamic-sge: The SGE CE GRIS/GIIS perl script. }}} _sge-V60u7_1-3.i386.rpm_: Contains the binaries and libraries needed to run sge commands; _sge-utils-V60u7_1-3.i386.rpm_: Instalation scripts and SGE utilities; _sge-daemons-V60u7_1-3.i386.rpm_: The SGE daemons; _sge-ckpt-V60u7_1-3.i386.rpm_: For checkpointing purposes; _sge-parallel-V60u7_1-3.i386.rpm_: For running parallel environments, as OpenMpi, Mpich, etc; _sge-docs-V60u7_1-3.i386.rpm_: Documentation, manuals and examples; _sge-qmon-V60u7_1-3.i386.rpm_: The SGE GUI interface; ---++ RPMS Download: http://www.lip.pt/grid/gliteWN-yaimtosge-0.0.0-2.i386.rpm http://www.lip.pt/grid/lcgCE-yaimtosge-0.0.0-2.i386.rpm http://www.lip.pt/grid/sge-V60u7_1-3.i386.rpm http://www.lip.pt/grid/sge-utils-V60u7_1-3.i386.rpm http://www.lip.pt/grid/sge-daemons-V60u7_1-3.i386.rpm http://www.lip.pt/grid/sge-ckpt-V60u7_1-3.i386.rpm http://www.lip.pt/grid/sge-parallel-V60u7_1-3.i386.rpm http://www.lip.pt/grid/sge-docs-V60u7_1-3.i386.rpm http://www.lip.pt/grid/sge-qmon-V60u7_1-3.i386.rpm ---++ Pré-Requisites: The SGE rpm packages delivered together with this manual were built under SLC4 with the additional packaging of the libdb-4.2.so library in order for them to work in SLC3. Please report problem to goncalo@lip.pt. We will assume that the standard “lcg-CE” and “glite-WN” softwares are already installed (but not configured) in the proper machines. The installation should have been performed using the instructions proposed in: http://grid-deployment.web.cern.ch/grid-deployment/documentation/LCG2-Manual-Install/ https://twiki.cern.ch/twiki/bin/view/EGEE/CertTestBedWorld Check that your apt repositories are properly set to: {{{ [root@ce03 root]# cat /etc/apt/sources.list.d/lcg-ca.list rpm http://linuxsoft.cern.ch/ LCG-CAs/current production [root@ce03 root]# cat /etc/apt/sources.list.d/lcg.list rpm http://glitesoft.cern.ch/EGEE/gLite/APT/R3.0/ rhel30 externals Release3.0 updates [root@ce03 root]# cat /etc/apt/sources.list.d/cern.list rpm http://linuxsoft.cern.ch cern/slc30X/i386/apt os updates extras rpm-src http://linuxsoft.cern.ch cern/slc30X/i386/apt os updates extras }}} You should stop following the LCG manual and start to follow this one right before you reach the Middleware Configuration section. Please ensure that “passwordless ssh” will work from a WN pool account to a CE pool account. This is something which is not specific for this precise deployment but needed by all grid infrastructures. ---++ CE gatekeeper Installation Sun Grid Engine needs a Qmaster machine which, in the present manual, we assume it will be installed in the CE Gatekeeper. The SGE rpms will deploy all files under /usr/local/sge/V60u7_1 and link that directory to /usr/local/sge/pro. Latter on, $SGE_ROOT will be defined as /usr/local/sge/pro in such a way that we can keep old SGE versions and use them when needed. Please install the following SGE packages (require “openmotif (>= 2.2.3-5)” package, if not already there, which you may find in the SLC repositories): {{{ sge-V60u7_1-3.i386.rpm sge-utils-V60u7_1-3.i386.rpm sge-daemons-V60u7_1-3.i386.rpm sge-qmon-V60u7_1-3.i386.rpm sge-ckpt-V60u7_1-3.i386.rpm sge-parallel-V60u7_1-3.i386.rpm sge-docs-V60u7_1-3.i386.rpm }}} {{{ [root@<your_ce> ~]# rpm -ivh sge-V60u7_1-3.i386.rpm sge-utils-V60u7_1-3.i386.rpm sge-daemons-V60u7_1-3.i386.rpm sge-qmon-V60u7_1-3.i386.rpm sge-ckpt-V60u7_1-3.i386.rpm sge-parallel-V60u7_1-3.i386.rpm sge-docs-V60u7_1-3.i386.rpm Preparing... ########################################### [100%] 1:sge ########################################### [ 14%] 2:sge-utils ########################################### [ 29%] 3:sge-daemons ########################################### [ 43%] 4:sge-qmon ########################################### [ 57%] 5:sge-ckpt ########################################### [ 71%] 6:sge-parallel ########################################### [ 86%] 7:sge-docs ########################################### [100%] }}} * Install lcgCE-yaimtosge-0.0.0-2.i386.rpm which includes the modifications to the standard yaim tool allowing the SGE scheduler configuration. This rpm requires “perl-XML-Simple >= 2.14-2.2” package which you can download from http://rpmfind.net/linux/rpm2html/search.php?query=perl-XML-Simple. It also requires glite-yaim >= 3.0.0-34. (!) Please upgrade your yaim version to the last release. {{{ [root@<your_ce> ~]# rpm -ivh lcgCE-yaimtosge-0.0.0-1.i386.rpm Preparing... ########################################### [100%] 1:lcgCE-yaimtosge ########################################### [100%] }}} * Add the following values to your site-info.def file: {{{ SGE_QMASTER=$CE_HOST DEFAULT_DOMAIN=$MY_DOMAIN ADMIN_MAIL=<your_admin_email> }}} * Check that the “WN_LIST”, “USERS_CONF”, “VOS” and "QUEUES" variables are also properly defined in your site-info.def file. The content of these variables will be used to build the SGE exec node list, the SGE user sets and the SGE local queues. For the time being, VO users in the USERS_CONF file have to be defined following the same order as the QUEUES definition. Otherwise, the VO SGE userset will not correspond to the correct VO QUEUE. This will be fixed in the future... * Configure the CE running SGE using the “CE_sge” node definiton. {{{ [root@<your_ce> ~]#/opt/glite/yaim/scripts/configure_node <path_to_your_site-info.def_file> CE_sge BDII_site }}} * The CE configuration must be always run before the WN configurations, otherwise the SGE daemons in the WNs will not be started since there is no Qmaster host associated to them. * SGE prompt commands will be accessible after a new login (to source the /etc/profile.d/ scripts). * To start SGE GUI, using the “qmon” comand, you need to install “xorg-x11-xauth >= 6.8.2-1”. Unfortunately, this package is not available in the SLC3 repository and you have to download it from the SLC4 one http://linuxsoft.cern.ch/cern/slc4X/i386/SL/RPMS/xorg-x11-xauth-6.8.2-1.EL.13.37.i386.rpm If you have configured your CE with wrong values for the “WN_LIST”, “USERS_CONF”, “VOS” and "QUEUES" variables, an easy way to solve the question is to delete the /usr/local/sge/pro/default directory and run the CE configuration again. ---++ WN Installation Please install the following sge packages: {{{ sge-V60u7_1-3.i386.rpm sge-utils-V60u7_1-3.i386.rpm sge-daemons-V60u7_1-3.i386.rpm sge-parallel-V60u7_1-3.i386.rpm sge-docs-V60u7_1-3.i386.rpm }}} {{{ [root@<your_wn> ~]# rpm -ivh sge-V60u7_1-3.i386.rpm sge-utils-V60u7_1-3.i386.rpm sge-daemons-V60u7_1-3.i386.rpm sge-parallel-V60u7_1-3.i386.rpm sge-docs-V60u7_1-3.i386.rpm Preparing... ########################################### [100%] 1:sge ########################################### [ 20%] 2:sge-utils ########################################### [ 40%] 3:sge-daemons ########################################### [ 60%] 4:sge-parallel ########################################### [ 80%] 5:sge-docs ########################################### [100%] }}} * Install gliteWN-yaimtosge-0.0.0-2.i386.rpm which includes the modifications to the standard yaim tool allowing the SGE client configuration. {{{ [root@<your_wn> ~]# rpm -ivh gliteWN-yaimtosge-0.0.0-1.i386.rpm Preparing... ########################################### [100%] 1:gliteWN-yaimtosge ########################################### [100%] }}} * Use the same site-info.def file as in the Gatekeeper case. This file should already include definitions for “SGE_QMASTER”, “DEFAULT_DOMAIN”, “ADMIN_MAIL” variables * Configure the WN using the “WN_sge” node definiton. {{{ [root@<your_wn> ~]# /opt/glite/yaim/scripts/configure_node <path_to_your_site-info.def_file> WN_sge }}} ---++ Testing: Test the information system using the following commands: {{{ ldapsearch -x -h <your_ce> -p 2135 -b "mds-vo-name=local,o=grid" ldapsearch -x -h <your_ce> -p 2170 -b "mds-vo-name=<site_name>,o=grid" }}} * Check if it is returning the proper queue names and available resources. * Try to submit a simple script from a give pool account in your CE. From this test you will check if the SGE prompt commands (like qsub or qstat) are working. If the job finishes sucessfully, the stdout and stderr files won't be available in our CE since, in a normal grid event, they would be transfered directly from the WN to the RB using GSIFTP. Try to check stderr/stdout files in the WN... * Try to do a globus-job-run using fork from a UI (you have to start your proxy first): {{{ [goncalo@ui01]$ globus-job-run ce03.lip.pt:2119/jobmanager-fork /bin/uname -a Linux ce03.lip.pt 2.6.9-34.EL.cern #1 Sun Mar 12 12:19:53 CET 2006 i686 athlon i386 GNU/Linux }}} * Try to do a globus-job-run using lcgsge from a UI: {{{ [goncalo@ui01]$ globus-job-run ce03.lip.pt:2119/jobmanager-lcgsge /bin/uname -a Linux sgewn01.lip.pt 2.6.9-34.EL.cern #1 Sun Mar 12 12:19:53 CET 2006 i686 i686 i386 GNU/Linux }}} * Try to submit a job though the RB from a UI: {{{ [goncalo@ui01]$ edg-job-submit -r ce03.lip.pt:2119/jobmanager-lcgsge-dteamgrid well.jdl Selected Virtual Organisation name (from proxy certificate extension): dteam Connecting to host rb02.lip.pt, port 7772 Logging to host rb02.lip.pt, port 9002 ********************************************************************************************* JOB SUBMIT OUTCOME The job has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your job identifier (edg_jobId) is: - https://rb02.lip.pt:9000/Ab0W2EpWMPkpJKjAMpRCsQ ********************************************************************************************* [goncalo@ui01 ce02]$ edg-job-status https://rb02.lip.pt:9000/Ab0W2EpWMPkpJKjAMpRCsQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://rb02.lip.pt:9000/Ab0W2EpWMPkpJKjAMpRCsQ Current Status: Done (Success) Exit code: 0 Status Reason: Job terminated successfully Destination: ce03.lip.pt:2119/jobmanager-lcgsge-dteamgrid reached on: Fri Feb 2 18:42:38 2007 ************************************************************* [goncalo@ui01 ce02]$ cat /tmp/jobOutput/goncalo_Ab0W2EpWMPkpJKjAMpRCsQ/well.out One Perl out of the sea! This is Linux sgewn01.lip.pt 2.6.9-34.EL.cern #1 Sun Mar 12 12:19:53 CET 2006 i686 i686 i386 GNU/Linux Fri Feb 2 18:31:14 WET 2007 }}}
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r8
<
r7
<
r6
<
r5
<
r4
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r8 - 2007-05-21
-
JavierLopezCacheiro
Log In
SGE_Yaim_Version2
SGE_Yaim_Version2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Cern Search
TWiki Search
Google Search
SGE_Yaim_Version2
All webs
Copyright &© 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback