TWiki
>
EGEE Web
>
YAIM
>
YAIMInfosys
>
YAIMcluster_1
(2009-10-15,
MariaALANDESPRADILLO
)
(raw view)
E
dit
A
ttach
P
DF
---+ YAIM cluster configuration: phase 1 %TOC% ---++ Introduction This wiki page describes the steps that are needed to test the new yaim cluster module that contains the configuration of the Glue cluster and Glue subcluster entities. In phase 1, the idea is to configure one cluster, one subcluster and one lcg CE in the same host. The relevant yaim modules that are needed to test the new cluster configuration are: * yaim lcg ce: this module has been modified to include new variables and to remove the code that configures the Glue cluster and Glue subcluster entities. Some functions have been transfered from the lcg CE to the CLUSTER node type (i.e. =config_gip_software_plugin= , =config_info_service_rtepublish=). * yaim cluster: this is a new module that contains the configuration of the Glue cluster and Glue subcluster entities. * yaim torque server: it hasn't been changed. This implies that while lcg ce has been changed to use new variables, torque server still uses old variables, which implies that new and old variables will have to coexist at this point. * yaim torque utils: it hasn't been changed but it's not affected by the new cluster configuration. The goal is to install and configure an lcg CE with the new yaim cluster module and perform the usual lcg CE tests. ---++ Installation instructions ---+++ Clean installation In order to test the new cluster configuration, you can install the following metapackages: <verbatim> lcg-CE glite-TORQUE_server glite-TORQUE_utils </verbatim> Optionally, you can also install a =glite-BDII=, if you want to run a site BDII. Then you should upgrade the glite-yaim-lcg-ce rpm by running: <verbatim>rpm -U /afs/cern.ch/project/gd/www/yaim/testing/cluster-testing/glite-yaim-lcg-ce-5.0.0-1.noarch.rpm</verbatim> or <verbatim>rpm -U http://grid-deployment.web.cern.ch/grid-deployment/yaim/testing/cluster-testing/glite-yaim-lcg-ce-5.0.0-1.noarch.rpm</verbatim> And install the new cluster configuration yaim module by running: <verbatim>rpm -U /afs/cern.ch/project/gd/www/yaim/testing/cluster-testing/glite-yaim-cluster-1.0.0-2.noarch.rpm</verbatim> or <verbatim>rpm -U http://grid-deployment.web.cern.ch/grid-deployment/yaim/testing/cluster-testing/glite-yaim-cluster-1.0.0-2.noarch.rpm</verbatim> Now follow the configuration instructions. ---+++ Upgrade In order to test the new cluster configuration, you can install the following metapackages: <verbatim> lcg-CE glite-TORQUE_server glite-TORQUE_utils </verbatim> Optionally, you can also install a =glite-BDII=, if you want to run a site BDII. Run yaim to configure your services: <verbatim>./yaim -c -s site-info.def -n lcg-CE (-n BDII_site) -n TORQUE_server -n TORQUE_utils</verbatim> Then upgrade the glite-yaim-lcg-ce rpm by running: <verbatim>rpm -U /afs/cern.ch/project/gd/www/yaim/testing/cluster-testing/glite-yaim-lcg-ce-5.0.0-1.noarch.rpm</verbatim> And install the new cluster configuration yaim module by running: <verbatim>rpm -i /afs/cern.ch/project/gd/www/yaim/testing/cluster-testing/glite-yaim-cluster-1.0.0-2.noarch.rpm</verbatim> Now follow the configuration instructions. ---++ Configuration instructions Since there's a set of new variables, you would need to change your usual site-info.def: ---+++ lcg CE *Mandatory variables* for the lcg CE: You'll find them under =/opt/glite/yaim/examples/services/lcg-ce=: The new variable names follow this syntax: * In general, variables based on hostnames, queues or VOViews containing '.' and '_' # should be transformed into '-' * <host-name>: identifier that corresponds to the CE hostname in lower case. Example: ctb-generic-1.cern.ch -> ctb_generic_1_cern_ch * <queue-name>: identifier that corresponds to the queue in upper case. Example: dteam -> DTEAM * <voview-name>: identifier that corresponds to the VOView id in upper case. '/' and '=' should also be transformed into '_'. Example: /dteam/Role=admin -> DTEAM_ROLE_ADMIN |Variable Name |Description |Value type |Version | | =CE_HOST_<host-name>_CLUSTER_UniqueID= | UniqueID of the cluster the CE belongs to | string | glite-yaim-lcg-ce 4.0.5-1 | | =CE_InfoApplicationDir= | Prefix of the experiment software directory in a site. This variable has been renamed in the new infosys configuration. The old variable name was: =VO_SW_DIR=. This parameter can be defined per CE, queue, site or voview. See =/opt/glite/yaim/examples/services/lcg-ce= for examples. | string | glite-yaim-lcg-ce 4.0.5-1 | The following variables will be distributed in the future in site-info.def since they affect other yaim modules. At this moment we are in a transition face to migrate to the new variable names. |Variable Name |Description |Value type |Version | | =CE_HOST_<host-name>_CE_TYPE= | CE type: 'jobmanager' for lcg CE and 'cream' for cream CE | string | glite-yaim-lcg-ce 4.0.5-1 | | =CE_HOST_<host-name>_QUEUES= | Space separated list of the queue names configured in the CE. This variable has been renamed in the new infosys configuration. The old variable name was: =QUEUES= | string | glite-yaim-lcg-ce 4.0.5-1 | | =CE_HOST_<host-name>_QUEUE_<queue-name>_CE_AccessControlBaseRule= | Space separated list of FQANS and/or VO names which are allowed to access the queues configured in the CE. This variable has been renamed in the new infosys configuration. The old variable name was: =<queue_name>_GROUP_ENABLE= | string | glite-yaim-lcg-ce 4.0.5-1 | | =CE_HOST_<host-name>_CE_InfoJobManager= | The name of the job manager used by the gatekeeper. This variable has been renamed in the new infosys configuration. The old variable name was: =JOB_MANAGER=. Please, define: lcgpbs, lcglfs, lcgsge or lcgcondor | string | glite-yaim-lcg-ce 4.0.5-1 | | =JOB_MANAGER= | The old variable is still needed since =config_jobmanager= in yaim core hasn't been modified to use the new variable. To be done. | string | OLD variable | *Default variables* for the lcg CE: You'll find them under: * =/opt/glite/yaim/defaults/lcg-ce.pre=: It contains a list of =CE_*= variables with some default values. These are the Glue schema properties belonging to the Compuing Element and the VOView entities. By default, these variables are specified per CE, but they can also be specified per queue or per VOVIEW, depending if we want that all the VOViews of a queue share a specific value or depending if we want that a certain VOViews has a specific value. For example, if I define in site-info.def: <verbatim> # In the CE vtb-generic-17.cern.ch, in the queue dteam, in the VOView dteam, # I want that the default value StateWaitingJobs is 666666 CE_HOST_vtb_generic_17_cern_ch_QUEUE_DTEAM_VOVIEW_DTEAM_CE_StateWaitingJobs=666666 </verbatim> Or I can also define: <verbatim> # In the CE vtb-generic-17.cern.ch, in the queue dteam, in all the supported VOViews, # I want that the default value StateWaitingJobs is 666666 CE_HOST_vtb_generic_17_cern_ch_QUEUE_DTEAM_CE_StateWaitingJobs=666666 </verbatim> If none of the above is defined, the default value for the whole CE, defined in =/opt/glite/yaim/defaults/lcg-ce.pre=, is taken. The variables that can be redefined per CE-queue are: <verbatim> CE_VAR=" ImplementationName ImplementationVersion InfoGatekeeperPort InfoLRMSType InfoLRMSVersion InfoJobManager InfoApplicationDir InfoDataDir InfoDefaultSE InfoTotalCPUs StateEstimatedResponseTime StateRunningJobs StateStatus StateTotalJobs StateWaitingJobs StateWorstResponseTime StateFreeJobSlots StateFreeCPUs PolicyMaxCPUTime PolicyMaxObtainableCPUTime PolicyMaxRunningJobs PolicyMaxWaitingJobs PolicyMaxTotalJobs PolicyMaxWallClockTime PolicyMaxObtainableWallClockTime PolicyPriority PolicyAssignedJobSlots PolicyMaxSlotsPerJob PolicyPreemption" </verbatim> The variables that moreover can also be redefined per CE-queue-VOVIEW are: <verbatim> VOVIEW_VAR=" StateRunningJobs StateWaitingJobs StateTotalJobs StateFreeJobSlots StateEstimatedResponseTime StateWorstResponseTime InfoDefaultSE InfoApplicationDir InfoDataDir " </verbatim> If the [[http://forge.cnaf.infn.it/plugins/scmsvn/viewcvs.php/*checkout*/v_1_3/spec/pdf/GLUESchema.pdf?rev=49&root=glueschema][Glue schema]] supports other variables than the ones defined here, you can just add new ones by redefining =CE_VAR= and/or =VOVIEW_VAR= in site-info.def. It's the list of variables contained in =CE_VAR= and =VOVIEW_VAR= what YAIM uses to create the ldif file. * =/opt/glite/yaim/defaults/lcg-ce.post=: It defines some auxiliary variables: | Variable Name | Description | Value type | Default Value | Version | | =CE_ImplementationVersion= | The version of the implementation. This should probably be in .pre instead of .post | version | =3.1= | glite-yaim-lcg-ce 4.0.5-1 | | =CE_InfoLRMSType= | Type of the underlying Resource Management System | string | =${CE_BATCH_SYS}= | glite-yaim-lcg-ce 4.0.5-1 | | =STATIC_CREATE= | Path to the script that creates the ldif file | path |=${INSTALL_ROOT}/glite/sbin/glite-info-static-create= | glite-yaim-lcg-ce 4.0.5-1 | | =TEMPLATE_DIR= | Path to the ldif templates directory | path | =${INSTALL_ROOT}/glite/etc= | glite-yaim-lcg-ce 4.0.5-1 | | =CONF_DIR= | Path to the temporary configuration directory | path | =${INSTALL_ROOT}/glite/var/tmp/gip= | glite-yaim-lcg-ce 4.0.5-1 | | =LDIF_DIR= | Path to the ldif directory | path | =${INSTALL_ROOT}/glite/etc/gip/ldif= | glite-yaim-lcg-ce 4.0.5-1 | | =GlueCE_ldif= | Path to the GlueCE ldif file | path | =${LDIF_DIR}/static-file-CE.ldif= | glite-yaim-lcg-ce 4.0.5-1 | | =GlueCESEBind_ldif= | Path to the GlueCESEBind ldif file | path | =${LDIF_DIR}/static-file-CESEBind.ldif= | glite-yaim-lcg-ce 4.0.5-1 | ---+++ Cluster Mandatory variables for the cluster: You'll find them under =/opt/glite/yaim/examples/services/glite-cluster=: The new variable names follow this syntax: * In general, variables based on hostnames, queues or VOViews containing '.' and '_' # should be transformed into '-' * <host-name>: identifier that corresponds to the CE hostname in lower case. Example: ctb-generic-1.cern.ch -> ctb_generic_1_cern_ch * <cluster-name>: identifier that corresponds to the cluster name in upper case. Example: my_cluster -> MY_CLUSTER * <subcluster-name>: identifier that corresponds to the subcluster name in upper case. Example: my_subcluster -> MY_SUBCLUSTER |Variable Name |Description |Value type |Version | | =CLUSTERS= | Space separated list of your cluster names, Ex. ="cluster1 [cluster2 [...]]"= | string list | glite-yaim-cluster 1.0.0-1 | | =CLUSTER_<cluster-name>_CLUSTER_UniqueID= |Cluster UniqueID | string | glite-yaim-cluster 1.0.0-1 | | =CLUSTER_<cluster-name>_CLUSTER_Name= | Cluster human readable name | string | glite-yaim-cluster 1.0.0-1 | | =CLUSTER_<cluster-name>_SITE_UniqueID= | Site name where the cluster belongs to. It should be consistent with your variable SITE_NAME. NOTE: This may be changed to SITE_UniqueID when the GlueSite is configured with the new infosys variables | string | glite-yaim-cluster 1.0.0-1 | | =CLUSTER_<cluster-name>_CE_HOSTS= | Space separated list of CE hostnames configured in the cluster | hostname list | glite-yaim-cluster 1.0.0-1 | | =CLUSTER_<cluster-name>_SUBCLUSTERS= | Space separated list of your subcluster names, Ex="subcluster1 [subcluster2 [...]]"= | string list | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_SUBCLUSTER_UniqueID= | Subcluster UniqueID | string | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_ApplicationSoftwareRunTimeEnvironment= | <verbatim>"sw1 [| sw2 [| ...]"</verbatim> old =CE_RUNTIMEENV= | string list | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_ArchitectureSMPSize= | old =CE_SMPSIZE= | number | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_ArchitecturePlatformType= | old =CE_OS_ARCH= | string | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_BenchmarkSF00= | old =CE_SF00= | number | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_BenchmarkSI00= | old =CE_SI00= | number | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_MainMemoryRAMSize= | old =CE_MINPHYSMEM= | number | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_MainMemoryVirtualSize= | old =CE_MINVIRTMEM= | number | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_NetworkAdapterInboundIP= | old =CE_INBOUNDIP= | boolean | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_NetworkAdapterOutboundIP= | old =CE_OUTBOUNDIP= | boolean | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_OperatingSystemName= | old =CE_OS= | OS name | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_OperatingSystemRelease= | old =CE_OS_RELEASE= | OS release | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_OperatingSystemVersion= | old =CE_OS_VERSION= | OS version | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_ProcessorClockSpeed= | old =CE_CPU_SPEED= | number | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_ProcessorModel= | old =CE_CPU_MODEL= | string | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_HOST_ProcessorVendor= | old =CE_CPU_VENDOR= | string | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_SUBCLUSTER_Name= | subcluster human readable name | string | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_SUBCLUSTER_PhysicalCPUs= | old =CE_PHYSCPU= | number | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_SUBCLUSTER_LogicalCPUs= | old =CE_LOGCPU= | number | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_SUBCLUSTER_TmpDir= | tmp directory | path | glite-yaim-cluster 1.0.0-1 | | =SUBCLUSTER_<subcluster-name>_SUBCLUSTER_WNTmpDir= | WN tmp directory | path | glite-yaim-cluster 1.0.0-1 | *Default variables* for the lcg CE: You'll find them under * =/opt/glite/yaim/defaults/glite-cluster.pre=: It contains the list of variables that can be configured per Subcluster. They belong to the _Host_ and _Subcluster_ entities in the Glue schema: <verbatim> HOST_VAR=" ApplicationSoftwareRunTimeEnvironment ArchitectureSMPSize ArchitecturePlatformType BenchmarkSF00 BenchmarkSI00 MainMemoryRAMSize MainMemoryVirtualSize NetworkAdapterInboundIP NetworkAdapterOutboundIP OperatingSystemName OperatingSystemRelease OperatingSystemVersion ProcessorClockSpeed ProcessorModel ProcessorVendor" SUBCLUSTER_VAR=" Name UniqueID PhysicalCPUs LogicalCPUs TmpDir WNTmpDir" </verbatim> If the [[http://forge.cnaf.infn.it/plugins/scmsvn/viewcvs.php/*checkout*/v_1_3/spec/pdf/GLUESchema.pdf?rev=49&root=glueschema][Glue schema]] supports other variables than the ones defined here, you can just add new ones by redefining =HOST_VAR= and/or =SUBCLUSTER_VAR= in site-info.def. It's the list of variables contained in =HOST_VAR= and =SUBCLUSTER_VAR= what YAIM uses to create the ldif file. * =/opt/glite/yaim/defaults/glite-cluster.post=: It defines some auxiliary variables: | Variable Name | Description | Value type | Default Value | Version | | =STATIC_CREATE= | Path to the script that creates the ldif file | path |=${INSTALL_ROOT}/glite/sbin/glite-info-static-create= | glite-yaim-cluster 1.0.0-1 | | =TEMPLATE_DIR= | Path to the ldif templates directory | path | =${INSTALL_ROOT}/glite/etc= | glite-yaim-cluster 1.0.0-1 | | =CONF_DIR= | Path to the temporary configuration directory | path | =${INSTALL_ROOT}/glite/var/tmp/gip= | glite-yaim-cluster 1.0.0-1 | | =LDIF_DIR= | Path to the ldif directory | path | =${INSTALL_ROOT}/glite/etc/gip/ldif= | glite-yaim-cluster 1.0.0-1 | | =GlueCluster_OUTFILE= | Path to the file to store the temp file that will be used to create the ldif file | path | =${CONF_DIR}/glite-info-static-cluster.conf= | glite-yaim-cluster 1.0.0-1 | | =GlueCluster_ldif= | Path to the Glue Cluster ldif file | path | =${LDIF_DIR}/static-file-Cluster.ldif= | glite-yaim-cluster 1.0.0-1 | ---+++ Torque server Since I haven't modified the code of this yaim module, the following variables are still needed, even if there are new variable replacing them: * =QUEUES= * =<queue-name>_GROUP_ENABLE= * =CE_SMPSIZE= ---+++ YAIM command Once you have defined all the needed variables, configure the lcg CE by running: <verbatim>./yaim -c -s site-info.def -n lcg-CE -n glite-CLUSTER (-n BDII_site) -n TORQUE_server -n TORQUE_utils</verbatim> ---++ What to test * Define only one cluster, one subcluster and one CE. * It's important to test both an upgrade and a clean installation. * Test basic job submission and usual lcg CE related tests that would be executed to certify a new release of the lcg CE * Define new Glue Schema variables for the CE, Voview, Host and Subcluster entities (not sure if YAIM already defined all the existing ones). Are they included in the ldif file? * Define CE and VOView entity variables also per queue and per queue-voview to test that this feature actually works. Are they really taken into account? Check the ldif file. ---++ Feedback * Is it easy to use the new variables? * comments on the complexity of the new way to configure the information system. * report on bugs and other issues. -- Main.MariaALANDESPRADILLO - 01 Sep 2008
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r4 - 2009-10-15
-
MariaALANDESPRADILLO
Log In
EGEE
EGEE Web
EGEE Web Home
gLite
ProductTeams
SA3
JRA1
TMB
EMT
SA1
SA2
NA2
NA4
EGEE-UIG
List of
registered projects
List of EGEE-RP
interactions
Changes
Index
Search
Main.WebList
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
EGEE
All webs
Copyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Ask a support question
or
Send feedback