YAIM cluster configuration: phase 1

Introduction

This wiki page describes the steps that are needed to test the new yaim cluster module that contains the configuration of the Glue cluster and Glue subcluster entities. In phase 1, the idea is to configure one cluster, one subcluster and one lcg CE in the same host.

The relevant yaim modules that are needed to test the new cluster configuration are:

  • yaim lcg ce: this module has been modified to include new variables and to remove the code that configures the Glue cluster and Glue subcluster entities. Some functions have been transfered from the lcg CE to the CLUSTER node type (i.e. config_gip_software_plugin , config_info_service_rtepublish).
  • yaim cluster: this is a new module that contains the configuration of the Glue cluster and Glue subcluster entities.
  • yaim torque server: it hasn't been changed. This implies that while lcg ce has been changed to use new variables, torque server still uses old variables, which implies that new and old variables will have to coexist at this point.
  • yaim torque utils: it hasn't been changed but it's not affected by the new cluster configuration.

The goal is to install and configure an lcg CE with the new yaim cluster module and perform the usual lcg CE tests.

Installation instructions

Clean installation

In order to test the new cluster configuration, you can install the following metapackages:

lcg-CE
glite-TORQUE_server
glite-TORQUE_utils

Optionally, you can also install a glite-BDII, if you want to run a site BDII.

Then you should upgrade the glite-yaim-lcg-ce rpm by running:

rpm -U /afs/cern.ch/project/gd/www/yaim/testing/cluster-testing/glite-yaim-lcg-ce-5.0.0-1.noarch.rpm
or
rpm -U http://grid-deployment.web.cern.ch/grid-deployment/yaim/testing/cluster-testing/glite-yaim-lcg-ce-5.0.0-1.noarch.rpm

And install the new cluster configuration yaim module by running:

rpm -U /afs/cern.ch/project/gd/www/yaim/testing/cluster-testing/glite-yaim-cluster-1.0.0-2.noarch.rpm
or
rpm -U http://grid-deployment.web.cern.ch/grid-deployment/yaim/testing/cluster-testing/glite-yaim-cluster-1.0.0-2.noarch.rpm

Now follow the configuration instructions.

Upgrade

In order to test the new cluster configuration, you can install the following metapackages:

lcg-CE
glite-TORQUE_server
glite-TORQUE_utils

Optionally, you can also install a glite-BDII, if you want to run a site BDII.

Run yaim to configure your services:

./yaim -c -s site-info.def -n lcg-CE (-n BDII_site) -n TORQUE_server -n TORQUE_utils

Then upgrade the glite-yaim-lcg-ce rpm by running:

rpm -U /afs/cern.ch/project/gd/www/yaim/testing/cluster-testing/glite-yaim-lcg-ce-5.0.0-1.noarch.rpm

And install the new cluster configuration yaim module by running:

rpm -i /afs/cern.ch/project/gd/www/yaim/testing/cluster-testing/glite-yaim-cluster-1.0.0-2.noarch.rpm

Now follow the configuration instructions.

Configuration instructions

Since there's a set of new variables, you would need to change your usual site-info.def:

lcg CE

Mandatory variables for the lcg CE: You'll find them under /opt/glite/yaim/examples/services/lcg-ce:

The new variable names follow this syntax:

  • In general, variables based on hostnames, queues or VOViews containing '.' and '_' # should be transformed into '-'
  • <host-name>: identifier that corresponds to the CE hostname in lower case. Example: ctb-generic-1.cern.ch -> ctb_generic_1_cern_ch
  • <queue-name>: identifier that corresponds to the queue in upper case. Example: dteam -> DTEAM
  • <voview-name>: identifier that corresponds to the VOView id in upper case. '/' and '=' should also be transformed into '_'. Example: /dteam/Role=admin -> DTEAM_ROLE_ADMIN

Variable Name Description Value type Version
CE_HOST_<host-name>_CLUSTER_UniqueID UniqueID of the cluster the CE belongs to string glite-yaim-lcg-ce 4.0.5-1
CE_InfoApplicationDir Prefix of the experiment software directory in a site. This variable has been renamed in the new infosys configuration. The old variable name was: VO_SW_DIR. This parameter can be defined per CE, queue, site or voview. See /opt/glite/yaim/examples/services/lcg-ce for examples. string glite-yaim-lcg-ce 4.0.5-1

The following variables will be distributed in the future in site-info.def since they affect other yaim modules. At this moment we are in a transition face to migrate to the new variable names.

Variable Name Description Value type Version
CE_HOST_<host-name>_CE_TYPE CE type: 'jobmanager' for lcg CE and 'cream' for cream CE string glite-yaim-lcg-ce 4.0.5-1
CE_HOST_<host-name>_QUEUES Space separated list of the queue names configured in the CE. This variable has been renamed in the new infosys configuration. The old variable name was: QUEUES string glite-yaim-lcg-ce 4.0.5-1
CE_HOST_<host-name>_QUEUE_<queue-name>_CE_AccessControlBaseRule Space separated list of FQANS and/or VO names which are allowed to access the queues configured in the CE. This variable has been renamed in the new infosys configuration. The old variable name was: _GROUP_ENABLE string glite-yaim-lcg-ce 4.0.5-1
CE_HOST_<host-name>_CE_InfoJobManager The name of the job manager used by the gatekeeper. This variable has been renamed in the new infosys configuration. The old variable name was: JOB_MANAGER. Please, define: lcgpbs, lcglfs, lcgsge or lcgcondor string glite-yaim-lcg-ce 4.0.5-1
JOB_MANAGER The old variable is still needed since config_jobmanager in yaim core hasn't been modified to use the new variable. To be done. string OLD variable

Default variables for the lcg CE: You'll find them under:

  • /opt/glite/yaim/defaults/lcg-ce.pre:

It contains a list of CE_* variables with some default values. These are the Glue schema properties belonging to the Compuing Element and the VOView entities. By default, these variables are specified per CE, but they can also be specified per queue or per VOVIEW, depending if we want that all the VOViews of a queue share a specific value or depending if we want that a certain VOViews has a specific value. For example, if I define in site-info.def:

# In the CE vtb-generic-17.cern.ch, in the queue dteam, in the VOView dteam, 
# I want that the default value StateWaitingJobs is 666666
CE_HOST_vtb_generic_17_cern_ch_QUEUE_DTEAM_VOVIEW_DTEAM_CE_StateWaitingJobs=666666

Or I can also define:

# In the CE vtb-generic-17.cern.ch, in the queue dteam, in all the supported VOViews,
# I want that the default value StateWaitingJobs is 666666
CE_HOST_vtb_generic_17_cern_ch_QUEUE_DTEAM_CE_StateWaitingJobs=666666

If none of the above is defined, the default value for the whole CE, defined in /opt/glite/yaim/defaults/lcg-ce.pre, is taken.

The variables that can be redefined per CE-queue are:

CE_VAR="
ImplementationName
ImplementationVersion
InfoGatekeeperPort
InfoLRMSType
InfoLRMSVersion
InfoJobManager
InfoApplicationDir
InfoDataDir
InfoDefaultSE
InfoTotalCPUs
StateEstimatedResponseTime
StateRunningJobs
StateStatus
StateTotalJobs
StateWaitingJobs
StateWorstResponseTime
StateFreeJobSlots
StateFreeCPUs
PolicyMaxCPUTime
PolicyMaxObtainableCPUTime
PolicyMaxRunningJobs
PolicyMaxWaitingJobs
PolicyMaxTotalJobs
PolicyMaxWallClockTime
PolicyMaxObtainableWallClockTime
PolicyPriority
PolicyAssignedJobSlots
PolicyMaxSlotsPerJob
PolicyPreemption"

The variables that moreover can also be redefined per CE-queue-VOVIEW are:

VOVIEW_VAR="
StateRunningJobs
StateWaitingJobs
StateTotalJobs
StateFreeJobSlots
StateEstimatedResponseTime
StateWorstResponseTime
InfoDefaultSE
InfoApplicationDir
InfoDataDir
"

If the Glue schema supports other variables than the ones defined here, you can just add new ones by redefining CE_VAR and/or VOVIEW_VAR in site-info.def. It's the list of variables contained in CE_VAR and VOVIEW_VAR what YAIM uses to create the ldif file.

  • /opt/glite/yaim/defaults/lcg-ce.post:

It defines some auxiliary variables:

Variable Name Description Value type Default Value Version
CE_ImplementationVersion The version of the implementation. This should probably be in .pre instead of .post version 3.1 glite-yaim-lcg-ce 4.0.5-1
CE_InfoLRMSType Type of the underlying Resource Management System string ${CE_BATCH_SYS} glite-yaim-lcg-ce 4.0.5-1
STATIC_CREATE Path to the script that creates the ldif file path ${INSTALL_ROOT}/glite/sbin/glite-info-static-create glite-yaim-lcg-ce 4.0.5-1
TEMPLATE_DIR Path to the ldif templates directory path ${INSTALL_ROOT}/glite/etc glite-yaim-lcg-ce 4.0.5-1
CONF_DIR Path to the temporary configuration directory path ${INSTALL_ROOT}/glite/var/tmp/gip glite-yaim-lcg-ce 4.0.5-1
LDIF_DIR Path to the ldif directory path ${INSTALL_ROOT}/glite/etc/gip/ldif glite-yaim-lcg-ce 4.0.5-1
GlueCE_ldif Path to the GlueCE ldif file path ${LDIF_DIR}/static-file-CE.ldif glite-yaim-lcg-ce 4.0.5-1
GlueCESEBind_ldif Path to the GlueCESEBind ldif file path ${LDIF_DIR}/static-file-CESEBind.ldif glite-yaim-lcg-ce 4.0.5-1

Cluster

Mandatory variables for the cluster: You'll find them under /opt/glite/yaim/examples/services/glite-cluster:

The new variable names follow this syntax:

  • In general, variables based on hostnames, queues or VOViews containing '.' and '_' # should be transformed into '-'
  • <host-name>: identifier that corresponds to the CE hostname in lower case. Example: ctb-generic-1.cern.ch -> ctb_generic_1_cern_ch
  • <cluster-name>: identifier that corresponds to the cluster name in upper case. Example: my_cluster -> MY_CLUSTER
  • <subcluster-name>: identifier that corresponds to the subcluster name in upper case. Example: my_subcluster -> MY_SUBCLUSTER

Variable Name Description Value type Version
CLUSTERS Space separated list of your cluster names, Ex. "cluster1 [cluster2 [...]]" string list glite-yaim-cluster 1.0.0-1
CLUSTER_<cluster-name>_CLUSTER_UniqueID Cluster UniqueID string glite-yaim-cluster 1.0.0-1
CLUSTER_<cluster-name>_CLUSTER_Name Cluster human readable name string glite-yaim-cluster 1.0.0-1
CLUSTER_<cluster-name>_SITE_UniqueID Site name where the cluster belongs to. It should be consistent with your variable SITE_NAME. NOTE: This may be changed to SITE_UniqueID when the GlueSite is configured with the new infosys variables string glite-yaim-cluster 1.0.0-1
CLUSTER_<cluster-name>_CE_HOSTS Space separated list of CE hostnames configured in the cluster hostname list glite-yaim-cluster 1.0.0-1
CLUSTER_<cluster-name>_SUBCLUSTERS Space separated list of your subcluster names, Ex="subcluster1 [subcluster2 [...]]"= string list glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_SUBCLUSTER_UniqueID Subcluster UniqueID string glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_ApplicationSoftwareRunTimeEnvironment
"sw1 [| sw2 [| ...]"
old CE_RUNTIMEENV
string list glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_ArchitectureSMPSize old CE_SMPSIZE number glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_ArchitecturePlatformType old CE_OS_ARCH string glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_BenchmarkSF00 old CE_SF00 number glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_BenchmarkSI00 old CE_SI00 number glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_MainMemoryRAMSize old CE_MINPHYSMEM number glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_MainMemoryVirtualSize old CE_MINVIRTMEM number glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_NetworkAdapterInboundIP old CE_INBOUNDIP boolean glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_NetworkAdapterOutboundIP old CE_OUTBOUNDIP boolean glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_OperatingSystemName old CE_OS OS name glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_OperatingSystemRelease old CE_OS_RELEASE OS release glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_OperatingSystemVersion old CE_OS_VERSION OS version glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_ProcessorClockSpeed old CE_CPU_SPEED number glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_ProcessorModel old CE_CPU_MODEL string glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_HOST_ProcessorVendor old CE_CPU_VENDOR string glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_SUBCLUSTER_Name subcluster human readable name string glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_SUBCLUSTER_PhysicalCPUs old CE_PHYSCPU number glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_SUBCLUSTER_LogicalCPUs old CE_LOGCPU number glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_SUBCLUSTER_TmpDir tmp directory path glite-yaim-cluster 1.0.0-1
SUBCLUSTER_<subcluster-name>_SUBCLUSTER_WNTmpDir WN tmp directory path glite-yaim-cluster 1.0.0-1

Default variables for the lcg CE: You'll find them under

  • /opt/glite/yaim/defaults/glite-cluster.pre:

It contains the list of variables that can be configured per Subcluster. They belong to the Host and Subcluster entities in the Glue schema:

HOST_VAR="
ApplicationSoftwareRunTimeEnvironment 
ArchitectureSMPSize 
ArchitecturePlatformType 
BenchmarkSF00 
BenchmarkSI00 
MainMemoryRAMSize 
MainMemoryVirtualSize 
NetworkAdapterInboundIP 
NetworkAdapterOutboundIP 
OperatingSystemName 
OperatingSystemRelease 
OperatingSystemVersion 
ProcessorClockSpeed 
ProcessorModel 
ProcessorVendor"

SUBCLUSTER_VAR="
Name 
UniqueID 
PhysicalCPUs 
LogicalCPUs 
TmpDir 
WNTmpDir"

If the Glue schema supports other variables than the ones defined here, you can just add new ones by redefining HOST_VAR and/or SUBCLUSTER_VAR in site-info.def. It's the list of variables contained in HOST_VAR and SUBCLUSTER_VAR what YAIM uses to create the ldif file.

  • /opt/glite/yaim/defaults/glite-cluster.post:

It defines some auxiliary variables:

Variable Name Description Value type Default Value Version
STATIC_CREATE Path to the script that creates the ldif file path ${INSTALL_ROOT}/glite/sbin/glite-info-static-create glite-yaim-cluster 1.0.0-1
TEMPLATE_DIR Path to the ldif templates directory path ${INSTALL_ROOT}/glite/etc glite-yaim-cluster 1.0.0-1
CONF_DIR Path to the temporary configuration directory path ${INSTALL_ROOT}/glite/var/tmp/gip glite-yaim-cluster 1.0.0-1
LDIF_DIR Path to the ldif directory path ${INSTALL_ROOT}/glite/etc/gip/ldif glite-yaim-cluster 1.0.0-1
GlueCluster_OUTFILE Path to the file to store the temp file that will be used to create the ldif file path ${CONF_DIR}/glite-info-static-cluster.conf glite-yaim-cluster 1.0.0-1
GlueCluster_ldif Path to the Glue Cluster ldif file path ${LDIF_DIR}/static-file-Cluster.ldif glite-yaim-cluster 1.0.0-1

Torque server

Since I haven't modified the code of this yaim module, the following variables are still needed, even if there are new variable replacing them:

  • QUEUES
  • <queue-name>_GROUP_ENABLE
  • CE_SMPSIZE

YAIM command

Once you have defined all the needed variables, configure the lcg CE by running:

./yaim -c -s site-info.def -n lcg-CE -n glite-CLUSTER (-n BDII_site) -n TORQUE_server -n TORQUE_utils

What to test

  • Define only one cluster, one subcluster and one CE.
  • It's important to test both an upgrade and a clean installation.
  • Test basic job submission and usual lcg CE related tests that would be executed to certify a new release of the lcg CE
  • Define new Glue Schema variables for the CE, Voview, Host and Subcluster entities (not sure if YAIM already defined all the existing ones). Are they included in the ldif file?
  • Define CE and VOView entity variables also per queue and per queue-voview to test that this feature actually works. Are they really taken into account? Check the ldif file.

Feedback

  • Is it easy to use the new variables?
  • comments on the complexity of the new way to configure the information system.
  • report on bugs and other issues.

-- MariaALANDESPRADILLO - 01 Sep 2008

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2009-10-15 - MariaALANDESPRADILLO
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback