CORAL cluster configuration using Puppet
The Quattor-managed
persistency
cluster of physical and virtual machines was migrated in 2014 to a Puppet-managed cluster of virtual machines.
The new cluster (host group in Puppet terminology) is named the
coral
host group, but is still used for both CORAL and COOL services.
CORAL host group (aicoral51, aicoral61, aicoral62, aicoral71, aihepos6, aihepos7)
The CORAL host group includes the following nodes:
- aicoral51 is an SLC5 build node used for the local build and validation of a legacy software branch (LCG61g) that was only supported on SLC5
- aicoral61 is an SLC6 build node used for the local build and validation of the software trunk on SLC6
- aicoral62 is an SLC6 server hosting a CoralServer instance for tests (see the PersistencyTestServers page)
- aicoral71 is a CC7 build node used for the local build and validation of the software trunk on CC7
- aihepos6 is an SLC6 node that is used to test the development and deployment of the HEP_OSlibs package for SLC6
- aihepos7 is a CC7 node that is used to test the development and deployment of the HEP_OSlibs package for CC7
Useful links
Creating and maintaining a VM in the CERN AI (Agile Infrastructure) involves interacting with two complementary systems:
- the CERN Cloud Infrastructure (Openstack), providing the virtualization layer to instantiate the VMs
- the CERN Configuration Management System (Puppet/Foreman, GIT), providing the management layer to configure the VMs
CERN AI official documentation
- CERN Cloud Infrastructure User Guide - clouddocs
- CERN Configuration Management System User Guide - configdocs
- CERN IT Monitoring User Documentation - itmon
CERN AI how-to -
twiki (obsolete)
- Module and manifest development cycle (git repositories, branches, shared modules, Foreman environments) - twiki (obsolete?)
CERN AI web services
CERN AI SNOW categories
- Use "Cloud infrastructure" for Openstack related issues and to request VMs and disk space for a service related cluster
- Use "Configuration service" for Puppet/Foreman related issues
Internal documentation for IT-SDC services
- IT-SDC service organization - twiki
Manage the VMs in the CORAL cluster using Puppet
Prepare a local clone of the GIT repository
- Clone the
it-puppet-hostgroup-coral
repository and check out the qa
branch:
-
git clone https://:@gitlab.cern.ch:8443/ai/it-puppet-hostgroup-coral.git
-
cd it-puppet-hostgroup-coral; git checkout qa
- Use
git branch -a
to list both local and remote branches
- The repository contains two directories as documented in the twiki:
-
code
stores Puppet manifests, files and templates
-
data
stores Hiera data in yaml
format
The simplest workflow is the following:
- Modify files locally in the
qa
branch, then commit them locally and finally push the commits to the server
-
git commit -a
-
git pull --rebase && git push
- Then connect to the node as root and run the puppet agent
-
puppet agent -t
- You may also do this after changing Foreman parameters
- This step is in any case unnecessary as the puppet agent automatically runs every hour
System updates
- On SLC6, unlike quattor, there is no need to periodically update the O/S date version, as system updates are performed automatically (
yum distro-sync
is executed in a daily cron job)
- On SLC5, you must manually run
yum update
(see also these slides
)
- If this gives issues, you may need to
yum clean all
after a failed update
- There is presently an issue with the latest lemon-forwarder-0.25 version: update using
yum remove lemon-forwarder; yum update; yum install lemon-forwarder-0.24
Additional useful documentation about Puppet
Seeing what others are doing with Puppet:
- Browse the installed Puppet modules and host groups on
/mnt/puppetnfsdir
on aiadm
(where it is mounted from puppet-nfs.cern.ch:/PuppetEnv
)
- Browse the GIT repositories of all Puppet modules and host groups at https://gitlab.cern.ch/ai/it-puppet-*
Debugging Puppet:
- Check the installed top-level manifest for the CORAL cluster at
/mnt/puppetnfsdir/environments/qa/hostgroups/hg_coral/manifests/init.pp
- Check the installed shared modules at
/mnt/puppetnfsdir/environments/qa/modules/
- Use
puppet agent -t--debug
to display a lot of very verbose (not necessarily very useful) information
Using global (built-in, Facter facts, Foreman ENC) and local variables within Puppet manifests
- Puppet variables
may be local or global
- Within a Puppet manifest, a global variable
gvar
can be referred to as $::gvar
and a local variable lvar
as $lvar$
- A global variable may also be referred to as
$gvar
, but this may lead to confusion and should be avoided: always use $::gvar
to refer to the gvar
global variable.
- Within a Puppet manifest, a global variable may never be assigned a new value (puppet runtime error), while a local variable may be assigned a new value only once
- Puppet global variables
may come from built-ins and Facter facts
or from Foreman ENC parameters
(but not from Hiera data)
- Facter facts are automatically available as Puppet variables
- To start with, note that the command
facter
prints out only the core Facter facts, while facter -p
also prints out additional custom facts
, for instance from Puppet plugins: neither command, however, lists Foreman parameters or Hiera data
- A
fact
from the Facter will be available within a Puppet manifest as global variable $::fact
, but will not be available as hiera('fact')
(puppet runtime error unless a Hiera data called fact
also exists)
- Foreman parameters are also automatically available as Puppet variables, through the ENC mechanism
- If variable
var
is both a fact and a Foreman parameter, the Foreman parameter is visible as $::var$ in Puppet manifests, but the facter
command shows the unmodified fact value (this was tested using manufacturer
)
- Hiera variables are not visible as Puppet variables in manifests, but they are available through
hiera('var')
explicit lookup
- Note that the CERN kerberos module
explicitly looks for rootegroups
and rootusers
first as Puppet variables (via Foreman ENC) and then as a Hiera variables
Using Puppet variables within Hiera yaml files
Defining Hiera variables within Hiera yaml files
- Hiera is a hierarchical key/value lookup tool that uses a well-defined lookup order (machine, subhostgroup, environment within hostgroup, OS within hostgroup, hostgroup, common defaults)
- To debug the values of Hiera variables as seen by Puppet, you must use the
ai-hiera
command on aiadm
as documented on the section on debugging tools
- Example for
coral
, try ai-hiera -n aihepos6.cern.ch sssd::interactiveallowgroups --trace
- Note that up-to-date yaml files are pulled in to
aiadm
as soon as git pull --rebase && git push
has been executed, but you should allow a few minutes to get the right values from ai-hiera
- Example for
coral
, check the dates of /etc/puppet/environments/qa/hieradata/fqdns/coral/
and /etc/puppet/environments/qa/hieradata/hostgroups/coral/
to see if they have been updated
Configuring user access and sudo directives
- Read the documentation about configuring user access
- To configure sudo, you may use the sudo module
(installed at /mnt/puppetnfsdir/environments/qa/modules/sudo/
), as described in AI-3788
- Alternatively you may configure sudo using the read_line directive from the stdlib base module (see some examples
)
- In both case, keep the default sudo secure_path and add privileges for the relevant users
Create Puppet-managed Openstack VMs in the CORAL cluster
Before you start
Create a new Openstack project (tenant)
Create a new hostgroup in Puppet/Foreman and in GIT and configure root access
- Read the documentation about host groups
and about user access
- Submit a JIRA request
for a new hostgroup (Add configuration Element)
- Host group
coral
has been created in Puppet via CRM-399
- Access to the
coral
host group in Puppet/Foreman has been granted to additional individual users via RQF0371311
(egroups cannot be used for this purpose)
- Submit a SNOW request
for a GIT repository for the new hostgroup
- Connect to the Puppet/Foreman dashboard on https://judy.cern.ch/dashboard
- Create the
coral
host group and all relevant subgroups in Foreman (More -> Configuration -> Host Groups)
- Initially a
coral/spare
subgroup had been created, but hosts have then been moved to more specific subgroups
Configure root access for the new hostgroup via Puppet/Foreman
Choose a Puppet/Foreman environment (GIT branch)
- Read the documentation about environments
(also on the twiki)
- Openstack environments correspond to GIT branches
- Two golden Openstack environments, as well as the corresponding GIT branches, are always defined: master
and qa
- You can start by using the
qa
environment
Choose an Openstack image
- Read the documentation about managing images
- Supported images on SLC6 include "CERN Server" and "Server" variants. Use "Server" variants for Puppet-managed machines (see also the slides from the 20131010 VOC meeting
)
- The IT supplied images are updated every few months to include the latest patches, therefore it is normal to use an image that is a few months old
- Access to CC7 images is restricted and can be requested via SNOW as described in the ConfigCentOS7 twiki
- Access to CC7 images for project CORAL has been requested via RQF0378056
- List the available images on aiadm using the openstack tools:
-
. openstack-CORAL.sh; nova image-list
- Alternatively, list the available images on aiadm using the EC2 tool suite:
-
. ec2rc-CORAL.sh; euca-describe-images
- Alternatively, list the available images on https://openstack.cern.ch/dashboard/project/images_and_snapshots
Choose an Openstack flavor
- Read the documentation about flavors
- The default flavor is
m1.small
(search here
for AIBS_VMFLAVOR_NAME
)
- List the available flavors on aiadm using the openstack tools:
-
. openstack-CORAL.sh; nova flavor-list
Choose an Openstack availability zone
- Read the documentation about availability zones
- List the available availability zones on aiadm using the openstack tools:
-
. openstack-CORAL.sh; nova availability-zone-list
Create your Puppet-managed Openstack VMs using CERN AI tools
- Read the documentation about creating Puppet-managed VMs
- Create your VMs on aiadm using the CERN AI tool
ai-bs-vm
(after setting up the runtime environment for the Openstack CORAL project using . openstack-CORAL.sh
)
- Set
AIBS_HOSTGROUP_NAME
to specify the Puppet/Foreman host group
- Set
AIBS_ENVIRONMENT_NAME
to specify the Puppet/Foreman environment
- Set
AIBS_VMIMAGE_NAME
to specify the Openstack image
- Set
AIBS_VMFLAVOR_NAME
to specify the Openstack flavor
- Set
AIBS_VMAVAILZONE_NAME
to specify the Openstack availability zone
- The AI command
ai-bs-vm
does two things: it creates the Openstack VM and then registers it in Puppet/Foreman
- The Openstack command
nova boot
(doc
) or the EC2 command euca-run-instances
(doc
) alone would only create the VM without registering it in Puppet/Foreman
- When the VMs are ready, you should be ready to ssh as root if your account is included in the
rootusers
(and/or rootegroups
) variable in Foreman
Resize the swap and root partitions
- Read the documentation about swap
and about resizing disks
- Check the current status on your VMs using commands
df -H
and fdisk -l
and swapon -s
- Resize the swap partition and grow the root partition using command
growpart
as described in the documentation about swap
- [TODO: this can also be done in puppet rather than manually, how?]
- [TODO: command
growpart
does not exist on SLC5, what should be done there?]
Add storage volumes to the Openstack VMs
- Read the documentation about adding disk volumes
- Create storage volumes and attach them to your VMs on aiadm using the Openstack commands
cinder create
and nova volume-attach
(after setting up the runtime environment using . openstack-CORAL.sh
)
Manage the VMs using Puppet
- You are now ready to manage the VMs using Puppet as described in the section above
Configure backups for storage volumes attached to Puppet nodes
- Read the documentation about backup node registration
and user responsibilities
(and more generally about the CERN backup service
)
- Submit a SNOW request
for backup node registration
- Node
aicoral61
has been registered in TSM via RQF0397907
- Nodes
aicoral62
, aicoral51
, aicoral71
have been registered in TSM via RQF0398897
- Change the password on each node using
dsmc set password
(this will then be needed as a Teigi secret)
- Read the documentation about configuring TSM
for Puppet nodes
- Backups have been configured for the
coral/hgbuild
sub hostgroup with git changes to manifests
and yaml data
- Read the documentation about adding secrets
in Teigi for Puppet
- On
aiadm
, run the command tbag set --hg coral/hgbuild tsmclient_password
and enter the secret
- To show the secret, on
aiadm
run the command tbag show --hg coral/hgbuild tsmclient_password
- If needed, read the documentation about restoring data
from TSM backups
- To display a summary of your last backups, run the commands
dsmc query backup -querysummary -inactive /home/
or dsmc query backup -querysummary -inactive /etc/
(note the trailing "/")
Network aliases
- Network aliases must be added via nova (setting LANDB fields
) on aiadm
and no longer via LANDB directly
- The aliases for node
aicoral62
were set by nova meta aicoral62 set landb-alias="CORALMYSQL,CORALPRX01,CORALSRV01"
- The current aliases for
aicoral62
can be checked by nova show aicoral62 | grep landb
or also via the network.cern.ch
interface to LANDB
Advanced topics and pending actions