Integrating OpenStack with Ceph

In this document we describe how to intergrate OpenStack Cinder with Ceph to enable volume based back end storage as well as image management and storage.

We assume that you do not have administrative permissions to the Ceph service and 'volumes' and 'images' pools have been created for you. To access these pools, we also assume the users 'volumes' and 'images' have been created and that the pools can be accessed by the provided keyrings (ceph.client.volumes.keyring, ceph.client.images.keyring) and a ceph configuration file (ceph.conf).

Sources of Information

Currently there exists good documentation on integrating Ceph with OpenStack however dependent on the package versions installed, the integration may not be successful hence why our approach is outlined here. The documentation from other sources can be found at:

Installing Ceph and Required Packages

Firstly, it is unlikely that the correct QEMU version is installed on your compute nodes. This needs to be QEMU version 0.12.1 and needs to be installed from the Ceph repository. Other 0.12.1 versions exists however produce errors when used with Ceph. If you believe you have the correct packages installed, please jump to the 'Ceph Volume Configuration' section otherwise remove qemu and related packages:

Note: the following command will remove openstack cinder on the controller node and openstack-nova-compute on compute nodes hence these must be re-installed afterwards. Hence it is a good idea to copy the configuration files to a safe place so they can be returned after re-installation

yum remove ceph librados2 librbd1 qemu-guest-agent qemu-guest-agent-win32 qemu-img qemu-kvm qemu-kvm-tools libcephfs1

Install the Ceph dependency packages:

yum install leveldb snappy xfsprogs boost gdisk python-lockfile gperftools-libs python-flask

and then the correct QEMU and Ceph packages from the respective repositories found at:

rpm -i
librados2-0.67.2-0.el6.x86_64.rpm
librbd1-0.67.2-0.el6.x86_64.rpm
libcephfs1-0.67.2-0.el6.x86_64.rpm
qemu-img-0.12.1.2-2.355.el6.2.cuttlefish.x86_64.rpm
qemu-kvm-0.12.1.2-2.355.el6.2.cuttlefish.x86_64.rpm
qemu-kvm-tools-0.12.1.2-2.355.el6.2.cuttlefish.x86_64.rpm
qemu-guest-agent-0.12.1.2-2.355.el6.2.cuttlefish.x86_64.rpm
qemu-guest-agent-win32-0.12.1.2-2.355.el6.2.cuttlefish.x86_64.rpm
ceph-0.67.2-0.el6.x86_64.rpm

Cinder / Volume Configuration

On each of the OpenStack nodes (controller and compute) copy the ceph configuration file and keyrings you were given to /etc/ceph/. Make sure the configuration file has the correct permissions for the cinder service to access it:

chown cinder:cinder /etc/ceph/ceph.conf

Also check that the keyring files are readable by all and if not:

chmod +r ceph.client.volumes.keyring

In order to access the 'volumes' pool, we need to instruct Ceph where and what keyrings to use. We can do this by adding the following to the Ceph configuration file:

[client.volumes]
keyring = /etc/ceph/ceph.client.volumes.keyring

In order to use the command line tools 'rbd' and 'rados' we need to specify the arguments to Ceph via the CEPH_ARGS environment variable:

export CEPH_ARGS="--id volumes"

This specifies that the user 'volumes' should access the Ceph service and with the combination of adding the above to ceph.conf, we should be able to sucessfully use Ceph. Test this by trying:

ceph health
rados lspools
rbd -p volumes ls
rbd -p images ls

Now we need to re-install openstack-cinder and openstack-nova-compute on the appropriate nodes and ensure that the appropriate services are running again:

yum install openstack-cinder openstack-nova-compute
/etc/init.d/libvirtd start
/etc/init.d/openstack-nova-compute start
/etc/init.d/openstack-cinder-api start
/etc/init.d/openstack-cinder-scheduler start
/etc/init.d/openstack-cinder-volume start

It would be wise to check that the service is functioning normally by executing:

nova-manage service list

If not, fix these problems first before proceeding with the integration. You may find a solution within OpenStackErrors.

Now we need to export the user that the Cinder and Nova services, on the controller and compute nodes respectively, must use to access Ceph. We do this by adding export CEPH_ARG="--id volumes" to /etc/init.d/openstack-cinder-volume and /etc/init.d/openstack-nova-compute and then restart these services:

/etc/init.d/openstack-nova-compute restart
/etc/init.d/openstack-cinder-volume restart

For hosts running nova-compute (controller and compute nodes in our case), these services actually do not need a keyring as they instead store a secret key in libvirt. On the controller host, first create the secret.xml file containing:

<secret ephemeral='no' private='no'>
<usage type='ceph'>
<name>client.volumes secret</name>
</usage>
</secret>

and then create/output the secret key by executing:

virsh secret-define --file secret.xml

Copy this secret key and add it so libvirt by executing:

virsh secret-set-value --secret the_secret_key --base64 the_client_key

where the_client_key is the key found within ceph.client.volumes.keyring.

On the OpenStack compute nodes, we ensure that each host has the same secret key by creating the following secret.xml file:

<secret ephemeral='no' private='no'>
<uuid>the_secret_key</uuid>
<usage type='ceph'>
<name>client.volumes secret</name>
</usage>
</secret>

And then execute the same virsh commands as above

Now Ceph, RDB and Rados should successfully work from the command line. Finally, in order to give OpenStack the ability to create, attach and delete volumes via the Horizon interface, we must add the following to the Cinder configuration file (/etc/cinder/cinder.conf):

volume_driver=cinder.volume.drivers.rbd.RBDDriver
rbd_pool=volumes
rbd_user=volumes
rbd_secret_uuid=the_secret_key
glance_api_version=2

The pool name and user to access the pool may be different in your case. After restarting the Cinder volume service, it should be successfully integrated with Ceph.

Glance / Image Configuration

On the controller node (where Glance is installed), we must install the python libraries for RBD; this is available from Ceph:

We must also edit the /etc/glance/glance-api.conf file to include the following variables within the [DEFAULT] section.

default_store=rbd
rbd_store_user=images
rbd_store_pool=images
rbd_store_ceph_conf=/etc/ceph/ceph.conf

In order to access the 'images' pool, we need to instruct Ceph where and what keyrings to use. We can do this by adding the following to the Ceph configuration file:

[client.images]
keyring = /etc/ceph/ceph.client.images.keyring

Restart the Glance API to pick up the configuration changes:

/etc/init.d/openstack-glance-api restart

Important: Note that some of these variables may already exist and simply might need changing to reflect your pool name and username to access RBD; adding these variables before RBD variables that already exist will result in your changes being overwritten hence connection errors are likely to occur.

Also ensure that the ceph.client.images.keyring within the /etc/ceph directory and has the correct permissions (cinder:cinder).

Now test image creation via Glance (an example):

glance image-create --name "Fedora 19 x86_64" --disk-format qcow2 --container-format bare --is-public true < Fedora-x86_64-19-20130627-sda.qcow2

Finally, check that this image appears within the pool 'images':

rbd -p images ls

Note: you may have to export CEPH_ARGS="--id images" to get the above command to work.

You should now be able to create images and boot instances from these images!

Troubleshooting

Error Deleting Volume

This error typically occurs when Ceph is incorrectly configured or in the case when configuration is correct, deleting a large volume can cause this error to appear both within the logs and on the Horizon interface. To remove the volume from Ceph, we need to obtain the ID of the volume which can be found by executing:

cinder list

and copying the ID of the volume where the state is error_deleting.

We then remove the Ceph volume:

rbd -p volumes rm volume-the_id_from_cinder_list

Now that the volume is deleted, the volume will still appear on the Horizon interface hence we need to manually remove the volume entry from the MySQL database:

use cinder;
delete from volumes where status='error_deleting';

It is likely that when an such an error occurs, the amount of volumes and total volume GB's reserved are not updated hence one may see the subsequent error in the future. It would be wise to check that the correct amount of volumes and volume GB's in use are correct after such an error, hence following the solution to the error below.

VolumeLimitExceeded: Maximum number of volumes allowed (10) exceeded

When deleting a Ceph volume, it may fail as described above hence the usage of volumes and total volume GB's reserved may not be updated. This will result in the "VolumeLimitExceeded" error when actually less than 10 (the default value) volumes are in use.

Check that the correct number of volumes and GB's are recorded in the cinder.quotas_usages table:

use cinder;
select resource, in_use from quota_usages;

If the figures are incorrect, there are two ways to solve this problem:

1. Increase the quota of allowed volumes:

cinder quota-update <tenant_id> --volumes 20

2. Update the database with the correct number of volumes and volume GB's currently in use. This is the prefered as method (1) does not provide a long term solution but instead delays this error from appearing again:

update quota_usages set in_use=XXX where resource='gigabytes';

update quota_usages set in_use=X where resource='volumes';

The correct volumes and volume GB's in use can actually be seen from the Horizon interface when one selects 'Create Volume'.

Caught signal (Segmentation fault)

When executing a Ceph, rbd or rados command, one may see the output of a core dump specifing a segmentation fault as the cause of the problem. This was noticed when Ceph was installed via the method outlined above but later changed to an older version, likely by automatic configuration changes. For example, Ceph 0.61.7 was initially installed however was downgraded to version 0.56.3. Ensure the correct version of Ceph is installed to solve this problem.

Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r11 - 2013-09-06 - GaryMcGilvary
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback