Storage Options for VM
Openstack supports multiple storage backends. In this document we describe how we implement storage in SWITCHengines, how to make the right choice and how to backup your data.
- Ceph: Backed by SATA disks (default)
- Ceph: Backed by NVMe
- Local NVMe
Option | Speed | Capacity | Reliability | Price |
Ceph (SATA) | - | ++ | ++ | $ |
Ceph (NVMe) | + | - | ++ | $$$ |
Local NVMe | ++ | - | + | $$$ |
In Openstack you will start a virtual machine that is called instance that can boot from an ephemeral disk or from a volume. Ephemeral disks are meant to be short-lived and are deleted when the instance is deleted. Volumes are meant to be persistent and have a different lifecycle.
Tip: when you boot an instance, make sure you understand whether you are booting with an ephemeral disk or of you are booting from a volume. In the web interface you must “Select Boot Source”. If you select «Volume» as a boot source, or if you enable the «Create New Volume» flag, then you will boot from a persistent volume, otherwise you will boot from a new ephemeral disk.
The main storage backend for SWITCHengines is Ceph. Your instance is attached to a virtual block device over the network. The Ceph cluster is a distributed system among many servers. Your virtual disk is spread among all the servers of the cluster, and each block is replicated three times.
Considerations about data loss:
If one physical disk fails in our infrastructure the data is preserved, because the Ceph cluster keeps three copies of each block of user data on three different servers.
In case of a software problem (Ceph bug, human error) there is a anyway probability of data loss.
Considerations about performance:
The majority of disks in our Ceph cluster are not SSD disks. The I/O latency is the sum of the latency of the network access to the remote server containing the disk, and the latency of the disk itself. For most of the use cases we know of, this type of storage is fast enough. We have successfully run database servers on this configuration.
If you need more IOPS or higher read/write bandwidth consider one of the SSD/NVMe options.
Ceph (NVMe)
When creating a colume, you can select the SSD backed type. This will give you the benefits of Ceph (Reliability, Live Migration, Resizing) with the benefits of NVMe (latency, r/w speed). Note that due to the nature of Ceph, latency is higher than local attached SSDs as all traffic goes through the network.
To boot from an SSD backed Ceph Volume, create a volume and specify the OS image that will serve as the basis for your VM. Then create the VM instance and select "boot from volume"
SSD Hypervisors
We operate some hypervisors that have a local SSD so that we can offer SSD-backed instances. To use the local disk instead of the default Ceph backend, we create in the SWITCHengines project a special flavor. When you select this flavor, ephemeral disks of such instances will be backed by local SSD on such a hypervisor. When you select this flavor, the instance will attach to a partition of the SSD disk that is local to the server.
Warning: resizing an instance among flavors with different storage backends will fail. You will find in the action log a instance resize operation in error state. Is not possible to copy the data between different backends with the resize operation.
Warning: instances running with local SSD disks cannot be live migrated, there could be downtime during hardware maintenance or hardware failure.
Be aware that if you attach additional volumes at a later time to a running instance, the new volumes will be created in the Ceph backend. This is also a good thing for making backups, as we can see later.
Considerations about data loss:
If the physical SSD disk breaks, your data is lost.
The best way to backup the data is to attach an additional volume, where the data is stored on the Ceph backend, and copy the data there.
Snapshots
In Openstack you find the word Snapshot in the instances screen and in the volume screen, but in the second case we talk about Volume Snapshot. It is very important to understand the difference.
When we boot from ephemeral disk, we can take a Snapshot of the instance. This will create an Image with the content of the ephemeral disk. You can reuse that image to boot a new instance.
Warning: if your special flavor creates a machine with two ephemeral disks, the snapshot will save an image only with the first disk, and the data on the second disk is not saved at all.
When we have a volume, we make a Volume Snapshot. This will be listed in the volume page or with the command openstack volume snapshot list.
Volumes in SWITCHengines are always implemented with the Ceph backend. The snapshot of a volume is a “Ceph snapshot”. It is very fast to create such a Snapshot regardless the size of the volume.
The ephemeral disks in SWITCHengines could be on SSD. In this case the snapshot will trigger the copy of all the data from the SSD to the Ceph cluster. This is a slow operation and the required time grows with the size of the ephemeral disk.
We suggest creating this kind of snapshots from the command line to avoid authentication timeouts due to the web interface.
openstack server image create --name <name> <instance_uuid>
Tip: after creating snapshots always check their final status is active. If your status is stuck in queued or uploading your still at risk of data loss.