Storage for Kubernetes: Getting to the Bottom of Persistent Volumes
Where would software development be in 2022 without Kubernetes? Containers and K8s have truly been game-changers. Unless you’re still reminiscing about the good old days of cgroups I don’t think anyone would say otherwise. But that’s not to say that containers and K8s don’t come without their own challenges and potential pitfalls.
One of the often-overlooked frustrations with using containers is storage, or more specifically how to use legacy storage architectures that are complex and lack the API functionality to support modern automation.
The default storage for containers are ephemeral, which means that you cannot retrieve data in a later session, as the data dissipates along with the volume once the container is turned off. While there is a range of solutions to ensure persistence, some lack the functionality to support automation while others are limited in elasticity and aren’t equipped to withstand large fluctuations in data.
In this article, we review the commonly used options for data persistence but then dig a little deeper to discover storage solutions that are more dynamic and offer greater elasticity. With the aim of getting better data consistency and improved performance, we look into the ideal cloud storage solution for your containerized environments.
From the Top: The Ephemeral
As we mentioned earlier, the default storage for containers is ephemeral volumes that only last as long as the container is active. If the container is closed down, or it crashes, (poof!) your data is lost. This is a problem for non-trivial applications (such as databases) where, if a container crashes it will restart again in a clean state. Kubernetes Volume Abstraction helps to mitigate this common challenge.
To avoid this situation, volumes need to be mounted into pods that are supported with more persistent cloud storage volumes using plugins provided by Kubernetes.
Edging Down: Persistent Volumes
You can’t really talk about persistent storage in containerized environments without first talking about Persistent Volumes. This is the abstract layer used to connect cloud storage volumes with K8 clusters, without affecting the limitations of the pod’s life cycle, so the data it stores can exist for long periods of time.
These volumes persist irrespective of where the container is moved or how it’s used. And because they are more flexible, there are a range of PVs to choose from and the user needs to specify attributes such as size and performance requirements.
With a K8 pod’s lifecycle being typically unpredictable, PVs are widely used to get necessary levels of data persistence. For tasks where data preservation is essential, such as operating databases, without PVs they become impossible
Slipping further: Drawbacks of PVs
So if PVs can be detached from K8 pods to enable performance flexibility and data persistence, all our problems are solved, right?
Well, not exactly. PVs are not 100% reliable in all scenarios, especially when scaling and elasticity are required.
An auto-scaler for PVs is available using a Container Storage Interface, however, this has some limitations when applied to EBS volumes. First and foremost, volumes can be extended only once every six hours. This entails you cannot scale on-demand, if application usage exceeds the capacity for the volume. While it’s possible to launch another volume to extend your filesystem further, once you’ve extended your PV, it can’t be shrunk back down again. This can be costly if your K8s cluster experiences a temporary fluctuation in demand, where you will be charged the price from the peak, no matter how fleeting it was, even after usage returns to a normal state.
However, another problem arguably more concerning, is that PVs are prone to fill up when there’s a high level of data ingestion. Projects that involve large amounts of data, such as ETLs, machine learning pipelines, and databases are all examples that cause data spikes. In this situation, the available capacity of PVs can be exceeded, potentially causing the application to crash as a result.
Diving Deep: Volume Autoscalers
To avoid the risk of PVs crashing because of insufficient space, it becomes necessary to find a solution that will provide all the flexibility and performance capabilities of PVs but with the ability to scale and meet the needs of even the most demanding and sudden influxes of data.
That’s where Zesty Disk comes in.
Zesty Disk is a storage auto-scaler that expands and shrinks filesystems so they can scale up with sudden data surges and scale back down once this data is removed.
For K8s this can be set up with a daemonset which ensures every node in the cluster will have an “auto-scalable filesystem” running on it, interacting with the cluster and the node.
How has this worked in the real world?
One of our customers is using Zesty Disk for their data analytics cluster. A new process is fired every few hours, causing the number of pods in a given workload to grow. Initially, the number of pods always grew with the workload and there was always enough CPU and memory to handle the ingested data (given the cluster was configured to grow with demand). A problem only appeared after a significant percentage of jobs started failing due to an “out of disk” error. With further investigation, they discovered that their customer’s data is not homogeneous. While some customers’ data analysis needed only 100GB of disk space, others required as much as 750GB of disk capacity. But all their machines were provisioned with the one-volume size which was the configuration of the original volume.
To provide for the app’s stability, the cluster disk size was initially set to the maximum disk size which was 750GB. This turned out to be a patchy safety net as the maximum size tends to gradually increase with new customers over time.
Zesty Disk was used to solve this problem by dynamically managing disk sizes in run time. By using the storage auto-scaler, they were able to launch clusters with as little as 15 GB of disk space and rely on the auto-scaler to adjust capacity as needs grow.
The technology was truly a game-changer for the company as they are now able to easily prevent out-of-disk errors, stand by their SLAs, and in the process, cut their EBS costs!
The Down Low
For Kubernetes environments, the elasticity of Zesty Disk prevents PVs from filling up due to traffic spikes, enabling data to stay persistent by removing any limitations to capacity. But at the same time, it enables volumes to “shrink” back down once demand decreases. Meaning, that you get exactly the amount of storage you need when you need it.
This next generation of flexibility, persistence, elasticity, and performance is exactly what’s needed to take your K8 environment to the next level.