How to Improve Stateful Deployments on Kubernetes

Read More >

At the beginning of the year, a client came to me looking to query blockchain data. While this type of data is usually publicly available, it’s not too easy to access in bulk; and even if you have it, it might not be in an easily digestible format. You either need to fetch the data from a node or run a node yourself. Then, once you have all the data, you need to index it in a way that makes it easily queryable and to store the indexes somewhere you can access them quickly.

To resolve these issues, I had planned to build an indexer based on SQLite, which would run locally and wouldn’t incur any network latency while still giving me SQL features for the indexes. I had hoped to package everything into a Docker image and deploy it to a managed Kubernetes (K8s) service, like AWS EKS. That way, I could scale horizontally while not getting preoccupied with K8s itself. Since tools like LiteFS allow replicating the SQLite database to multiple pods, only one would have to do the actual indexing work, but all would have their index on a fast local disk.

But the ever-growing blockchain data along with adding and removing indexes proved challenging when it came to capacity planning. So here’s my experience for those looking to solve this.

Challenges of Stateful K8 Deployments

Stateless deployments are great. You just scale your pods up and down according to the load. And nothing of value is lost if they get destroyed since they don’t host any important data.

Stateful Nodes Have State

While this may seem obvious, your pods have to store this state somewhere; and if it’s not there, they have to create it somehow.

They need a volume that keeps data between restarts; and since the pod writes the index, you can only mount it to one pod at a time. To scale up, you must ensure enough volumes are available for all your pods. You can create enough volumes beforehand to get around this issue, but the reason for using volumes in the first place was the state they stored. New volumes don’t have that state.

The new node starts and has no state, so it has to start indexing again. Depending on the size of the chain, this can take anywhere from a few days to weeks. But you don’t want to be waiting a week for a new node to be ready when scaling your system up. A spike might only last a few minutes or hours.

Stateful Nodes Need Backups

If, for any reason, the data gets corrupted or the volumes lost, you need backups to restore the index. While the data is still available, after all, you only need to build an index, not run a full node. But it would still take quite some time to construct these indexes from scratch.

Data Growth

This is a pretty specific challenge. Blockchain data usually only grows, meaning I needed to plan enough capacity on my volumes to avoid running out of space. Then, there’s the space required for the indexes. Whenever I add a new one, it needs storage space too; and while I wasn’t planning for it, there could also be a situation where I’d need to delete faulty or unused indexes again.

Addressing the Challenges

Replicating the State with LiteFS

My issue was that new volumes would come up without the indexed data, so new pods had to start indexing again, which took way too long. I solved this with LiteFS, a replication system for SQLite.

LiteFS allowed my pods to sync the databases holding the chain data and indexes via the cluster-internal network; a new node could just copy the finished database without needing all the indexing work from the beginning.

I also looked into K8s StatefulSets, a stateful alternative to Deployments. A StatefulSet ensures each pod gets the same volume on every restart. But my pods don’t care which volume they get; they only need enough of them and should have as much index data as possible. I mention this here because it might be relevant to your use case.

Backup the State to S3

Now replication is nice to save on work, but when things get ugly, I still want to have some way to get my indexes back. I found Litestream, a tool that saves SQLite data to S3, so it’s pretty cheap. To use Litestream with LiteFS, make sure you have a single static LiteFS primary.

Dynamically Allocate Block Storage

To tackle my data growth issue, I used Zesty Disk, which provisions block storage dynamically. This meant I didn’t need to consider buffers in my capacity planning. I could add and remove indexes when required; and even if the chain in question would see a spike in new transactions, I could be sure the storage would grow with it.

Zesty Disk creates multiple volumes in the background that my pods see as just the one file system. When my pods need more storage space, more volumes are added to the storage, and when less capacity is needed, any extra volumes are simply removed. This all happens automatically, without any interaction needed from me and my team, and without causing any downtime.

Take the headache out of managing stateful applications

Stateful deployments with K8s are more involved than stateless ones. If you can get away without a state, go for it! It’ll definitely save you time, effort, and money.

But if there’s no way around it, understand how your state is used. You might need a StatefulSet to ensure each pod receives the same volume every time. Even if not, you might have to think about replication and backups.

In any case, without capacity planning, using Zesty Disk entails you have one thing less to worry about. I recommend checking it out for storage provisioning that has very dynamic requirements.