How we stopped manual pod tuning and shrunk our Kubernetes clusters by 43%
I lead the Cloud Platform team at Sennder, a digital freight forwarder. We operate a marketplace that connects shippers moving goods across Europe with carriers who own trucks. We’re a small team, seven people, supporting more than 100 engineers.
Our infrastructure runs on AWS EKS. We operate eight clusters across different environments, with more than 150 nodes at any given time. It’s not the biggest setup out there, but it’s dynamic. We use Karpenter and KEDA, and depending on the time of day, traffic, or background jobs, the platform scale up and down constantly.
Our problem wasn’t growth. It was resource utilization.
As the company grew, traffic increased, and we added more services. Things scaled up more or less the way we expected.
But what we started to see was workloads requesting large amounts of CPU, while using very little in practice. Some teams would just set very high CPU requests to avoid issues. In some cases, services were reserving around 90% of a node’s CPU and using about 10%, even in production with real traffic.
On the other side, when requests were too low, we occasionally ran into CPU throttling. It’s not just about things being a bit slower. It can cascade. A service might stop responding to health checks, which then affects other services. For users, that could mean slow-loading operations boards or, in more critical cases, being unable to notify clients or accept offers. It didn’t happen often, but when it did, it disrupted the operations team’s daily work.
Covering nodes with commitments doesn’t mean you’re using them efficiently.
We have been using Zesty Commitment Manager for a long time. That’s how we started working with Zesty. We covered our infrastructure with Reserved Instances, and later Savings Plans, and that helped a lot with cost control.
But it was very hard to be efficient at the workload level. We had all the observability in place. We even ran training sessions and webinars to explain things like CPU throttling and memory limits. Still, educating everyone to regularly check their metrics and adjust requests was hard.
Some teams tried. They ran a few tuning cycles, and things looked better. Then, a few months later, traffic or workloads changed, and the settings were wrong again. It wasn’t dynamic, and because it was repetitive and manual, it was easy to forget. Teams often brought us back to over-scaling to avoid problems.
Solutions for pod rightsizing were starting to appear on the market, but many options weren’t suitable. Kubernetes VPA, for example, doesn’t work well alongside KEDA and can introduce conflicts. Other tools were too simple and didn’t keep long-term historical data. Since we were already working with Zesty, we learnt about their pod rightsizing solution.
From a platform engineering perspective, Pod Rightsizing was exactly what we needed.
We deployed Zesty’s solution as a Helm chart, like any other EKS add-on. Installation was straightforward. Pod Rightsizing needed a couple of hours to collect enough data to show something meaningful in the UI.
In non-critical environments, we enable it by default. In production, teams opt in through a simple flag in our Helm chart, and we use labels to decide exactly which workloads are right-sized. Developers don’t need to define scaling logic or tune parameters themselves. We keep enough buffers to make sure scaling down doesn’t hurt applications.
That translated directly into lower monthly bills.
In non-critical environments, we reduced cluster size by around 30–40%, which is already significant for us. In environments where Pod Rightsizing was applied more fully, reductions reached around 43%.
When we shared the savings numbers internally, the reaction was very positive. Saving that much without having to worry about it made the value very clear.
We don’t need to focus on repetitive, non-scalable tasks anymore
Beyond cost, the biggest change for us is operational. We no longer spend time chasing people to adjust requests. Even if someone deploys a service with old or poorly configured values, nothing breaks. Pod Rightsizing catches it behind the scenes.
We still monitor high-level metrics like average CPU usage and node counts, and we control the rollout strategy. But wherever Pod Rightsizing is active, it’s not something we really need to think about. It just does the work.
We are now covering both workload efficiency and long-term cost control.
Pod Rightsizing helps us reduce waste by right-sizing workloads, and Commitment Manager lets us optimize commitments on top of that, so together they add up to better overall savings.
My advice to anyone looking to optimize is to start by right-sizing workloads to match real usage, and then use commitments to optimize that baseline. That’s how they work best together.





