How we cut EC2 costs by 40% while staying fully flexible
As Program Operations Manager at Printify, I deal every day with highly dynamic systems.
We run a global print-on-demand platform that basically lets anyone, YouTubers, brands, or just someone with a creative idea, start earning money without worrying about inventory, equipment, or delivery. You design something, and we take care of getting it printed and shipped.
On the engineering side, we are about 99% in AWS. We run eight Kubernetes clusters, and hundreds of nodes are spinning up and down at any moment. With the increasing use of GPU for image generation and mockups, our workloads are extremely image-intensive.
Manual commitments just couldn’t keep up
A few years back, everything was growing at once, and traffic became unpredictable. A new social media trend could trigger thousands of images generated within minutes, changing colors, sizes, angles, everything. And then traffic could vanish again just as fast.
Teams are focused on delivering features, not monitoring costs. They scaled Reserved instances and savings plans as high as needed to avoid outages, and honestly, that’s normal when the infrastructure is so dynamic, and the whole setup is so sensitive to performance.
But as we kept growing year after year, with clusters shifting constantly between instance families, sizes, and regions, it was impossible to stay optimized with the right commitments manually. We simply didn’t have the capacity for that. Inefficiencies became basically inevitable, and cloud costs got out of hand very quickly.
We also started building clusters on Spot Instances to reduce costs. In theory, Spots are great, up to 90% savings. In practice, AWS often runs out of capacity in certain regions, forcing us back to on-demand. And some of our workloads load massive Docker images, taking more than the two-minute interruption window. So for our most intensive jobs, Spots were way too risky.
At the same time, we wanted no upfront commitments, minimum lock-in, and the ability to move quickly to new instance families. Flexibility was crucial for us.
Zesty brought automation that fits our dynamics
That’s when we met Zesty, about four years ago. And pretty fast, we saw this could be a game-changer. Commitment Manager became a core part of our cloud cost strategy.
Zesty offered automation that removed the whole burden of managing commitments. You turn it on, and it keeps coverage high automatically. We don’t need daily oversight anymore. It just works.
When AWS started phasing out Reserved Instances, Zesty started a smooth transition to Savings Plans. What really clicked for us was the concept of micro purchases. Instead of one giant commitment you’re stuck with, Zesty builds a portfolio of tiny ones that adjust as our demand changes. If usage goes up, it grows. If usage drops, old ones expire on their own. No panic, no cleanup, no lock-in. That level of granularity was exactly what we needed.
And this fits perfectly with our spot-first strategy. When AWS runs out of Spot capacity, Commitment Manager catches it and covers it with micro Savings Plans. That flexibility is something we could never do manually.
And I have to say, Zesty didn’t just give us a tool, they became partners. They told us what to be careful about, when to act, and why. When you’re running a fast-moving platform, that kind of guidance matters more than people realize. You don’t get that from most vendors.
Our setup became both cost-efficient and flexible
Once we had Zesty’s Commitment Manager in place, things went from daily checks and hours of adjustments every week to maybe an hour a week, sometimes less. I just look at the coverage, make sure nothing unexpected is coming, and that’s it. The tool handles the rest.
We sit comfortably above 90% coverage across our compute. Over time, our mix of commitments naturally shifts toward micro Savings Plans. As clusters scale, change families or sizes, or fall back from Spot to on-demand, Commitment Manager keeps coverage high without us touching a thing.
For us, the result is an infrastructure that behaves the same way our business does. It means we’re not paying for things we don’t use anymore. And if a creator runs a viral campaign or if the holiday season hits, those spikes don’t blow up the bill. Commitments stay aligned with real usage, and that alone enabled us to cut EC2 costs by around 40%.
And because everything is built around flexibility, we’re not locked into the wrong instance types. We can move to newer, faster, cheaper generations, without breaking our coverage.
We are already testing what comes next
Right now, we’re extending into Zesty’s Kubernetes optimization tools. FastScaler interested us because it can hibernate nodes with container images already pre-loaded. For us, that’s huge. Our workloads use a lot of big images, so bringing nodes online fast is essential.
So far, the results look really promising. Nodes come online way below the two-minute interruption limit. That means we might finally be able to run even our heaviest, most image-intensive workloads on Spot.
What we want next is full integration with Zesty’s other Kubernetes optimization tools, all working together. One system that understands what’s happening in the clusters in real time and adjusts automatically: Spot, right-sizing, commitments, everything.
Hopefully, we’ll see it running in our environment soon.





