If you’ve been around Kubernetes long enough, you’ve probably heard people discuss KEDA vs. Karpenter like it’s some kind of showdown.
But here’s the truth from someone who’s spent way too much time battling scaling issues in production:
It’s not KEDA vs Karpenter. It’s KEDA and Karpenter.
They do completely different jobs, and when you combine them, you unlock smoother, smarter, and more cost-efficient scaling—without the headaches of manual tuning.
So, let’s clear this up once and for all.
What is KEDA?
KEDA (Kubernetes Event-Driven Autoscaling) is all about scaling your workloads based on external events.
Think of it like this:
While traditional Horizontal Pod Autoscalers (HPA) scale your pods based on CPU or memory usage, KEDA can scale based on real-world signals like:
- Messages piling up in a RabbitMQ queue
- Pending jobs in AWS SQS
- Lag in a Kafka topic
- Or nearly 60 other sources (seriously, it supports a ton)
Why use KEDA?
Because real traffic patterns don’t always spike CPU or memory. Imagine you’ve got a queue stacking up overnight. KEDA notices and spins up more pods to work through the backlog—even if your nodes look quiet otherwise.
And the best part?
KEDA scales pods down to zero when there’s no work to do. Total cost saver.
What is Karpenter?
Karpenter handles the other side of the scaling equation: the nodes themselves.
While Kubernetes is great at adding pods when demand increases, it still needs somewhere to put them. That’s where Karpenter steps in.
Instead of manually setting up complicated Node Groups and guessing how many nodes you might need, Karpenter:
- Launches the right types of nodes exactly when you need them
- Shuts down unneeded nodes automatically
- Picks the most cost-efficient instance types, sizes, and zones
- Supports a mix of Spot and On-Demand instances to save cash
Why use Karpenter?
Because manually managing node groups and scaling policies is painful. Karpenter figures it out in real time based on the actual resource requests of your pods.
KEDA and Karpenter: The Dream Team
Here’s where the magic happens.
KEDA scales pods based on demand. Karpenter scales nodes to fit those pods.
They complement each other perfectly.
Here’s a real-world example:
- You’ve got a queue (like Kafka) that’s backing up.
- KEDA detects the backlog and increases the replica count of your consumer pods from 1 to 20.
- Kubernetes now needs to find room for those 20 pods.
- Karpenter notices there’s not enough capacity and spins up new nodes to fit the pods.
- Pods get scheduled, the backlog clears.
- KEDA scales your pods back down.
- Karpenter sees the nodes are empty and shuts them down.
Zero queue backlog. No wasted resources or manual intervention.
That’s the dream, right?
How to Use KEDA and Karpenter Together (The Right Way)
When you pair KEDA and Karpenter, it feels like you’ve unlocked next-level Kubernetes scaling. But to keep things running smoothly (and avoid some frustrating surprises), here’s how to fine-tune the setup with real-world best practices.
1. Define Your Pod Resource Requests Wisely
Here’s the deal: Karpenter provisions nodes based on what your pods say they need—specifically the CPU and memory requests (not limits).
If your pod requests are too low (or missing entirely), Karpenter might spin up a node that’s way too small, and your pods will struggle to run.
Example:
resources:
requests:
cpu: "500m"
memory: "2Gi"
limits:
cpu: "1"
memory: "2Gi"
In this example, Karpenter ensures the node has at least 500 millicores of CPU and 1Gi of memory available for this pod.
Tip: Don’t guess your resource needs. Use tools like Vertical Pod Autoscaler (VPA) in recommendation mode to analyze past usage and suggest accurate requests.
- Kubernetes Resource Management Docs
- VPA Setup Guide
2. Choose KEDA Scalers That Match Real Demand
The key with KEDA is choosing the right scaler for the actual pressure point of your workload.
For instance:
- For queue processing? Use the RabbitMQ, Kafka, or AWS SQS scaler.
- For APIs? Try the HTTP scaler to monitor request volume.
- For databases? Use PostgreSQL or Redis scalers to watch row counts or keyspace.
Example with AWS SQS scaler in a deployment:
triggers:
- type: aws-sqs-queue
metadata:
queueURL: <https://sqs.us-east-1.amazonaws.com/123456789012/my-queue>
queueLength: "10"
authenticationRef:
name: keda-aws-credentials
This says, “Scale my consumers when there are 10 or more messages waiting.”
Tp: Align KEDA’s polling intervals with how fast your system expects to react to load spikes.
Default polling is every 30 seconds, but for faster workloads, reduce it.
- Full List of KEDA Scalers
- KEDA HTTP Add-on
3. Watch Out for Cold Starts
Scaling from zero sounds perfect, right? Until you realize the cold start tax.
Here’s what really happens when demand spikes:
- KEDA detects an event (e.g., queue backlog).
- It scales the deployment from 0 to N pods.
- Karpenter notices nodes are needed.
- New nodes get provisioned, which can take 60-120 seconds.
- The pods pull their images and initialize.
During this time, your customers are waiting (or worse, errors are stacking up).
How to minimize cold starts:
- Set
minReplicaCount
in KEDA to keep a small baseline running:
minReplicaCount: 1
- Pre-warm your cluster by keeping a minimum number of nodes active using Karpenter’s consolidation settings to avoid scaling down absolutely everything.
- Consider smaller, faster nodes that spin up quicker.
Tip: Run a light “canary” workload that forces at least one node to stick around.
- KEDA Scaling Behavior
- Karpenter Consolidation Docs
4. Let Karpenter Handle Node Types
Karpenter’s secret sauce is its ability to dynamically choose the best node type based on your pod requests, availability, and pricing.
But it only works if you guide it with labels, requirements, and constraints.
Example:
Let’s say some workloads need GPUs. You can add a node selector to your deployment:
nodeSelector:
karpenter.sh/capacity-type: "on-demand"
accelerator: "nvidia"
And in Karpenter’s Provisioner
object, you can define which instance types support GPUs:
requirements:
- key: "accelerator"
operator: In
values: ["nvidia"]
Now Karpenter will automatically choose the cheapest available GPU instance type that satisfies the pod’s needs.
Tip: Use topologySpreadConstraints
to distribute workloads across AZs for resilience, and let Karpenter balance nodes accordingly.
- Karpenter Node Requirements
- Kubernetes Topology Spread Constraints
5. Monitor Everything
Even with automation, blind scaling is dangerous. You need visibility into how your scaling layers are behaving together.
What to monitor:
- KEDA: Are scalers triggering correctly? How long are workloads staying at max replicas?
- Karpenter: How fast are new nodes coming online? Are pods stuck pending?
- Pod health: Are there restarts, image pull errors, or resource throttling?
- Node usage: Are nodes underused or over-provisioned?
Tools:
- Prometheus + Grafana: For visual dashboards and custom alerts.
- CloudWatch (EKS) or GKE Monitoring: For deeper infrastructure-level insights.
- Karpenter Events: Inspect using:
kubectl get events --sort-by='.lastTimestamp'
Tip: Set up alerts for pod Pending
states longer than 30 seconds. It’s usually the first sign your Karpenter settings need tweaking.
- KEDA Metrics Exporter
- Karpenter Troubleshooting Guide
We promise you.. no spam. Just weekly tips to make your better at your job.
Stop Choosing, Start Combining
Apparently people still tend to compare KEDA and Karpenter as it’s a matter of either or.
Both handle scaling, both save money, and both make Kubernetes life easier.
But after running real workloads in production, I can tell you—this isn’t an either/or situation.
KEDA scales what you run.
Karpenter scales where you run it.
Use them together, and you’ll spend less time worrying about traffic spikes, node shortages, or wasting money on idle resources.
They’re not rivals.
They’re the perfect team.