Think of it as your way of telling Kubernetes:

Hey, this app needs at least this much CPU and memory to work properly—please make sure it gets that.

Without them, Kubernetes has no clue how big or small your workloads are, which can lead to overloaded nodes and crashing pods.

Use cases

When you define resource requests, you help Kubernetes:

  • Place pods on nodes with enough available resources.
  • Prevent resource starvation between different workloads.
  • Optimize cluster capacity and balance workloads more effectively.
  • Avoid performance issues caused by over-scheduling pods on crowded nodes.

If you skip setting requests, Kubernetes defaults to zero, which means it assumes the pod barely needs any resources—until it suddenly does and overwhelms the node.

How Do Resource Requests Work?

Each container in a pod can declare its own resource requests. These requests are set in the pod spec under the resources.requests field.

Kubernetes tracks two key resource types:

  • CPU (measured in cores or millicores).
  • Memory (measured in bytes, like Mi or Gi).

When the scheduler assigns a pod to a node, it makes sure the node has enough unallocated CPU and memory to satisfy the pod’s requests.

Example:

apiVersion: v1
kind: Pod
metadata:
  name: web-app
spec:
  containers:
  - name: backend
    image: my-backend:latest
    resources:
      requests:
        cpu: "500m"
        memory: "256Mi"

This container asks for:

  • 500 millicores (half a CPU core).
  • 256 MiB of memory.

If no node has enough free CPU and memory to meet those minimums, the pod will stay in a Pending state until resources free up.

Resource Requests vs. Limits

It’s easy to mix these up, but they play different roles:

FeatureResource RequestsResource Limits
What it setsMinimum guaranteed resourcesMaximum allowed resource usage
Scheduler uses it?✅ Yes❌ No
What happens if exceeded?Nothing (it can use more if available)Pod is throttled or killed

Tip: You can (and should) set both requests and limits to protect your workloads and your nodes.

How Do Resource Requests Affect Node Scaling?

If you’re using tools like Karpenter, Cluster Autoscaler, or EKS Automode, resource requests are critical. These autoscalers look at pending pods and their resource requests to decide:

  • When to add more nodes.
  • What size those nodes should be.
  • How to pack pods efficiently across nodes.

Warning: If you under-request resources, Kubernetes might schedule your pod on a node that can’t actually handle it under load. If you over-request, you might waste cluster capacity and overpay for unused resources.

Best Practices

  • Start with real data. Use tools like Vertical Pod Autoscaler (VPA) or Prometheus to analyze actual resource usage.
  • Set requests based on steady-state needs. Think about what the app typically consumes, not just its peaks.
  • Balance requests across pods. Avoid a few heavy pods hogging all the resources while others starve.
  • Review and adjust regularly. Usage changes over time—don’t set requests once and forget them.

Similar Concepts

ConceptPurpose
Resource LimitsDefine the maximum resources a container can use.
Vertical Pod Autoscaler (VPA)Suggests resource requests based on historical usage.
Horizontal Pod Autoscaler (HPA)Scales pods based on metrics like CPU usage.
KarpenterProvisions nodes to meet the resource requests of pending pods.

References