In-Place Pod Resize is a Kubernetes feature that allows changes to a container’s CPU and memory resource requests without restarting the Pod. This enables real-time vertical scaling for running workloads, removing the traditional requirement to evict or relaunch Pods when updating resource limits.

Launch Context

In-Place Pod Resize was introduced as a beta feature in Kubernetes v1.33, designed to address a long-standing challenge in Kubernetes vertical scaling. Prior to this, adjusting a Pod’s resource requests required a full restart—an approach that introduced downtime and complexity for many production workloads.

How It Works

With In-Place Pod Resize, updates to CPU and memory requests are applied live by patching the Pod specification. Kubernetes attempts to modify the container’s resource requests in place, allowing the Pod to continue running without disruption.

In cases where the resized Pod no longer fits on its current node, Kubernetes will mark it as “infeasible” and prevent it from being scheduled. Advanced implementations—like Zesty’s Pod Rightsizing—address this by:

  1. Detecting infeasible Pods,
  2. Gradually evicting them in a controlled rollout to minimize disruption,
  3. Recreating them on nodes with sufficient capacity,
  4. Applying updated resource requests via a mutation webhook.

This approach ensures minimal service interruption while maintaining optimal resource allocation.

Value Proposition

In-Place Pod Resize brings significant operational and financial benefits to Kubernetes environments:

  1. Zero Downtime Scaling: Eliminate disruptions tied to pod restarts, which is critical for stateful applications and SLAs.
  2. Smarter Resource Allocation: Fine-tune CPU and memory resources in real time based on current needs.
  3. Improved Cost Efficiency: Avoid overprovisioning by dynamically adjusting resources instead of reserving excess capacity.
  4. Compatibility with HPA: In-Place Pod Resize removes blockers to running Vertical Pod Autoscaler (VPA) alongside Horizontal Pod Autoscaler (HPA).

Benefits

  1. No Service Disruption: Maintain availability during scaling operations.
  2. Operational Simplicity: Reduce the need for manual resource tuning and restart logic.
  3. More Precise Scaling: Enable per-container tuning, especially useful in multi-container Pods.
  4. FinOps-Aligned Optimization: Helps organizations move from static provisioning to demand-based scaling.

Use Cases

  1. Production APIs: Adjust memory or CPU on live services under varying load.
  2. Stateful Workloads: Scale vertically without risking application state loss.
  3. VPA + HPA Environments: Combine vertical and horizontal scaling strategies effectively.
  4. Bursting Workloads: React to short-term spikes in usage without service disruption.

Challenges

  1. Node Fit Limitations: Enlarged Pods may not fit on their current nodes, requiring rescheduling.
  2. Operational Complexity: Coordinating evictions, rollout strategy, and webhook mutation may require additional tooling.
  3. Visibility & Controls: Teams need observability into scaling behavior and mechanisms for safe policy enforcement.

Integration with Zesty

Zesty’s Pod Rightsizing solution now supports In-Place Pod Resize, further enhancing its real-time optimization capabilities. When additional CPU or memory is needed:

  1. Zesty patches the resource requests without restarting the Pod.
  2. If the Pod becomes infeasible on its current node, Zesty gradually evicts and reschedules it in a controlled manner.
  3. The rescheduled Pod passes through a mutation webhook, ensuring updated resource requests are applied seamlessly.

This integration enables fully automated, zero-downtime vertical scaling as part of Zesty’s broader Kubernetes optimization toolkit.


Related Concepts


References