Kubernetes 1.35: In-Place Pod Resize Graduates to Stable
Summary
Kubernetes v1.35 marks the graduation of the In-Place Pod Resize feature to Stable (GA). This feature allows for the modification of CPU and memory requests and limits within a running Pod, eliminating the requirement to delete and recreate Pods for vertical scaling.
Key Points
- In-Place Pod Resize transitioned from Beta (v1.33) to Stable (GA) in Kubernetes v1.35.
- The feature enables the mutation of
spec.containers[*].resourcesfor CPU and memory via theresizesubresource. - In v1.35, decreasing memory limits is now permitted; the Kubelet performs a best-effort check to prevent OOM-kills by verifying that current usage is below the new limit.
- When a node lacks sufficient capacity, deferred resize requests are reattempted based on
PriorityClass,QoS class, and request duration (FIFO). - Support for Pod Level Resources has been introduced as an Alpha feature in v1.35.
- The feature remains incompatible with swap, the static CPU Manager, and the static Memory Manager.
- New Kubelet metrics and Pod events are available in v1.35 to monitor and debug resource changes.
Technical Details
The In-Place Pod Resize mechanism functions by decoupling the desired resource state from the actual runtime state. The spec.containers[*].resources field serves as the desired configuration, while the status.containerStatuses[*].resources field reflects the resources currently applied to the running container. Resource adjustments are triggered by updating the desired requests and limits through the resize subresource.
In version 1.35, the logic for handling resource decreases was expanded to allow memory limit reductions. While the Kubelet attempts to prevent OOM-kills by checking current usage against the new limit, this check is best-effort and not a guarantee of stability. For resource contention, the Kubelet manages a queue of deferred resizes, prioritizing them by PriorityClass, then QoS class, and finally by the age of the request. Note that while CPU and memory are mutable, all other resource types remain immutable, and certain runtime environments (such as specific Java and Python configurations) may still require a container restart to recognize memory changes.
Impact / Why It Matters
This feature enables seamless vertical autoscaling and reduces operational disruption for stateful, batch, or latency-sensitive workloads by allowing resource adjustments without container restarts. It provides the necessary infrastructure for advanced patterns such as CPU Startup Boost and more efficient node bin-packing.