Kubernetes v1.35: New level of efficiency with in-place Pod restart
Summary
Kubernetes v1.35 introduces the RestartAllContainers alpha feature, which enables the in-place restart of all containers within a Pod without requiring the Pod to be deleted and recreated. This mechanism allows for a rapid reset of a Pod's state by re-executing the startup sequence while preserving the Pod's underlying network and storage resources.
Key Points
- Introduced as an alpha feature in Kubernetes v1.35 via the
RestartAllContainersOnContainerExitsfeature gate. - Extends the
ContainerRestartRulesfeature, which graduated to beta in version 1.35. - Preserves the Pod's UID, IP address, network namespace, sandbox, attached devices, and all volumes (including
emptyDirand PVCs). - Re-executes the entire startup sequence, including all init containers, followed by sidecars and regular containers.
- Reduces recovery time for large-scale AI/ML workloads (e.g., 1,000+ nodes) from minutes to seconds.
- Provides a new Pod condition,
AllContainersRestarting, to track the restart lifecycle. - Does not execute
preStophooks during the in-place restart process.
Technical Details
The RestartAllContainers action is triggered when a container exits with an exit code that matches a defined rule within the restartPolicyRules configuration. When this action is invoked, the kubelet terminates all running containers—including ephemeral containers—and restarts the Pod's lifecycle from the beginning. This ensures that any init containers responsible for environment setup or credential fetching are re-run, providing a clean state for the application.
To utilize this feature, the RestartAllContainersOnContainerExits feature gate must be enabled on both the API server and the kubelet. The configuration is implemented within the Pod specification, allowing developers to map specific exitCodes to the RestartAllContainers action. Because the kubelet bypasses preStop hooks during this process, all containers must be designed to be reentrant and capable of handling abrupt termination. For observability, the container's restart count is incremented, and the AllContainersRestarting condition becomes True until the new startup sequence is complete.
Impact / Why It Matters
This feature significantly reduces the computational overhead and cost associated with recovering large-scale, synchronous workloads, such as AI/ML training, where deleting and rescheduling thousands of Pods can lead to massive resource waste. It also allows developers to implement robust, Kubernetes-native recovery strategies for complex inter-container dependencies without needing custom external controllers.