Kubernetes v1.35: Introducing Workload Aware Scheduling

Summary

Kubernetes v1.35 introduces workload-aware scheduling features designed to manage multi-Pod applications, such as machine learning batch jobs, more efficiently. The update implements the new Workload API and gang scheduling to enable all-or-nothing Pod placement, alongside optimizations for scheduling identical Pods.

Key Points

Introduces the scheduling.k8s.io/v1alpha1 API group to define structured scheduling requirements for multi-Pod applications.
Implements gang scheduling via the GangScheduling plugin, ensuring a group of Pods is only scheduled if the minCount requirement is met.
Features "opportunistic batching" (Beta) to reduce scheduling latency by reusing feasibility calculations for Pods with identical resource requests, images, and affinities.
Includes a 5-minute timeout for gang scheduling; if the full group cannot be assigned to nodes within this window, all Pods in the group are rejected and returned to the queue.
Requires enabling the GenericWorkload feature gate on both kube-apiserver and kube-scheduler to utilize the Workload API.
The OpportunisticBatching feature is enabled by default in v1.35 but can be managed via the OpportunisticBatching feature gate on kube-scheduler.

Technical Details

The Workload API allows users to define a Workload resource that specifies a podGroup and a scheduling policy. Pods are linked to this workload using the workloadRef and podGroup fields in their specification. The GangScheduling plugin manages the lifecycle of these Pods by blocking them from scheduling until the referenced Workload object exists and the number of pending Pods in the group meets the defined minCount. Once the minCount is reached, the scheduler uses a "Permit" gate to verify that valid assignments exist for the entire group. If the scheduler cannot find valid assignments for the whole group within five minutes, it rejects all Pods in that group to prevent resource deadlocks and wastage.

To optimize performance, the OpportunisticBatching feature identifies Pods that share identical scheduling criteria, such as container images, resource requests, and affinity rules. When the scheduler processes a Pod, it can reuse the feasibility calculations from previous identical Pods in the queue, significantly speeding up the scheduling process for large, identical workloads. However, this mechanism is disabled if any scheduling-relevant fields differ between Pods or if specific features are used that interfere with the batching logic.

Impact / Why It Matters

These updates reduce resource deadlocks and wastage in large-scale, multi-Pod workloads by ensuring that interdependent Pods are scheduled as a single unit. Additionally, the introduction of opportunistic batching provides a performance boost for high-scale, identical workloads without requiring manual configuration.

Kubernetes v1.35: Introducing Workload Aware Scheduling

Kubernetes v1.35: Introducing Workload Aware Scheduling

Summary

Key Points

Technical Details

Impact / Why It Matters

↳ Sources