diff --git a/keps/sig-node/5419-pod-level-resources-in-place-resize/README.md b/keps/sig-node/5419-pod-level-resources-in-place-resize/README.md index f424c4fae40..78457eb408b 100644 --- a/keps/sig-node/5419-pod-level-resources-in-place-resize/README.md +++ b/keps/sig-node/5419-pod-level-resources-in-place-resize/README.md @@ -5,8 +5,10 @@ - [Summary](#summary) - [Motivation](#motivation) - [Goals](#goals) + - [Non Goals](#non-goals) - [Proposal](#proposal) - [Notes/Constraints/Caveats](#notesconstraintscaveats) + - [Risks and Mitigations](#risks-and-mitigations) - [Design Details](#design-details) - [Design Principles](#design-principles) - [Components/Features changes](#componentsfeatures-changes) @@ -102,7 +104,29 @@ This proposal aims to: 1. Extend the In-Place Pod Resize (IPPR) functionality to support dynamic adjustments of pod-level CPU and Memory resources. 2. Ensure compatibility and proper interaction between pod-level IPPR and existing container-level IPPR mechanisms. -3. Provide clear mechanisms for tracking and reporting the actual allocated pod-level resources in PodStatus +3. Provide clear mechanisms for tracking and reporting the actual allocated + pod-level resources in PodStatus + +### Non Goals +This KEP focuses solely on extending IPPR to pod-level resources, so the non-goals +are largely the same as [IPPR's +non-goals](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources#non-goals). +These include: + +1. This KEP focuses solely on in-place resizing of core compute resources (CPU and + Memory) at the pod level. Extending this functionality to other resource types + (e.g., GPUs, network bandwidth) is outside the current scope. + +2. This KEP does not aim to implement dynamic changes to a pod's QoS class based on + in-place resource resize operations. + +3. No dynamic adjustments for Init Containers that have already finished and can't + be restarted. + +4. No automatic removal of lower-priority pods to make room for a pod that's resizing its resources. + +5. This KEP doesn't aim to fix every complex timing issue that can happen between + the Kubelet and the scheduler during resizes that already exist in [KEP#1287](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md). ## Proposal ### Notes/Constraints/Caveats @@ -118,6 +142,41 @@ This proposal aims to: 3. This feature relies on the PodLevelResources, InPlacePodVerticalScaling and InPlacePodLevelResourcesVerticalScaling feature gates being enabled. +### Risks and Mitigations +This KEP focuses solely on extending IPPR to pod-level resources, so the risks +are largely the same as [IPPR's +risks and mitigations](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources#risks-and-mitigations) +& [Pod-Level Resources' risks and +mitigations](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2837-pod-level-resource-spec/README.md#risks-and-mitigations). +These include: + +1. Backward compatibility: For pods with pod-level resources, when Pod.Spec.Resources + becomes representative of desired state, and Pod's actual resource configurations are + tracked in Pod.Status.Resources, applications that query PodSpec and rely on + Resources in PodSpec to determine resource configurations will see values that + may not represent actual configurations. As a mitigation, this change needs to be + documented and highlighted in the release notes, and in + top-level Kubernetes documents. + +2. Scheduler race condition: If a resize happens concurrently with the scheduler + evaluating the node where the pod is resized, it can result in a node being + over-scheduled, which will cause the pod to be rejected with an OutOfCPU or + OutOfMemory error. Solving this race condition is out of scope for this KEP, but + a general solution may be considered in the future. + +3. Since Pod Level Resource Specifications is an opt-in feature, merging the feature related changes won't impact existing workloads. Moreover, the feature will be rolled out gradually, beginning with an alpha release for testing and gathering feedback. This will be followed by beta and GA releases as the feature matures and potential problems and improvements are addressed. + +4. While this feature doesn't alter the existing cgroups structure, it does change how pod-level cgroup values are determined. Currently, Kubernetes derived these values from the container-level cgroup settings. However, with Pod Level Resource Specifications enabled, pod-level cgroup settings will be directly set based on the values specified in the pod's resource spec stanza, if set. This change in behavior could potentially affect: + +Workloads or tools that rely on reading cgroup values: This means that any workloads or tools that depend on reading or interpreting container cgroup values might observe different derived values if pod-level resources are specified without container level settings. + +Third-party schedulers or tools that make assumptions about pod-level resource calculation: These tools might require adjustments to accommodate the new way pod-level resources are determined. + +To mitigate potential issues, the feature documentation will clearly highlight this change and its potential impact. This will allow users to: + + - Adjust their pod-level and container-level resource settings as needed + - Modify any custom schedulers or tools to align with new resource calculation method. + ## Design Details ### Design Principles