You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. Extend the In-Place Pod Resize (IPPR) functionality to support dynamic
103
105
adjustments of pod-level CPU and Memory resources.
104
106
2. Ensure compatibility and proper interaction between pod-level IPPR and existing container-level IPPR mechanisms.
105
-
3. Provide clear mechanisms for tracking and reporting the actual allocated pod-level resources in PodStatus
107
+
3. Provide clear mechanisms for tracking and reporting the actual allocated
108
+
pod-level resources in PodStatus
109
+
110
+
### Non Goals
111
+
112
+
1. This KEP focuses solely on in-place resizing of core compute resources (CPU and
113
+
Memory) at the pod level. Extending this functionality to other resource types
114
+
(e.g., GPUs, network bandwidth) is outside the current scope.
115
+
116
+
2. This KEP does not aim to implement dynamic changes to a pod's QoS class based on
117
+
in-place resource resize operations.
118
+
119
+
3. No dynamic adjustments for Init Containers that have already finished and can't
120
+
be restarted.
121
+
122
+
4. No automatic removal of lower-priority pods to make room for a pod that's resizing its resources.
123
+
124
+
5. This KEP doesn't aim to fix every complex timing issue that can happen between
125
+
the Kubelet and the scheduler during resizes that already exist in [KEP#1287](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md).
106
126
107
127
## Proposal
108
128
### Notes/Constraints/Caveats
@@ -118,6 +138,26 @@ This proposal aims to:
118
138
119
139
3. This feature relies on the PodLevelResources, InPlacePodVerticalScaling and InPlacePodLevelResourcesVerticalScaling feature gates being enabled.
120
140
141
+
### Risks and Mitigations
142
+
143
+
1. Backward compatibility: For pods with pod-level resources, when Pod.Spec.Resources
144
+
becomes representative of desired state, and Pod's actual resource configurations are
145
+
tracked in Pod.Status.Resources, applications that query PodSpec and rely on
146
+
Resources in PodSpec to determine resource configurations will see values that
147
+
may not represent actual configurations. As a mitigation, this change needs to be
148
+
documented and highlighted in the release notes, and in
149
+
top-level Kubernetes documents.
150
+
151
+
2. Resizing memory lower: Lowering cgroup memory limits may not work as pages could
152
+
be in use, and approaches such as setting limit near current usage may be
153
+
required. This issue needs further investigation.
154
+
155
+
3. Scheduler race condition: If a resize happens concurrently with the scheduler
156
+
evaluating the node where the pod is resized, it can result in a node being
157
+
over-scheduled, which will cause the pod to be rejected with an OutOfCPU or
158
+
OutOfMemory error. Solving this race condition is out of scope for this KEP, but
159
+
a general solution may be considered in the future.
0 commit comments