Skip to content

Commit 3bfbc18

Browse files
authored
Merge pull request #3914 from wangchen615/customizable_recommender_kep
Add enhancement proposal for feature request #3913
2 parents 7e8972f + 680a094 commit 3bfbc18

File tree

2 files changed

+298
-0
lines changed

2 files changed

+298
-0
lines changed
Lines changed: 298 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,298 @@
1+
# Support Customized Recommenders for Vertical Pod Autoscalers
2+
3+
<!-- toc -->
4+
- [Summary](#summary)
5+
- [Motivation](#motivation)
6+
- [Goals](#goals)
7+
- [Non-Goals](#non-goals)
8+
- [Proposal](#proposal)
9+
- [User Stories](#user-stories-optional)
10+
- [Story 1](#story-1)
11+
- [Story 2](#story-2)
12+
- [Implementation Details](#implementation-details)
13+
- [Deployment Details](#deployment-details)
14+
- [Design Details](#design-details)
15+
- [Test Plan](#test-plan)
16+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
17+
- [Alternatives](#alternatives)
18+
- [Out of Scope](#out-out-scope)
19+
- [Implementation History](#implementation-history)
20+
<!-- /toc -->
21+
22+
## Summary
23+
24+
Today, the current VPA recommends CPU/Mem requests based on one recommender,
25+
which recommends the future requests based on the historical usage observed in a
26+
rolling time window. As there is no universal recommendation policy that applies to all
27+
types of workload, this KEP suggests supporting multiple customized recommenders in VPA.
28+
Thus, users can run different recommenders for different workloads, as they may exhibit
29+
very distinct resource usage behaviors.
30+
31+
## Motivation
32+
33+
A VPA is used to recommend the requested resources of containers in pods when the actual CPU/memory usage of a container
34+
is significantly different from the resources requested. Resource usage-based recommendation
35+
is the basic approach that resizes containers according to the actual usage observed and is
36+
implemented in the default VPA recommender. Users can configure the time window and a certain
37+
percentile of observed usage in the past as the prediction of future requests/limits for CPU/memory.
38+
39+
However, as containers running different types of workloads may have different resource usage patterns,
40+
there is no universal policy that applies to all. The existing VPA recommender may not accurately
41+
predict future resource usage when containers exhibit certain resource usage behaviors,
42+
such as trending, periodically changing, or occasional spikes, resulting in significant
43+
over-provisioning and OOM kills for microservices. Learning different types of resource usage
44+
behaviors for workloads and applying different algorithms to improve resource utilization
45+
(CPU and Memory) predictions can significantly reduce over-provisioning and OOM kills in VPA.
46+
47+
### Goals
48+
49+
- Allow the VPA object to specify a customized recommender to use.
50+
- Allow the VPA object to use the default recommender when no recommender is specified.
51+
52+
### Non-Goals
53+
54+
- We assume no pod uses two recommenders at the same time.
55+
- We do not resolve conflicts between recommenders.
56+
57+
## Proposal
58+
59+
### User Stories
60+
61+
#### Story 1
62+
63+
- Containers with Cyclic Patterns in Resource Usage
64+
65+
Containers used in monitoring may receive load periodically to process but need to be long-running
66+
to listen to incoming traffic. Thus, these containers usually exhibit cyclic patterns, alternating
67+
between usage spikes and idling. Resizing containers according to usage observed in the previous
68+
time window may always lead to under-provision for a short period when the load spikes just arrive.
69+
The problem will happen for memory if the cyclic pattern length is >8 days. For CPU, the problem may
70+
be visible for example with lower usage on the weekend. The problem will even lead to frequent pod evictions
71+
when the pod was resized according to the idling period and the host resource has been taken by other pods.
72+
73+
#### Story 2
74+
75+
- Containers with Different but Recurrent Behaviors in Resource Usage
76+
77+
Containers running spark/deep learning training workloads are known to show recurring and repeating
78+
patterns in resource usage. Prior research has shown that different but recurrent behaviors occur
79+
for different containerized tasks, such as Spark or deep learning training. These common patterns can
80+
be represented by phases, which display similar resource usage of computational resources over time.
81+
There are common sequences of patterns for different executions of the workload and they can be used
82+
to proactively predict future resource usage more accurately. The default recommender in the current
83+
VPA adopts a reactive approach so a more proactive recommender is needed for these types of workload.
84+
85+
### Implementation Details
86+
87+
The following describes the details of implementing a first-citizen approach to support the customized
88+
recommender. Namely, a dedicated field `recommenderName` is added to the VPA crd definition in
89+
`deploy/vpa-v1.crd.yaml`.
90+
91+
```yaml
92+
validation:
93+
# openAPIV3Schema is the schema for validating custom objects.
94+
openAPIV3Schema:
95+
type: object
96+
properties:
97+
spec:
98+
type: object
99+
required: []
100+
properties:
101+
recommenderName:
102+
type: string
103+
targetRef:
104+
type: object
105+
updatePolicy:
106+
type: object
107+
```
108+
109+
Correspondingly, the `VerticalPodAutoscalerSpec` in `pkg/apis/autoscaling.k8s.io/v1/types.go`
110+
should be updated to include the `recommenderName` field.
111+
112+
```golang
113+
// VerticalPodAutoscalerSpec is the specification of the behavior of the autoscaler.
114+
type VerticalPodAutoscalerSpec struct {
115+
// TargetRef points to the controller managing the set of pods for the
116+
// autoscaler to control - e.g. Deployment, StatefulSet. VerticalPodAutoscaler
117+
// can be targeted at controller implementing scale subresource (the pod set is
118+
// retrieved from the controller's ScaleStatus) or some well known controllers
119+
// (e.g. for DaemonSet the pod set is read from the controller's spec).
120+
// If VerticalPodAutoscaler cannot use specified target it will report
121+
// ConfigUnsupported condition.
122+
// Note that VerticalPodAutoscaler does not require full implementation
123+
// of scale subresource - it will not use it to modify the replica count.
124+
// The only thing retrieved is a label selector matching pods grouped by
125+
// the target resource.
126+
TargetRef *autoscaling.CrossVersionObjectReference `json:"targetRef" protobuf:"bytes,1,name=targetRef"`
127+
128+
// Describes the rules on how changes are applied to the pods.
129+
// If not specified, all fields in the `PodUpdatePolicy` are set to their
130+
// default values.
131+
// +optional
132+
UpdatePolicy *PodUpdatePolicy `json:"updatePolicy,omitempty" protobuf:"bytes,2,opt,name=updatePolicy"`
133+
134+
// Controls how the autoscaler computes recommended resources.
135+
// The resource policy may be used to set constraints on the recommendations
136+
// for individual containers. If not specified, the autoscaler computes recommended
137+
// resources for all containers in the pod, without additional constraints.
138+
// +optional
139+
ResourcePolicy *PodResourcePolicy `json:"resourcePolicy,omitempty" protobuf:"bytes,3,opt,name=resourcePolicy"`
140+
141+
// Name of the recommender responsible for generating recommendation for this object.
142+
RecommenderName []string `json:"recommenderName,omitempty" protobuf:"bytes,4,opt,name=recommenderName"`
143+
}
144+
```
145+
146+
When creating a recommender object for recommendations, the recommender main routine should initiate itself
147+
with a predefined recommender name, which can be defined as a constant in the `pkg/recomender/main.go` routine,
148+
149+
```golang
150+
const RecommenderName = "default"
151+
152+
recommender := routines.NewRecommender(config, *checkpointsGCInterval, useCheckpoints, RecommenderName, *vpaObjectNamespace)
153+
```
154+
155+
where the routines.NewRecommender can pass the `RecommenderName` to the clusterState object.
156+
157+
```golang
158+
// NewRecommender creates a new recommender instance.
159+
// Dependencies are created automatically.
160+
// Deprecated; use RecommenderFactory instead.
161+
func NewRecommender(config *rest.Config, checkpointsGCInterval time.Duration, useCheckpoints bool, recommender_name string, namespace string) Recommender {
162+
clusterState := model.NewClusterState(recommender_name)
163+
return RecommenderFactory{
164+
ClusterState: clusterState,
165+
ClusterStateFeeder: input.NewClusterStateFeeder(config, clusterState, *memorySaver, namespace),
166+
CheckpointWriter: checkpoint.NewCheckpointWriter(clusterState, vpa_clientset.NewForConfigOrDie(config).AutoscalingV1()),
167+
VpaClient: vpa_clientset.NewForConfigOrDie(config).AutoscalingV1(),
168+
PodResourceRecommender: logic.CreatePodResourceRecommender(),
169+
CheckpointsGCInterval: checkpointsGCInterval,
170+
UseCheckpoints: useCheckpoints,
171+
}.Make()
172+
}
173+
174+
175+
// NewClusterState returns a new ClusterState with no pods.
176+
func NewClusterState(recommender_name string) *ClusterState {
177+
return &ClusterState{
178+
RecommenderName: recommender_name,
179+
Pods: make(map[PodID]*PodState),
180+
Vpas: make(map[VpaID]*Vpa),
181+
EmptyVPAs: make(map[VpaID]time.Time),
182+
aggregateStateMap: make(aggregateContainerStatesMap),
183+
labelSetMap: make(labelSetMap),
184+
}
185+
}
186+
```
187+
188+
Therefore, when loading VPA objects to the `clusterStateFeeder`, it can use the field selector to select VPA CRDs that
189+
have `recommenderName` equal to the current clusterState’s `RecommenderName`.
190+
```golang
191+
// Fetch VPA objects and load them into the cluster state.
192+
func (feeder *clusterStateFeeder) LoadVPAs() {
193+
// List VPA API objects.
194+
allVpaCRDs, err := feeder.vpaLister.List(labels.Everything())
195+
if err != nil {
196+
klog.Errorf("Cannot list VPAs. Reason: %+v", err)
197+
return
198+
}
199+
200+
var vpaCRDs []*vpa_types.VerticalPodAutoscaler
201+
for _, vpaCRD := range allVpaCRDs {
202+
currentRecommenderName := feeder.clusterState.RecommenderName
203+
if (vpaCRD.Spec.RecommenderName != currentRecommenderName) && (vpaCRD.Spec.RecommenderName != "") {
204+
klog.V(6).Infof("Ignoring the vpaCRD as its name %v is not equal to the current recommender's name %v", vpaCRD.Spec.RecommenderName, currentRecommenderName)
205+
continue
206+
}
207+
vpaCRDs = append(vpaCRDs, vpaCRD)
208+
209+
klog.V(3).Infof("Fetched %d VPAs.", len(vpaCRDs))
210+
// Add or update existing VPAs in the model.
211+
vpaKeys := make(map[model.VpaID]bool)
212+
213+
214+
215+
feeder.clusterState.ObservedVpas = vpaCRDs
216+
}
217+
```
218+
219+
Accordingly, the VPA object yaml should include the `recommenderName` as the default `RecommenderName`.
220+
```yaml
221+
apiVersion: "autoscaling.k8s.io/v1"
222+
kind: VerticalPodAutoscaler
223+
metadata:
224+
name: hamster-vpa
225+
Spec:
226+
recommenderName: default
227+
targetRef:
228+
apiVersion: "apps/v1"
229+
... ...
230+
```
231+
232+
### Deployment Details
233+
The customized recommender is supposed to be deployed as a separate deployment that is chosen
234+
by different sets of VPA objects.Each VPA object is supposed to choose only one recommender at a time.
235+
The way how the default recommender and the customized recommender are running and interacting with VPA objects
236+
are shown in the following drawing.
237+
238+
<img src="images/deployment.png" alt="deployment" width="720" height="360"/>
239+
240+
Though we do not support a VPA object to use multiple recommenders in this proposal, we leave the possibility of necessary
241+
changes of using multiple recommenders in the future. Namely, we define `recommenderName` to be an array instead of a string, but we support one element only in this proposal. We modify the admission controller to validate that the array has <= 1 elements.
242+
243+
We will add the following check in the `func validateVPA(vpa *vpa_types.VerticalPodAutoscaler, isCreate bool)` function.
244+
```
245+
if len(vpa.Spec.RecommenderName) > 1 {
246+
return fmt.Errorf("VPA object shouldn't specify more than one recommenderNames.")
247+
}
248+
```
249+
250+
251+
## Design Details
252+
253+
### Test Plan
254+
- Add e2e test demonstrating the default recommender ignores a VPA which specifies an alternate recommender.
255+
256+
### Upgrade / Downgrade Strategy
257+
For cluster upgrades, the VPA from the previous version will continue working as before.
258+
There is no change in behavior or flags which have to be enabled or disabled.
259+
260+
## Alternatives
261+
262+
### Develop a plugin framework for customizable recommenders.
263+
Add a webhook system for customized recommendations. The default VPA recommender would
264+
call any available recommendation webhooks, and if any of them make a recommendation,
265+
the recommender would use that recommendation instead of making its own. If none make
266+
a recommendation, it would make its recommendation as it currently does. The plugin alternative
267+
is rejected because it involves much more design changes and code changes. It might be considered in the future if there are
268+
more use cases where running multiple recommenders for the same VPA object is needed.
269+
270+
### Develop a label selector approach.
271+
Add a label for the CRD object to denote the recommender’s name. When making
272+
recommendations in the recommender, only the VpaCrds with the label
273+
`recommender=default` will be loaded and updated by the existing recommender.
274+
A label selector approach is rejected because it is too powerful and users can easily
275+
ignore those labels and misconfigure the VPA objects.
276+
277+
## Out of Scope
278+
279+
- Although this proposal will enable alternate recommenders, no alternate recommenders
280+
will be created as part of this proposal.
281+
- This proposal will not support running multiple recommenders for the same VPA object. Each VPA object
282+
is supposed to use only one recommender.
283+
284+
## Implementation History
285+
286+
<!--
287+
Major milestones in the lifecycle of a KEP should be tracked in this section.
288+
Major milestones might include:
289+
- the `Summary` and `Motivation` sections being merged, signaling SIG acceptance
290+
- the `Proposal` section being merged, signaling agreement on a proposed design
291+
- the date implementation started
292+
- the first Kubernetes release where an initial version of the KEP was available
293+
- the version of Kubernetes where the KEP graduated to general availability
294+
- when the KEP was retired or superseded
295+
-->
296+
297+
298+
65.1 KB
Loading

0 commit comments

Comments
 (0)