|  | 
|  | 1 | +# Support Customized Recommenders for Vertical Pod Autoscalers | 
|  | 2 | + | 
|  | 3 | +<!-- toc --> | 
|  | 4 | +- [Summary](#summary) | 
|  | 5 | +- [Motivation](#motivation) | 
|  | 6 | +  - [Goals](#goals) | 
|  | 7 | +  - [Non-Goals](#non-goals) | 
|  | 8 | +- [Proposal](#proposal) | 
|  | 9 | +  - [User Stories](#user-stories-optional) | 
|  | 10 | +    - [Story 1](#story-1) | 
|  | 11 | +    - [Story 2](#story-2) | 
|  | 12 | +  - [Implementation Details](#implementation-details) | 
|  | 13 | +  - [Deployment Details](#deployment-details) | 
|  | 14 | +- [Design Details](#design-details) | 
|  | 15 | +  - [Test Plan](#test-plan) | 
|  | 16 | +  - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) | 
|  | 17 | +- [Alternatives](#alternatives) | 
|  | 18 | +- [Out of Scope](#out-out-scope) | 
|  | 19 | +- [Implementation History](#implementation-history) | 
|  | 20 | +<!-- /toc --> | 
|  | 21 | + | 
|  | 22 | +## Summary | 
|  | 23 | + | 
|  | 24 | +Today, the current VPA recommends CPU/Mem requests based on one recommender,  | 
|  | 25 | +which recommends the future requests based on the historical usage observed in a  | 
|  | 26 | +rolling time window. As there is no universal recommendation policy that applies to all  | 
|  | 27 | +types of workload, this KEP suggests supporting multiple customized recommenders in VPA.   | 
|  | 28 | +Thus, users can run different recommenders for different workloads, as they may exhibit  | 
|  | 29 | +very distinct resource usage behaviors. | 
|  | 30 | + | 
|  | 31 | +## Motivation | 
|  | 32 | + | 
|  | 33 | +A VPA is used to recommend the requested resources of containers in pods when the actual CPU/memory usage of a container  | 
|  | 34 | +is significantly different from the resources requested. Resource usage-based recommendation  | 
|  | 35 | +is the basic approach that resizes containers according to the actual usage observed and is  | 
|  | 36 | +implemented in the default VPA recommender. Users can configure the time window and a certain  | 
|  | 37 | +percentile of observed usage in the past as the prediction of future requests/limits for CPU/memory. | 
|  | 38 | + | 
|  | 39 | +However, as containers running different types of workloads may have different resource usage patterns,  | 
|  | 40 | +there is no universal policy that applies to all. The existing VPA recommender may not accurately  | 
|  | 41 | +predict future resource usage when containers exhibit certain resource usage behaviors,  | 
|  | 42 | +such as trending, periodically changing, or occasional spikes, resulting in significant  | 
|  | 43 | +over-provisioning and OOM kills for microservices. Learning different types of resource usage  | 
|  | 44 | +behaviors for workloads and applying different algorithms to improve resource utilization  | 
|  | 45 | +(CPU and Memory) predictions can significantly reduce over-provisioning and OOM kills in VPA. | 
|  | 46 | + | 
|  | 47 | +### Goals | 
|  | 48 | + | 
|  | 49 | +- Allow the VPA object to specify a customized recommender to use.  | 
|  | 50 | +- Allow the VPA object to use the default recommender when no recommender is specified. | 
|  | 51 | + | 
|  | 52 | +### Non-Goals | 
|  | 53 | + | 
|  | 54 | +- We assume no pod uses two recommenders at the same time. | 
|  | 55 | +- We do not resolve conflicts between recommenders. | 
|  | 56 | + | 
|  | 57 | +## Proposal | 
|  | 58 | + | 
|  | 59 | +### User Stories | 
|  | 60 | + | 
|  | 61 | +#### Story 1  | 
|  | 62 | + | 
|  | 63 | +- Containers with Cyclic Patterns in Resource Usage | 
|  | 64 | + | 
|  | 65 | +Containers used in monitoring may receive load periodically to process but need to be long-running  | 
|  | 66 | +to listen to incoming traffic. Thus, these containers usually exhibit cyclic patterns, alternating  | 
|  | 67 | +between usage spikes and idling. Resizing containers according to usage observed in the previous  | 
|  | 68 | +time window may always lead to under-provision for a short period when the load spikes just arrive.  | 
|  | 69 | +The problem will happen for memory if the cyclic pattern length is >8 days. For CPU, the problem may  | 
|  | 70 | +be visible for example with lower usage on the weekend. The problem will even lead to frequent pod evictions  | 
|  | 71 | +when the pod was resized according to the idling period and the host resource has been taken by other pods. | 
|  | 72 | + | 
|  | 73 | +#### Story 2  | 
|  | 74 | + | 
|  | 75 | +- Containers with Different but Recurrent Behaviors in Resource Usage | 
|  | 76 | + | 
|  | 77 | +Containers running spark/deep learning training workloads are known to show recurring and repeating  | 
|  | 78 | +patterns in resource usage. Prior research has shown that different but recurrent behaviors occur  | 
|  | 79 | +for different containerized tasks, such as Spark or deep learning training. These common patterns can  | 
|  | 80 | +be represented by phases, which display similar resource usage of computational resources over time.  | 
|  | 81 | +There are common sequences of patterns for different executions of the workload and they can be used  | 
|  | 82 | +to proactively predict future resource usage more accurately. The default recommender in the current  | 
|  | 83 | +VPA adopts a reactive approach so a more proactive recommender is needed for these types of workload. | 
|  | 84 | + | 
|  | 85 | +### Implementation Details | 
|  | 86 | + | 
|  | 87 | +The following describes the details of implementing a first-citizen approach to support the customized  | 
|  | 88 | +recommender. Namely, a dedicated field `recommenderName` is added to the VPA crd definition in  | 
|  | 89 | +`deploy/vpa-v1.crd.yaml`. | 
|  | 90 | + | 
|  | 91 | +```yaml | 
|  | 92 | +validation: | 
|  | 93 | + # openAPIV3Schema is the schema for validating custom objects. | 
|  | 94 | + openAPIV3Schema: | 
|  | 95 | +   type: object | 
|  | 96 | +   properties: | 
|  | 97 | +     spec: | 
|  | 98 | +       type: object | 
|  | 99 | +       required: [] | 
|  | 100 | +       properties: | 
|  | 101 | +         recommenderName: | 
|  | 102 | +           type: string | 
|  | 103 | +         targetRef: | 
|  | 104 | +           type: object | 
|  | 105 | +         updatePolicy: | 
|  | 106 | +           type: object | 
|  | 107 | +``` | 
|  | 108 | +
 | 
|  | 109 | +Correspondingly, the `VerticalPodAutoscalerSpec` in `pkg/apis/autoscaling.k8s.io/v1/types.go`  | 
|  | 110 | +should be updated to include the `recommenderName` field. | 
|  | 111 | + | 
|  | 112 | +```golang | 
|  | 113 | +// VerticalPodAutoscalerSpec is the specification of the behavior of the autoscaler. | 
|  | 114 | +type VerticalPodAutoscalerSpec struct { | 
|  | 115 | +	// TargetRef points to the controller managing the set of pods for the | 
|  | 116 | +	// autoscaler to control - e.g. Deployment, StatefulSet. VerticalPodAutoscaler | 
|  | 117 | +	// can be targeted at controller implementing scale subresource (the pod set is | 
|  | 118 | +	// retrieved from the controller's ScaleStatus) or some well known controllers | 
|  | 119 | +	// (e.g. for DaemonSet the pod set is read from the controller's spec). | 
|  | 120 | +	// If VerticalPodAutoscaler cannot use specified target it will report | 
|  | 121 | +	// ConfigUnsupported condition. | 
|  | 122 | +	// Note that VerticalPodAutoscaler does not require full implementation | 
|  | 123 | +	// of scale subresource - it will not use it to modify the replica count. | 
|  | 124 | +	// The only thing retrieved is a label selector matching pods grouped by | 
|  | 125 | +	// the target resource. | 
|  | 126 | +	TargetRef *autoscaling.CrossVersionObjectReference `json:"targetRef" protobuf:"bytes,1,name=targetRef"` | 
|  | 127 | + | 
|  | 128 | +	// Describes the rules on how changes are applied to the pods. | 
|  | 129 | +	// If not specified, all fields in the `PodUpdatePolicy` are set to their | 
|  | 130 | +	// default values. | 
|  | 131 | +	// +optional | 
|  | 132 | +	UpdatePolicy *PodUpdatePolicy `json:"updatePolicy,omitempty" protobuf:"bytes,2,opt,name=updatePolicy"` | 
|  | 133 | + | 
|  | 134 | +	// Controls how the autoscaler computes recommended resources. | 
|  | 135 | +	// The resource policy may be used to set constraints on the recommendations | 
|  | 136 | +	// for individual containers. If not specified, the autoscaler computes recommended | 
|  | 137 | +	// resources for all containers in the pod, without additional constraints. | 
|  | 138 | +	// +optional | 
|  | 139 | +	ResourcePolicy *PodResourcePolicy `json:"resourcePolicy,omitempty" protobuf:"bytes,3,opt,name=resourcePolicy"` | 
|  | 140 | +   | 
|  | 141 | +  // Name of the recommender responsible for generating recommendation for this object. | 
|  | 142 | +  RecommenderName []string `json:"recommenderName,omitempty" protobuf:"bytes,4,opt,name=recommenderName"` | 
|  | 143 | +} | 
|  | 144 | +``` | 
|  | 145 | + | 
|  | 146 | +When creating a recommender object for recommendations, the recommender main routine should initiate itself  | 
|  | 147 | +with a predefined recommender name, which can be defined as a constant in the `pkg/recomender/main.go` routine, | 
|  | 148 | + | 
|  | 149 | +```golang | 
|  | 150 | +const RecommenderName = "default" | 
|  | 151 | + | 
|  | 152 | +recommender := routines.NewRecommender(config, *checkpointsGCInterval, useCheckpoints, RecommenderName, *vpaObjectNamespace) | 
|  | 153 | +``` | 
|  | 154 | + | 
|  | 155 | +where the routines.NewRecommender can pass the `RecommenderName` to the clusterState object. | 
|  | 156 | + | 
|  | 157 | +```golang | 
|  | 158 | +// NewRecommender creates a new recommender instance. | 
|  | 159 | +// Dependencies are created automatically. | 
|  | 160 | +// Deprecated; use RecommenderFactory instead. | 
|  | 161 | +func NewRecommender(config *rest.Config, checkpointsGCInterval time.Duration, useCheckpoints bool, recommender_name string, namespace string) Recommender { | 
|  | 162 | +  clusterState := model.NewClusterState(recommender_name) | 
|  | 163 | +  return RecommenderFactory{ | 
|  | 164 | +     ClusterState:           clusterState, | 
|  | 165 | +     ClusterStateFeeder:     input.NewClusterStateFeeder(config, clusterState, *memorySaver, namespace), | 
|  | 166 | +     CheckpointWriter:       checkpoint.NewCheckpointWriter(clusterState, vpa_clientset.NewForConfigOrDie(config).AutoscalingV1()), | 
|  | 167 | +     VpaClient:              vpa_clientset.NewForConfigOrDie(config).AutoscalingV1(), | 
|  | 168 | +     PodResourceRecommender: logic.CreatePodResourceRecommender(), | 
|  | 169 | +     CheckpointsGCInterval:  checkpointsGCInterval, | 
|  | 170 | +     UseCheckpoints:         useCheckpoints, | 
|  | 171 | +  }.Make() | 
|  | 172 | +} | 
|  | 173 | + | 
|  | 174 | + | 
|  | 175 | +// NewClusterState returns a new ClusterState with no pods. | 
|  | 176 | +func NewClusterState(recommender_name string) *ClusterState { | 
|  | 177 | +  return &ClusterState{ | 
|  | 178 | +     RecommenderName:   recommender_name, | 
|  | 179 | +     Pods:              make(map[PodID]*PodState), | 
|  | 180 | +     Vpas:              make(map[VpaID]*Vpa), | 
|  | 181 | +     EmptyVPAs:         make(map[VpaID]time.Time), | 
|  | 182 | +     aggregateStateMap: make(aggregateContainerStatesMap), | 
|  | 183 | +     labelSetMap:       make(labelSetMap), | 
|  | 184 | +  } | 
|  | 185 | +} | 
|  | 186 | +``` | 
|  | 187 | + | 
|  | 188 | +Therefore, when loading VPA objects to the `clusterStateFeeder`, it can use the field selector to select VPA CRDs that  | 
|  | 189 | +have `recommenderName` equal to the current clusterState’s `RecommenderName`. | 
|  | 190 | +```golang | 
|  | 191 | +// Fetch VPA objects and load them into the cluster state. | 
|  | 192 | +func (feeder *clusterStateFeeder) LoadVPAs() { | 
|  | 193 | +  // List VPA API objects. | 
|  | 194 | +  allVpaCRDs, err := feeder.vpaLister.List(labels.Everything()) | 
|  | 195 | +  if err != nil { | 
|  | 196 | +     klog.Errorf("Cannot list VPAs. Reason: %+v", err) | 
|  | 197 | +     return | 
|  | 198 | +  } | 
|  | 199 | + | 
|  | 200 | +  var vpaCRDs []*vpa_types.VerticalPodAutoscaler | 
|  | 201 | +  for _, vpaCRD := range allVpaCRDs { | 
|  | 202 | +     currentRecommenderName := feeder.clusterState.RecommenderName | 
|  | 203 | +     if (vpaCRD.Spec.RecommenderName != currentRecommenderName) && (vpaCRD.Spec.RecommenderName != "") { | 
|  | 204 | +        klog.V(6).Infof("Ignoring the vpaCRD as its name %v is not equal to the current recommender's name %v", vpaCRD.Spec.RecommenderName, currentRecommenderName) | 
|  | 205 | +        continue | 
|  | 206 | +     } | 
|  | 207 | +     vpaCRDs = append(vpaCRDs, vpaCRD) | 
|  | 208 | + | 
|  | 209 | +  klog.V(3).Infof("Fetched %d VPAs.", len(vpaCRDs)) | 
|  | 210 | +  // Add or update existing VPAs in the model. | 
|  | 211 | +  vpaKeys := make(map[model.VpaID]bool) | 
|  | 212 | + | 
|  | 213 | +  … | 
|  | 214 | + | 
|  | 215 | +  feeder.clusterState.ObservedVpas = vpaCRDs | 
|  | 216 | +} | 
|  | 217 | +``` | 
|  | 218 | +
 | 
|  | 219 | +Accordingly, the VPA object yaml should include the `recommenderName` as the default `RecommenderName`. | 
|  | 220 | +```yaml | 
|  | 221 | +apiVersion: "autoscaling.k8s.io/v1" | 
|  | 222 | +kind: VerticalPodAutoscaler | 
|  | 223 | +metadata: | 
|  | 224 | + name: hamster-vpa | 
|  | 225 | +Spec: | 
|  | 226 | + recommenderName: default | 
|  | 227 | + targetRef: | 
|  | 228 | +   apiVersion: "apps/v1" | 
|  | 229 | +   ... ...  | 
|  | 230 | +``` | 
|  | 231 | +
 | 
|  | 232 | +### Deployment Details | 
|  | 233 | +The customized recommender is supposed to be deployed as a separate deployment that is chosen  | 
|  | 234 | +by different sets of VPA objects.Each VPA object is supposed to choose only one recommender at a time. | 
|  | 235 | +The way how the default recommender and the customized recommender are running and interacting with VPA objects  | 
|  | 236 | +are shown in the following drawing. | 
|  | 237 | +
 | 
|  | 238 | +<img src="images/deployment.png" alt="deployment" width="720" height="360"/> | 
|  | 239 | +
 | 
|  | 240 | +Though we do not support a VPA object to use multiple recommenders in this proposal, we leave the possibility of necessary   | 
|  | 241 | +changes of using multiple recommenders in the future. Namely, we define `recommenderName` to be an array instead of a string, but we support one element only in this proposal. We modify the admission controller to validate that the array has <= 1 elements. | 
|  | 242 | +
 | 
|  | 243 | +We will add the following check in the `func validateVPA(vpa *vpa_types.VerticalPodAutoscaler, isCreate bool)` function.  | 
|  | 244 | +``` | 
|  | 245 | +	if len(vpa.Spec.RecommenderName) > 1 { | 
|  | 246 | +		return fmt.Errorf("VPA object shouldn't specify more than one recommenderNames.") | 
|  | 247 | +	} | 
|  | 248 | +``` | 
|  | 249 | +
 | 
|  | 250 | +
 | 
|  | 251 | +## Design Details | 
|  | 252 | +
 | 
|  | 253 | +### Test Plan | 
|  | 254 | +- Add e2e test demonstrating the default recommender ignores a VPA which specifies an alternate recommender. | 
|  | 255 | +
 | 
|  | 256 | +### Upgrade / Downgrade Strategy | 
|  | 257 | +For cluster upgrades, the VPA from the previous version will continue working as before.  | 
|  | 258 | +There is no change in behavior or flags which have to be enabled or disabled. | 
|  | 259 | +
 | 
|  | 260 | +## Alternatives | 
|  | 261 | +
 | 
|  | 262 | +### Develop a plugin framework for customizable recommenders. | 
|  | 263 | +Add a webhook system for customized recommendations. The default VPA recommender would  | 
|  | 264 | +call any available recommendation webhooks, and if any of them make a recommendation,  | 
|  | 265 | +the recommender would use that recommendation instead of making its own. If none make  | 
|  | 266 | +a recommendation, it would make its recommendation as it currently does. The plugin alternative | 
|  | 267 | +is rejected because it involves much more design changes and code changes. It might be considered in the future if there are  | 
|  | 268 | +more use cases where running multiple recommenders for the same VPA object is needed. | 
|  | 269 | +
 | 
|  | 270 | +### Develop a label selector approach. | 
|  | 271 | +Add a label for the CRD object to denote the recommender’s name. When making  | 
|  | 272 | +recommendations in the recommender, only the VpaCrds with the label  | 
|  | 273 | +`recommender=default` will be loaded and updated by the existing recommender.  | 
|  | 274 | +A label selector approach is rejected because it is too powerful and users can easily  | 
|  | 275 | +ignore those labels and misconfigure the VPA objects. | 
|  | 276 | +
 | 
|  | 277 | +## Out of Scope | 
|  | 278 | +
 | 
|  | 279 | +- Although this proposal will enable alternate recommenders, no alternate recommenders  | 
|  | 280 | +will be created as part of this proposal. | 
|  | 281 | +- This proposal will not support running multiple recommenders for the same VPA object. Each VPA object  | 
|  | 282 | +is supposed to use only one recommender.  | 
|  | 283 | +
 | 
|  | 284 | +## Implementation History | 
|  | 285 | +
 | 
|  | 286 | +<!-- | 
|  | 287 | +Major milestones in the lifecycle of a KEP should be tracked in this section. | 
|  | 288 | +Major milestones might include: | 
|  | 289 | +- the `Summary` and `Motivation` sections being merged, signaling SIG acceptance | 
|  | 290 | +- the `Proposal` section being merged, signaling agreement on a proposed design | 
|  | 291 | +- the date implementation started | 
|  | 292 | +- the first Kubernetes release where an initial version of the KEP was available | 
|  | 293 | +- the version of Kubernetes where the KEP graduated to general availability | 
|  | 294 | +- when the KEP was retired or superseded | 
|  | 295 | +--> | 
|  | 296 | +
 | 
|  | 297 | +
 | 
|  | 298 | +
 | 
0 commit comments