kubernetes
diff --git a/‎vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa/README.md‎
Lines changed: 298 additions & 0 deletions b/‎vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa/README.md‎
Lines changed: 298 additions & 0 deletions
diff --git a/‎vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa/images/deployment.png‎
65.1 KB b/‎vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa/images/deployment.png‎
65.1 KB
@@ -0,0 +1,298 @@
+# Support Customized Recommenders for Vertical Pod Autoscalers
+
+<!-- toc -->
+- [Summary](#summary)
+- [Motivation](#motivation)
+  - [Goals](#goals)
+  - [Non-Goals](#non-goals)
+- [Proposal](#proposal)
+  - [User Stories](#user-stories-optional)
+    - [Story 1](#story-1)
+    - [Story 2](#story-2)
+  - [Implementation Details](#implementation-details)
+  - [Deployment Details](#deployment-details)
+- [Design Details](#design-details)
+  - [Test Plan](#test-plan)
+  - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
+- [Alternatives](#alternatives)
+- [Out of Scope](#out-out-scope)
+- [Implementation History](#implementation-history)
+<!-- /toc -->
+
+## Summary
+
+Today, the current VPA recommends CPU/Mem requests based on one recommender, 
+which recommends the future requests based on the historical usage observed in a 
+rolling time window. As there is no universal recommendation policy that applies to all 
+types of workload, this KEP suggests supporting multiple customized recommenders in VPA.  
+Thus, users can run different recommenders for different workloads, as they may exhibit 
+very distinct resource usage behaviors.
+
+## Motivation
+
+A VPA is used to recommend the requested resources of containers in pods when the actual CPU/memory usage of a container 
+is significantly different from the resources requested. Resource usage-based recommendation 
+is the basic approach that resizes containers according to the actual usage observed and is 
+implemented in the default VPA recommender. Users can configure the time window and a certain 
+percentile of observed usage in the past as the prediction of future requests/limits for CPU/memory.
+
+However, as containers running different types of workloads may have different resource usage patterns, 
+there is no universal policy that applies to all. The existing VPA recommender may not accurately 
+predict future resource usage when containers exhibit certain resource usage behaviors, 
+such as trending, periodically changing, or occasional spikes, resulting in significant 
+over-provisioning and OOM kills for microservices. Learning different types of resource usage 
+behaviors for workloads and applying different algorithms to improve resource utilization 
+(CPU and Memory) predictions can significantly reduce over-provisioning and OOM kills in VPA.
+
+### Goals
+
+- Allow the VPA object to specify a customized recommender to use. 
+- Allow the VPA object to use the default recommender when no recommender is specified.
+
+### Non-Goals
+
+- We assume no pod uses two recommenders at the same time.
+- We do not resolve conflicts between recommenders.
+
+## Proposal
+
+### User Stories
+
+#### Story 1 
+
+- Containers with Cyclic Patterns in Resource Usage
+
+Containers used in monitoring may receive load periodically to process but need to be long-running 
+to listen to incoming traffic. Thus, these containers usually exhibit cyclic patterns, alternating 
+between usage spikes and idling. Resizing containers according to usage observed in the previous 
+time window may always lead to under-provision for a short period when the load spikes just arrive. 
+The problem will happen for memory if the cyclic pattern length is >8 days. For CPU, the problem may 
+be visible for example with lower usage on the weekend. The problem will even lead to frequent pod evictions 
+when the pod was resized according to the idling period and the host resource has been taken by other pods.
+
+#### Story 2 
+
+- Containers with Different but Recurrent Behaviors in Resource Usage
+
+Containers running spark/deep learning training workloads are known to show recurring and repeating 
+patterns in resource usage. Prior research has shown that different but recurrent behaviors occur 
+for different containerized tasks, such as Spark or deep learning training. These common patterns can 
+be represented by phases, which display similar resource usage of computational resources over time. 
+There are common sequences of patterns for different executions of the workload and they can be used 
+to proactively predict future resource usage more accurately. The default recommender in the current 
+VPA adopts a reactive approach so a more proactive recommender is needed for these types of workload.
+
+### Implementation Details
+
+The following describes the details of implementing a first-citizen approach to support the customized 
+recommender. Namely, a dedicated field `recommenderName` is added to the VPA crd definition in 
+`deploy/vpa-v1.crd.yaml`.
+
+```yaml
+validation:
+ # openAPIV3Schema is the schema for validating custom objects.
+ openAPIV3Schema:
+   type: object
+   properties:
+     spec:
+       type: object
+       required: []
+       properties:
+         recommenderName:
+           type: string
+         targetRef:
+           type: object
+         updatePolicy:
+           type: object
+```
+
+Correspondingly, the `VerticalPodAutoscalerSpec` in `pkg/apis/autoscaling.k8s.io/v1/types.go` 
+should be updated to include the `recommenderName` field.
+
+```golang
+// VerticalPodAutoscalerSpec is the specification of the behavior of the autoscaler.
+type VerticalPodAutoscalerSpec struct {
+	// TargetRef points to the controller managing the set of pods for the
+	// autoscaler to control - e.g. Deployment, StatefulSet. VerticalPodAutoscaler
+	// can be targeted at controller implementing scale subresource (the pod set is
+	// retrieved from the controller's ScaleStatus) or some well known controllers
+	// (e.g. for DaemonSet the pod set is read from the controller's spec).
+	// If VerticalPodAutoscaler cannot use specified target it will report
+	// ConfigUnsupported condition.
+	// Note that VerticalPodAutoscaler does not require full implementation
+	// of scale subresource - it will not use it to modify the replica count.
+	// The only thing retrieved is a label selector matching pods grouped by
+	// the target resource.
+	TargetRef *autoscaling.CrossVersionObjectReference `json:"targetRef" protobuf:"bytes,1,name=targetRef"`
+
+	// Describes the rules on how changes are applied to the pods.
+	// If not specified, all fields in the `PodUpdatePolicy` are set to their
+	// default values.
+	// +optional
+	UpdatePolicy *PodUpdatePolicy `json:"updatePolicy,omitempty" protobuf:"bytes,2,opt,name=updatePolicy"`
+
+	// Controls how the autoscaler computes recommended resources.
+	// The resource policy may be used to set constraints on the recommendations
+	// for individual containers. If not specified, the autoscaler computes recommended
+	// resources for all containers in the pod, without additional constraints.
+	// +optional
+	ResourcePolicy *PodResourcePolicy `json:"resourcePolicy,omitempty" protobuf:"bytes,3,opt,name=resourcePolicy"`
+  
+  // Name of the recommender responsible for generating recommendation for this object.
+  RecommenderName []string `json:"recommenderName,omitempty" protobuf:"bytes,4,opt,name=recommenderName"`
+}
+```
+
+When creating a recommender object for recommendations, the recommender main routine should initiate itself 
+with a predefined recommender name, which can be defined as a constant in the `pkg/recomender/main.go` routine,
+
+```golang
+const RecommenderName = "default"
+
+recommender := routines.NewRecommender(config, *checkpointsGCInterval, useCheckpoints, RecommenderName, *vpaObjectNamespace)
+```
+
+where the routines.NewRecommender can pass the `RecommenderName` to the clusterState object.
+
+```golang
+// NewRecommender creates a new recommender instance.
+// Dependencies are created automatically.
+// Deprecated; use RecommenderFactory instead.
+func NewRecommender(config *rest.Config, checkpointsGCInterval time.Duration, useCheckpoints bool, recommender_name string, namespace string) Recommender {
+  clusterState := model.NewClusterState(recommender_name)
+  return RecommenderFactory{
+     ClusterState:           clusterState,
+     ClusterStateFeeder:     input.NewClusterStateFeeder(config, clusterState, *memorySaver, namespace),
+     CheckpointWriter:       checkpoint.NewCheckpointWriter(clusterState, vpa_clientset.NewForConfigOrDie(config).AutoscalingV1()),
+     VpaClient:              vpa_clientset.NewForConfigOrDie(config).AutoscalingV1(),
+     PodResourceRecommender: logic.CreatePodResourceRecommender(),
+     CheckpointsGCInterval:  checkpointsGCInterval,
+     UseCheckpoints:         useCheckpoints,
+  }.Make()
+}
+
+
+// NewClusterState returns a new ClusterState with no pods.
+func NewClusterState(recommender_name string) *ClusterState {
+  return &ClusterState{
+     RecommenderName:   recommender_name,
+     Pods:              make(map[PodID]*PodState),
+     Vpas:              make(map[VpaID]*Vpa),
+     EmptyVPAs:         make(map[VpaID]time.Time),
+     aggregateStateMap: make(aggregateContainerStatesMap),
+     labelSetMap:       make(labelSetMap),
+  }
+}
+```
+
+Therefore, when loading VPA objects to the `clusterStateFeeder`, it can use the field selector to select VPA CRDs that 
+have `recommenderName` equal to the current clusterState’s `RecommenderName`.
+```golang
+// Fetch VPA objects and load them into the cluster state.
+func (feeder *clusterStateFeeder) LoadVPAs() {
+  // List VPA API objects.
+  allVpaCRDs, err := feeder.vpaLister.List(labels.Everything())
+  if err != nil {
+     klog.Errorf("Cannot list VPAs. Reason: %+v", err)
+     return
+  }
+
+  var vpaCRDs []*vpa_types.VerticalPodAutoscaler
+  for _, vpaCRD := range allVpaCRDs {
+     currentRecommenderName := feeder.clusterState.RecommenderName
+     if (vpaCRD.Spec.RecommenderName != currentRecommenderName) && (vpaCRD.Spec.RecommenderName != "") {
+        klog.V(6).Infof("Ignoring the vpaCRD as its name %v is not equal to the current recommender's name %v", vpaCRD.Spec.RecommenderName, currentRecommenderName)
+        continue
+     }
+     vpaCRDs = append(vpaCRDs, vpaCRD)
+
+  klog.V(3).Infof("Fetched %d VPAs.", len(vpaCRDs))
+  // Add or update existing VPAs in the model.
+  vpaKeys := make(map[model.VpaID]bool)
+
+  …
+
+  feeder.clusterState.ObservedVpas = vpaCRDs
+}
+```
+
+Accordingly, the VPA object yaml should include the `recommenderName` as the default `RecommenderName`.
+```yaml
+apiVersion: "autoscaling.k8s.io/v1"
+kind: VerticalPodAutoscaler
+metadata:
+ name: hamster-vpa
+Spec:
+ recommenderName: default
+ targetRef:
+   apiVersion: "apps/v1"
+   ... ... 
+```
+
+### Deployment Details
+The customized recommender is supposed to be deployed as a separate deployment that is chosen 
+by different sets of VPA objects.Each VPA object is supposed to choose only one recommender at a time.
+The way how the default recommender and the customized recommender are running and interacting with VPA objects 
+are shown in the following drawing.
+
+<img src="images/deployment.png" alt="deployment" width="720" height="360"/>
+
+Though we do not support a VPA object to use multiple recommenders in this proposal, we leave the possibility of necessary  
+changes of using multiple recommenders in the future. Namely, we define `recommenderName` to be an array instead of a string, but we support one element only in this proposal. We modify the admission controller to validate that the array has <= 1 elements.
+
+We will add the following check in the `func validateVPA(vpa *vpa_types.VerticalPodAutoscaler, isCreate bool)` function. 
+```
+	if len(vpa.Spec.RecommenderName) > 1 {
+		return fmt.Errorf("VPA object shouldn't specify more than one recommenderNames.")
+	}
+```
+
+
+## Design Details
+
+### Test Plan
+- Add e2e test demonstrating the default recommender ignores a VPA which specifies an alternate recommender.
+
+### Upgrade / Downgrade Strategy
+For cluster upgrades, the VPA from the previous version will continue working as before. 
+There is no change in behavior or flags which have to be enabled or disabled.
+
+## Alternatives
+
+### Develop a plugin framework for customizable recommenders.
+Add a webhook system for customized recommendations. The default VPA recommender would 
+call any available recommendation webhooks, and if any of them make a recommendation, 
+the recommender would use that recommendation instead of making its own. If none make 
+a recommendation, it would make its recommendation as it currently does. The plugin alternative
+is rejected because it involves much more design changes and code changes. It might be considered in the future if there are 
+more use cases where running multiple recommenders for the same VPA object is needed.
+
+### Develop a label selector approach.
+Add a label for the CRD object to denote the recommender’s name. When making 
+recommendations in the recommender, only the VpaCrds with the label 
+`recommender=default` will be loaded and updated by the existing recommender. 
+A label selector approach is rejected because it is too powerful and users can easily 
+ignore those labels and misconfigure the VPA objects.
+
+## Out of Scope
+
+- Although this proposal will enable alternate recommenders, no alternate recommenders 
+will be created as part of this proposal.
+- This proposal will not support running multiple recommenders for the same VPA object. Each VPA object 
+is supposed to use only one recommender. 
+
+## Implementation History
+
+<!--
+Major milestones in the lifecycle of a KEP should be tracked in this section.
+Major milestones might include:
+- the `Summary` and `Motivation` sections being merged, signaling SIG acceptance
+- the `Proposal` section being merged, signaling agreement on a proposed design
+- the date implementation started
+- the first Kubernetes release where an initial version of the KEP was available
+- the version of Kubernetes where the KEP graduated to general availability
+- when the KEP was retired or superseded
+-->
+
+
+