From f58c046fa1bc6f48c1865d3d306990f36d86b0bd Mon Sep 17 00:00:00 2001
From: kerthcet <kerthcet@gmail.com>
Date: Thu, 11 May 2023 11:28:12 +0800
Subject: [PATCH 1/5] Adapt to the latest template

Signed-off-by: kerthcet <kerthcet@gmail.com>
---
 .../README.md                                 | 441 +++++++++++++++---
 1 file changed, 365 insertions(+), 76 deletions(-)
diff --git a/keps/sig-apps/961-maxunavailable-for-statefulset/README.md b/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
index fbbef782646..7183961cfe4 100644
--- a/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
+++ b/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
@@ -1,4 +1,97 @@
-# Implement maxUnavailable in StatefulSet
+# KEP-961: Implement maxUnavailable in StatefulSet
+
+
+<!--
+This is the title of your KEP. Keep it short, simple, and descriptive. A good
+title can help communicate what the KEP is and should be considered as part of
+any review.
+-->
+
+<!--
+A table of contents is helpful for quickly jumping to sections of a KEP and for
+highlighting any additional information provided beyond the standard KEP
+template.
+
+Ensure the TOC is wrapped with
+  <code>&lt;!-- toc --&rt;&lt;!-- /toc --&rt;</code>
+tags, and then generate with `hack/update-toc.sh`.
+-->
+
+<!-- toc -->
+- [Release Signoff Checklist](#release-signoff-checklist)
+- [Summary](#summary)
+- [Motivation](#motivation)
+  - [Goals](#goals)
+  - [Non-Goals](#non-goals)
+- [Proposal](#proposal)
+  - [User Stories (Optional)](#user-stories-optional)
+    - [Story 1](#story-1)
+    - [Story 2](#story-2)
+  - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
+  - [Risks and Mitigations](#risks-and-mitigations)
+- [Design Details](#design-details)
+  - [Test Plan](#test-plan)
+      - [Prerequisite testing updates](#prerequisite-testing-updates)
+      - [Unit tests](#unit-tests)
+      - [Integration tests](#integration-tests)
+      - [e2e tests](#e2e-tests)
+  - [Graduation Criteria](#graduation-criteria)
+  - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
+  - [Version Skew Strategy](#version-skew-strategy)
+- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
+  - [Feature Enablement and Rollback](#feature-enablement-and-rollback)
+  - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
+  - [Monitoring Requirements](#monitoring-requirements)
+  - [Dependencies](#dependencies)
+  - [Scalability](#scalability)
+  - [Troubleshooting](#troubleshooting)
+- [Implementation History](#implementation-history)
+- [Drawbacks](#drawbacks)
+- [Alternatives](#alternatives)
+- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
+<!-- /toc -->
+
+## Release Signoff Checklist
+
+<!--
+**ACTION REQUIRED:** In order to merge code into a release, there must be an
+issue in [kubernetes/enhancements] referencing this KEP and targeting a release
+milestone **before the [Enhancement Freeze](https://git.k8s.io/sig-release/releases)
+of the targeted release**.
+
+For enhancements that make changes to code or processes/procedures in core
+Kubernetes—i.e., [kubernetes/kubernetes], we require the following Release
+Signoff checklist to be completed.
+
+Check these off as they are completed for the Release Team to track. These
+checklist items _must_ be updated for the enhancement to be released.
+-->
+
+Items marked with (R) are required *prior to targeting to a milestone / release*.
+
+- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
+- [x] (R) KEP approvers have approved the KEP status as `implementable`
+- [x] (R) Design details are appropriately documented
+- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
+  - [ ] e2e Tests for all Beta API Operations (endpoints)
+  - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
+  - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
+- [ ] (R) Graduation criteria is in place
+  - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
+- [ ] (R) Production readiness review completed
+- [ ] (R) Production readiness review approved
+- [x] "Implementation History" section is up-to-date for milestone
+- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
+- [x] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
+
+<!--
+**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
+-->
+
+[kubernetes.io]: https://kubernetes.io/
+[kubernetes/enhancements]: https://git.k8s.io/enhancements
+[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
+[kubernetes/website]: https://git.k8s.io/website
 
 ## Table of Contents
 
@@ -32,56 +125,153 @@
 
 ## Summary
 
-The purpose of this enhancement is to implement maxUnavailable for StatefulSet during RollingUpdate. 
-When a StatefulSet’s `.spec.updateStrategy.type` is set to `RollingUpdate`, the StatefulSet controller 
-will delete and recreate each Pod in the StatefulSet. The updating of each Pod currently happens one at a time. With support for `maxUnavailable`, the updating will proceed `maxUnavailable` number of pods at a time. 
+<!--
+This section is incredibly important for producing high-quality, user-focused
+documentation such as release notes or a development roadmap. It should be
+possible to collect this information before implementation begins, in order to
+avoid requiring implementors to split their attention between writing release
+notes and implementing the feature itself. KEP editors and SIG Docs
+should help to ensure that the tone and content of the `Summary` section is
+useful for a wide audience.
+
+A good summary is probably at least a paragraph in length.
+
+Both in this section and below, follow the guidelines of the [documentation
+style guide]. In particular, wrap lines to a reasonable length, to make it
+easier for reviewers to cite specific portions, and to minimize diff churn on
+updates.
+
+[documentation style guide]: https://github.com/kubernetes/community/blob/master/contributors/guide/style-guide.md
+-->
+
+The purpose of this enhancement is to implement maxUnavailable for StatefulSet during RollingUpdate.
+When a StatefulSet’s `.spec.updateStrategy.type` is set to `RollingUpdate`, the StatefulSet controller
+will delete and recreate each Pod in the StatefulSet. The updating of each Pod currently happens one at a time.
+With support for `maxUnavailable`, the updating will proceed `maxUnavailable` number of pods at a time.
 
 ## Motivation
 
+<!--
+This section is for explicitly listing the motivation, goals, and non-goals of
+this KEP.  Describe why the change is important and the benefits to users. The
+motivation section can optionally provide links to [experience reports] to
+demonstrate the interest in a KEP within the wider Kubernetes community.
+
+[experience reports]: https://github.com/golang/go/wiki/ExperienceReports
+-->
+
 Consider the following scenarios:-
 
-1. My containers publish metrics to a time series system. If I am using a Deployment, each rolling 
-update creates a new pod name and hence the metrics published by this new pod starts a new time series 
-which makes tracking metrics for the application difficult. While this could be mitigated, it requires 
-some tricks on the time series collection side. It would be so much better, If we could use a 
+1. My containers publish metrics to a time series system. If I am using a Deployment, each rolling
+update creates a new pod name and hence the metrics published by this new pod starts a new time series
+which makes tracking metrics for the application difficult. While this could be mitigated, it requires
+some tricks on the time series collection side. It would be so much better, If we could use a
 StatefulSet object so my object names doesnt change and hence all metrics goes to a single time series. This will be easier if StatefulSet is at feature parity with Deployments.
-2. My Container does some initial startup tasks like loading up cache or something that takes a lot of 
-time. If we used StatefulSet, we can only go one pod at a time which would result in a slow rolling 
-update. If StatefulSet supported maxUnavailable with value greater than 1, it would allow for a faster 
+2. My Container does some initial startup tasks like loading up cache or something that takes a lot of
+time. If we used StatefulSet, we can only go one pod at a time which would result in a slow rolling
+update. If StatefulSet supported maxUnavailable with value greater than 1, it would allow for a faster
 rollout since a total of maxUnavailable number of pods could be loading up the cache at the same time.
-3. My Stateful clustered application, has followers and leaders, with followers being many more than 1. My application can tolerate many followers going down at the same time. I want to be able to do faster 
-rollouts by bringing down 2 or more followers at the same time. This is only possible if StatefulSet 
+3. My Stateful clustered application, has followers and leaders, with followers being many more than 1. My application can tolerate many followers going down at the same time. I want to be able to do faster
+rollouts by bringing down 2 or more followers at the same time. This is only possible if StatefulSet
 supports maxUnavailable in Rolling Updates.
-4. Sometimes I just want easier tracking of revisions of a rolling update. Deployment does it through 
-ReplicaSets and has its own nuances. Understanding that requires diving into the complicacy of hashing 
-and how ReplicaSets are named. Over and above that, there are some issues with hash collisions which 
-further complicate the situation(I know they were solved). StatefulSet introduced ControllerRevisions 
-in 1.7 which are much easier to think and reason about. They are used by DaemonSet and StatefulSet for 
-tracking revisions. It would be so much nicer if all the use cases of Deployments can be met in 
-StatefulSet's and additionally we could track the revisions by ControllerRevisions. Another way of 
+4. Sometimes I just want easier tracking of revisions of a rolling update. Deployment does it through
+ReplicaSets and has its own nuances. Understanding that requires diving into the complicacy of hashing
+and how ReplicaSets are named. Over and above that, there are some issues with hash collisions which
+further complicate the situation(I know they were solved). StatefulSet introduced ControllerRevisions
+in 1.7 which are much easier to think and reason about. They are used by DaemonSet and StatefulSet for
+tracking revisions. It would be so much nicer if all the use cases of Deployments can be met in
+StatefulSet's and additionally we could track the revisions by ControllerRevisions. Another way of
 saying this is, all my Deployment use cases are easily met by StatefulSet, and additionally I can enjoy
 easier revision tracking only if StatefulSet supported `maxUnavailable`.
 
-With this feature in place, when using StatefulSet with maxUnavailable >1, the user is making a 
-conscious choice that more than one pod going down at the same time during rolling update, would not 
-cause issues with their Stateful Application which have per pod state and identity. Other Stateful 
+With this feature in place, when using StatefulSet with maxUnavailable >1, the user is making a
+conscious choice that more than one pod going down at the same time during rolling update, would not
+cause issues with their Stateful Application which have per pod state and identity. Other Stateful
 Applications which cannot tolerate more than one pod going down, will resort to the current behavior of one pod at a time Rolling Updates.
 
 ### Goals
-StatefulSet RollingUpdate strategy will contain an additional parameter called `maxUnavailable` to 
+
+<!--
+List the specific goals of the KEP. What is it trying to achieve? How will we
+know that this has succeeded?
+-->
+
+StatefulSet RollingUpdate strategy will contain an additional parameter called `maxUnavailable` to
 control how many Pods will be brought down at a time, during Rolling Update.
 
 ### Non-Goals
-NA
+
+<!--
+What is out of scope for this KEP? Listing non-goals helps to focus discussion
+and make progress.
+-->
+
+N/A
 
 ## Proposal
 
+<!--
+This is where we get down to the specifics of what the proposal actually is.
+This should have enough detail that reviewers can understand exactly what
+you're proposing, but should not include things like API designs or
+implementation. What is the desired outcome and how do we measure success?.
+The "Design Details" section below is for the real
+nitty-gritty.
+-->
+
 ### User Stories
 
+<!--
+Detail the things that people will be able to do if this KEP is implemented.
+Include as much detail as possible so that people can understand the "how" of
+the system. The goal here is to make this feel real for users without getting
+bogged down.
+-->
+
 #### Story 1
-As a User of Kubernetes, I should be able to update my StatefulSet, more than one Pod at a time, in a 
-RollingUpdate manner, if my Stateful app can tolerate more than one pod being down, thus allowing my 
-update to finish much faster. 
+As a User of Kubernetes, I should be able to update my StatefulSet, more than one Pod at a time, in a
+RollingUpdate manner, if my Stateful app can tolerate more than one pod being down, thus allowing my
+update to finish much faster.
+
+### Notes/Constraints/Caveats (Optional)
+
+<!--
+What are the caveats to the proposal?
+What are some important details that didn't come across above?
+Go in to as much detail as necessary here.
+This might be a good place to talk about core concepts and how they relate.
+-->
+
+No.
+
+### Risks and Mitigations
+
+<!--
+What are the risks of this proposal, and how do we mitigate? Think broadly.
+For example, consider both security and how this will impact the larger
+Kubernetes ecosystem.
+
+How will security be reviewed, and by whom?
+
+How will UX be reviewed, and by whom?
+
+Consider including folks who also work outside the SIG or subproject.
+-->
+
+We are proposing a new field called `maxUnavailable` whose default value will be 1. In this mode, StatefulSet will behave exactly like its current behavior.
+Its possible we introduce a bug in the implementation. The mitigation currently is that is disabled by default in Alpha phase for people to try out and give
+feedback.
+In Beta phase when its enabled by default, people will only see issues or bugs when `maxUnavailable` is set to something greater than 1. Since people have
+tried this feature in Alpha, we would have time to fix issues.
+
+## Design Details
+
+<!--
+This section should contain enough information that the specifics of your
+change are understandable. This may include API specs (though not always
+required) or even code snippets. If there's any ambiguity about HOW your
+proposal will be implemented, this is the place to discuss them.
+-->
 
 ### Implementation Details
 
@@ -106,62 +296,62 @@ type RollingUpdateStatefulSetStrategy struct {
         // Defaults to 1.
         // +optional
         MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty" protobuf:"bytes,2,opt,name=maxUnavailable"`
-	
+
 	...
 }
 ```
 
-- By Default, if maxUnavailable is not specified, its value will be assumed to be 1 and StatefulSets 
+- By Default, if maxUnavailable is not specified, its value will be assumed to be 1 and StatefulSets
 will follow their old behavior. This will also help while upgrading from a release which doesnt support maxUnavailable to a release which supports this field.
 - If maxUnavailable is specified, it cannot be greater than total number of replicas.
 - If maxUnavailable is specified and partition is also specified, MaxUnavailable cannot be greater than `replicas-partition`
-- If a partition is specified, maxUnavailable will only apply to all the pods which are staged by the 
-partition. Which means all Pods with an ordinal that is greater than or equal to the partition will be 
+- If a partition is specified, maxUnavailable will only apply to all the pods which are staged by the
+partition. Which means all Pods with an ordinal that is greater than or equal to the partition will be
 updated when the StatefulSet’s .spec.template is updated. Lets say total replicas is 5 and partition is set to 2 and maxUnavailable is set to 2. If the image is changed in this scenario, following
   are the possible behavior choices we have:-
 
-  1. Pods with ordinal 4 and 3 will start Terminating at the same time(because of maxUnavailable). Once they are both running and ready, pods with ordinal 2 will start Terminating. Pods with ordinal 0 and 1 
-will remain untouched due the partition. In this choice, the number of pods terminating is not always 
-maxUnavailable, but sometimes less than that. For e.g. if pod with ordinal 3 is running and ready but 4 is not, we still wait for 4 to be running and ready before moving on to 2. This implementation avoids 
+  1. Pods with ordinal 4 and 3 will start Terminating at the same time(because of maxUnavailable). Once they are both running and ready, pods with ordinal 2 will start Terminating. Pods with ordinal 0 and 1
+will remain untouched due the partition. In this choice, the number of pods terminating is not always
+maxUnavailable, but sometimes less than that. For e.g. if pod with ordinal 3 is running and ready but 4 is not, we still wait for 4 to be running and ready before moving on to 2. This implementation avoids
 out of order Terminations of pods.
-  2. Pods with ordinal 4 and 3 will start Terminating at the same time(because of maxUnavailable). When any of 4 or 3 are running and ready, pods with ordinal 2 will start Terminating. This could violate 
-ordering guarantees, since if 3 is running and ready, then both 4 and 2 are terminating at the same 
+  2. Pods with ordinal 4 and 3 will start Terminating at the same time(because of maxUnavailable). When any of 4 or 3 are running and ready, pods with ordinal 2 will start Terminating. This could violate
+ordering guarantees, since if 3 is running and ready, then both 4 and 2 are terminating at the same
 time out of order. If 4 is running and ready, then both 3 and 2 are Terminating at the same time and no ordering guarantees are violated. This implementation, guarantees, that always there are maxUnavailable number of Pods Terminating except the last batch.
-  3. Pod with ordinal 4 and 3 will start Terminating at the same time(because of maxUnavailable). When 4 is running and ready, 2 will start Terminating. At this time both 2 and 3 are terminating. If 3 is 
-running and ready before 4, 2 wont start Terminating to preserve ordering semantics. So at this time, 
+  3. Pod with ordinal 4 and 3 will start Terminating at the same time(because of maxUnavailable). When 4 is running and ready, 2 will start Terminating. At this time both 2 and 3 are terminating. If 3 is
+running and ready before 4, 2 wont start Terminating to preserve ordering semantics. So at this time,
 only 1 is unavailable although we requested 2.
   4. Introduce a field in Rolling Update, which decides whether we want maxUnavailable with ordering or without ordering guarantees. Depending on what the user wants, this Choice can either choose behavior 1 or 3 if ordering guarantees are needed or choose behavior 2 if they dont care. To simplify this further
 PodManagementPolicy today supports `OrderedReady` or `Parallel`. The `Parallel` mode only supports scale up and tear down of StatefulSets and currently doesnt apply to Rolling Updates. So instead of coming up
 with a new field, we could use the PodManagementPolicy to choose the behavior the User wants.
 
         1. PMP=Parallel will now apply to RollingUpdate. This will choose behavior described in 2 above.
-           This means always maxUnavailable number of Pods are terminating at the same time except in 
+           This means always maxUnavailable number of Pods are terminating at the same time except in
            the last case and no ordering guarantees are provided.
-        2. PMP=OrderedReady with maxUnavailable can choose one of behavior 1 or 3. 
+        2. PMP=OrderedReady with maxUnavailable can choose one of behavior 1 or 3.
 
-NOTE: The goal is faster updates of an application. In some cases , people would need both ordering 
-and faster updates. In other cases they just need faster updates and they dont care about ordering as 
+NOTE: The goal is faster updates of an application. In some cases , people would need both ordering
+and faster updates. In other cases they just need faster updates and they dont care about ordering as
 long as they get identity.
 
-Choice 1 is simpler to reason about. It does not always have maxUnavailable number of Pods in 
-Terminating state. It does not guarantee ordering within the batch of maxUnavailable Pods. The maximum 
+Choice 1 is simpler to reason about. It does not always have maxUnavailable number of Pods in
+Terminating state. It does not guarantee ordering within the batch of maxUnavailable Pods. The maximum
 difference in the ordinals which are Terminating out of Order, cannot be more than maxUnavailable.
 
-Choice 2 always offers maxUnavailable number of Pods in Terminating state. This can sometime lead to 
+Choice 2 always offers maxUnavailable number of Pods in Terminating state. This can sometime lead to
 pods terminating out of order. This will always lead to the fastest rollouts. The maximum difference in the ordinals which are Terminating out of Order, can be more than maxUnavailable.
 
-Choice 3 always guarantees than no two pods are ever Terminating out of order. It sometimes does that, 
-at the cost of not being able to Terminate maxUnavailable pods. The implementationg for this might be 
+Choice 3 always guarantees than no two pods are ever Terminating out of order. It sometimes does that,
+at the cost of not being able to Terminate maxUnavailable pods. The implementationg for this might be
 complicated.
 
-Choice 4 provides a choice to the users and hence takes the guessing out of the picture on what they 
+Choice 4 provides a choice to the users and hence takes the guessing out of the picture on what they
 will expect. Implementing Choice 4 using PMP would be the easiest.
 
 #### Implementation
 
 The alpha release we are going with Choice 4 with support for both PMP=Parallel and PMP=OrderedReady.
 For PMP=Parallel, we will use Choice 2
-For PMP=OrderedReady, we will use Choice 3 to ensure we can support ordering guarantees while also 
+For PMP=OrderedReady, we will use Choice 3 to ensure we can support ordering guarantees while also
 making sure the rolling updates are fast.
 
 
@@ -202,7 +392,7 @@ https://github.com/kubernetes/kubernetes/blob/v1.13.0/pkg/controller/statefulset
 
 		// wait for unhealthy Pods on update
 		if !isHealthy(replicas[target]) {
-			// If this Pod is unhealthy regardless of revision, count it in 
+			// If this Pod is unhealthy regardless of revision, count it in
 			// unavailable pods
 			unavailablePods = append(unavailablePods, replicas[target].Name)
 			klog.V(4).Infof(
@@ -211,7 +401,7 @@ https://github.com/kubernetes/kubernetes/blob/v1.13.0/pkg/controller/statefulset
 				set.Name,
 				replicas[target].Name)
 		}
-		
+
 		// NEW CODE HERE
 		// If at anytime, total number of unavailable Pods exceeds maxUnavailable,
 		// we stop deleting more Pods for Update
@@ -229,29 +419,33 @@ https://github.com/kubernetes/kubernetes/blob/v1.13.0/pkg/controller/statefulset
 ...
 ```
 
-### Risks and Mitigations
-We are proposing a new field called `maxUnavailable` whose default value will be 1. In this mode, StatefulSet will behave exactly like its current behavior.
-Its possible we introduce a bug in the implementation. The mitigation currently is that is disabled by default in Alpha phase for people to try out and give
-feedback. 
-In Beta phase when its enabled by default, people will only see issues or bugs when `maxUnavailable` is set to something greater than 1. Since people have
-tried this feature in Alpha, we would have time to fix issues.
+### Test Plan
 
+<!--
+**Note:** *Not required until targeted at a release.*
+The goal is to ensure that we don't accept enhancements with inadequate testing.
 
-### Upgrades/Downgrades
+All code is expected to have adequate tests (eventually with coverage
+expectations). Please adhere to the [Kubernetes testing guidelines][testing-guidelines]
+when drafting this test plan.
 
-We will default to 1 for maxUnavailable field in StatefulSet for backward compatibility
+[testing-guidelines]: https://git.k8s.io/community/contributors/devel/sig-testing/testing.md
+-->
 
-Downgrades
+[x] I/we understand the owners of the involved components may require updates to
+existing tests to make this code solid enough prior to committing the changes necessary
+to implement this enhancement.
 
-When downgrading from a release with this feature, to a release without maxUnavailable, there are two cases
- - If maxUnavailable is greater than 1, there are two more cases:-
-   - If you're rolling back to a release that doesn't have this field - then there is even no way to discover it
-   - If you're just disabling the feature (either together with downgrade to a release that has a field or without downgrade),the field should remain set 
-        (unless someone will explicitly delete it later), but controller should ignore its behavior (and there shouldn't be a way to set it if the feature gate
-         is switched off).
- - If maxUnavailable is less than equal to 1 -- in this case user wont see any difference in behavior
+##### Prerequisite testing updates
+
+<!--
+Based on reviewers feedback describe what additional tests need to be added prior
+implementing this enhancement to ensure the enhancements have also solid foundations.
+-->
 
-### Tests
+No.
+
+#### Tests
 
 - maxUnavailable =1, Same behavior as today with PodManagementPolicy as `OrderedReady` or `Parallel`
 - Each of these Tests can be run in PodManagementPolicy = `OrderedReady` or `Parallel` and the Update
@@ -263,17 +457,107 @@ When downgrading from a release with this feature, to a release without maxUnava
 - maxUnavailable greater than 1 with partition and staged pods greater than maxUnavailable
 - maxUnavailable greater than 1 with partition and maxUnavailable greater than replicas
 
-## Test Plan
+#### Test Plan
 For `Alpha`, unit tests and e2e tests will be added to test functionality at both
 with feature flag enabled and disabled. Defaults will be verified so that users
 who donot set this flag are not surprised at all.
 
-
 ## Graduation Criteria
 
+<!--
+**Note:** *Not required until targeted at a release.*
+
+Define graduation milestones.
+
+These may be defined in terms of API maturity, [feature gate] graduations, or as
+something else. The KEP should keep this high-level with a focus on what
+signals will be looked at to determine graduation.
+
+Consider the following in developing the graduation criteria for this enhancement:
+- [Maturity levels (`alpha`, `beta`, `stable`)][maturity-levels]
+- [Feature gate][feature gate] lifecycle
+- [Deprecation policy][deprecation-policy]
+
+Clearly define what graduation means by either linking to the [API doc
+definition](https://kubernetes.io/docs/concepts/overview/kubernetes-api/#api-versioning)
+or by redefining what graduation means.
+
+In general we try to use the same stages (alpha, beta, GA), regardless of how the
+functionality is accessed.
+
+[feature gate]: https://git.k8s.io/community/contributors/devel/sig-architecture/feature-gates.md
+[maturity-levels]: https://git.k8s.io/community/contributors/devel/sig-architecture/api_changes.md#alpha-beta-and-stable-versions
+[deprecation-policy]: https://kubernetes.io/docs/reference/using-api/deprecation-policy/
+
+Below are some examples to consider, in addition to the aforementioned [maturity levels][maturity-levels].
+
+#### Alpha
+
+- Feature implemented behind a feature flag
+- Initial e2e tests completed and enabled
+
+#### Beta
+
+- Gather feedback from developers and surveys
+- Complete features A, B, C
+- Additional tests are in Testgrid and linked in KEP
+
+#### GA
+
+- N examples of real-world usage
+- N installs
+- More rigorous forms of testing—e.g., downgrade tests and scalability tests
+- Allowing time for feedback
+
+**Note:** Generally we also wait at least two releases between beta and
+GA/stable, because there's no opportunity for user feedback, or even bug reports,
+in back-to-back releases.
+
+**For non-optional features moving to GA, the graduation criteria must include
+[conformance tests].**
+
+[conformance tests]: https://git.k8s.io/community/contributors/devel/sig-architecture/conformance-tests.md
+
+#### Deprecation
+
+- Announce deprecation and support policy of the existing flag
+- Two versions passed since introducing the functionality that deprecates the flag (to address version skew)
+- Address feedback on usage/changed behavior, provided on GitHub issues
+- Deprecate the flag
+-->
+
 - Alpha: Initial support for maxUnavailable in StatefulSets added. Disabled by default with default value of 1.
-- Beta:  Enabled by default with default value of 1 with upgrade downgrade testedd at least manually.
+- Beta:  Enabled by default with default value of 1 with upgrade downgrade tested at least manually.
 
+### Upgrade / Downgrade Strategy
+
+<!--
+If applicable, how will the component be upgraded and downgraded? Make sure
+this is in the test plan.
+
+Consider the following in developing an upgrade/downgrade strategy for this
+enhancement:
+- What changes (in invocations, configurations, API use, etc.) is an existing
+  cluster required to make on upgrade, in order to maintain previous behavior?
+- What changes (in invocations, configurations, API use, etc.) is an existing
+  cluster required to make on upgrade, in order to make use of the enhancement?
+-->
+
+We will default to 1 for maxUnavailable field in StatefulSet for backward compatibility
+
+Downgrades
+
+When downgrading from a release with this feature, to a release without maxUnavailable, there are two cases
+- If maxUnavailable is greater than 1, there are two more cases:-
+  - If you're rolling back to a release that doesn't have this field - then there is even no way to discover it
+  - If you're just disabling the feature (either together with downgrade to a release that has a field or without downgrade),the field should remain set
+       (unless someone will explicitly delete it later), but controller should ignore its behavior (and there shouldn't be a way to set it if the feature gate
+       is switched off).
+ - If maxUnavailable is less than equal to 1 -- in this case user wont see any difference in behavior -->
+
+### Version Skew Strategy
+
+No.
 
 ## Production Readiness Review Questionnaire
 
@@ -296,18 +580,18 @@ revert to the old behavior where rolling update will proceed one pod at a time.
 
 ###### What happens if we reenable the feature if it was previously rolled back?
 
-We will restore the desired behavior for StatefulSets for which the maxunavailable field wasn't deleted after 
+We will restore the desired behavior for StatefulSets for which the maxunavailable field wasn't deleted after
 the feature gate was disabled.
 
 ###### Are there any tests for feature enablement/disablement?
-yes, there are unit tests which make sure the field is correctly dropped 
+yes, there are unit tests which make sure the field is correctly dropped
 on feature enable and disabled
 
 ### Rollout, Upgrade and Rollback Planning
 
 ###### How can a rollout or rollback fail? Can it impact already running workloads?
 
-A rollout or rollback of this feature can fail if there is a bug which causes the kube-apiserver or 
+A rollout or rollback of this feature can fail if there is a bug which causes the kube-apiserver or
 the kube-controller-manager to start crashing when the feature flag is enabled.
 
 
@@ -375,6 +659,8 @@ No
 ###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
 The controller-manager will see very negligible and almost un-notoceable increase in cpu usage.
 
+###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
+
 ### Troubleshooting
 
 ###### How does this feature react if the API server and/or etcd is unavailable?
@@ -384,6 +670,8 @@ hence this feature will also be not be able to be used.
 ###### What are other known failure modes?
 NA
 
+###### What steps should be taken if SLOs are not being met to determine the problem?
+
 ## Implementation History
 
 - KEP Started on 1/1/2019
@@ -398,6 +686,7 @@ NA
 - Users who need StatefulSets stable identity and are ok with getting a slow rolling update will continue to use StatefulSets. Users who
 are not ok with a slow rolling update, will continue to use Deployments with workarounds for the scenarios mentioned in the Motivations
 section.
-- Another alternative would be to use OnDelete and deploy your own Custom Controller on top of StatefulSet Pods. There you can implement 
+- Another alternative would be to use OnDelete and deploy your own Custom Controller on top of StatefulSet Pods. There you can implement
 your own logic for deleting more than one pods in a specific order. This requires more work on the user but give them ultimate flexibility.
 
+## Infrastructure Needed (Optional)
\ No newline at end of file

From 21a765b0090bfcaf70ab21107f8834952d62ec93 Mon Sep 17 00:00:00 2001
From: kerthcet <kerthcet@gmail.com>
Date: Thu, 11 May 2023 15:32:10 +0800
Subject: [PATCH 2/5] Bump to beta

Signed-off-by: kerthcet <kerthcet@gmail.com>
---
 keps/prod-readiness/sig-apps/961.yaml         |  2 +
 .../README.md                                 | 84 ++++++++++++++-----
 .../kep.yaml                                  | 45 ++++++----
 3 files changed, 96 insertions(+), 35 deletions(-)

diff --git a/keps/prod-readiness/sig-apps/961.yaml b/keps/prod-readiness/sig-apps/961.yaml
index de2d414190d..3531e9cdaf0 100644
--- a/keps/prod-readiness/sig-apps/961.yaml
+++ b/keps/prod-readiness/sig-apps/961.yaml
@@ -1,3 +1,5 @@
 kep-number: 961
 alpha:
   approver: "@wojtek-t"
+beta:
+  approver: "@wojtek-t"
diff --git a/keps/sig-apps/961-maxunavailable-for-statefulset/README.md b/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
index 7183961cfe4..13fdeb49e9a 100644
--- a/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
+++ b/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
@@ -1,6 +1,5 @@
 # KEP-961: Implement maxUnavailable in StatefulSet
 
-
 <!--
 This is the title of your KEP. Keep it short, simple, and descriptive. A good
 title can help communicate what the KEP is and should be considered as part of
@@ -19,23 +18,25 @@ tags, and then generate with `hack/update-toc.sh`.
 
 <!-- toc -->
 - [Release Signoff Checklist](#release-signoff-checklist)
+- [Table of Contents](#table-of-contents)
 - [Summary](#summary)
 - [Motivation](#motivation)
   - [Goals](#goals)
   - [Non-Goals](#non-goals)
 - [Proposal](#proposal)
-  - [User Stories (Optional)](#user-stories-optional)
+  - [User Stories](#user-stories)
     - [Story 1](#story-1)
-    - [Story 2](#story-2)
   - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
   - [Risks and Mitigations](#risks-and-mitigations)
 - [Design Details](#design-details)
+  - [Implementation Details](#implementation-details)
+    - [API Changes](#api-changes)
+    - [Implementation](#implementation)
   - [Test Plan](#test-plan)
       - [Prerequisite testing updates](#prerequisite-testing-updates)
-      - [Unit tests](#unit-tests)
-      - [Integration tests](#integration-tests)
-      - [e2e tests](#e2e-tests)
-  - [Graduation Criteria](#graduation-criteria)
+    - [Tests](#tests)
+    - [Test Plan](#test-plan-1)
+- [Graduation Criteria](#graduation-criteria)
   - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
   - [Version Skew Strategy](#version-skew-strategy)
 - [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
@@ -206,7 +207,7 @@ What is out of scope for this KEP? Listing non-goals helps to focus discussion
 and make progress.
 -->
 
-N/A
+None.
 
 ## Proposal
 
@@ -458,9 +459,12 @@ No.
 - maxUnavailable greater than 1 with partition and maxUnavailable greater than replicas
 
 #### Test Plan
-For `Alpha`, unit tests and e2e tests will be added to test functionality at both
+
+For `Alpha`, unit tests and integration tests will be added to test functionality at both
 with feature flag enabled and disabled. Defaults will be verified so that users
-who donot set this flag are not surprised at all.
+who do not set this flag are not surprised at all.
+
+For `Beta`, add e2e tests.
 
 ## Graduation Criteria
 
@@ -604,11 +608,16 @@ maxUnavailable to a number greater than 1,  but the invariants and the logic wil
 maxUnavailable pods with the same identity and never more than maxUnavailable being deleted.
 
 ###### What specific metrics should inform a rollback?
-TODO when we reach Beta
+
+When feature enabled but rolling update in a unexpected phenomenon like the update pods at a time is not equal to the
+`maxUnavailable` value or rolling update in a unexpected order.
+
+Or we can refer to the `rolling-update-duration` metric for observation, if it didn't decrease when setting the `maxUnavailable`
+great than 1 or the duration increased abnormally, then we should rollback.
 
 ###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
-Will be tested when graduating to Beta.
 
+No, but it will be tested manually before merging the PR.
 
 ###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
 No
@@ -623,24 +632,36 @@ The below command should show maxUnavailable value:
 kubectl get statefulsets -o yaml | grep maxUnavailable
 ```
 
+Or refer to the new metric `rolling-update-duration`, it should exist.
+
 ###### How can someone using this feature know that it is working for their instance?
-TODO when we reach Beta
+
+With feature enabled, set the `maxUnavailable` great than 1, and pay attention to the rolling update pods at a time,
+it should equal to the `maxUnavailable`.
+Or when setting the `maxUnavailable` great than 1, the `rolling-update-duration` should decrease.
 
 ###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
 
+I think it has little relevance with SLOs, but rolling update at a very low speed which impacts the running services.
+
 ###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
 
+None.
+
 ###### Are there any missing metrics that would be useful to have to improve observability of this feature?
 
+None.
+
 ### Dependencies
 
 ###### Does this feature depend on any specific services running in the cluster?
-NA
+No.
 
 ### Scalability
 
 ###### Will enabling / using this feature result in any new API calls?
-It doesnt make any extra API calls.
+
+It doesn't make any extra API calls.
 
 ###### Will enabling / using this feature result in introducing new API types?
 No
@@ -648,7 +669,6 @@ No
 ###### Will enabling / using this feature result in any new calls to the cloud provider?
 No
 
-
 ###### Will enabling / using this feature result in increasing size or count of the existing API objects?
 A struct gets added to every StatefulSet object which has three fields, one 32 bit integer and two fields of type string.
 The struct in question is IntOrString.
@@ -661,6 +681,8 @@ The controller-manager will see very negligible and almost un-notoceable increas
 
 ###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
 
+No.
+
 ### Troubleshooting
 
 ###### How does this feature react if the API server and/or etcd is unavailable?
@@ -668,7 +690,28 @@ The RollingUpdate will fail or will not be able to proceed if etcd or apiserver
 hence this feature will also be not be able to be used.
 
 ###### What are other known failure modes?
-NA
+
+<!--
+For each of them, fill in the following information by copying the below template:
+  - [Failure mode brief description]
+    - Detection: How can it be detected via metrics? Stated another way:
+      how can an operator troubleshoot without logging into a master or worker node?
+    - Mitigations: What can be done to stop the bleeding, especially for already
+      running user workloads?
+    - Diagnostics: What are the useful log messages and their required logging
+      levels that could help debug the issue?
+      Not required until feature graduated to beta.
+    - Testing: Are there any tests for failure mode? If not, describe why.
+-->
+
+In a multi-master setup, when the cluster has skewed CCM, the behaviors may different.
+
+- [Failure mode brief description]
+  - Detection: the `rolling-update-duration` didn't decrease when setting the `maxUnavailable` great than 1 or increased abnormally.
+  - Mitigations: Disable the feature.
+  - Diagnostics: Set the logger level great than 4.
+  - Testing: No testing, because the rolling update duration is hard to measure, it can be impact by a lot of things,
+  like the master performance.
 
 ###### What steps should be taken if SLOs are not being met to determine the problem?
 
@@ -676,10 +719,11 @@ NA
 
 - KEP Started on 1/1/2019
 - Implementation PR and UT by 8/30
+- Bump to beta at 2023-05-11
 
 ## Drawbacks
 
-NA
+Slow rolling update.
 
 ## Alternatives
 
@@ -689,4 +733,6 @@ section.
 - Another alternative would be to use OnDelete and deploy your own Custom Controller on top of StatefulSet Pods. There you can implement
 your own logic for deleting more than one pods in a specific order. This requires more work on the user but give them ultimate flexibility.
 
-## Infrastructure Needed (Optional)
\ No newline at end of file
+## Infrastructure Needed (Optional)
+
+No.
diff --git a/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml b/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml
index d0acec5ea22..5ea9291dcc8 100644
--- a/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml
+++ b/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml
@@ -2,30 +2,43 @@ title: Implement maxUnavailable for StatefulSets
 kep-number: 961
 authors:
   - "@krmayankk"
+  - "@kerthcet"
 owning-sig: sig-apps
-participating-sigs:
-  - sig-apps
+participating-sigs: []
+status: implementable
+creation-date: 2018-12-29
 reviewers:
   - "@janetkuo"
   - "@kow3ns"
 approvers:
   - "@janetkuo"
   - "@kow3ns"
-editor: TBD
-creation-date: 2018-12-29
-last-updated: 2022-01-01
-status: implementable
-see-also:
-  - n/a
-replaces:
-  - n/a
-superseded-by:
-  - n/a
+
+see-also: []
+replaces: []
+
+# The target maturity stage in the current dev cycle for this KEP.
+stage: beta
+
+# The most recent milestone for which work toward delivery of this KEP has been
+# done. This can be the current (upcoming) milestone, if it is being actively
+# worked on.
+latest-milestone: "v1.28"
 
 # The milestone at which this feature was, or is targeted to be, at each stage.
 milestone:
   alpha: "v1.24"
-  beta: "v1.25"
-  stable: "v1.26"
-latest-milestone: "v1.24"
-stage: "alpha"
+  beta: "v1.28"
+  stable: TBD
+
+# The following PRR answers are required at alpha release
+# List the feature gate name and the components for which it must be enabled
+feature-gates:
+  - name: MaxUnavailableStatefulSet
+    components:
+      - kube-controller-manager
+      - kube-apiserver
+
+# The following PRR answers are required at beta release
+metrics:
+  - rolling-update-duration

From 301e2d81b98dcb3d39667a16e5b4afc200e9b08a Mon Sep 17 00:00:00 2001
From: kerthcet <kerthcet@gmail.com>
Date: Tue, 16 May 2023 17:17:22 +0800
Subject: [PATCH 3/5] address comments

Signed-off-by: kerthcet <kerthcet@gmail.com>
---
 keps/sig-apps/961-maxunavailable-for-statefulset/README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/keps/sig-apps/961-maxunavailable-for-statefulset/README.md b/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
index 13fdeb49e9a..4748c7772ad 100644
--- a/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
+++ b/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
@@ -598,7 +598,6 @@ on feature enable and disabled
 A rollout or rollback of this feature can fail if there is a bug which causes the kube-apiserver or
 the kube-controller-manager to start crashing when the feature flag is enabled.
 
-
 Yes, it can impact already running workloads.
 
 If a rolling update is in progress for a StatefulSet, while this feature is being enabled in kube-apiserver
@@ -610,7 +609,8 @@ maxUnavailable pods with the same identity and never more than maxUnavailable be
 ###### What specific metrics should inform a rollback?
 
 When feature enabled but rolling update in a unexpected phenomenon like the update pods at a time is not equal to the
-`maxUnavailable` value or rolling update in a unexpected order.
+`maxUnavailable` value (we may have some other factors impacting this, but this should apply for most of the time) or
+rolling update in a unexpected order.
 
 Or we can refer to the `rolling-update-duration` metric for observation, if it didn't decrease when setting the `maxUnavailable`
 great than 1 or the duration increased abnormally, then we should rollback.

From ddd8c0cd636eef2dadb5e7a9417b69c9106c125c Mon Sep 17 00:00:00 2001
From: kerthcet <kerthcet@gmail.com>
Date: Wed, 31 May 2023 19:50:21 +0800
Subject: [PATCH 4/5] Address commetns

Signed-off-by: kerthcet <kerthcet@gmail.com>
---
 .../sig-apps/3335-statefulset-slice/README.md |   6 +-
 .../README.md                                 | 208 ++++++++++++++----
 .../kep.yaml                                  |   2 +
 3 files changed, 171 insertions(+), 45 deletions(-)

diff --git a/keps/sig-apps/3335-statefulset-slice/README.md b/keps/sig-apps/3335-statefulset-slice/README.md
index 490a3277c9e..eb39a0c4ec9 100644
--- a/keps/sig-apps/3335-statefulset-slice/README.md
+++ b/keps/sig-apps/3335-statefulset-slice/README.md
@@ -139,10 +139,10 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
 - [X] (R) Design details are appropriately documented
 - [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
   - [ ] e2e Tests for all Beta API Operations (endpoints)
-  - [ ] (R) Ensure GA e2e tests for meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) 
+  - [ ] (R) Ensure GA e2e tests for meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
   - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
 - [X] (R) Graduation criteria is in place
-  - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) 
+  - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
 - [ ] (R) Production readiness review completed
 - [ ] (R) Production readiness review approved
 - [X] "Implementation History" section is up-to-date for milestone
@@ -390,7 +390,7 @@ In the main control loop, StatefulSet will attempt to create pod replicas `[k, N
 *   When scaling down pods: If ordinal `j` exists but is not in range `[k, N+k-1]`), pod `j` will be terminated.
 **RollingUpdate Partition Changes**
 
-Since `ordinals.start` changes the offset of the replica ordinals, this affects the `partition` field used for RollingUpdate. As `partition` specifies an ordinal index, the partition field must be in the range `[k, N+k-1]`, to be valid. 
+Since `ordinals.start` changes the offset of the replica ordinals, this affects the `partition` field used for RollingUpdate. As `partition` specifies an ordinal index, the partition field must be in the range `[k, N+k-1]`, to be valid.
 
 ### Test Plan
 
diff --git a/keps/sig-apps/961-maxunavailable-for-statefulset/README.md b/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
index 4748c7772ad..a766d05671c 100644
--- a/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
+++ b/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
@@ -34,10 +34,15 @@ tags, and then generate with `hack/update-toc.sh`.
     - [Implementation](#implementation)
   - [Test Plan](#test-plan)
       - [Prerequisite testing updates](#prerequisite-testing-updates)
-    - [Tests](#tests)
-    - [Test Plan](#test-plan-1)
+      - [Unit tests](#unit-tests)
+      - [Integration tests](#integration-tests)
+      - [e2e tests](#e2e-tests)
 - [Graduation Criteria](#graduation-criteria)
+    - [Alpha](#alpha)
+    - [Beta](#beta)
   - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
+    - [Upgrade](#upgrade)
+    - [Downgrade](#downgrade)
   - [Version Skew Strategy](#version-skew-strategy)
 - [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
   - [Feature Enablement and Rollback](#feature-enablement-and-rollback)
@@ -444,9 +449,28 @@ Based on reviewers feedback describe what additional tests need to be added prio
 implementing this enhancement to ensure the enhancements have also solid foundations.
 -->
 
-No.
+##### Unit tests
+
+<!--
+In principle every added code should have complete unit test coverage, so providing
+the exact set of tests will not bring additional value.
+However, if complete unit test coverage is not possible, explain the reason of it
+together with explanation why this is acceptable.
+-->
+
+<!--
+Additionally, for Alpha try to enumerate the core package you will be touching
+to implement this enhancement and provide the current unit coverage for those
+in the form of:
+- <package>: <date> - <current test coverage>
+The data can be easily read from:
+https://testgrid.k8s.io/sig-testing-canaries#ci-kubernetes-coverage-unit
+
+This can inform certain test coverage improvements that we want to do before
+extending the production code to implement this enhancement.
+-->
 
-#### Tests
+Testcases:
 
 - maxUnavailable =1, Same behavior as today with PodManagementPolicy as `OrderedReady` or `Parallel`
 - Each of these Tests can be run in PodManagementPolicy = `OrderedReady` or `Parallel` and the Update
@@ -458,13 +482,42 @@ No.
 - maxUnavailable greater than 1 with partition and staged pods greater than maxUnavailable
 - maxUnavailable greater than 1 with partition and maxUnavailable greater than replicas
 
-#### Test Plan
+Coverage:
 
-For `Alpha`, unit tests and integration tests will be added to test functionality at both
-with feature flag enabled and disabled. Defaults will be verified so that users
-who do not set this flag are not surprised at all.
+- `pkg/apis/apps/v1`:               `2023-05-26` - `71.7%`
+- `pkg/apis/apps/v1beta2`:          `2023-05-26` - `76.7%`
+- `pkg/apis/apps/validation`:       `2023-05-26` - `92.3%`
+- `pkg/controller/statefulset`:     `2023-05-26` - `85.7%`
+- `pkg/registry/apps/statefulset`:  `2023-05-26` - `63.9%`
 
-For `Beta`, add e2e tests.
+##### Integration tests
+
+<!--
+This question should be filled when targeting a release.
+For Alpha, describe what tests will be added to ensure proper quality of the enhancement.
+
+For Beta and GA, add links to added tests together with links to k8s-triage for those tests:
+https://storage.googleapis.com/k8s-triage/index.html
+-->
+
+We have controller tests in `pkg/controller/statefulset` cover all cases. What's more,
+rolling update is hard to simulate in integration tests because we want to verify the pod
+status but actually they're faked. But we'll have e2e tests instead.
+
+##### e2e tests
+
+<!--
+This question should be filled when targeting a release.
+For Alpha, describe what tests will be added to ensure proper quality of the enhancement.
+
+For Beta and GA, add links to added tests together with links to k8s-triage for those tests:
+https://storage.googleapis.com/k8s-triage/index.html
+
+We expect no non-infra related flakes in the last month as a GA graduation criteria.
+-->
+
+- `test/e2e/apps: we'll add a test to verify that the number of pods brought down each
+time is equal to configured maxUnavailable.
 
 ## Graduation Criteria
 
@@ -530,8 +583,13 @@ in back-to-back releases.
 - Deprecate the flag
 -->
 
-- Alpha: Initial support for maxUnavailable in StatefulSets added. Disabled by default with default value of 1.
-- Beta:  Enabled by default with default value of 1 with upgrade downgrade tested at least manually.
+#### Alpha
+
+- Initial support for maxUnavailable in StatefulSets added. Disabled by default with default value of 1.
+
+#### Beta
+
+- Enabled by default with default value of 1 with upgrade downgrade tested at least manually.
 
 ### Upgrade / Downgrade Strategy
 
@@ -547,21 +605,43 @@ enhancement:
   cluster required to make on upgrade, in order to make use of the enhancement?
 -->
 
-We will default to 1 for maxUnavailable field in StatefulSet for backward compatibility
+#### Upgrade
 
-Downgrades
+- What changes (in invocations, configurations, API use, etc.) is an existing
+  cluster required to make on upgrade, in order to maintain previous behavior?
+  - Nothing special, it's backward compatibility
+- What changes (in invocations, configurations, API use, etc.) is an existing
+  cluster required to make on upgrade, in order to make use of the enhancement?
+  - This feature is enabled by default in beta so you can configure the maxUnavailable to meet your requirements
+
+#### Downgrade
+
+when downgrade from a release `maxUnavailable` enabled to a release it's disabled:
+
+- If we downgrade to a release even doesn't have this field, then there's no way to discover it.
+- Else, if we create statefulset with maxUnavailable configured, it will be dropped to nil automatically.
 
-When downgrading from a release with this feature, to a release without maxUnavailable, there are two cases
-- If maxUnavailable is greater than 1, there are two more cases:-
-  - If you're rolling back to a release that doesn't have this field - then there is even no way to discover it
-  - If you're just disabling the feature (either together with downgrade to a release that has a field or without downgrade),the field should remain set
-       (unless someone will explicitly delete it later), but controller should ignore its behavior (and there shouldn't be a way to set it if the feature gate
-       is switched off).
- - If maxUnavailable is less than equal to 1 -- in this case user wont see any difference in behavior -->
+Anyway, we'll back to that time rolling update with a single unavailable pod.
 
 ### Version Skew Strategy
 
-No.
+<!--
+If applicable, how will the component handle version skew with other
+components? What are the guarantees? Make sure this is in the test plan.
+
+Consider the following in developing a version skew strategy for this
+enhancement:
+- Does this enhancement involve coordinating behavior in the control plane and
+  in the kubelet? How does an n-2 kubelet without this feature available behave
+  when this feature is used?
+- Will any other components on the node change? For example, changes to CSI,
+  CRI or CNI may require updating that component before the kubelet.
+-->
+
+- If both api-server and kube-controller-manager enabled this feature, then everything works as expected
+- If only api-server enabled this feature, then we can configure the `maxUnavailable` field but it will not work.
+- If only kube-controller-manager enabled this feature, then for newly created statefulSet, the `maxUnavailable` will default to 1.
+- If both api-server and kube-controller-manager disabled this feature, then everything works as expected.
 
 ## Production Readiness Review Questionnaire
 
@@ -584,36 +664,56 @@ revert to the old behavior where rolling update will proceed one pod at a time.
 
 ###### What happens if we reenable the feature if it was previously rolled back?
 
-We will restore the desired behavior for StatefulSets for which the maxunavailable field wasn't deleted after
+We will restore the desired behavior for StatefulSets for which the maxUnavailable field wasn't deleted after
 the feature gate was disabled.
 
 ###### Are there any tests for feature enablement/disablement?
-yes, there are unit tests which make sure the field is correctly dropped
-on feature enable and disabled
+
+<!--
+The e2e framework does not currently support enabling or disabling feature
+gates. However, unit tests in each component dealing with managing data, created
+with and without the feature, are necessary. At the very least, think about
+conversion tests if API types are being modified.
+
+Additionally, for features that are introducing a new API field, unit tests that
+are exercising the `switch` of feature gate itself (what happens if I disable a
+feature gate after having objects written with the new field) are also critical.
+You can take a look at one potential example of such test in:
+https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
+-->
+
+Yes, there are unit tests which make sure the field is correctly dropped
+on feature enable and disabled, see [strategy tests](https://github.com/kubernetes/kubernetes/blob/23698d3e9f4f3b9738ba5a6fcefd17894a00624f/pkg/registry/apps/statefulset/strategy_test.go#L391-L417).
 
 ### Rollout, Upgrade and Rollback Planning
 
 ###### How can a rollout or rollback fail? Can it impact already running workloads?
 
-A rollout or rollback of this feature can fail if there is a bug which causes the kube-apiserver or
-the kube-controller-manager to start crashing when the feature flag is enabled.
+<!--
+Try to be as paranoid as possible - e.g., what if some components will restart
+mid-rollout?
+
+Be sure to consider highly-available clusters, where, for example,
+feature flags will be enabled on some API servers and not others during the
+rollout. Similarly, consider large clusters and how enablement/disablement
+will rollout across nodes.
+-->
 
-Yes, it can impact already running workloads.
+This could happen when we downgrade the kube-controller-manager from maxUnavailable-enabled release to a disabled one,
+but it will not impact running workloads, because `maxUnavailable` only works in rolling update.
 
-If a rolling update is in progress for a StatefulSet, while this feature is being enabled in kube-apiserver
-and kube-controller-manager, the StatefulSet controller can run into corner cases where it will take longer
-for the controller to converge. This will only happen if after enabling the feature, the customer also sets
-maxUnavailable to a number greater than 1,  but the invariants and the logic will ensure that there are never more than
-maxUnavailable pods with the same identity and never more than maxUnavailable being deleted.
+But if a statefulset has a plenty of replicas, when rolling update, it will take more time comparing to
+`maxUnavailable` enabled with a number greater than 1.
 
 ###### What specific metrics should inform a rollback?
 
-When feature enabled but rolling update in a unexpected phenomenon like the update pods at a time is not equal to the
-`maxUnavailable` value (we may have some other factors impacting this, but this should apply for most of the time) or
-rolling update in a unexpected order.
+<!--
+What signals should users be paying attention to when the feature is young
+that might indicate a serious problem?
+-->
 
-Or we can refer to the `rolling-update-duration` metric for observation, if it didn't decrease when setting the `maxUnavailable`
-great than 1 or the duration increased abnormally, then we should rollback.
+When `maxUnavailable` enabled, and we're rolling update a statefulset, the number of pods brought down should
+equal to the `maxUnavailable`, if not, we should rollback.
 
 ###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
 
@@ -642,10 +742,34 @@ Or when setting the `maxUnavailable` great than 1, the `rolling-update-duration`
 
 ###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
 
+<!--
+This is your opportunity to define what "normal" quality of service looks like
+for a feature.
+
+It's impossible to provide comprehensive guidance, but at the very
+high level (needs more precise definitions) those may be things like:
+  - per-day percentage of API calls finishing with 5XX errors <= 1%
+  - 99% percentile over day of absolute value from (job creation time minus expected
+    job creation time) for cron job <= 10%
+  - 99.9% of /health requests per day finish with 200 code
+
+These goals will help you determine what you need to measure (SLIs) in the next
+question.
+-->
+
 I think it has little relevance with SLOs, but rolling update at a very low speed which impacts the running services.
 
 ###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
 
+<!--
+Pick one more of these and delete the rest.
+-->
+
+- [x] Metrics
+  - Component exposing the metric: kube-controller-manager
+    - Metric name: `rolling-update-duration{plugin="PodTopologySpread"}`
+    - Metric name: `schedule_attempts_total{result="error|unschedulable"}`
+
 None.
 
 ###### Are there any missing metrics that would be useful to have to improve observability of this feature?
@@ -704,7 +828,7 @@ For each of them, fill in the following information by copying the below templat
     - Testing: Are there any tests for failure mode? If not, describe why.
 -->
 
-In a multi-master setup, when the cluster has skewed CCM, the behaviors may different.
+In a multi-master setup, when the cluster has skewed kube-controller-manager, the behaviors may differ.
 
 - [Failure mode brief description]
   - Detection: the `rolling-update-duration` didn't decrease when setting the `maxUnavailable` great than 1 or increased abnormally.
@@ -717,13 +841,13 @@ In a multi-master setup, when the cluster has skewed CCM, the behaviors may diff
 
 ## Implementation History
 
-- KEP Started on 1/1/2019
-- Implementation PR and UT by 8/30
-- Bump to beta at 2023-05-11
+- 2019-01-01: KEP created.
+- 2019-08-30: PR Implemented with tests covered.
+- 2023-05-11: Bump to Beta.
 
 ## Drawbacks
 
-Slow rolling update.
+It's an opt-in feature so you can choose to use it or not, then no drawbacks I can think of.
 
 ## Alternatives
 
diff --git a/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml b/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml
index 5ea9291dcc8..97317de0602 100644
--- a/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml
+++ b/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml
@@ -10,9 +10,11 @@ creation-date: 2018-12-29
 reviewers:
   - "@janetkuo"
   - "@kow3ns"
+  - "@soltysh"
 approvers:
   - "@janetkuo"
   - "@kow3ns"
+  - "@soltysh"
 
 see-also: []
 replaces: []

From c9a8b37765c807b1b9d3fc9abdfa8d6a2d33270d Mon Sep 17 00:00:00 2001
From: kerthcet <kerthcet@gmail.com>
Date: Thu, 15 Jun 2023 20:41:00 +0800
Subject: [PATCH 5/5] Add new metric rollingUpdateDurationSeconds to
 statefulset

Signed-off-by: kerthcet <kerthcet@gmail.com>
---
 .../README.md                                 | 163 +++++++++++++-----
 .../kep.yaml                                  |   3 +-
 2 files changed, 121 insertions(+), 45 deletions(-)

diff --git a/keps/sig-apps/961-maxunavailable-for-statefulset/README.md b/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
index a766d05671c..e9cdd2953e4 100644
--- a/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
+++ b/keps/sig-apps/961-maxunavailable-for-statefulset/README.md
@@ -32,6 +32,7 @@ tags, and then generate with `hack/update-toc.sh`.
   - [Implementation Details](#implementation-details)
     - [API Changes](#api-changes)
     - [Implementation](#implementation)
+    - [Metrics](#metrics)
   - [Test Plan](#test-plan)
       - [Prerequisite testing updates](#prerequisite-testing-updates)
       - [Unit tests](#unit-tests)
@@ -79,16 +80,16 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
 - [x] (R) KEP approvers have approved the KEP status as `implementable`
 - [x] (R) Design details are appropriately documented
 - [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
-  - [ ] e2e Tests for all Beta API Operations (endpoints)
+  - [x] e2e Tests for all Beta API Operations (endpoints)
   - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
   - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
 - [ ] (R) Graduation criteria is in place
   - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
-- [ ] (R) Production readiness review completed
+- [x] (R) Production readiness review completed
 - [ ] (R) Production readiness review approved
 - [x] "Implementation History" section is up-to-date for milestone
-- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
-- [x] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
+- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
+- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
 
 <!--
 **Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
@@ -235,6 +236,7 @@ bogged down.
 -->
 
 #### Story 1
+
 As a User of Kubernetes, I should be able to update my StatefulSet, more than one Pod at a time, in a
 RollingUpdate manner, if my Stateful app can tolerate more than one pod being down, thus allowing my
 update to finish much faster.
@@ -248,7 +250,9 @@ Go in to as much detail as necessary here.
 This might be a good place to talk about core concepts and how they relate.
 -->
 
-No.
+With `maxUnavailable` feature enabled, we'll bring down more than one pod at a time, if your application can't tolerate
+this behavior, you should absolutely disable this feature or leave the field unconfigured, which will behave as the default
+behavior.
 
 ### Risks and Mitigations
 
@@ -425,6 +429,53 @@ https://github.com/kubernetes/kubernetes/blob/v1.13.0/pkg/controller/statefulset
 ...
 ```
 
+#### Metrics
+
+We'll add a new metric named `rolling-update-duration-seconds`, it tracks how long a statefulset takes to finish a rolling-update,
+the general logic is:
+
+when [performing update](https://github.com/kubernetes/kubernetes/blob/c984d53b31655924b87a57bfd4d8ff90aaeab9f8/pkg/controller/statefulset/stateful_set_control.go#L97-L138),
+if the `currentRevision` != `updateRevision`, then we take it as starting to rolling update, but because rolling update can not
+finish in one reconciling loop, then we may have to track it, we have two approaches:
+
+- One is add a new field in the `defaultStatefulSetControl`, like below:
+
+  ```golang
+  type defaultStatefulSetControl struct {
+    podControl        *StatefulPodControl
+    statusUpdater     StatefulSetStatusUpdaterInterface
+    controllerHistory history.Interface
+    recorder          record.EventRecorder
+
+    // <Newly Added>
+    // rollingUpdateStartTimes records when each statefulset starts to perform rolling update.
+    // key is the statefulset name, value is the start time.
+    rollingUpdateStartTimes map[string]time.Time
+  }
+  ```
+
+  Then when `currentRevision` != `updateRevision`, we'll check whether the statefulset name
+  exists in the `rollingUpdateStartTimes`, if exists, skip, if not, write the
+  start time.
+
+  If we found rolling update completed in [updateStatefulSetStatus](https://github.com/kubernetes/kubernetes/blob/c984d53b31655924b87a57bfd4d8ff90aaeab9f8/pkg/controller/statefulset/stateful_set_control.go#L682-L701),
+  then we'll report the metrics here and clear the statefulset from the `rollingUpdateStartTimes`.
+
+- Another approach is add a new field to `StatefulSetStatus` as below:
+
+  ```golang
+  type StatefulSetStatus struct {
+
+    // rollingUpdateStartTime will record when the statefulSet starts to rolling update.
+    rollingUpdateStartTime *time.Time
+  }
+  ```
+
+  The logic general looks the same, we'll update the time together with status update, so there's no
+  extra api calls.
+
+  Prefer Opt-2 as we already record the rolling update messages in the status.
+
 ### Test Plan
 
 <!--
@@ -482,6 +533,11 @@ Testcases:
 - maxUnavailable greater than 1 with partition and staged pods greater than maxUnavailable
 - maxUnavailable greater than 1 with partition and maxUnavailable greater than replicas
 
+New testcases being added:
+
+- New metric `rolling-update-duration-seconds` should calculate time correctly.
+- Feature enablement/disablement test
+
 Coverage:
 
 - `pkg/apis/apps/v1`:               `2023-05-26` - `71.7%`
@@ -500,9 +556,9 @@ For Beta and GA, add links to added tests together with links to k8s-triage for
 https://storage.googleapis.com/k8s-triage/index.html
 -->
 
-We have controller tests in `pkg/controller/statefulset` cover all cases. What's more,
-rolling update is hard to simulate in integration tests because we want to verify the pod
-status but actually they're faked. But we'll have e2e tests instead.
+Controller tests in `pkg/controller/statefulset` cover all cases. Since pod statuses
+are faked in integration tests, simulating rolling update is difficult. We're opting
+for e2e tests instead.
 
 ##### e2e tests
 
@@ -590,6 +646,8 @@ in back-to-back releases.
 #### Beta
 
 - Enabled by default with default value of 1 with upgrade downgrade tested at least manually.
+- Additional e2e tests listed in the test plan should be added.
+- No critical bugs reported and no bad feedbacks collected.
 
 ### Upgrade / Downgrade Strategy
 
@@ -609,19 +667,19 @@ enhancement:
 
 - What changes (in invocations, configurations, API use, etc.) is an existing
   cluster required to make on upgrade, in order to maintain previous behavior?
-  - Nothing special, it's backward compatibility
+  - This is a new field, it will only be enabled after an upgrade. To maintain the
+    previous behavior, you can disable the feature gate manually in Beta or leave
+    the field unconfigured.
 - What changes (in invocations, configurations, API use, etc.) is an existing
   cluster required to make on upgrade, in order to make use of the enhancement?
-  - This feature is enabled by default in beta so you can configure the maxUnavailable to meet your requirements
+  - This feature is enabled by default in Beta so you can configure the maxUnavailable
+    to meet your requirements
 
 #### Downgrade
 
-when downgrade from a release `maxUnavailable` enabled to a release it's disabled:
-
-- If we downgrade to a release even doesn't have this field, then there's no way to discover it.
-- Else, if we create statefulset with maxUnavailable configured, it will be dropped to nil automatically.
-
-Anyway, we'll back to that time rolling update with a single unavailable pod.
+- If we try to set this field, it will be silently dropped.
+- If the object with this field already exists, we maintain is as it is, but the controller
+  will just ignore this field.
 
 ### Version Skew Strategy
 
@@ -638,10 +696,13 @@ enhancement:
   CRI or CNI may require updating that component before the kubelet.
 -->
 
-- If both api-server and kube-controller-manager enabled this feature, then everything works as expected
-- If only api-server enabled this feature, then we can configure the `maxUnavailable` field but it will not work.
-- If only kube-controller-manager enabled this feature, then for newly created statefulSet, the `maxUnavailable` will default to 1.
-- If both api-server and kube-controller-manager disabled this feature, then everything works as expected.
+- If both api-server and kube-controller-manager enabled this feature, then `maxUnavailable` will take effect.
+- If only api-server enabled this feature, because statefulset controller is unaware of this feature, then we'll
+  fall into the default rolling-update behavior, one pod at a time.
+- If only kube-controller-manager enabled this feature, then this field is invisible to users, we'll keep the default
+  rolling-update behavior, one pod at a time.
+- If both api-server and kube-controller-manager disabled this feature, then only one Pod will be updated
+  at a time as the default behavior.
 
 ## Production Readiness Review Questionnaire
 
@@ -659,13 +720,12 @@ No, the default behavior remains the same.
 
 ###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
 
-Yes this feature can be disabled. Once disabled, all existing  StatefulSet will
-revert to the old behavior where rolling update will proceed one pod at a time.
+Yes, you can disable the feature-gate manually once it's in Beta.
 
 ###### What happens if we reenable the feature if it was previously rolled back?
 
-We will restore the desired behavior for StatefulSets for which the maxUnavailable field wasn't deleted after
-the feature gate was disabled.
+If we disable the feature-gate and reenable it again, then the `maxUnavailable` will take effect again,
+that's say we'll restart to respect the value of `maxUnavailable` when rolling update.
 
 ###### Are there any tests for feature enablement/disablement?
 
@@ -682,9 +742,11 @@ You can take a look at one potential example of such test in:
 https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
 -->
 
-Yes, there are unit tests which make sure the field is correctly dropped
+There are unit tests which make sure the field is correctly dropped
 on feature enable and disabled, see [strategy tests](https://github.com/kubernetes/kubernetes/blob/23698d3e9f4f3b9738ba5a6fcefd17894a00624f/pkg/registry/apps/statefulset/strategy_test.go#L391-L417).
 
+Feature enablement/disablement test will also be added when graduating to Beta as [TestStatefulSetStartOrdinalEnablement](https://github.com/kubernetes/kubernetes/blob/23698d3e9f4f3b9738ba5a6fcefd17894a00624f/pkg/registry/apps/statefulset/strategy_test.go#L473)
+
 ### Rollout, Upgrade and Rollback Planning
 
 ###### How can a rollout or rollback fail? Can it impact already running workloads?
@@ -699,11 +761,15 @@ rollout. Similarly, consider large clusters and how enablement/disablement
 will rollout across nodes.
 -->
 
-This could happen when we downgrade the kube-controller-manager from maxUnavailable-enabled release to a disabled one,
-but it will not impact running workloads, because `maxUnavailable` only works in rolling update.
+It depends, like
+
+- In a HA cluster, if part of the API servers enabled with this feature gate but the others not, then
+  for already running workloads, they will not be impacted in rolling update because the process is controlled
+  by the statefulset controller.  But for newly created statefulset, their `maxUnavailable` field will
+  be dropped if requested a feature-gate disabled api-server.
 
-But if a statefulset has a plenty of replicas, when rolling update, it will take more time comparing to
-`maxUnavailable` enabled with a number greater than 1.
+- With this feature enabled and a statefulset running into the rolling-update, then we disabled the feature-gate,
+  then this statefulset will be impacted as it will fall into the default behavior, updating a pod at one time.
 
 ###### What specific metrics should inform a rollback?
 
@@ -712,19 +778,26 @@ What signals should users be paying attention to when the feature is young
 that might indicate a serious problem?
 -->
 
-When `maxUnavailable` enabled, and we're rolling update a statefulset, the number of pods brought down should
+- As a normal user, with `maxUnavailable` enabled, and we're rolling update a statefulset, the number of pods brought down should
 equal to the `maxUnavailable`, if not, we should rollback.
 
+- As a administrator(as well for normal user), I can check the new metric `rolling-update-duration-seconds`, if we enabled the feature with
+greater than 1 `maxUnavailable` set for statefulsets, but didn't see the duration fall down, then we may have some problems here.
+This is not precise because rolling-update duration can be impacted by several factors, e.g. pulling down a big, new image, but
+can be some indicators refer to.
+
 ###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
 
 No, but it will be tested manually before merging the PR.
 
 ###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
+
 No
 
 ### Monitoring Requirements
 
 ###### How can an operator determine if the feature is in use by workloads?
+
 If their StatefulSet rollingUpdate section has the field maxUnavailable specified with
 a value different than 1.
 The below command should show maxUnavailable value:
@@ -732,13 +805,14 @@ The below command should show maxUnavailable value:
 kubectl get statefulsets -o yaml | grep maxUnavailable
 ```
 
-Or refer to the new metric `rolling-update-duration`, it should exist.
+Or refer to the new metric `rolling-update-duration-seconds`, it should exist.
 
 ###### How can someone using this feature know that it is working for their instance?
 
 With feature enabled, set the `maxUnavailable` great than 1, and pay attention to the rolling update pods at a time,
 it should equal to the `maxUnavailable`.
-Or when setting the `maxUnavailable` great than 1, the `rolling-update-duration` should decrease.
+Or refer to the `rolling-update-duration-seconds`, it can give some indicators generally. Like when setting the `maxUnavailable`
+greater than 1, then the duration should descend.
 
 ###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
 
@@ -757,7 +831,9 @@ These goals will help you determine what you need to measure (SLIs) in the next
 question.
 -->
 
-I think it has little relevance with SLOs, but rolling update at a very low speed which impacts the running services.
+Rolling update duration is impacted by a lot of factors, like the container hooks, image size, network, container logics,
+so what we can roughly know here is for each individual statefulset. So generally, if we set the `maxUnavailable` to X, then
+the rolling update duration should reduce X times. Keep in mind that, this is not precise.
 
 ###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
 
@@ -766,11 +842,8 @@ Pick one more of these and delete the rest.
 -->
 
 - [x] Metrics
-  - Component exposing the metric: kube-controller-manager
-    - Metric name: `rolling-update-duration{plugin="PodTopologySpread"}`
-    - Metric name: `schedule_attempts_total{result="error|unschedulable"}`
-
-None.
+  - Metric name: rolling-update-duration-seconds
+  - Components exposing the metric: kube-controller-manager
 
 ###### Are there any missing metrics that would be useful to have to improve observability of this feature?
 
@@ -797,11 +870,14 @@ No
 A struct gets added to every StatefulSet object which has three fields, one 32 bit integer and two fields of type string.
 The struct in question is IntOrString.
 
+If we introduce the new metric with Opt-2, then we'll have a new field `rollingUpdateStartTime` with `*Time` type
+in the `StatefulSetStatus`.
+
 ###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
 No
 
 ###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
-The controller-manager will see very negligible and almost un-notoceable increase in cpu usage.
+The controller-manager will see very negligible and almost un-noticeable increase in cpu usage.
 
 ###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
 
@@ -828,14 +904,13 @@ For each of them, fill in the following information by copying the below templat
     - Testing: Are there any tests for failure mode? If not, describe why.
 -->
 
-In a multi-master setup, when the cluster has skewed kube-controller-manager, the behaviors may differ.
+Disable the feature-gate when performing rolling-update.
 
 - [Failure mode brief description]
-  - Detection: the `rolling-update-duration` didn't decrease when setting the `maxUnavailable` great than 1 or increased abnormally.
-  - Mitigations: Disable the feature.
+  - Detection: The `maxUnavailabel` is not respected.
+  - Mitigations: If rolling update in parallel is tolerated for application, set the `podManagementPolicy=Parallel`.
   - Diagnostics: Set the logger level great than 4.
-  - Testing: No testing, because the rolling update duration is hard to measure, it can be impact by a lot of things,
-  like the master performance.
+  - Testing: We have e2e tests to cover the right behaviors.
 
 ###### What steps should be taken if SLOs are not being met to determine the problem?
 
diff --git a/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml b/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml
index 97317de0602..dfa21492280 100644
--- a/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml
+++ b/keps/sig-apps/961-maxunavailable-for-statefulset/kep.yaml
@@ -40,7 +40,8 @@ feature-gates:
     components:
       - kube-controller-manager
       - kube-apiserver
+disable-supported: true
 
 # The following PRR answers are required at beta release
 metrics:
-  - rolling-update-duration
+  - rolling-update-duration-seconds