Skip to content

Commit 1185959

Browse files
committed
KEP-3015: PreferSameNode traffic distribution
1 parent 9f7ee23 commit 1185959

File tree

3 files changed

+440
-0
lines changed

3 files changed

+440
-0
lines changed
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# The KEP must have an approver from the
2+
# "prod-readiness-approvers" group
3+
# of http://git.k8s.io/enhancements/OWNERS_ALIASES
4+
kep-number: 3015
5+
alpha:
6+
approver: "@johnbelamaric"
Lines changed: 390 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,390 @@
1+
# KEP-3015 PreferSameNode Traffic Distribution
2+
3+
<!-- toc -->
4+
- [Release Signoff Checklist](#release-signoff-checklist)
5+
- [Summary](#summary)
6+
- [Motivation](#motivation)
7+
- [Goals](#goals)
8+
- [Non-Goals](#non-goals)
9+
- [Proposal](#proposal)
10+
- [User Stories](#user-stories)
11+
- [DNS](#dns)
12+
- [Risks and Mitigations](#risks-and-mitigations)
13+
- [Design Details](#design-details)
14+
- [Test Plan](#test-plan)
15+
- [Prerequisite testing updates](#prerequisite-testing-updates)
16+
- [Unit tests](#unit-tests)
17+
- [Integration tests](#integration-tests)
18+
- [e2e tests](#e2e-tests)
19+
- [Graduation Criteria](#graduation-criteria)
20+
- [Alpha](#alpha)
21+
- [Beta](#beta)
22+
- [GA](#ga)
23+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
24+
- [Version Skew Strategy](#version-skew-strategy)
25+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
26+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
27+
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
28+
- [Monitoring Requirements](#monitoring-requirements)
29+
- [Dependencies](#dependencies)
30+
- [Scalability](#scalability)
31+
- [Troubleshooting](#troubleshooting)
32+
- [Implementation History](#implementation-history)
33+
<!-- /toc -->
34+
35+
## Release Signoff Checklist
36+
37+
<!--
38+
**ACTION REQUIRED:** In order to merge code into a release, there must be an
39+
issue in [kubernetes/enhancements] referencing this KEP and targeting a release
40+
milestone **before the [Enhancement Freeze](https://git.k8s.io/sig-release/releases)
41+
of the targeted release**.
42+
43+
For enhancements that make changes to code or processes/procedures in core
44+
Kubernetes—i.e., [kubernetes/kubernetes], we require the following Release
45+
Signoff checklist to be completed.
46+
47+
Check these off as they are completed for the Release Team to track. These
48+
checklist items _must_ be updated for the enhancement to be released.
49+
-->
50+
51+
Items marked with (R) are required *prior to targeting to a milestone / release*.
52+
53+
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
54+
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
55+
- [ ] (R) Design details are appropriately documented
56+
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
57+
- [ ] e2e Tests for all Beta API Operations (endpoints)
58+
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
59+
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
60+
- [ ] (R) Graduation criteria is in place
61+
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
62+
- [ ] (R) Production readiness review completed
63+
- [ ] (R) Production readiness review approved
64+
- [ ] "Implementation History" section is up-to-date for milestone
65+
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
66+
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
67+
68+
<!--
69+
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
70+
-->
71+
72+
[kubernetes.io]: https://kubernetes.io/
73+
[kubernetes/enhancements]: https://git.k8s.io/enhancements
74+
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
75+
[kubernetes/website]: https://git.k8s.io/website
76+
77+
## Summary
78+
79+
This KEP extends KEP-4444 `TrafficDistribution` with a new value,
80+
`PreferSameNode`, indicating traffic for a service should
81+
preferentially be routed to endpoints on the same node as the client.
82+
83+
(This is the third attempt at this feature, which was previously
84+
suggested as [`internalTrafficPolicy: PreferLocal`] and [Node-level
85+
topology].)
86+
87+
[`internalTrafficPolicy: PreferLocal`]: https://github.com/kubernetes/enhancements/pull/3016
88+
[Node-level topology]: https://github.com/kubernetes/enhancements/pull/3293
89+
90+
## Motivation
91+
92+
### Goals
93+
94+
- Allow configuring a service so that connections will be delivered to
95+
a local endpoint when possible, and a remote endpoint if not.
96+
97+
### Non-Goals
98+
99+
N/A
100+
101+
## Proposal
102+
103+
### User Stories
104+
105+
#### DNS
106+
107+
As a cluster administrator, I plan to run a DNS pod on each node, and
108+
would like DNS requests from other pods to always go to the local DNS
109+
pod, for efficiency. However, if no local DNS pod is available, DNS
110+
should just go to a remote pod instead so it keeps working. There
111+
should never be enough DNS traffic to overload any one endpoint, so
112+
it's safe to use a TrafficDistribution mode that doesn't worry about
113+
endpoint overload.
114+
115+
### Risks and Mitigations
116+
117+
This is similar to the existing `PreferClose` mode (possibly to be
118+
renamed `PreferSameZone`) and has the same sorts of risks.
119+
We only use the new traffic distribution mode if the user explicitly
120+
requests it, and in that case, the user is responsible for ensuring
121+
that clients and servers are distributed in a way such that the
122+
traffic distribution mode makes sense.
123+
124+
## Design Details
125+
126+
We will add a new field to `discoveryv1.EndpointHints`:
127+
128+
```golang
129+
// EndpointHints provides hints describing how an endpoint should be consumed.
130+
type EndpointHints struct {
131+
...
132+
133+
// forNodes indicates the node(s) this endpoint should be targeted by.
134+
// +listType=atomic
135+
ForNodes []string `json:"forNodes,omitempty" protobuf:"bytes,2,name=forNodes"`
136+
}
137+
138+
When updating EndpointSlices, if the EndpointSlice controller sees a
139+
service with `PreferSameNode` traffic distribution, then for each
140+
endpoint in the slice, it will add a `ForNodes` hint including the
141+
name of the endpoint's node. (The field is an array for future
142+
extensibility, but initially it will always have either 0 or 1
143+
elements.) In addition, it will set the `ForZones` hint as it would
144+
with `TrafficDistribution: PreferClose`, to allow older service
145+
proxies to fall back to at least same-zone behavior.
146+
147+
When kube-proxy sees an Endpoint with the `ForNodes` hint set, it will
148+
use that endpoint if the hint includes its own node name, and ignore
149+
it otherwise, similarly to the `ForZones` hint.
150+
151+
### Test Plan
152+
153+
[X] I/we understand the owners of the involved components may require updates to
154+
existing tests to make this code solid enough prior to committing the changes necessary
155+
to implement this enhancement.
156+
157+
##### Prerequisite testing updates
158+
159+
N/A
160+
161+
##### Unit tests
162+
163+
Tests of validation, endpointslice-controller, and kube-proxy will be
164+
updated.
165+
166+
<!--
167+
Additionally, for Alpha try to enumerate the core package you will be touching
168+
to implement this enhancement and provide the current unit coverage for those
169+
in the form of:
170+
- <package>: <date> - <current test coverage>
171+
The data can be easily read from:
172+
https://testgrid.k8s.io/sig-testing-canaries#ci-kubernetes-coverage-unit
173+
174+
This can inform certain test coverage improvements that we want to do before
175+
extending the production code to implement this enhancement.
176+
-->
177+
178+
- `<package>`: `<date>` - `<test coverage>`
179+
180+
##### Integration tests
181+
182+
N/A
183+
184+
##### e2e tests
185+
186+
E2E tests will be added similar to existing traffic distribution
187+
tests, to cover the new options.
188+
189+
- <test>: <link to test coverage>
190+
191+
### Graduation Criteria
192+
193+
#### Alpha
194+
195+
- Feature implemented behind a feature flag
196+
197+
- Unit tests for API enablement and endpoint selection.
198+
199+
#### Beta
200+
201+
- E2E tests completed and enabled.
202+
203+
- Enough time has passed since Alpha to avoid version skew issues.
204+
205+
#### GA
206+
207+
- Time passes, no major objections
208+
209+
### Upgrade / Downgrade Strategy
210+
211+
No real issues, other than dealing with skew.
212+
213+
### Version Skew Strategy
214+
215+
In skewed clusters, it may not be possible for kube-controller-manager
216+
to set the new EndpointSlice hint, or else kube-proxy may not be able
217+
to see the hint. In this case, the service will fall back to
218+
perfer-same-zone semantics rather than prefer-same-node. Users can
219+
avoid problems with this by not using the feature until their cluster
220+
is fully upgraded to a version that supports the feature.
221+
222+
## Production Readiness Review Questionnaire
223+
224+
### Feature Enablement and Rollback
225+
226+
###### How can this feature be enabled / disabled in a live cluster?
227+
228+
- [X] Feature gate (also fill in values in `kep.yaml`)
229+
- Feature gate name: PreferSameNodeTrafficDistribution
230+
- Components depending on the feature gate:
231+
- kube-apiserver
232+
- kube-controller-manager
233+
- kube-proxy
234+
235+
###### Does enabling the feature change any default behavior?
236+
237+
No
238+
239+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
240+
241+
Yes.
242+
243+
###### What happens if we reenable the feature if it was previously rolled back?
244+
245+
It starts working again.
246+
247+
###### Are there any tests for feature enablement/disablement?
248+
249+
No.
250+
251+
### Rollout, Upgrade and Rollback Planning
252+
253+
###### How can a rollout or rollback fail? Can it impact already running workloads?
254+
255+
An initial rollout cannot fail and won't impact already-running
256+
workloads, because at the time of the initial rollout, there cannot
257+
already be any `TrafficDistribution: PreferSameNode` services.
258+
259+
A rollback has reasonable fallback behavior (as with downgrades), and
260+
a re-rollout just updates the behavior of existing `PreferSameNode`
261+
services in the expected way.
262+
263+
###### What specific metrics should inform a rollback?
264+
265+
There are no metrics that would inform anyone that the feature was
266+
failing, but since the feature is opt-in, individual users can simply
267+
stop using the feature if it is not working for them.
268+
269+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
270+
271+
No
272+
273+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
274+
275+
No
276+
277+
### Monitoring Requirements
278+
279+
###### How can an operator determine if the feature is in use by workloads?
280+
281+
By checking if any Service has `TrafficDistribution: PreferSameNode`.
282+
283+
###### How can someone using this feature know that it is working for their instance?
284+
285+
As with other topology features, there is no easy way for an end user
286+
to reliably confirm that it is working correctly other than by
287+
sniffing the network traffic, or else looking at the logs of each
288+
endpoint to confirm that they are receiving the expected connections
289+
and not receiving unexpected connections.
290+
291+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
292+
293+
The implementation of the feature itself has no SLOs. The effect it
294+
has on the performance of end user workloads that use the feature
295+
depends on those workloads.
296+
297+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
298+
299+
The implementation of the feature itself has no SLIs, other than the
300+
generic kube-proxy metrics. User workloads that use the feature may
301+
expose SLI information that the user can examine to determine how well
302+
the feature is working for their workload.
303+
304+
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
305+
306+
Not really; we don't know how fast the user's services are supposed to
307+
be, so we can't really tell if we are improving them as much as they
308+
hoped or not.
309+
310+
### Dependencies
311+
312+
###### Does this feature depend on any specific services running in the cluster?
313+
314+
It depends on a service proxy which recognizes the new traffic policy
315+
values. We will update `kube-proxy` ourselves, but network plugins /
316+
kubernetes distributions that ship their own alternative service
317+
proxies will also need to be updated to support the new value before
318+
their users can make use of it. (Until then, `TrafficDistribution:
319+
PreferSameNode` would be implemented as `TrafficDistribution:
320+
PreferClose`.)
321+
322+
### Scalability
323+
324+
###### Will enabling / using this feature result in any new API calls?
325+
326+
No
327+
328+
###### Will enabling / using this feature result in introducing new API types?
329+
330+
No
331+
332+
###### Will enabling / using this feature result in any new calls to the cloud provider?
333+
334+
No
335+
336+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
337+
338+
No (other than that it means people may set `TrafficDistribution` on
339+
Services where they were not previously setting it).
340+
341+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
342+
343+
No
344+
345+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
346+
347+
No
348+
349+
### Troubleshooting
350+
351+
###### How does this feature react if the API server and/or etcd is unavailable?
352+
353+
No change from existing service/proxy behavior.
354+
355+
###### What are other known failure modes?
356+
357+
None known
358+
359+
###### What steps should be taken if SLOs are not being met to determine the problem?
360+
361+
N/A
362+
363+
## Implementation History
364+
365+
- Initial proposal as `InternalTrafficPolicy: PreferLocal`: 2021-10-21
366+
- Initial proposal as "Node-level topology": 2022-01-15
367+
- Initial proposal as `TrafficDistribution: PreferSameNode`: 2025-02-06
368+
369+
## Drawbacks
370+
371+
## Alternatives
372+
373+
As noted, this is the third attempt at this feature.
374+
375+
The initial proposal ([#3016]) was for `internalTrafficPolicy:
376+
PreferLocal`, but we decided that traffic policy was for
377+
semantically-significant changes to how traffic was distributed,
378+
whereas this is just a hint, like topology.
379+
380+
That led to the second attempt ([#3293]), which never got as far as
381+
defining a specific API, but reframed the problem as being a kind of
382+
topology hint. This eventually fizzled out because of people's
383+
opinions at that time about how topology ought to work in Kubernetes.
384+
385+
However, KEP-4444 (TrafficDistribution) represents an updated
386+
understanding of topology in Kubernetes, which makes the idea of
387+
node-level topology palatable.
388+
389+
[#3016]: https://github.com/kubernetes/enhancements/pull/3016
390+
[#3293]: https://github.com/kubernetes/enhancements/pull/3293

0 commit comments

Comments
 (0)