@@ -170,13 +170,53 @@ The targets are defined by the below suggested maximum limits, which are organiz
170170
171171- Since custom resources can be arbitrarily large, we have broken down the limit by custom resource object size.
172172
173+ ** Custom Resource Definitions:**
174+
175+ | Suggested Maximum Limit: scope=cluster |
176+ | --- |
177+ | 500 |
178+
179+ _ Note: The Custom Resource Definition suggested maximum limit was selected not
180+ due to the above SLI/SLOs, but instead due to the latency OpenAPI publishing,
181+ which is a background process that occurs asychroniously each time a Custom
182+ Resource Definition schema is updated. For 500 Custom Resource Definitions it takes
183+ slightly over 35 seconds for a definition change to be visible via the OpenAPI
184+ spec endpoint._
185+
186+ ** Custom Resources, Cluster Wide:**
187+
188+ Cluster wide limits for custom resources are storage bound and custom resources
189+ share the storage space with all other objects. While determining the
190+ appropriate storage limit for a cluster is out-of-scope for this document, once
191+ a etcd storage limit selected, suggested maximum limits for custom resources
192+ are:
193+
194+ | etcd storage limit | Suggested Maximum Limit: scope=cluster |
195+ | --- | --- |
196+ | 4GB | 40000 |
197+ | 8GB | 80000 |
198+
199+ These limits aim to keep custom resource storage usage to less than half of the
200+ total cluster storage capacity for custom resources of 50kb or less in size.
201+
173202** Custom Resources per Definition:**
174203
204+ For each custom resource definition, the limit on the number of custom resources
205+ can be found by taking the (median) object size of the custom resource and finding
206+ the the matching row in this table:
207+
175208| Object size | Suggested Maximum Limit: scope=namespace (5s p99 SLO) | Suggested Maximum Limit: scope=cluster (30s p99 SLO) |
176209| --- | --- | --- |
177- | 10kb | 1500 | 10000 |
178- | 25kb | 600 | 4000 |
179- | 50kb | 300 | 2000 |
210+ | <=10kb | 1500 | 10000 |
211+ | (10kb - 25kb] | 600 | 4000 |
212+ | (25kb - 50kb] | 300 | 2000 |
213+
214+ The cluster scope indicates the total number of custom resources for that
215+ definition allowed in the entire cluster.
216+
217+ The namespace scope indicates the total number of custom resources for that
218+ definition allowed in any particular namespace. The cumulative count of the
219+ custom resource across all namespaces must not exceed the cluster limit.
180220
181221Since, in practice, custom resources scale farther without conversion webhooks
182222within the SLI/SLOs (roughly 2x according to our scale tests), custom resource
@@ -190,19 +230,6 @@ and the scope=cluster suggested maximum limit indicates how many custom resource
190230be in the cluster total. For custom resources of custom resource definitions using ` scope: Cluster ` : only
191231the scope=cluster suggested maximum limit applies._
192232
193- ** Custom Resource Definitions:**
194-
195- | Suggested Maximum Limit: scope=cluster |
196- | --- |
197- | 500 |
198-
199- _ Note: The Custom Resource Definition suggested maximum limit was selected not
200- due to the above SLI/SLOs, but instead due to the latency OpenAPI publishing,
201- which is a background process that occurs asychroniously each time a Custom
202- Resource Definition schema is updated. For 500 Custom Resource Definitions it takes
203- slightly over 35 seconds for a definition change to be visible via the OpenAPI
204- spec endpoint._
205-
206233** Conversion Webhooks:**
207234
208235Conversion Webhook SLOs are defined from the perspective of the conversion
@@ -211,14 +238,17 @@ making the request to the webhook, but it does include network latency.
211238
212239Given that the performance and scalability of conversion webhooks are the
213240responsibility of their author, Custom resource scale targets are applied only for
214- conversion webhooks that are within the follow latencies for the above suggested
241+ conversion webhooks that are within the following latencies for the above suggested
215242maximum limits.
216243
217- | scope | Expected conversion Webhook SLO: p99 latency |
244+ | scope | object count limit | Expected conversion Webhook SLO: p99 latency |
218245| --- | --- |
219- | resource | 50ms |
220- | namespace | 1 seconds |
221- | cluster | 6 seconds |
246+ | resource | 1 | 50ms |
247+ | namespace | 1500 (<=10kb), 600 (10-25kb) or 300 (25-50kb) | 1 seconds |
248+ | cluster | 10000 (<=10kb), 4000 (10-25kb) or 2000 (25-50kb) | 6 seconds |
249+
250+ The scope=resource's higher "per-object" latency (50ms vs ~ 1.5ms for namespace
251+ and cluster scope) it to accommodate for a request serving cost constant.
222252
223253The above object size and suggested maximum limits in the Custom Resources per
224254Definition table applies to these conversion webhook SLOs. For example, for a
0 commit comments