Enables HTTP/2 health check #95981

caesarxuchao · 2020-10-29T08:14:42Z

This PR enables the HTTP/2 connection health check for all the clients that call SetTransportDefaults. This includes all clients created via k8s.io/client-go. Without the health check, a broken transport layer connection can stay in the HTTP/2 connection pool for a long time, causing requests to the same host to fail. For example, a broken TCP connection can linger for 15 minutes before being closed.

The HTTP/2 feature exposes two parameters,

ReadIdleTimeout: if the HTTP/2 client has not received any frame from the connection for this amount of time, it starts sending periodic pings to the server. The period of the pings is also ReadIdleTimeout.
PingTimeout: if an ACK to the ping is not received by the client after this amount of time, the connection is considered broken and will be closed by the client.

For Kubernetes, I default the ReadIdleTimeout to 30s, which I think is not going to cause performance issues even in large clusters, because

the ping is carried by persisted connections,
the ping is handled at the http/2 layer, never surfaced to k8s layer.

I'll wait for results from pull-kubernetes-e2e-gce-large-performance to confirm.

I default the PingTimeout to 15s, which is relaxed enough in most network conditions.

Although I think most users don't need to tune these two parameters, I add environment variables to allow user to tune them, or even disable the health check, by setting HTTP2_READ_IDLE_TIMEOUT_SECONDS=0.

Fixes https://github.com/kubernetes/client-go/issues/374 and
#87615.

Ref. #95898
#94844

/kind bug
/sig api-machinery

HTTP/2 connection health check is enabled by default in all Kubernetes clients. The feature should work out-of-the-box. If needed, users can tune the feature via the HTTP2_READ_IDLE_TIMEOUT_SECONDS and HTTP2_PING_TIMEOUT_SECONDS environment variables. The feature is disabled if HTTP2_READ_IDLE_TIMEOUT_SECONDS is set to 0.

caesarxuchao · 2020-10-29T08:24:52Z

I disagree with setting ReadIdleTimeout to 1/2 * IdleTimeout in #95898. The connection is considered "idle" if there is no stream using the connection (See here). And the connection is considered "readIdle" if there is no frame read from the connection (See here). The two concepts are unique to each other so we shouldn't use one value to determine the other value.

update: I think the IdleTimeout in http/1 behaves more like what's described in #95898. A connection is closed if there is no traffic for the IdleTimeout duration.

caesarxuchao · 2020-10-29T08:28:02Z

@cheftako, this health check is very similar to the GRPC health check we want to add to the apiserver-network-proxy to solve kubernetes-sigs/apiserver-network-proxy#152. Their performance implication is also similar.

caesarxuchao · 2020-10-29T18:55:16Z

@cheftako
I'm not too worried about the performance cost of sending the periodic pings, because

the ping is carried by persisted connections
the ping is handled at the http/2 layer, never surfaced to k8s layer
the ping is only sent if there is no traffic on a connection for the ReadIdleTimeout (30s) duration.

It would be more convincing if we can test the behavior in a large cluster. @kubernetes/sig-scalability is there an easy way to test this PR in a large cluster?

@JensErat do you mind sharing the size of the clusters where you turned on the http/2 health check feature? Thanks.

BenTheElder · 2020-10-29T18:57:07Z

/test
[intentionally triggering list of available tests]

k8s-ci-robot · 2020-10-29T18:57:09Z

@BenTheElder: The /test command needs one or more targets.
The following commands are available to trigger jobs:

/test pull-kubernetes-bazel-build
/test pull-kubernetes-bazel-test
/test pull-kubernetes-conformance-image-test
/test pull-kubernetes-conformance-kind-ipv6-parallel
/test pull-kubernetes-dependencies
/test pull-kubernetes-dependencies-canary
/test pull-kubernetes-e2e-ipvs-azure-dualstack
/test pull-kubernetes-e2e-iptables-azure-dualstack
/test pull-kubernetes-e2e-aws-eks-1-13-correctness
/test pull-kubernetes-files-remake
/test pull-kubernetes-e2e-gce
/test pull-kubernetes-e2e-gce-no-stage
/test pull-kubernetes-e2e-gce-kubetest2
/test pull-kubernetes-e2e-gce-canary
/test pull-kubernetes-e2e-gce-ubuntu
/test pull-kubernetes-e2e-gce-ubuntu-containerd
/test pull-kubernetes-e2e-gce-ubuntu-containerd-canary
/test pull-kubernetes-e2e-gce-rbe
/test pull-kubernetes-e2e-gce-alpha-features
/test pull-kubernetes-e2e-gce-device-plugin-gpu
/test pull-kubernetes-integration
/test pull-kubernetes-cross
/test pull-kubernetes-e2e-kind
/test pull-kubernetes-e2e-kind-canary
/test pull-kubernetes-e2e-kind-ipv6
/test pull-kubernetes-e2e-kind-ipv6-canary
/test pull-kubernetes-conformance-kind-ga-only
/test pull-kubernetes-conformance-kind-ga-only-parallel
/test pull-kubernetes-e2e-kops-aws
/test pull-kubernetes-bazel-build-canary
/test pull-kubernetes-bazel-test-canary
/test pull-kubernetes-bazel-test-integration-canary
/test pull-kubernetes-local-e2e
/test pull-publishing-bot-validate
/test pull-kubernetes-e2e-gce-network-proxy-http-connect
/test pull-kubernetes-e2e-gce-network-proxy-grpc
/test pull-kubernetes-e2e-gci-gce-autoscaling
/test pull-kubernetes-e2e-aks-engine-azure
/test pull-kubernetes-e2e-azure-disk
/test pull-kubernetes-e2e-azure-disk-vmss
/test pull-kubernetes-e2e-azure-file
/test pull-kubernetes-e2e-kind-dual-canary
/test pull-kubernetes-e2e-kind-ipvs-dual-canary
/test pull-kubernetes-e2e-gci-gce-ipvs
/test pull-kubernetes-node-e2e
/test pull-kubernetes-e2e-containerd-gce
/test pull-kubernetes-node-e2e-containerd
/test pull-kubernetes-node-e2e-alpha
/test pull-kubernetes-node-kubelet-serial-cpu-manager
/test pull-kubernetes-node-kubelet-serial-topology-manager
/test pull-kubernetes-node-kubelet-serial-hugepages
/test pull-kubernetes-node-crio-cgrpv2-e2e
/test pull-kubernetes-node-crio-e2e
/test pull-kubernetes-e2e-gce-100-performance
/test pull-kubernetes-e2e-gce-big-performance
/test pull-kubernetes-e2e-gce-correctness
/test pull-kubernetes-e2e-gce-large-performance
/test pull-kubernetes-kubemark-e2e-gce-big
/test pull-kubernetes-kubemark-e2e-gce-scale
/test pull-kubernetes-e2e-gce-storage-slow
/test pull-kubernetes-e2e-gce-storage-snapshot
/test pull-kubernetes-e2e-gce-storage-slow-rbe
/test pull-kubernetes-e2e-gce-csi-serial
/test pull-kubernetes-e2e-gce-iscsi
/test pull-kubernetes-e2e-gce-iscsi-serial
/test pull-kubernetes-e2e-gce-storage-disruptive
/test pull-kubernetes-e2e-aks-engine-azure-windows
/test pull-kubernetes-e2e-azure-disk-windows
/test pull-kubernetes-e2e-azure-file-windows
/test pull-kubernetes-typecheck
/test pull-kubernetes-verify
/test pull-kubernetes-e2e-windows-gce

Use /test all to run the following jobs:

pull-kubernetes-bazel-build
pull-kubernetes-bazel-test
pull-kubernetes-dependencies
pull-kubernetes-e2e-gce-ubuntu-containerd
pull-kubernetes-integration
pull-kubernetes-e2e-kind
pull-kubernetes-e2e-kind-ipv6
pull-kubernetes-conformance-kind-ga-only-parallel
pull-kubernetes-node-e2e
pull-kubernetes-e2e-gce-100-performance
pull-kubernetes-e2e-azure-disk-windows
pull-kubernetes-e2e-azure-file-windows
pull-kubernetes-typecheck
pull-kubernetes-verify

In response to this:

/test
[intentionally triggering list of available tests]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

BenTheElder · 2020-10-29T18:59:30Z

I think https://github.com/kubernetes/test-infra/blob/b40bcac24d0ec96cc7d440ca5ece690cce7a62fe/config/jobs/kubernetes/sig-scalability/sig-scalability-presubmit-jobs.yaml#L67 is what you want, there are some optional scale presubmit.

On second thought though, even if that were the right one, we should check with someone from the scale team before using these resources 🙃

caesarxuchao · 2020-10-29T19:12:31Z

Thanks, @BenTheElder ! The pull-kubernetes-e2e-gce-large-performance uses 2000 nodes, that seems to be the right test. I'll check with the scalability sig.

jkaniuk · 2020-10-29T19:39:44Z

Seems relatively harmless from scalability point of view.

I'm fine with running 2k presubmit to verify, however our approval is not needed:
/test pull-kubernetes-e2e-gce-large-performance
(fingers crossed, it was a while since I've run this)

@tosi3k - for visibility as next oncaller

liggitt · 2021-03-18T20:31:02Z

Backport to 1.18 is riskier, as it is using v0.0.0-20191004110552-13f9640d40b9, not sure if a backport will introduce incompatible changes.

I think we should backport to 1.18, given:

how unpredictable the stuck connection issue is
how many components it affects
how problematic a stuck connection is for informer clients
that we've successfully used these sys and net levels in 1.19 for O(months) now
that other consumers have successfully backported these changes to 1.18 and used them in production

I opened #100376 to pick this to 1.18

deads2k · 2021-03-24T19:56:18Z

Given the consequences when this fails, I'm in favor of backporting this change if we can get an ack (ideally from @caesarxuchao) that the backport is fairly safe. My knowledge of the golang stack at this depth is fairly limited.

Is there a particular minimum golang level that is required or was it just the golang.org/x/net level?

liggitt · 2021-03-24T22:14:50Z

Given the consequences when this fails, I'm in favor of backporting this change if we can get an ack (ideally from @caesarxuchao) that the backport is fairly safe.

@caesarxuchao acked the backport PR (#100376)

Is there a particular minimum golang level that is required or was it just the golang.org/x/net level?

It was contained to golang.org/x/{net,sys}

Cherrypick #95981 to 1.18, Enables HTTP/2 health check

Following up from cilium#19259, this commit implements the same logic as kubelet [1] to close all idle connections. We can rely on HTTP/2 to determine which connections are idle (HTTP/2 pings), instead of closing all connections indiscriminately (when the heartbeat fails) via the custom dialer in setDialer(). We can assume HTTP/2 support as it has been supported since K8s 1.20 [2]. Users can disable HTTP/2 by setting the DISABLE_HTTP2 environment variable, however this only really meant from very advanced users, hence no documentation. Kubernetes has been running with HTTP/2 as the default for a long time (~since Dec 2020) without major issues. CloseIdleConnectionsFor() was introduced in K8s 1.23 [3], hence this commit cannot be backported to any earlier version of Cilium which doesn't support K8s 1.23. [1]: kubernetes/kubernetes#104844 [2]: kubernetes/kubernetes#95981 [3]: kubernetes/kubernetes@b9d865a Suggested-by: André Martins <[email protected]> Signed-off-by: Chris Tarazi <[email protected]>

Following up from #19259, this commit implements the same logic as kubelet [1] to close all idle connections. We can rely on HTTP/2 to determine which connections are idle (HTTP/2 pings), instead of closing all connections indiscriminately (when the heartbeat fails) via the custom dialer in setDialer(). We can assume HTTP/2 support as it has been supported since K8s 1.20 [2]. Users can disable HTTP/2 by setting the DISABLE_HTTP2 environment variable, however this only really meant from very advanced users, hence no documentation. Kubernetes has been running with HTTP/2 as the default for a long time (~since Dec 2020) without major issues. CloseIdleConnectionsFor() was introduced in K8s 1.23 [3], hence this commit cannot be backported to any earlier version of Cilium which doesn't support K8s 1.23. [1]: kubernetes/kubernetes#104844 [2]: kubernetes/kubernetes#95981 [3]: kubernetes/kubernetes@b9d865a Suggested-by: André Martins <[email protected]> Signed-off-by: Chris Tarazi <[email protected]>

[ upstream commit 365c788 ] Following up from cilium#19259, this commit implements the same logic as kubelet [1] to close all idle connections. We can rely on HTTP/2 to determine which connections are idle (HTTP/2 pings), instead of closing all connections indiscriminately (when the heartbeat fails) via the custom dialer in setDialer(). We can assume HTTP/2 support as it has been supported since K8s 1.20 [2]. Users can disable HTTP/2 by setting the DISABLE_HTTP2 environment variable, however this only really meant from very advanced users, hence no documentation. Kubernetes has been running with HTTP/2 as the default for a long time (~since Dec 2020) without major issues. CloseIdleConnectionsFor() was introduced in K8s 1.23 [3], hence this commit cannot be backported to any earlier version of Cilium which doesn't support K8s 1.23. [1]: kubernetes/kubernetes#104844 [2]: kubernetes/kubernetes#95981 [3]: kubernetes/kubernetes@b9d865a Suggested-by: André Martins <[email protected]> Signed-off-by: Chris Tarazi <[email protected]> Signed-off-by: Paul Chaignon <[email protected]>

[ upstream commit 365c788 ] Following up from #19259, this commit implements the same logic as kubelet [1] to close all idle connections. We can rely on HTTP/2 to determine which connections are idle (HTTP/2 pings), instead of closing all connections indiscriminately (when the heartbeat fails) via the custom dialer in setDialer(). We can assume HTTP/2 support as it has been supported since K8s 1.20 [2]. Users can disable HTTP/2 by setting the DISABLE_HTTP2 environment variable, however this only really meant from very advanced users, hence no documentation. Kubernetes has been running with HTTP/2 as the default for a long time (~since Dec 2020) without major issues. CloseIdleConnectionsFor() was introduced in K8s 1.23 [3], hence this commit cannot be backported to any earlier version of Cilium which doesn't support K8s 1.23. [1]: kubernetes/kubernetes#104844 [2]: kubernetes/kubernetes#95981 [3]: kubernetes/kubernetes@b9d865a Suggested-by: André Martins <[email protected]> Signed-off-by: Chris Tarazi <[email protected]> Signed-off-by: Paul Chaignon <[email protected]>

The env vars were implemented in kubernetes#95981. Effectively no impacts if using the default values here. BUG=255296578 Change-Id: I2441ef1a181a0fb202af7893c25cff5ff15ef17a

The env vars were implemented in kubernetes#95981. Effectively no impacts if using the default values here. BUG=255296578 Change-Id: I2441ef1a181a0fb202af7893c25cff5ff15ef17a Remove GODEBUG=x509sha1=1 workaround Bug: 227456358a Revert changes made in I5165e6c2fe73e8e1b2a617ced591133228b6d275 Change-Id: I7dd1ca8f9796c542c89074703e44e6b6f4b6edab

kube-apiserver: Change kube-apiserver liveness probe config setup Prior OSS attempt: kubernetes#94076 Bug: b/164955956 Change-Id: Ia2f75ac5e97f4eb6aa4dc9b00fc2bb3112db9360 kube-apiserver: Set profiling false for kube-apiserver Change-Id: I052a53d72acfc3c5aa8d03f695beb93cadc0be87 kube-apiserver: allow adding memory limit to kube-apiserver via the KUBE_APISERVER_MEMORY_LIMIT environment variable bug: 181991030 Change-Id: I1a01766e6f2415eba31d6d865f7f39d7855f969e kube-apiserver: Drop removed insecure --address and --insecure-port flags GKE config script version of kubernetes#106859 Bug: 209962139 Change-Id: I551cfc12fd94a927383fcff89013a3a78fb61592 kube-apiserver: Readd KUBE_APISERVER_EVENT_TTL_SEC env for kube-apiserver --event-ttl Bug: b/176806841 Change-Id: I5c651a555287af0e891e9592dfb7c172b6bb24a2 kube-apiserver: Re-enable the deprecated behavior of CommonName on X.509 as a diagnoser to detect the fleet level impact on the deprecation Change-Id: If4962b6ecb3294878bfa37ab0ebcd4441fae7140 kube-apiserver: Update permissions of /etc/srv/kubernetes/abac-authz-policy.jsonl if kube-apiserver is running as non-root. BUG=199669294 Change-Id: Ic9ae9add0a35b7df99781e470c917b2427bee289 kube-apiserver: Remove reference to deleted target-ram-mb flag Change-Id: I57d0c80eb2285ec9d93b075b1845f1805e0fd748 kube-apiserver: Re-enable SHA1 signatures with GODEBUG=x509sha1=1 Bug: b/227456358 Bug: b/226424430 Change-Id: I5165e6c2fe73e8e1b2a617ced591133228b6d275 kube-apiserver: use --api-audiences as --service-account-api-audiences is deprecated Copy of kubernetes#103078 Change-Id: I88c2f2eb8bde4378b115e01cbbe9700c27f03955 Expose UDS with profiling data for kube-apiserver. Added in kubernetes#114191. See go/no-gke-tcp-pprof for policy. Refs b/273485199 Change-Id: Ic772c9249468fdf7f516d62bd2a0bfbfe933c1bb Allow specifying terminationGracePeriodSeconds for kube-apiserver. Refs b/252987333 Change-Id: I20699501f429630fe74531cb086091eb9ed3611c Run kube-apiserver with cloud-provider=external Bug: 299159412 Change-Id: I8db8e72377f1c63a3874ee9f0567be264c40ba58 Allow specifying timeout second for HTTP2/2 health check The env vars were implemented in kubernetes#95981. Effectively no impacts if using the default values here. BUG=255296578 Change-Id: I2441ef1a181a0fb202af7893c25cff5ff15ef17a Remove GODEBUG=x509sha1=1 workaround Bug: 227456358a Revert changes made in I5165e6c2fe73e8e1b2a617ced591133228b6d275 Change-Id: I7dd1ca8f9796c542c89074703e44e6b6f4b6edab Use structured authorization configuration Bug: b/333034514 Bug: b/172290849 Bug: b/175273696 Bug: b/332956373 Change-Id: If16e5dd6f9a4da93fb8623e27a9b9fda959eea83 gke/cluster: make kube-apiserver explicitly set enable-logs-handler Preserves existing implicit behavior with 1.31 default changing in kubernetes#125787 Change-Id: I6380b46cef32c7aba4ea7c0ea7e014e18c8f09d0

k8s-ci-robot requested review from a team, andrewsykim and andyzhangx October 29, 2020 08:15

cheftako mentioned this pull request Oct 29, 2020

Added code to track tainting of a tunnel. kubernetes-sigs/apiserver-network-proxy#155

Closed

roycaihw mentioned this pull request Feb 5, 2021

Admission webhooks affected by dead tcp connections #80313

Open

aramase mentioned this pull request Mar 18, 2021

chore: update k8s lib versions Azure/aad-pod-identity#1010

Merged

6 tasks

liggitt mentioned this pull request Mar 24, 2021

Cherrypick #95981 to 1.18, Enables HTTP/2 health check #100376

Merged

k8s-ci-robot added a commit that referenced this pull request Mar 25, 2021

Merge pull request #100376 from liggitt/http2-1.18

e680574

Cherrypick #95981 to 1.18, Enables HTTP/2 health check

hzxuzhonghu mentioned this pull request Apr 12, 2021

Unable to detect if a watch is active #65012

Closed

axeal mentioned this pull request Apr 19, 2021

Provide workaround for kube-apiserver HA connection loss causing nginx-ingress to not update its endpoints for 15mins rancher/rancher#19558

Closed

szuecs mentioned this pull request Dec 17, 2021

✨ AWS CNI Mode zalando-incubator/kube-ingress-aws-controller#461

Merged

shonecyx mentioned this pull request Jan 17, 2022

The ingress gateway won't close the downstream connection when upstream backend server crashes istio/istio#36818

Closed

youyongsong mentioned this pull request Feb 9, 2022

Failed to export events after running for a while opsgenie/kubernetes-event-exporter#121

Open

aojea mentioned this pull request Feb 14, 2022

kubelet apiserver: be gentle closing connections on heartbeat failures #108107

Merged

christarazi mentioned this pull request Mar 31, 2022

k8s: Use kubelet's logic to close all idle connections cilium/cilium#19290

Merged

hakman mentioned this pull request Jun 29, 2022

calico-kube-controllers pod stuck in not Ready for 13 min projectcalico/calico#3751

Closed

aojea mentioned this pull request May 9, 2023

Make net.ipv4.tcp_keepalive_time a safe sysctl #117873

Closed

dongjiang1989 mentioned this pull request Sep 18, 2023

allow configuration of PingTimeout/ReadIdleTimeout prometheus/common#512

Open

nijave mentioned this pull request May 5, 2024

Update haproxy and keepalived config kubernetes/kubeadm#3060

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enables HTTP/2 health check #95981

Enables HTTP/2 health check #95981

Uh oh!

caesarxuchao commented Oct 29, 2020 •

edited

Loading

Uh oh!

caesarxuchao commented Oct 29, 2020 •

edited

Loading

Uh oh!

caesarxuchao commented Oct 29, 2020

Uh oh!

caesarxuchao commented Oct 29, 2020

Uh oh!

BenTheElder commented Oct 29, 2020

Uh oh!

k8s-ci-robot commented Oct 29, 2020

Uh oh!

BenTheElder commented Oct 29, 2020

Uh oh!

caesarxuchao commented Oct 29, 2020

Uh oh!

jkaniuk commented Oct 29, 2020

Uh oh!

liggitt commented Mar 18, 2021

Uh oh!

deads2k commented Mar 24, 2021

Uh oh!

liggitt commented Mar 24, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

Enables HTTP/2 health check #95981

Enables HTTP/2 health check #95981

Uh oh!

Conversation

caesarxuchao commented Oct 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

caesarxuchao commented Oct 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

caesarxuchao commented Oct 29, 2020

Uh oh!

caesarxuchao commented Oct 29, 2020

Uh oh!

BenTheElder commented Oct 29, 2020

Uh oh!

k8s-ci-robot commented Oct 29, 2020

Uh oh!

BenTheElder commented Oct 29, 2020

Uh oh!

caesarxuchao commented Oct 29, 2020

Uh oh!

jkaniuk commented Oct 29, 2020

Uh oh!

liggitt commented Mar 18, 2021

Uh oh!

deads2k commented Mar 24, 2021

Uh oh!

liggitt commented Mar 24, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

caesarxuchao commented Oct 29, 2020 •

edited

Loading

caesarxuchao commented Oct 29, 2020 •

edited

Loading