Increase default keepAliveTimeout to 65s #59203

pras529 · 2025-07-25T03:44:22Z

Summary

This PR increases the default value of http.Server.keepAliveTimeout from 5000ms (5s) to 65000ms (65s). This change aligns Node.js with the keep-alive timeout expectations of most browsers, proxies, and mobile clients, reducing the risk of subtle long-lived connection failures due to premature socket closure.

Details

Default Change:
The keepAliveTimeout default is now set to 65 seconds (65000ms) instead of the previous 5 seconds (5000ms).
Documentation & Comments:
All relevant documentation and code comments have been updated to reflect the new default.

Motivation

The previous default of 5 seconds is shorter than the defaults used by common infrastructure, which can lead to unexpected connection terminations or failures. Setting the default to 65 seconds brings Node.js in line with industry standards and improves interoperability.

References

Please review carefully as this changes default server behavior for all applications that do not explicitly set keepAliveTimeout.

nodejs-github-bot · 2025-07-25T03:44:28Z

Review requested:

@nodejs/http
@nodejs/net

pras529 · 2025-07-25T04:02:38Z

@jasnell @targos @lpinca Please review my pull request when you have a chance. Thank you!

mcollina

Agreed, this should have been raised a long time ago.

What are the http keep-alive timeouts in browsers, smartphones etc? Can you provide references?

I did a quick check and Chrome would default to 300s. @nodejs/tsc can you weight it if you want to increase it up to 350s?

Fastify uses 72 seconds to avoid any issues with ALB/ELBs and other load balancers that expect 60s (having the same timeout is slightly problematic).

Increasing this exposes users to the possibility of DoS attacks by exhaustion of file descriptor - just keeping the sockets open would prevent new connections from arriving.

doc/changelogs/CHANGELOG_V24.md

lib/_http_server.js

doc/api/http.md

codecov · 2025-07-25T09:25:32Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.06%. Comparing base (7215d9b) to head (e59f1f3).
Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #59203      +/-   ##
==========================================
+ Coverage   90.04%   90.06%   +0.02%     
==========================================
  Files         648      648              
  Lines      191041   191041              
  Branches    37448    37455       +7     
==========================================
+ Hits       172026   172065      +39     
+ Misses      11651    11601      -50     
- Partials     7364     7375      +11

Files with missing lines	Coverage Δ
lib/_http_server.js	`97.23% <100.00%> (ø)`

... and 32 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

pras529 · 2025-07-25T09:54:17Z

Hey @mcollina ,

Thanks for your feedback on the keep-alive timeout.

I’ve chosen 60 seconds to align with defaults used by major cloud load balancers like AWS ELB/ALB, ensuring broad compatibility and preventing unexpected connection drops. Increasing the timeout to 72 seconds, as seen in Fastify and NGINX, could slightly boost connection reuse, but also raises the risk of file descriptor exhaustion under DoS scenarios.

Overall, 60 seconds provides a safer balance between interoperability and security for most environments. I’m open to discussing further if there’s a strong case for a higher value.

pras529 · 2025-07-25T09:56:33Z

@mcollina , I've resolved this issue and removed the breaking change alert regarding http.Server.keepAliveTimeout. The documentation now reflects the correct status, in line with the release team's guidance. Please review my pull request when you have a chance. Thank you!

mcollina · 2025-07-25T10:23:38Z

I've seen 60 seconds cause problems due to time network delays - I've actually seen this happen. Going over 60 seconds is necessary to account for that.

pras529 · 2025-07-25T10:28:01Z

Okay, sure—that makes sense. Is 72 seconds sufficient to account for those network delays, or do you recommend going even higher? Happy to adjust as needed based on your experience.

mcollina · 2025-07-25T10:30:17Z

I literally picked 72 with no reasoning. We could do 65.

pras529 · 2025-07-25T11:05:26Z

Thanks for clarifying. I’m fine with 65 seconds if that works better—happy to update the default to 65 to provide a bit more buffer over 60. Let me know if you have any concerns or if you think another value might be preferable.

mcollina · 2025-07-25T11:11:41Z

@pras529 there are 4 frailing tests and linting to be fixed, could you address them?

pras529 · 2025-07-25T11:16:57Z

@mcollina Yes, I’ll address the failing tests and linting issues—just need a bit of time to work through them.

pimterry

I'm very happy to increase this as well, but it's worth noting that the description in the linked issue doesn't quite match the reality of what's going on. The example reproduction in #59193 specifically isn't correct - in fact the response there is delivered just fine AFAICT. The keepAliveTimeout we're talking about here is how long the server will wait between requests, to keep a connection alive, when there's no response pending. In cases like that repro, it's not relevant: the response will be delivered to the client, and then the connection will close 5 seconds later (11 seconds after the initial connection started, in that example).

I've done some quick research, there's a wide split in defaults elsewhere:

Reverse proxies & standalone server tools generally all have long multi-minute keep-alive timeouts, e.g.:
- Nginx uses 75 seconds
- Caddy uses 300 seconds
- Traefik uses 180 seconds
- AWS ELBs use 60 seconds
- Google's Cloud uses 620 seconds
- Cloudflare users 900 seconds
Meanwhile most other programming languages and backends tend to use fairly short keep-alive timeouts:
- Uvicorn defaults to 5 seconds
- Gunicorn defaults to 2 seconds
- Puma defaults to 20 seconds
- Apache defaults to 5 seconds
- Jetty defaults to 30 seconds

I did a quick check and Chrome would default to 300s.

I don't think browser's configuration matters as much as LBs & reverse proxies. Given a RST when reusing an idle connection they'll retry transparently without exposing the error (I've just tested Chrome & Firefox and confirmed) and I've seen them apply heuristics and various connection pooling games to manage keep-alive and idle sockets there in the past - it's not a fixed value.

Of course, it's still nice for performance to keep the connection open, but in browser-facing performance-sensitive cases you'll probably want HTTP/2 anyway, which this PR doesn't touch.

The only case where there's going to cause an actual problem (e.g. those 502 errors) is when a) the client which doesn't gracefully handle this (it's not a browser) and b) you hit this race condition where the server closes the socket, but the client sends the next request before it processes the FIN packet.

Unless the Node timeout is greater than the LB timeout (i.e. many minutes) then that race condition is always possible, but any increase will exponentially decrease the probability of this race happening (particularly on production sites, where I'd strongly expect most LB->backend connections are rarely idle for more than a few seconds anyway).

Anyway, that's a long way of saying: given the above I think boosting this to 30+ seconds is a good idea, happy to go with 65 seconds as a balanced option 👍 but I don't think we should worry about going beyond that - we're balancing protection against resource usage vs race condition resistance, and there's diminishing returns.

lib/_http_server.js

yujiang · 2025-07-25T12:05:24Z

Thanks for referencing this and working on a fix!
Looks like a great step forward, and I appreciate you taking the time to address it 🙏

pimterry

Code all looks good to me 👍

The first commit message is failing the linting - you'll need to use git to modify that. It's probably best to just squash all your commits and change the resulting commit message to describe the end result of the whole change here, because that message is what will be shown in the CHANGELOG when this is released.

http: Requested changes addressed and resolved http: Increase default keepAliveTimeout to 65s http: explicitly set keepAliveTimeout to 65s in tests http: increase keepAliveTimeout default Increase the default keepAliveTimeout to 65 seconds in the http module. Tests have been updated to explicitly set the new timeout value. All requested changes have been addressed and resolved. PR-URL: nodejs#59203

pras529 · 2025-07-26T06:55:18Z

Hello @mcollina,

Thank you for your review. I’ve addressed the changes you requested and updated the commit message to fully comply with our guidelines. I’m still a bit uncertain about the Linux and macOS test results—could you please take a look and let me know if anything else needs adjustment?

Thanks for your help!

mcollina

one more nit, sorry.

doc/api/http.md

mcollina · 2025-07-26T13:47:39Z

I'm on the fence to consider this a breaking change or not. If so, this change will probably go out in v25,
but I think it's pretty harmless to ship in v24 as well.

@pimterry wdyt?

benjamingr

I think 65 is a much more reasonable default than 5. Apache are the only ones I found to reduce it and that was with a given threading model (open idle connections take a thread and are more expensive) and time.

The nginx 75s keepalive timeout sounds a lot more reasonable to me ( https://nginx.org/en/docs/http/ngx_http_core_module.html )

Especially since reverse proxying Node behind nginx is common so I suspect gateway timeouts are needlessly common now for people using node:http directly.

Increase the default keepAliveTimeout to 65 seconds in the http module. Tests have been updated to explicitly set the new timeout value. PR-URL: nodejs#59203

PR-URL: nodejs#59203

mcollina · 2025-07-26T15:04:56Z

@benjamingr any opinion on semversiness?

pimterry · 2025-07-27T10:00:52Z

I think it could be semver minor. This will change some behaviour (more connection reuse, subtly different default response headers, could very mildly increase resource usage in scenarios with many keep-alive-but-no-requests clients) but that's all quite limited, and nothing that would break any reasonable code I can imagine.

The nginx 75s keepalive timeout sounds a lot more reasonable to me

I don't mind going to 75s if people prefer, but I don't think finetuning matters much in practice: the important thing for actual failures (as opposed to peak performance) is the relative difference between the race condition time (client sending a packet before receiving the FIN = roughly RTT, say 5ms) and the total timeout. Node closing before the LB isn't a big problem, as long as the LB isn't sending a request at exactly the same time.

With this change, to hit an error a client needs to send a first request, wait exactly 64.995 seconds, and then send another request just before the FIN comes in. For any likely traffic distribution, I think odds go down exponentially as the timeout goes up, so by 65s this is already extremely unlikely to ever appear in practice. Meanwhile resource usage (no threads for us, but still handles + some memory at least) goes up linearly.

For failures specifically I don't think tuning individual seconds matters much here (for maximising performance in some envs fine tuning definitely does matter, but it's env specific and I'd expect 60s+ should already deliver pretty good results for typical node deployments).

nodejs-github-bot added http Issues or PRs related to the http subsystem. needs-ci PRs that need a full CI run. labels Jul 25, 2025

mcollina requested changes Jul 25, 2025

View reviewed changes

mcollina added the semver-major PRs that contain breaking changes and should be released in the next major version. label Jul 25, 2025

pras529 marked this pull request as draft July 25, 2025 09:07

pras529 marked this pull request as ready for review July 25, 2025 09:42

pras529 changed the title ~~Increase default keepAliveTimeout to 60s and warn on legacy 5s usage~~ Increase default keepAliveTimeout to 60s Jul 25, 2025

pras529 requested a review from mcollina July 25, 2025 09:57

pras529 changed the title ~~Increase default keepAliveTimeout to 60s~~ Increase default keepAliveTimeout to 65s Jul 25, 2025

pimterry requested changes Jul 25, 2025

View reviewed changes

lib/_http_server.js Outdated Show resolved Hide resolved

pras529 force-pushed the feat/keepalive-timeout-60s branch from e59f1f3 to 695d5d0 Compare July 25, 2025 12:56

pras529 requested a review from pimterry July 25, 2025 13:02

pimterry approved these changes Jul 25, 2025

View reviewed changes

pras529 force-pushed the feat/keepalive-timeout-60s branch from 695d5d0 to 33c1556 Compare July 25, 2025 17:19

mcollina requested changes Jul 26, 2025

View reviewed changes

doc/api/http.md Outdated Show resolved Hide resolved

benjamingr approved these changes Jul 26, 2025

View reviewed changes

http: increase keepAliveTimeout default

b33c716

Increase the default keepAliveTimeout to 65 seconds in the http module. Tests have been updated to explicitly set the new timeout value. PR-URL: nodejs#59203

pras529 force-pushed the feat/keepalive-timeout-60s branch from 33c1556 to b33c716 Compare July 26, 2025 14:23

doc: fix version change as suggested

55e24f3

PR-URL: nodejs#59203

pras529 requested a review from mcollina July 26, 2025 14:32

Uh oh!

Increase default keepAliveTimeout to 65s #59203

Are you sure you want to change the base?

Increase default keepAliveTimeout to 65s #59203

Conversation

pras529 commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Motivation

References

Uh oh!

nodejs-github-bot commented Jul 25, 2025

Uh oh!

pras529 commented Jul 25, 2025

Uh oh!

mcollina left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pras529 commented Jul 25, 2025

Uh oh!

pras529 commented Jul 25, 2025

Uh oh!

mcollina commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pras529 commented Jul 25, 2025

Uh oh!

mcollina commented Jul 25, 2025

Uh oh!

pras529 commented Jul 25, 2025

Uh oh!

mcollina commented Jul 25, 2025

Uh oh!

pras529 commented Jul 25, 2025

Uh oh!

pimterry left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yujiang commented Jul 25, 2025

Uh oh!

pimterry left a comment

Choose a reason for hiding this comment

Uh oh!

pras529 commented Jul 26, 2025

Uh oh!

mcollina left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mcollina commented Jul 26, 2025

Uh oh!

benjamingr left a comment

Choose a reason for hiding this comment

Uh oh!

mcollina commented Jul 26, 2025

Uh oh!

pimterry commented Jul 27, 2025

Uh oh!

Uh oh!

pras529 commented Jul 25, 2025 •

edited

Loading

codecov bot commented Jul 25, 2025 •

edited

Loading

mcollina commented Jul 25, 2025 •

edited

Loading