-
-
Notifications
You must be signed in to change notification settings - Fork 32.3k
Description
Version
v18.15.0
Platform
Running this in a Kubernetes cluster inside nodejs:lts container. Details for K8s node "nodeInfo": { "architecture": "amd64", "bootID": "a5847523-d870-4bc0-9f1a-7a4ed9885ca3", "containerRuntimeVersion": "docker://20.10.12", "kernelVersion": "3.10.0-1160.53.1.el7.x86_64", "kubeProxyVersion": "v1.21.12", "kubeletVersion": "v1.21.12", "machineID": "", "operatingSystem": "linux", "osImage": "Red Hat Enterprise Linux", "systemUUID": "290B2642-439A-2099-6F2B-7ED84D7C284B" },
Subsystem
http or net
What steps will reproduce the bug?
Run non stop http1.1 requests over a keepAlive connection pool using node-fetch
or axios
against any server that closes the connection from its side after 'N' seconds. In my case the server did a 20 second close. There might be multiple requests to the server over the connection but if 20 seconds have elapsed it closes the connection after the the last http response is sent.I have a default nodejs client configuration ( I haven't assigned it more threads or anything).
When configuring the custom http agent on the client side, i supplied {keepAlive: true, maxSockets:50}
.
How often does it reproduce? Is there a required condition?
When the there's a lot of requests being sent constantly things are fine, but if there is a slowdown(not many things do, hence lesser requests to go out) - the next request usually ends up getting an ECONNRESET.
Based on my TCP dump I have, when there's a good load of requests over the connection pool when the server sends a
[FIN, ACK], the client sends a [FIN,ACK] and the server sends an ACK back and the connection closes successfully
But when there is a "lull" later and there's not enough requests over the pool, the server sends a [FIN,ACK] for an unused connection in the pool, the nodejs client responds [ACK] and the next request in the queue goes on this socket causing the server to respond with a RESET. (Rightly so - coz the server wanted to close the connection).
Now I believe the reason for the next request to go on the socket that just got the FIN has probably has to do with connection choosing strategy. I think the default is both these frameworks is lifo, and the ACK (without the FIN) that gets sent makes the connection go to the top of the pool for the next request.
What is the expected behavior? Why is that the expected behavior?
A socket closed from the server side (FIN,ACK sent by server) must be removed from the connection pool instead of it being kept in there - Regardless of the fact that a FIN wasn't sent back. And no future requests should go on it.
What do you see instead?
The connection stays in the pool if the FIN wasn't sent back. The next request goes on the connection. Sever forcibly closes connection with a RESET
Additional information
I tried a few other frameworks apart from node-fetch and axios (same issue, making me think it's a node core problem) - But I can't use them in my code so not mentioning them .
When I reduced maxSockets
from 50 to 20 the issue happened less frequently. Which is why i think it is related activity on those sockets. I switch to the keepaliveagent
package that has a SocketTTL feature - it helps, but doesn't solve the problem . Resets still happen (same issue). Seems like this issue was reported there and they tried to handle it there (still a problem though). I'm assuming this issue has the same problem I'm facing - they were using keepaliveagent
as well.