-
Notifications
You must be signed in to change notification settings - Fork 4.6k
clientconn: Wait for all goroutines on close #8666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8666 +/- ##
==========================================
+ Coverage 83.21% 83.22% +0.01%
==========================================
Files 419 419
Lines 32427 32468 +41
==========================================
+ Hits 26985 27023 +38
- Misses 4054 4058 +4
+ Partials 1388 1387 -1
🚀 New features to boost your workflow:
|
balancer_wrapper.go
Outdated
| healthData *healthData | ||
|
|
||
| shutdownMu sync.Mutex | ||
| shutdown chan struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you change this to shutdownCh to indicate it is a channel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
balancer_wrapper.go
Outdated
|
|
||
| func (acbw *acBalancerWrapper) UpdateAddresses(addrs []resolver.Address) { | ||
| acbw.ac.updateAddrs(addrs) | ||
| acbw.goFunc(func(shutdown <-chan struct{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we changed this to run the acbw.ac.updateAddrs(addrs) function in a go routine??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's basically a "bubbled-up" goroutine. Previously, the goroutine was spawned in updateAddrs itself (line 1021). But, as we now need to track those, I figured it would be most appropriate to do it here. Another option would be to somehow push this down into updateAddrs itself, by passing the acBalancerWrapper pointer, or a function pointer to acbw.goFunc or sth. along those lines, and then use that to spawn the goroutine there:
| acbw.goFunc(func(shutdown <-chan struct{}) { | |
| acbw.ac.updateAddrs(acbw, addrs) |
Then we could write line 1021 of updateAddrs like so:
acbw.goFunc(ac.resetTransportAndUnlock)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it this way. Seems to be clearer.
| ac.mu.Lock() | ||
| defer ac.mu.Unlock() | ||
| acMu := &ac.mu | ||
| acMu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason the original code did not work? this seems unnecessarily complicated and does the same thing. Or am I missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a way to make the defer do the unlock conditionally, as the code might need to unlock it before returning (see lines 1441 and 1442). We can achieve the same e.g. with a boolean variable, if you prefer.
I think using a pointer for this is good, because, if you forget the nil check, it will panic with an easy to understand stack trace. Whereas if you forget to check a boolean, you'd do a double-unlock and there's a chance that you get weirder problems.
e45e9d7 to
e8bb2b4
Compare
| gsb.pendingSwaps.Add(1) | ||
| go func() { | ||
| defer gsb.pendingSwaps.Done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using the newly added Go method on the sync.WaitGroup to run goroutines on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would require a Go bump to 1.25. The go.mod file still states go 1.24.0.
| // balancerCurrent. | ||
| currentMu sync.Mutex | ||
|
|
||
| pendingSwaps sync.WaitGroup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name this something else? This is actually waiting for the swap, but for the swapped-out balancer to be closed. I can't think of something nice. If you can't come up with something nicer either, then a comment would be useful. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with a generic "activeGoroutines" and added a comment.
| cur := gsb.balancerCurrent | ||
| gsb.balancerCurrent = gsb.balancerPending | ||
| gsb.balancerPending = nil | ||
| gsb.pendingSwaps.Add(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you are here, would you mind adding a comment to balancerWrapper.UpdateState (which is the only place from which swap is called), where it is currently checking if the update if from the current or pending balancer, and returning early if that is not the case.
I'm talking about this check:
| if !bw.gsb.balancerCurrentOrPending(bw) { |
This check ensures that when there is a race between UpdateState and Close, and if Close wins that race and runs first, UpdateState does not actually swap balancers. This guarantees that swap does not spawn a goroutine after the gracefulswitch balancer is closed. So, if we add a comment near that check explaining this, that would be great.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to add something helpful there.
Three goroutines could outlive a call to ClientConn.close(). Add mechanics to cancel them and wait for them to complete when closing a client connection. RELEASE NOTES: - Closing a client connection will cancel all pending goroutines and block until they complete. Signed-off-by: Tom Wieczorek <[email protected]>
e8bb2b4 to
3fa606c
Compare
Three goroutines could outlive a call to ClientConn.close(). Add mechanics to cancel them and wait for them to complete when closing a client connection.
Fixes #8655.
RELEASE NOTES: