Skip to content

Conversation

@twz123
Copy link
Contributor

@twz123 twz123 commented Dec 20, 2025

The health check goroutine could outlive a call to ClientConn.close(). Add a done channel that will be waited on when closing the transport.

See:

RELEASE NOTES:

  • Closing a client connection will now block until the health check goroutine completes.

@codecov
Copy link

codecov bot commented Dec 20, 2025

Codecov Report

❌ Patch coverage is 68.42105% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.29%. Comparing base (319a0fa) to head (599fb48).
⚠️ Report is 49 commits behind head on master.

Files with missing lines Patch % Lines
clientconn.go 68.42% 5 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8783      +/-   ##
==========================================
- Coverage   83.33%   83.29%   -0.04%     
==========================================
  Files         418      418              
  Lines       32910    32921      +11     
==========================================
- Hits        27424    27421       -3     
- Misses       4088     4095       +7     
- Partials     1398     1405       +7     
Files with missing lines Coverage Δ
clientconn.go 89.98% <68.42%> (-0.46%) ⬇️

... and 27 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mbissa mbissa assigned mbissa and eshitachandwani and unassigned mbissa Dec 24, 2025
@eshitachandwani eshitachandwani added this to the 1.79 Release milestone Jan 2, 2026
clientconn.go Outdated
func (ac *addrConn) createTransport(ctx context.Context, addr resolver.Address, copts transport.ConnectOptions, connectDeadline time.Time) error {
addr.ServerName = ac.cc.getServerName(addr)

var healthCheckDone <-chan struct{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think healthCheckCh and healthCheckDoneCh might be slightly better variable names to indicate that these are channels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


ac.mu.Lock()
defer ac.mu.Unlock()
defer func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a small note stating where the healthCheckComplete is set and why do we wait on healthCheckComplete instead of healthCheckDone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments. Hope they're helpful.

The health check goroutine could outlive a call to ClientConn.close().
Add a done channel that will be waited on when closing the transport.

RELEASE NOTES:
- Closing a client connection will now block until the health check
  goroutine completes.

Signed-off-by: Tom Wieczorek <[email protected]>
@twz123 twz123 force-pushed the clientconn-wait-for-healthcheck branch from bfd0214 to 599fb48 Compare January 5, 2026 09:05
@github-actions
Copy link

This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

Copy link
Member

@eshitachandwani eshitachandwani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding @easwars as a second reviewer

func (ac *addrConn) createTransport(ctx context.Context, addr resolver.Address, copts transport.ConnectOptions, connectDeadline time.Time) error {
addr.ServerName = ac.cc.getServerName(addr)

var healthCheckDoneCh <-chan struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having two channels (where one is a copy of another, and we depend on nil checks to figure out if the associated event is something that will happen and therefore we should wait for it), what do you think about the following approach, which uses the grpcsync.Event type defined here:

type Event struct {

The grpcsync.Event is simply a wrapper around a channel and a boolean. So, you can check if the event has fired with one of its methods and you can wait for an event to fire with another of its methods.

  • Make startHealthCheck returns two grpcsync.Events
    • One for the health check goroutine started event
      • startHealthCheck will fire this event right before starting the health check goroutine
    • Another for the health check goroutine completed event
      • startHealthCheck will fire this event as a deferred statement from inside the health check goroutine
  • In createTransport
    • We can check in a defer that if the health check goroutine was started using the first of the above two events, and if the health check goroutine was not stated at all, then we can cancel the hctx
    • In onClose, we will wait for the health check goroutine completed event if and only if the health check goroutine was started.

Let me know what you think about this approach. Thanks.

@github-actions
Copy link

github-actions bot commented Feb 3, 2026

This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@github-actions github-actions bot added stale and removed stale labels Feb 3, 2026
@twz123
Copy link
Contributor Author

twz123 commented Feb 3, 2026

Will revisit this next week, hopefully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants