CASSGO-41 Deadlock in refreshDebouncer when reconnection fails

### What version of Cassandra are you using?
astra-classic

### What version of Gocql are you using?
v1.6.0


### What version of Go are you using?
1.21

### What did you do?
Connection errors, I think due to overload, lead to frequent reconnection attempts and failures

### What did you expect to see?
Should retry until connection succeeds


### What did you see instead?
Deadlock

```
498297 goroutine 1324045437 [chan send, 113 minutes]:
498298 github.com/gocql/gocql.(*refreshDebouncer).stop(0xc0b826a7c0)
498299         /go/pkg/mod/github.com/gocql/gocql@v1.6.0/host_source.go:848 +0x8c
498300 github.com/gocql/gocql.(*Session).Close(0xc03efb0c00)
498301         /go/pkg/mod/github.com/gocql/gocql@v1.6.0/session.go:494 +0x105
498302 github.com/gocql/gocql.NewSession({{0xc24be58930, 0x3, 0x3}, {0x2ef55cf, 0x5}, 0x4, 0x12a05f200, 0x12a05f200, 0x0, 0x755a, ...})
498303         /go/pkg/mod/github.com/gocql/gocql@v1.6.0/session.go:180 +0x98d
498304 github.com/gocql/gocql.(*ClusterConfig).CreateSession(...)
498305         /go/pkg/mod/github.com/gocql/gocql@v1.6.0/cluster.go:289
```

It looks like a race condition between `(*refreshDebouncer).stop()` and `(*refreshDebouncer).flusher()`
1. `stop()` acquires `d.mu` and sets `d.stopped` to true
2. `flusher()` exits the `select` at the top of the loop and blocks on acquiring `d.mu`
3. `stop()` releases `d.mu` and tries to write to `d.quit`
4. `flusher()` acquires `d.mu` and returns because `d.stopped` is true
5. `stop()` is deadlocked because `d.quit` is unbuffered and the reader has stopped

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CASSGO-41 Deadlock in refreshDebouncer when reconnection fails #1752

What version of Cassandra are you using?

What version of Gocql are you using?

What version of Go are you using?

What did you do?

What did you expect to see?

What did you see instead?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CASSGO-41 Deadlock in refreshDebouncer when reconnection fails #1752

Description

What version of Cassandra are you using?

What version of Gocql are you using?

What version of Go are you using?

What did you do?

What did you expect to see?

What did you see instead?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions