Goroutine Leak on Thanos Receive (Ingestor&Router)

**Thanos, Prometheus and Golang version used**:
Thanos: v0.39.2 (considering historic data, we've seen this behavior on previous releases)

**Object Storage Provider**:
Azure (should be **not** relevant)

**What happened**:
Exponential increase in active goroutines, probably linked to a mutex within the receiver router service. The increase of ingestor goroutines resets when router deployments are restarted and is likely caused by it.

Here's an example goroutine:
```
goroutine 1838951702 [sync.Cond.Wait, 24 minutes]:
sync.runtime_notifyListWait(0xc000d93d50, 0x0)
	/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.0.linux-amd64/src/runtime/sema.go:597 +0x159
sync.(*Cond).Wait(0xc005790f98?)
	/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.0.linux-amd64/src/sync/cond.go:71 +0x85
google.golang.org/grpc/internal/transport.(*http2Client).keepalive(0xc004116488)
	/go/pkg/mod/google.golang.org/grpc@v1.63.2/internal/transport/http2_client.go:1710 +0x225
created by google.golang.org/grpc/internal/transport.newHTTP2Client in goroutine 1838951656
	/go/pkg/mod/google.golang.org/grpc@v1.63.2/internal/transport/http2_client.go:399 +0x1dab
```

**How to reproduce it (as minimally and precisely as possible)**:
No consistent way has been discovered yet though it happens somewhat frequently on our thanos installation.

**Full logs to relevant components**:
Logs are looking normal, there are no warnings/errors other than what's expected

**Anything else we need to know**:

[goroutines-receive-ingestor.txt](https://github.com/user-attachments/files/23480142/goroutines-receive-ingestor.txt)
[goroutines-receive-router.txt](https://github.com/user-attachments/files/23480143/goroutines-receive-router.txt)

<img width="1983" height="1127" alt="Image" src="https://github.com/user-attachments/assets/dd7b1fae-209f-43f3-a0e5-de4672111bd9" />

<img width="1983" height="1127" alt="Image" src="https://github.com/user-attachments/assets/11c209f8-451e-4cee-8b3f-e403a4f7ec8f" />

After rotating all receive routers (ingestors have not been restarted):
<img width="1983" height="1127" alt="Image" src="https://github.com/user-attachments/assets/134e7234-d80e-4d60-bdc8-6a28ec7a8b31" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Goroutine Leak on Thanos Receive (Ingestor&Router) #8557

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Goroutine Leak on Thanos Receive (Ingestor&Router) #8557

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions