Skip to content

Retention service hangs and does not remove old shards #25054

Open
@gwossum

Description

@gwossum

Under certain conditions, the retention service can become hung waiting on a shard's reference count to drop to zero. When this happens, no other shards can be removed by the retention service. This can eventually result in high disk usage.

The attached goroutine trace shows a system exhibiting the issue. The retention service is stuck on waiting on the WaitGroup used to indicate that the references to the shard have dropped to zero.
goroutine.txt

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions