Huge collection and perfomance for qdrant cluster #7984

kimmy-github · 2026-01-26T03:51:49Z

kimmy-github
Jan 26, 2026

Recently, I ran into an issue in my Qdrant cluster. One of the collections has grown to nearly 400 GB. Every day at 2:00 a.m., I trigger a backup, and during that process the system reports warnings like these:

qdrant | 2026-01-25T18:03:54.811658Z WARN storage::content_manager::consensus_manager: Failed to apply collection meta operation entry with user error: Bad request: There is no transfer for shard 1 from 3903034538002768 to 7761821500842248
qdrant | 2026-01-25T18:04:03.460369Z WARN storage::content_manager::consensus_manager: Failed to send message to http://10.10.1.2:6335/ with error: Error in closure supplied to transport channel pool: status: Cancelled, message: "Timeout expired", details: [], metadata: MetadataMap { headers: {} }

After checking the monitoring data, I noticed that during the backup window the combined disk read and write throughput exceeds 800 MB/s, and disk I/O wait time can peak around 13 ms.

So I have a few questions. First, how can I check the timeout setting for this kind of consensus_manager—in other words, where is the timeout configured and how can I inspect it? Second, is a 400 GB collection considered too large for Qdrant in practice? Finally, are there any good approaches to optimize the system so backups don’t trigger these errors or timeouts?

generall · 2026-01-26T09:21:19Z

generall
Jan 26, 2026
Maintainer

what exactly do you call a backup? Are you using managed/hybrid cloud?

2 replies

kimmy-github Jan 26, 2026
Author

I use https://qdrant.tech/documentation/database-tutorials/create-snapshot/ to make a backup on 3 node qdrant cluster which was deployed on our own linux servers,not using managed/hybrid cloud

timvisee Jan 28, 2026
Maintainer

After checking the monitoring data, I noticed that during the backup window the combined disk read and write throughput exceeds 800 MB/s, and disk I/O wait time can peak around 13 ms.

Given the amount of data it makes sense that it'll be using a lot of IO.

Second, is a 400 GB collection considered too large for Qdrant in practice?
Finally, are there any good approaches to optimize the system so backups don’t trigger these errors or timeouts?

No, that isn't too large. But it may be very large in terms of snapshots. I'd recommend to use a file system based snapshot technique instead, like I've described here.

timvisee · 2026-01-26T09:44:25Z

timvisee
Jan 26, 2026
Maintainer

Being unsure yet what environment you run in: on big deployments its recommended to create a disk level snapshot to backup. That should be supported on all major cloud providers. Note that I do not mean a Qdrant snapshot, but a disk/filesystem level snapshot.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qdrant

Huge collection and perfomance for qdrant cluster #7984

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Qdrant

Huge collection and perfomance for qdrant cluster #7984

Uh oh!

Uh oh!

kimmy-github Jan 26, 2026

Replies: 2 comments · 2 replies

Uh oh!

generall Jan 26, 2026 Maintainer

Uh oh!

kimmy-github Jan 26, 2026 Author

Uh oh!

timvisee Jan 28, 2026 Maintainer

Uh oh!

Uh oh!

timvisee Jan 26, 2026 Maintainer

kimmy-github
Jan 26, 2026

Replies: 2 comments 2 replies

generall
Jan 26, 2026
Maintainer

kimmy-github Jan 26, 2026
Author

timvisee Jan 28, 2026
Maintainer

timvisee
Jan 26, 2026
Maintainer