Add IO concurrency limiting for shard cache reads by wolfv · Pull Request #2163 · conda/rattler

wolfv · 2026-03-04T20:48:14Z

Description

This PR adds a separate concurrency limiter for IO operations (specifically reading shard cache files from disk) to prevent exhausting the OS file-descriptor limit when many packages are queried concurrently.

Previously, only HTTP request concurrency was limited via max_concurrent_requests. When querying for many packages at once (e.g., with wildcard patterns), the gateway could attempt to open many shard cache files simultaneously, leading to file-descriptor exhaustion.

Changes:

Added max_concurrent_io field to GatewayBuilder with builder methods with_max_concurrent_io() and set_max_concurrent_io()
Added io_concurrency_semaphore to GatewayInner to track the IO concurrency limit
Updated ShardedSubdir to acquire an IO semaphore permit before reading cached shard files from disk
Updated SubdirBuilder to pass the IO semaphore to ShardedSubdir during construction

The IO semaphore is acquired before opening shard cache files, ensuring that concurrent file operations are bounded independently from HTTP request concurrency.

How Has This Been Tested?

The changes follow the existing pattern used for concurrent_requests_semaphore and integrate with the existing concurrency control infrastructure. Existing tests should continue to pass as the new semaphore is optional and defaults to unlimited when not configured.

Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas

https://claude.ai/code/session_018WhkrxiAzPjv4ByCqhXKg4

When querying for `*` (all packages), `fetch_package_records` is called concurrently for every package name. The existing `concurrent_requests_semaphore` already gates HTTP downloads, but cache reads (`tokio_fs::read`) were completely unthrottled — potentially opening thousands of file descriptors simultaneously and hitting the OS "too many open files" limit. Reuse the same semaphore to also gate the cache-read path, so the number of concurrently open shard files is bounded by the same limit as HTTP requests. https://claude.ai/code/session_018WhkrxiAzPjv4ByCqhXKg4

… shard cache reads When querying for `*` all shard cache reads fire concurrently, each opening a file descriptor. Previously the fix incorrectly reused the HTTP `concurrent_requests_semaphore` for IO — conflating two different resource types. Follow the same pattern as `rattler::install::InstallDriver`, which keeps a separate `io_concurrency_semaphore` for file operations: - `GatewayBuilder`: add `max_concurrent_io: MaxConcurrency` field with `with_max_concurrent_io` / `set_max_concurrent_io` builder methods. - `GatewayInner`: store the resulting `io_concurrency_semaphore`. - Thread it through `SubdirBuilder::build_sharded` → `ShardedSubdir::new`. - `ShardedSubdir::fetch_package_records`: acquire the IO permit before `tokio_fs::read` so the number of concurrently open shard files is bounded independently of the HTTP request limit. https://claude.ai/code/session_018WhkrxiAzPjv4ByCqhXKg4

…st call sites Two test helpers in sharded_subdir/mod.rs call ShardedSubdir::new directly and were missing the new io_concurrency_semaphore parameter added in the previous commit. https://claude.ai/code/session_018WhkrxiAzPjv4ByCqhXKg4

claude added 2 commits March 4, 2026 15:59

github-actions bot added the vouched label Mar 4, 2026

baszalmstra approved these changes Mar 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add IO concurrency limiting for shard cache reads#2163

Add IO concurrency limiting for shard cache reads#2163
wolfv wants to merge 3 commits intomainfrom
claude/fix-repodata-open-files-TRv0Y

wolfv commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wolfv commented Mar 4, 2026

Description

How Has This Been Tested?

Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants