Skip to content

Add IO concurrency limiting for shard cache reads#2163

Open
wolfv wants to merge 3 commits intomainfrom
claude/fix-repodata-open-files-TRv0Y
Open

Add IO concurrency limiting for shard cache reads#2163
wolfv wants to merge 3 commits intomainfrom
claude/fix-repodata-open-files-TRv0Y

Conversation

@wolfv
Copy link
Contributor

@wolfv wolfv commented Mar 4, 2026

Description

This PR adds a separate concurrency limiter for IO operations (specifically reading shard cache files from disk) to prevent exhausting the OS file-descriptor limit when many packages are queried concurrently.

Previously, only HTTP request concurrency was limited via max_concurrent_requests. When querying for many packages at once (e.g., with wildcard patterns), the gateway could attempt to open many shard cache files simultaneously, leading to file-descriptor exhaustion.

Changes:

  • Added max_concurrent_io field to GatewayBuilder with builder methods with_max_concurrent_io() and set_max_concurrent_io()
  • Added io_concurrency_semaphore to GatewayInner to track the IO concurrency limit
  • Updated ShardedSubdir to acquire an IO semaphore permit before reading cached shard files from disk
  • Updated SubdirBuilder to pass the IO semaphore to ShardedSubdir during construction

The IO semaphore is acquired before opening shard cache files, ensuring that concurrent file operations are bounded independently from HTTP request concurrency.

How Has This Been Tested?

The changes follow the existing pattern used for concurrent_requests_semaphore and integrate with the existing concurrency control infrastructure. Existing tests should continue to pass as the new semaphore is optional and defaults to unlimited when not configured.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas

https://claude.ai/code/session_018WhkrxiAzPjv4ByCqhXKg4

claude added 2 commits March 4, 2026 15:59
When querying for `*` (all packages), `fetch_package_records` is called
concurrently for every package name. The existing `concurrent_requests_semaphore`
already gates HTTP downloads, but cache reads (`tokio_fs::read`) were
completely unthrottled — potentially opening thousands of file descriptors
simultaneously and hitting the OS "too many open files" limit.

Reuse the same semaphore to also gate the cache-read path, so the number
of concurrently open shard files is bounded by the same limit as HTTP
requests.

https://claude.ai/code/session_018WhkrxiAzPjv4ByCqhXKg4
… shard cache reads

When querying for `*` all shard cache reads fire concurrently, each
opening a file descriptor. Previously the fix incorrectly reused the
HTTP `concurrent_requests_semaphore` for IO — conflating two different
resource types.

Follow the same pattern as `rattler::install::InstallDriver`, which
keeps a separate `io_concurrency_semaphore` for file operations:

- `GatewayBuilder`: add `max_concurrent_io: MaxConcurrency` field with
  `with_max_concurrent_io` / `set_max_concurrent_io` builder methods.
- `GatewayInner`: store the resulting `io_concurrency_semaphore`.
- Thread it through `SubdirBuilder::build_sharded` → `ShardedSubdir::new`.
- `ShardedSubdir::fetch_package_records`: acquire the IO permit before
  `tokio_fs::read` so the number of concurrently open shard files is
  bounded independently of the HTTP request limit.

https://claude.ai/code/session_018WhkrxiAzPjv4ByCqhXKg4
…st call sites

Two test helpers in sharded_subdir/mod.rs call ShardedSubdir::new directly
and were missing the new io_concurrency_semaphore parameter added in the
previous commit.

https://claude.ai/code/session_018WhkrxiAzPjv4ByCqhXKg4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants