Skip to content

feat: implement in-flight capacity tracking and backpressure for storage provider selection#672

Open
alikhere wants to merge 8 commits intostoracha:mainfrom
alikhere:feat/in-flight-capacity
Open

feat: implement in-flight capacity tracking and backpressure for storage provider selection#672
alikhere wants to merge 8 commits intostoracha:mainfrom
alikhere:feat/in-flight-capacity

Conversation

@alikhere
Copy link
Contributor

Summary

Implements in-flight capacity tracking and backpressure mechanism to prevent overloaded storage nodes from being selected. This ensures that providers with insufficient capacity are automatically excluded during selection, and tracks both used and claimed (in-flight) capacity.

Core Features

  • Provider Capacity Tracking: Tracks usedCapacity, claimedCapacity, and maxCapacity per provider
  • Capacity-Aware Router: Automatically filters out providers without available capacity
  • In-Flight Tracking: Claims capacity during allocation, finalizes on success, releases on failure
  • Allocation Tracker: Monitors in-flight allocations with timeout-based cleanup for abandoned uploads

Implementation Details

  1. Provider Capacity Storage (ProviderCapacityStorage)

    • In-memory implementation for testing
    • Atomic capacity operations to prevent race conditions
    • Methods: claimCapacity(), releaseClaimed(), finalizeAllocation(), hasAvailableCapacity()
  2. Capacity-Aware Router (createCapacityAwareRouter)

    • Wraps existing router to filter providers by capacity
    • Checks used_capacity + claimed_capacity + upload_size <= max_capacity
    • Works for both primary allocation and replication
  3. Allocation Flow Integration

    • Allocation: Claims capacity immediately after provider selection
    • Acceptance: Finalizes allocation (claimed → used) on success, releases on failure
    • Failure Handling: Releases claimed capacity on all failure paths
  4. Allocation Tracker

    • Tracks in-flight allocations with timestamps
    • Provides cleanup mechanism for abandoned allocations (5-minute default timeout)
    • Prevents memory leaks in allocation accounting

Testing

  • ✅ Provider capacity storage operations
  • ✅ Prevention of over-allocation
  • ✅ Capacity-aware router selection
  • ✅ Capacity lifecycle (claim → finalize/release)

Closes #654

@alikhere alikhere requested a review from travis as a code owner February 19, 2026 17:48
@alikhere
Copy link
Contributor Author

Hi @alanshaw I’ve implemented in-flight capacity tracking and backpressure as discussed in #654.
Please take a look when you have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add in flight counting (enables backpressure)

1 participant