fix(sa-bench): shard high-concurrency loadgen across frontends#154
Open
YAMY1234 wants to merge 4 commits into
Open
fix(sa-bench): shard high-concurrency loadgen across frontends#154YAMY1234 wants to merge 4 commits into
YAMY1234 wants to merge 4 commits into
Conversation
Support direct multi-frontend request sharding to avoid single-destination ephemeral port exhaustion at high concurrency while keeping shared aiohttp sessions and nginx keepalive tuning.
Honor benchmark.env for all benchmark runners so sa-bench recipes can enable frontend sharding through environment flags.
Keep the sharded SA-Bench client changes passing the repository lint rules before opening the PR.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #154 +/- ##
=======================================
Coverage ? 65.10%
=======================================
Files ? 67
Lines ? 8228
Branches ? 0
=======================================
Hits ? 5357
Misses ? 2871
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--api-urlsupport to SA-Bench and round-robin requests across multiple frontend endpoints while sharing one aiohttp session.bench.shto populate direct frontend targets fromSA_BENCH_API_URLSor/logs/nginx.confwhenSA_BENCH_SHARD_FRONTENDS=true.benchmark.envthrough benchmark runners so recipes can enable sharded SA-Bench without using custom commands.Why
High-concurrency disaggregated serving runs can exhaust ephemeral ports when a single load generator connects to one
host:porttuple. Sharding request traffic across the deployed frontend endpoints avoids that client-side collapse without changing backend topology.For conc=32768

Before
After

Validation
uv run ruff check src/srtctl/benchmarks/base.py src/srtctl/benchmarks/custom.py src/srtctl/benchmarks/scripts/sa-bench/backend_request_func.py src/srtctl/benchmarks/scripts/sa-bench/benchmark_serving.pyuv run python -m py_compile src/srtctl/benchmarks/scripts/sa-bench/backend_request_func.py src/srtctl/benchmarks/scripts/sa-bench/benchmark_serving.pyuv run pytest tests/test_benchmarks.py tests/test_frontends.pyapi_urls=9.api_urls=9.