Conversation
* Added CONTEXT_LENGTH and MAX_PREFILL_TOKENS variables for better configuration. * Updated launch_server command with new options: --tokenizer-worker-num, --enable-aiter-allreduce-fusion, --cuda-graph-max-bs, --context-length, --disable-radix-cache, --max-prefill-tokens, and --scheduler-recv-interval.
… benchmark configurations for MI355X, enhancing performance with updated CLI arguments.
….yaml to v0.5.9, ensuring compatibility with recent changes.
… and BF16 SGLang benchmarks on MI355X, ensuring accurate tracking of performance enhancements.
… configurations and adjust perf-changelog.yaml to reflect the changes, ensuring accurate performance tracking and compatibility.
…ngelog.yaml to reflect improved CLI arguments for MI355X, ensuring better performance tracking.
…ter and adjusting memory fraction. Updated launch_server command to include data-parallel-size and improved context length handling for better performance.
…chmarks, increasing conc-end values and adding new entries for improved performance tuning on MI355X and MI300X.
…cripts for MI355X to streamline configuration and improve performance.
… EP_SIZE parameter for search-space entries, enhancing performance tuning for MI355X and MI300X. Adjusted perf-changelog.yaml to reflect updated image tag for better performance tracking.
…F16 and FP8 to improve performance tuning. Adjusted search-space configurations in amd-master.yaml to increase conc-end values for MI355X and MI300X.
…0rc0-rocm700 for MI355X and MI300X configurations, ensuring compatibility and improved performance tracking.
… MI355X to v0.5.10rc0-rocm700 and MI300X to v0.5.9-rocm720, ensuring compatibility and consistency across configurations.
…from 0.75 to 0.8 in Qwen3.5 BF16 and FP8 benchmark scripts to enhance performance tuning.
…chmarks, adjusting parameters to optimize performance for MI355X. Update perf-changelog.yaml to remove an outdated entry.
…3.5 benchmarks, replacing outdated sglang image references with the latest version to ensure consistency and improved performance.
…k script to simplify configuration and enhance performance tuning.
… the latest version (20260414) for improved consistency and performance.
… amd-master.yaml to streamline configuration.
… amd-master.yaml to enhance configuration clarity.
…md-master.yaml by adjusting parameters to improve performance tuning.
…l to the latest version (20260415) for improved consistency and performance.
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
1 similar comment
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
…hangelog.yaml to the latest version (v0.5.10rc0-rocm720-mi35x-20260415) for enhanced performance.
…ipt for enhanced performance tuning.
…onfigurations to improve performance tuning.
…g.yaml to enhance configuration clarity and maintain consistency.
|
@claude can u add these links to changelog yaml pdate image, include pr changes: |
|
Claude finished @functionstackx's task in 50s —— View job Adding SGLang PR links to perf-changelog.yaml
Added the three upstream SGLang PR links to the existing
Commit: |
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🟡
perf-changelog.yaml:1362-1369— The new perf-changelog entry has three copy-paste errors: (1)qwen3.5-fp8-mi355x-sglangis listed twice inconfig-keys(positions 0 and 2) whileqwen3.5-bf16-mi355x-sglangappears only once — the correct list should be[qwen3.5-bf16-mi355x-sglang, qwen3.5-fp8-mi355x-sglang]; (2)pr-linkpoints to PR #980 instead of the current PR #1036; (3) the description sayslmsysorg/sglang:v0.5.10-rocm720-mi35xbut the actual image inamd-master.yamlislmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260415(different repo, RC version, date suffix). Please fix all three before merging.Extended reasoning...
Bug 1 — Duplicate config-key
The
config-keyslist in the new perf-changelog entry reads:- qwen3.5-fp8-mi355x-sglang # index 0 - qwen3.5-bf16-mi355x-sglang # index 1 - qwen3.5-fp8-mi355x-sglang # index 2 ← duplicate
The FP8 key is present at both index 0 and index 2; the BF16 key is present once. This is a copy-paste artifact — the entry was likely templated from the analogous MI300X/MI325X entry (PR #986) and the first or last line was not changed from FP8 to BF16. The correct list should be
[qwen3.5-bf16-mi355x-sglang, qwen3.5-fp8-mi355x-sglang]. If any tooling processesconfig-keysto build a changelog database, verify coverage, or detect regressions, the duplicate will cause the FP8 config to be processed twice while the BF16 config is silently missing from the record.Bug 2 — Wrong PR link
The entry sets
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/980. This PR is #1036; PR #980 is a completely different, unrelated PR that predates this one by over 50 numbers. Looking at every other entry inperf-changelog.yaml, eachpr-linkreferences the specific PR that introduced those changes — the link is the primary audit trail connecting a config change back to a code review. The analogous MI300X/MI325X entry (PR #986) was clearly used as a template and the link was not updated. The correct value ishttps://github.com/SemiAnalysisAI/InferenceX/pull/1036.Bug 3 — Inaccurate image tag in description
The description bullet reads
"Use lmsysorg/sglang:v0.5.10-rocm720-mi35x"but the image actually configured inamd-master.yamlfor bothqwen3.5-bf16-mi355x-sglangandqwen3.5-fp8-mi355x-sglangislmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260415. There are three discrepancies: (a) the Docker Hub repository issglang-rocmnotsglang; (b) the version isv0.5.10rc0(a release candidate), not the stablev0.5.10; (c) the date suffix-20260415is omitted. For comparison, the analogous MI300X/MI325X entry (PR #986) correctly documentslmsysorg/sglang:v0.5.10-rocm720-mi30x, which matches the actual image used there — the MI355X description appears to have been copied from that entry without updating for the different RC image. The PR description itself also contains this inaccuracy.Step-by-step proof
- Open
perf-changelog.yaml, lines 1362–1369 (new entry added by this PR). config-keyslist:fp8, bf16, fp8— FP8 duplicated, BF16 appears once. ✗pr-link:.../pull/980— current PR is #1036. ✗- Description image:
lmsysorg/sglang:v0.5.10-rocm720-mi35x. - Open
amd-master.yaml,qwen3.5-bf16-mi355x-sglangentry:image: lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260415. Mismatch on repo name, RC suffix, and date suffix. ✗ - Same image is set for
qwen3.5-fp8-mi355x-sglangin the same file. ✗
All three errors are documentation-only and have no impact on runtime behavior, but they create a misleading changelog entry that should be corrected before merge.
- Open
…X entry Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260414
Co-authored-by: @chunfangamd @1am9trash