Skip to content
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
10f16e6
feat: enhance Qwen benchmark scripts with additional parameters
Apr 15, 2026
b22010e
Update perf-changelog.yaml to include new Qwen3.5 FP8 and BF16 SGLang…
Apr 15, 2026
22d9500
Update SGLang image versions for Qwen3.5 configurations in amd-master…
Apr 15, 2026
6261169
use 0327 build
Apr 15, 2026
559daa3
Update perf-changelog.yaml to reflect the new PR link for Qwen3.5 FP8…
Apr 15, 2026
8be5c3b
Update Qwen3.5 image tags in amd-master.yaml to v0.5.10rc0 for MI355X…
Apr 15, 2026
c4120a8
Update Qwen3.5 FP8 and BF16 SGLang benchmark descriptions in perf-cha…
Apr 15, 2026
5760086
Enhance Qwen3.5 benchmark scripts for MI355X by adding EP_SIZE parame…
Apr 15, 2026
df4c673
Update search-space configurations in amd-master.yaml for Qwen3.5 ben…
Apr 15, 2026
4367ae0
Remove context length parameter from Qwen3.5 BF16 and FP8 benchmark s…
Apr 15, 2026
37406ec
update to 5.10 rocm for qwen35
zhentaocc Apr 15, 2026
f5279e5
Update Qwen3.5 benchmark configurations in amd-master.yaml to include…
Apr 15, 2026
da6b5ac
Update context length calculations in Qwen3.5 benchmark scripts for B…
Apr 15, 2026
b318fee
Update image tags in amd-master.yaml for Qwen3.5 benchmarks to v0.5.1…
Apr 15, 2026
54b94f1
Update image tags in amd-master.yaml for Qwen3.5 benchmarks, changing…
Apr 15, 2026
0e0d51f
Remove data-parallel-size parameter and increase mem-fraction-static …
Apr 15, 2026
44cccfc
Update sglang image for qwen3.5 mi355x configs to fix shared memory c…
chunfangamd Apr 15, 2026
ce131cc
Refine search-space configurations in amd-master.yaml for Qwen3.5 ben…
Apr 15, 2026
f9f7f29
Update image tags in amd-master.yaml and perf-changelog.yaml for Qwen…
Apr 15, 2026
0fc9c12
Remove aiter allreduce fusion option from Qwen3.5 FP8 MI355X benchmar…
Apr 15, 2026
fe7672b
optimize the search space
chunfangamd Apr 15, 2026
7776a07
Upgrade image to 20260413
chunfangamd Apr 15, 2026
99b69b3
Update sglang image tags for Qwen3.5 benchmarks in amd-master.yaml to…
Apr 15, 2026
0a35926
Remove redundant search-space entry for qwen3.5-bf16-mi355x-sglang in…
Apr 15, 2026
f7a59b6
Remove duplicate search-space entry for qwen3.5-bf16-mi355x-sglang in…
Apr 15, 2026
b1bd701
Refine search-space configurations for qwen3.5-fp8-mi355x-sglang in a…
Apr 15, 2026
2915be4
Update sglang image tags for qwen3.5 configurations in amd-master.yam…
Apr 15, 2026
81c8886
Update sglang image tag for Qwen3.5 FP8 and BF16 benchmarks in perf-c…
Apr 16, 2026
1f12ab3
Add aiter allreduce fusion option to Qwen3.5 FP8 MI355X benchmark scr…
Apr 16, 2026
319ed68
Adjust CONTEXT_LENGTH in Qwen3.5 benchmark scripts for BF16 and FP8 c…
Apr 16, 2026
ee837ec
Remove duplicate entry for qwen3.5-fp8-mi355x-sglang in perf-changelo…
Apr 16, 2026
8c51a63
Add upstream SGLang PR links to perf-changelog.yaml for qwen3.5 MI355…
github-actions[bot] Apr 16, 2026
f0a8e2d
Downgrade the image for Qwen3.5-FP8-MI355X-SGLang to 20260414
chunfangamd Apr 16, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions .github/configs/amd-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ dsr1-fp8-mi355x-sglang:
- { tp: 8, conc-start: 4, conc-end: 64 }

qwen3.5-bf16-mi355x-sglang:
image: rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260215
image: lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260415
model: Qwen/Qwen3.5-397B-A17B
model-prefix: qwen3.5
runner: mi355x
Expand All @@ -125,11 +125,11 @@ qwen3.5-bf16-mi355x-sglang:
- isl: 1024
osl: 1024
search-space:
- { tp: 8, conc-start: 4, conc-end: 64 }
- { tp: 8, ep: 1, conc-start: 4, conc-end: 256 }
- isl: 8192
osl: 1024
search-space:
- { tp: 8, conc-start: 4, conc-end: 64 }
- { tp: 8, ep: 1, conc-start: 4, conc-end: 256 }

qwen3.5-bf16-mi300x-sglang:
image: lmsysorg/sglang:v0.5.10-rocm720-mi30x
Expand Down Expand Up @@ -186,7 +186,7 @@ qwen3.5-fp8-mi325x-sglang:
- { tp: 8, conc-start: 4, conc-end: 64 }

qwen3.5-fp8-mi355x-sglang:
image: rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260218
image: lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260415
model: Qwen/Qwen3.5-397B-A17B-FP8
model-prefix: qwen3.5
runner: mi355x
Expand All @@ -197,11 +197,14 @@ qwen3.5-fp8-mi355x-sglang:
- isl: 1024
osl: 1024
search-space:
- { tp: 8, conc-start: 4, conc-end: 64 }
- { tp: 8, ep: 1, conc-start: 4, conc-end: 32 }
- { tp: 8, ep: 8, conc-start: 64, conc-end: 256 }
- { tp: 2, ep: 2, conc-start: 128, conc-end: 256 }
- isl: 8192
osl: 1024
search-space:
- { tp: 8, conc-start: 4, conc-end: 64 }
- { tp: 2, ep: 2, conc-start: 4, conc-end: 32 }
- { tp: 4, ep: 1, conc-start: 32, conc-end: 256 }

qwen3.5-fp4-mi355x-sglang:
image: lmsysorg/sglang:v0.5.10-rocm720-mi35x
Expand Down
3 changes: 1 addition & 2 deletions benchmarks/single_node/qwen3.5_bf16_mi355x.sh
Original file line number Diff line number Diff line change
Expand Up @@ -39,15 +39,14 @@ python3 -m sglang.launch_server \
--port $PORT \
--tensor-parallel-size $TP \
--ep-size $EP_SIZE \
--data-parallel-size 1 \
--trust-remote-code \
--tokenizer-worker-num 6 \
--enable-aiter-allreduce-fusion \
--cuda-graph-max-bs $CONC \
--disable-radix-cache \
--max-prefill-tokens $MAX_PREFILL_TOKENS \
--scheduler-recv-interval 30 \
--mem-fraction-static 0.75 $EVAL_CONTEXT_ARGS > $SERVER_LOG 2>&1 &
--mem-fraction-static 0.8 $EVAL_CONTEXT_ARGS > $SERVER_LOG 2>&1 &

SERVER_PID=$!

Expand Down
3 changes: 1 addition & 2 deletions benchmarks/single_node/qwen3.5_fp8_mi355x.sh
Original file line number Diff line number Diff line change
Expand Up @@ -39,15 +39,14 @@ python3 -m sglang.launch_server \
--port $PORT \
--tensor-parallel-size $TP \
--ep-size $EP_SIZE \
--data-parallel-size 1 \
--trust-remote-code \
--tokenizer-worker-num 6 \
--enable-aiter-allreduce-fusion \
--cuda-graph-max-bs $CONC \
--disable-radix-cache \
--max-prefill-tokens $MAX_PREFILL_TOKENS \
--scheduler-recv-interval 30 \
--mem-fraction-static 0.75 $EVAL_CONTEXT_ARGS > $SERVER_LOG 2>&1 &
--mem-fraction-static 0.8 $EVAL_CONTEXT_ARGS > $SERVER_LOG 2>&1 &

SERVER_PID=$!

Expand Down
8 changes: 8 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1358,3 +1358,11 @@
description:
- "Enable SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 SGLang MTP"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1017

- config-keys:
- qwen3.5-fp8-mi355x-sglang
- qwen3.5-bf16-mi355x-sglang
description:
- "Update cli args of Qwen3.5 FP8 and BF16 SGLang benchmarks for MI355X to achieve better performance"
- "Use lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260415"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1036
Loading