[AMD] Add Minimax tp8 with ep and update vllm image for MI355x by benenzhu · Pull Request #927 · SemiAnalysisAI/InferenceX

benenzhu · 2026-03-23T10:20:51Z

Add tp8 with ep for conc 32 - 256 for Minimax in mi355x.

Made-with: Cursor # Conflicts: # benchmarks/single_node/minimaxm2.5_fp8_mi355x.sh # perf-changelog.yaml

github-actions · 2026-03-23T10:21:01Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

github-actions · 2026-03-23T10:21:01Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

benenzhu · 2026-03-23T10:22:55Z

/sweep test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm

github-actions · 2026-03-23T10:23:04Z

@benenzhu Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/23432526160
Command: test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm
Pinned ref: 9d2248d
Approval: not required (trusted collaborator).

claude

LGTM — straightforward config and script extension; the only issue is the nit flagged inline (wrong PR link in changelog).

Extended reasoning...

Overview

This PR touches three files: the AMD benchmark config (amd-master.yaml), the MiniMax MI355X benchmark shell script, and perf-changelog.yaml. The config adds { tp: 8, ep: 8, conc-start: 32, conc-end: 256 } to all three seq-len configs for minimaxm2.5-fp8-mi355x-vllm and bumps the image from v0.15.1 to v0.18.0. The shell script adds EP_SIZE as a required env var, conditionally passes --enable-expert-parallel to vllm serve, and removes the previously-present --disable-log-requests flag (consistent with a recent standalone commit c35e700 that removed it). The changelog gets one new entry.

Security Risks

No security-sensitive code paths are touched. This is purely benchmark configuration and a shell launch script for an inference server. No auth, crypto, or permission logic is involved.

Level of Scrutiny

Low scrutiny is warranted. All three changes are mechanical and follow well-established patterns already present in the repo (the EP conditional mirrors the same pattern used in other MiniMax scripts). The image bump and new TP=8/EP=8 search-space entries are additive and consistent with peer configs (e.g., dsr1-fp4-mi355x-atom uses identical ep-based entries).

Other Factors

The only bug found is a nit: the new perf-changelog.yaml entry references /pull/868 instead of /pull/927. This is a documentation/traceability issue with no functional impact. The functional changes are correct and complete. No outstanding human reviewer comments exist in the timeline.

perf-changelog.yaml

cquil11 · 2026-03-23T13:11:00Z

/sweep test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm --evals-only

github-actions · 2026-03-23T13:11:10Z

@cquil11 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/23438981061
Command: test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm --evals-only
Pinned ref: 9d2248d
Approval: not required (trusted collaborator).

cquil11 · 2026-03-23T13:38:16Z

Once accuracy check looks good then we can merge.

benenzhu · 2026-03-23T13:41:37Z

vllm-project/recipes#300 The vllm recipes PR for this one.

chunfangamd

1K1K: +21.3%
8K1K: +14.3%

benenzhu and others added 7 commits March 5, 2026 07:55

add minimax tp8 with ep and remove tp-4

1ca7a2b

update changelog

4edc25b

Update amd-master.yaml for 1k8k & 8k1k CONC to 256

8ae8114

Merge remote-tracking branch 'upstream/main' into minimax-mi355-opt

1f108c5

Made-with: Cursor # Conflicts: # benchmarks/single_node/minimaxm2.5_fp8_mi355x.sh # perf-changelog.yaml

fix

26c5c5f

Update perf-changelog.yaml

d6e9723

Update minimaxm2.5_fp8_mi355x.sh to remove log requests

c35e700

benenzhu requested a review from a team March 23, 2026 10:20

benenzhu requested review from billishyahao and chunfangamd as code owners March 23, 2026 10:20

github-project-automation bot added this to InferenceMAX Board Mar 23, 2026

Update perf-changelog.yaml

9d2248d

claude bot reviewed Mar 23, 2026

View reviewed changes

perf-changelog.yaml Show resolved Hide resolved

Merge branch 'main' into minimax-mi355-opt

db03dc0

chunfangamd approved these changes Mar 23, 2026

View reviewed changes

chunfangamd enabled auto-merge (squash) March 23, 2026 14:16

cquil11 approved these changes Mar 23, 2026

View reviewed changes

chunfangamd merged commit 0d571de into main Mar 23, 2026
13 checks passed

chunfangamd deleted the minimax-mi355-opt branch March 23, 2026 14:38

github-project-automation bot moved this to Done in InferenceMAX Board Mar 23, 2026

cquil11 added the AMD label Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] Add Minimax tp8 with ep and update vllm image for MI355x#927

[AMD] Add Minimax tp8 with ep and update vllm image for MI355x#927
chunfangamd merged 9 commits intomainfrom
minimax-mi355-opt

benenzhu commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

benenzhu commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

claude bot left a comment

Uh oh!

Uh oh!

cquil11 commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

cquil11 commented Mar 23, 2026

Uh oh!

benenzhu commented Mar 23, 2026

Uh oh!

chunfangamd left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

benenzhu commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

benenzhu commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Overview

Security Risks

Level of Scrutiny

Other Factors

Uh oh!

Uh oh!

cquil11 commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

cquil11 commented Mar 23, 2026

Uh oh!

benenzhu commented Mar 23, 2026

Uh oh!

chunfangamd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants