[AMD] Optimize search space and upgrade Image to 0.19.0 for MiniMax-M2.5 by chunfangamd · Pull Request #1003 · SemiAnalysisAI/InferenceX

chunfangamd · 2026-04-05T06:53:25Z

Optimize MiniMax-M2.5 FP8 MI355X vLLM search-space
Upgrade Image to v0.19.0
Enable FP8 KV cache + AITER FA (It is important to recheck this when upgrading to higher image versions)

e2e Test run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/23987768210

Fewer GPUs means less inter-GPU communication overhead, and MoE expert parallelism across 2 GPUs is very efficient for this model.

Enable FP8 KV cache + AITER FA for minimaxm2.5-fp8-mi355x-vllm

github-actions · 2026-04-05T06:53:35Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

github-actions · 2026-04-05T06:53:35Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

functionstackx

lgtm once validation passes

perf-changelog.yaml

benenzhu · 2026-04-05T07:20:57Z

vllm recipes pr updated: vllm-project/recipes#300

chunfangamd added 4 commits April 4, 2026 11:42

Add TP2EP2 for minimaxm2.5-fp8-mi355x-vllm

65fe884

Fewer GPUs means less inter-GPU communication overhead, and MoE expert parallelism across 2 GPUs is very efficient for this model.

Optimize config for minimaxm2.5-fp8-mi355x-vllm

faf1537

Update perf-changelog for minimaxm2.5-fp8-mi355x-vllm

772019f

Upgrade minimaxm2.5-fp8-mi355x-vllm Image to v0.19.0

e9e56b6

Enable FP8 KV cache + AITER FA for minimaxm2.5-fp8-mi355x-vllm

chunfangamd requested a review from a team April 5, 2026 06:53

chunfangamd requested a review from billishyahao as a code owner April 5, 2026 06:53

github-project-automation bot added this to InferenceMAX Board Apr 5, 2026

functionstackx approved these changes Apr 5, 2026

View reviewed changes

claude bot reviewed Apr 5, 2026

View reviewed changes

perf-changelog.yaml Show resolved Hide resolved

optimize all reduce

76d4d2d

benenzhu added AMD sweep-enabled labels Apr 5, 2026

benenzhu and others added 3 commits April 5, 2026 07:25

fix pr

2f7e7ff

Update perf-chagelog

35be871

Fix the perf-changelog

6a51085

cquil11 approved these changes Apr 5, 2026

View reviewed changes

cquil11 enabled auto-merge (squash) April 5, 2026 16:59

cquil11 disabled auto-merge April 5, 2026 16:59

cquil11 merged commit 0ee5ff3 into main Apr 5, 2026
96 of 97 checks passed

cquil11 deleted the chun-taoyu/minimaxm2.5-fp8-single-node branch April 5, 2026 16:59

github-project-automation bot moved this to Done in InferenceMAX Board Apr 5, 2026

cquil11 mentioned this pull request Apr 5, 2026

[AMD] Optimize MiniMax-M2.5 FP8 MI355X vLLM search-space #1002

Closed

claude bot mentioned this pull request Apr 8, 2026

[AMD] Upgrade GLM-5 SGLang mi35x image to 0.5.10 #1014

Merged

cquil11 changed the title ~~Optimize search space and upgrade Image to 0.19.0 for MiniMax-M2.5~~ [AMD] Optimize search space and upgrade Image to 0.19.0 for MiniMax-M2.5 Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] Optimize search space and upgrade Image to 0.19.0 for MiniMax-M2.5#1003

[AMD] Optimize search space and upgrade Image to 0.19.0 for MiniMax-M2.5#1003
cquil11 merged 8 commits intomainfrom
chun-taoyu/minimaxm2.5-fp8-single-node

chunfangamd commented Apr 5, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 5, 2026

Uh oh!

github-actions bot commented Apr 5, 2026

Uh oh!

functionstackx left a comment

Uh oh!

Uh oh!

benenzhu commented Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

chunfangamd commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 5, 2026

Uh oh!

github-actions bot commented Apr 5, 2026

Uh oh!

functionstackx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

benenzhu commented Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chunfangamd commented Apr 5, 2026 •

edited

Loading