Commit 0ee5ff3
Optimize search space and upgrade Image to 0.19.0 for MiniMax-M2.5 (#1003)
* Add TP2EP2 for minimaxm2.5-fp8-mi355x-vllm
Fewer GPUs means less inter-GPU communication overhead, and MoE
expert parallelism across 2 GPUs is very efficient for this model.
* Optimize config for minimaxm2.5-fp8-mi355x-vllm
* Update perf-changelog for minimaxm2.5-fp8-mi355x-vllm
* Upgrade minimaxm2.5-fp8-mi355x-vllm Image to v0.19.0
Enable FP8 KV cache + AITER FA for minimaxm2.5-fp8-mi355x-vllm
* optimize all reduce
* fix pr
* Update perf-chagelog
* Fix the perf-changelog
---------
Co-authored-by: zhutaoyu <zhutaoyu97@gmail.com>1 parent bddbf40 commit 0ee5ff3
File tree
3 files changed
+24
-7
lines changed- .github/configs
- benchmarks/single_node
3 files changed
+24
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
334 | 334 | | |
335 | 335 | | |
336 | 336 | | |
337 | | - | |
| 337 | + | |
338 | 338 | | |
339 | 339 | | |
340 | 340 | | |
| |||
345 | 345 | | |
346 | 346 | | |
347 | 347 | | |
348 | | - | |
349 | | - | |
350 | | - | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
351 | 351 | | |
352 | 352 | | |
353 | 353 | | |
354 | | - | |
355 | | - | |
356 | | - | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
357 | 357 | | |
358 | 358 | | |
359 | 359 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
| |||
49 | 50 | | |
50 | 51 | | |
51 | 52 | | |
| 53 | + | |
52 | 54 | | |
53 | 55 | | |
| 56 | + | |
54 | 57 | | |
55 | 58 | | |
56 | 59 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1244 | 1244 | | |
1245 | 1245 | | |
1246 | 1246 | | |
| 1247 | + | |
| 1248 | + | |
| 1249 | + | |
| 1250 | + | |
| 1251 | + | |
| 1252 | + | |
1247 | 1253 | | |
| 1254 | + | |
| 1255 | + | |
| 1256 | + | |
| 1257 | + | |
| 1258 | + | |
| 1259 | + | |
| 1260 | + | |
| 1261 | + | |
0 commit comments