docs(moonshotai): remove --compilation_config.pass_config.fuse_allreduce_rms from Kimi-K2.5 recipe by faradawn · Pull Request #325 · vllm-project/recipes

faradawn · 2026-04-09T22:38:34Z

Summary

Removes --compilation_config.pass_config.fuse_allreduce_rms true from all three command examples in the Kimi-K2.5 recipe (Hopper Docker, Blackwell Docker, and vllm serve)
This flag is no longer needed as of vLLM v0.17 — it is now enabled by default for MoE models on Hopper hardware (confirmed by Hanjie Qiu and Wei Zhao)

Closes #324

Test plan

Verify commands run correctly on vLLM v0.17+ without the flag

…uce_rms from Kimi-K2.5 recipe This flag is no longer needed as of vLLM v0.17 — fuse_allreduce_rms is now enabled by default for MoE models on Hopper hardware. Removes the flag from all three command examples (Hopper Docker, Blackwell Docker, and vllm serve). Closes vllm-project#324 Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

gemini-code-assist

Code Review

This pull request removes the --compilation_config.pass_config.fuse_allreduce_rms flag from the Kimi-K2.5 deployment documentation for various environments. A review comment suggests that while this flag is default on Hopper hardware, it should be retained for Blackwell (aarch64) examples to ensure optimal performance unless its default status on that architecture is confirmed.

gemini-code-assist · 2026-04-09T22:42:00Z

moonshotai/Kimi-K2.5.md

  vllm/vllm-openai:v0.17.0-aarch64-cu130 moonshotai/Kimi-K2.5 \
    --tensor-parallel-size 4 \
    --mm-encoder-tp-mode data \
-    --compilation_config.pass_config.fuse_allreduce_rms true \


The justification for removing this flag is that it's enabled by default on Hopper hardware. This change, however, is for a Blackwell (aarch64) example. Since Blackwell is a different architecture, this optimization might not be enabled by default. To ensure optimal performance on Blackwell, it might be better to retain this flag unless it's confirmed to be default on Blackwell as well.

gemini-code-assist bot reviewed Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(moonshotai): remove --compilation_config.pass_config.fuse_allreduce_rms from Kimi-K2.5 recipe#325

docs(moonshotai): remove --compilation_config.pass_config.fuse_allreduce_rms from Kimi-K2.5 recipe#325
faradawn wants to merge 1 commit intovllm-project:mainfrom
faradawn:fix/kimi-k2.5-remove-fuse-allreduce-flag

faradawn commented Apr 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

faradawn commented Apr 9, 2026

Summary

Test plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant