Skip to content

Hybrid MoE, combining EP and TP#2001

Open
zhangamy-crypto wants to merge 7 commits intovllm-project:mainfrom
zhangamy-crypto:hybrid-moe-branch
Open

Hybrid MoE, combining EP and TP#2001
zhangamy-crypto wants to merge 7 commits intovllm-project:mainfrom
zhangamy-crypto:hybrid-moe-branch

Conversation

@zhangamy-crypto
Copy link
Copy Markdown

Description

Add hybrid expert and tensor parallelism support in MoE; currently maps expert sharding to attention's DP sharding.

Tests

Tested on Qwen3 Coder and for prefill, improved throughput and latency.

Signed-off-by: zhangamy-crypto <zhangamy@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant