Skip to content

[FDConfig] Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility#7509

Open
xyxinyang wants to merge 1 commit intoPaddlePaddle:developfrom
xyxinyang:develop-zc
Open

[FDConfig] Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility#7509
xyxinyang wants to merge 1 commit intoPaddlePaddle:developfrom
xyxinyang:develop-zc

Conversation

@xyxinyang
Copy link
Copy Markdown
Collaborator

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick,PR标题需遵循格式,在最开始加上[Cherry-Pick]标签,以及最后面加上原PR ID,例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility. This enables R3 support for models like Qwen3VLMOE, DeepSeek V3, and other MoE models that use num_experts_per_tok instead of moe_k in their config.

Modifications

  • Added unified field mapping in ModelConfig (fastdeploy/config.py)

Usage or Command

N/A

Accuracy Tests

N/A

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 20, 2026

Thanks for your contribution!

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-20 14:57 CST

📋 Review 摘要

PR 概述:在 ModelConfig.override_name_from_config() 中新增 num_experts_per_tok → moe_k 的字段映射,以支持 Qwen3VLMOE、DeepSeek V3 等使用 num_experts_per_tok 的 MoE 模型。
变更范围fastdeploy/config.py — Config 字段映射
影响面 TagFDConfig

问题

级别 文件 概述
🟡 建议 多个模型文件 统一映射不完整,下游消费方仍直接使用 num_experts_per_tok

🟡 建议:统一映射后,下游消费方未同步切换到 moe_k

本次 PR 在 override_name_from_config() 中将 num_experts_per_tok 映射为 moe_k,映射逻辑本身是正确的。但代码库中仍有多处直接使用 fd_config.model_config.num_experts_per_tok 而非统一后的 moe_k

文件 行号 当前用法
model_executor/models/qwen3moe.py 69 top_k=fd_config.model_config.num_experts_per_tok
model_executor/models/deepseek_v3.py 167 top_k=fd_config.model_config.num_experts_per_tok
model_executor/models/gpt_oss.py 120 top_k=fd_config.model_config.num_experts_per_tok
model_executor/models/glm4_moe.py 176 top_k=fd_config.model_config.num_experts_per_tok
model_executor/layers/moe/routing_indices_cache.py 170 Glm4MoeForCausalLM 特殊分支使用 num_experts_per_tok

建议:

  1. 将上述文件中的 fd_config.model_config.num_experts_per_tok 统一替换为 fd_config.model_config.moe_k,使 "Unify" 目标真正达成。
  2. routing_indices_cache.py:169-172 中对 Glm4MoeForCausalLM 的硬编码特殊分支在统一映射后可以移除,直接使用 moe_k,简化逻辑:
# 当前(可简化):
if fd_config.model_config.architectures[0] == "Glm4MoeForCausalLM":
    self.moe_top_k = fd_config.model_config.num_experts_per_tok
else:
    self.moe_top_k = fd_config.model_config.moe_k

# 建议改为:
self.moe_top_k = fd_config.model_config.moe_k

总体评价

映射逻辑正确,不会引入 bug。但作为一个标题为 "Unify" 的 PR,建议一并更新下游消费方以完成完整的统一,否则后续维护者可能不清楚应该使用 moe_k 还是 num_experts_per_tok

Copy link
Copy Markdown
Collaborator

@gongshaotian gongshaotian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants