Skip to content

[Cherry-Pick] [FDConfig] Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility(#7509)#7517

Open
xyxinyang wants to merge 1 commit intoPaddlePaddle:release/2.6from
xyxinyang:release/2.6
Open

[Cherry-Pick] [FDConfig] Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility(#7509)#7517
xyxinyang wants to merge 1 commit intoPaddlePaddle:release/2.6from
xyxinyang:release/2.6

Conversation

@xyxinyang
Copy link
Copy Markdown
Collaborator

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick,PR标题需遵循格式,在最开始加上[Cherry-Pick]标签,以及最后面加上原PR ID,例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility. This enables R3 support for models like Qwen3VLMOE, DeepSeek V3, and other MoE models that use num_experts_per_tok instead of moe_k in their config.

Modifications

  • Added unified field mapping in ModelConfig (fastdeploy/config.py)

Usage or Command

N/A

Accuracy Tests

N/A

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 20, 2026

Thanks for your contribution!

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-20 16:03 CST

📋 Review 摘要

PR 概述:将 MoE 模型 config 中的 num_experts_per_tok 统一映射为 moe_k,以支持 Qwen3VLMOE、DeepSeek V3 等模型的兼容性。
变更范围fastdeploy/config.pyModelConfig.override_name_from_config 方法
影响面 TagFDConfig

问题

级别 文件 概述
🟡 建议 多个模型文件 下游消费方仍直接使用 num_experts_per_tok,建议统一迁移为 moe_k

总体评价

变更逻辑正确,与同方法内 moe_num_experts/moe_num_shared_experts 的映射模式一致。not hasattr(self, "moe_k") 的守卫条件也恰当(moe_k 不在 PRETRAINED_INIT_CONFIGURATION 中,因此用 hasattris None 更合理)。建议后续跟进将下游直接引用 num_experts_per_tok 的代码统一迁移到 moe_k,以保持代码风格一致性。

Comment thread fastdeploy/config.py
# Because the ERNIE 4.5 config.json contains two sets of keys, adaptation is required.
self.moe_num_shared_experts = self.n_shared_experts

if hasattr(self, "num_experts_per_tok") and not hasattr(self, "moe_k"):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 映射逻辑本身正确,但目前有多个下游文件仍直接使用 fd_config.model_config.num_experts_per_tok 而非统一后的 moe_k

  • fastdeploy/model_executor/models/deepseek_v3.py:173
  • fastdeploy/model_executor/models/qwen3moe.py:68
  • fastdeploy/model_executor/models/gpt_oss.py:119
  • fastdeploy/model_executor/models/glm4_moe.py:178
  • fastdeploy/model_executor/layers/moe/routing_indices_cache.py:170(对 Glm4Moe 的特殊分支)

由于映射代码只赋值 self.moe_k = self.num_experts_per_tok 而未删除原属性,这些引用不会报错,不阻塞合入。但建议在后续 PR 中将上述引用统一迁移为 moe_k,与 ernie4_5_moe.py 等已使用 moe_k 的模型保持一致,减少维护歧义。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants