[FDConfig] Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility by xyxinyang · Pull Request #7509 · PaddlePaddle/FastDeploy

xyxinyang · 2026-04-20T06:49:46Z

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility. This enables R3 support for models like Qwen3VLMOE, DeepSeek V3, and other MoE models that use num_experts_per_tok instead of moe_k in their config.

Modifications

Added unified field mapping in ModelConfig (fastdeploy/config.py)

Usage or Command

N/A

Accuracy Tests

N/A

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

…tibility

paddle-bot · 2026-04-20T06:49:57Z

Thanks for your contribution!

PaddlePaddle-bot

🤖 AI Code Review | 2026-04-20 14:57 CST

📋 Review 摘要

PR 概述：在 ModelConfig.override_name_from_config() 中新增 num_experts_per_tok → moe_k 的字段映射，以支持 Qwen3VLMOE、DeepSeek V3 等使用 num_experts_per_tok 的 MoE 模型。
变更范围：fastdeploy/config.py — Config 字段映射
影响面 Tag：FDConfig

问题

级别	文件	概述
🟡 建议	多个模型文件	统一映射不完整，下游消费方仍直接使用 `num_experts_per_tok`

🟡 建议：统一映射后，下游消费方未同步切换到 `moe_k`

本次 PR 在 override_name_from_config() 中将 num_experts_per_tok 映射为 moe_k，映射逻辑本身是正确的。但代码库中仍有多处直接使用 fd_config.model_config.num_experts_per_tok 而非统一后的 moe_k：

文件	行号	当前用法
`model_executor/models/qwen3moe.py`	69	`top_k=fd_config.model_config.num_experts_per_tok`
`model_executor/models/deepseek_v3.py`	167	`top_k=fd_config.model_config.num_experts_per_tok`
`model_executor/models/gpt_oss.py`	120	`top_k=fd_config.model_config.num_experts_per_tok`
`model_executor/models/glm4_moe.py`	176	`top_k=fd_config.model_config.num_experts_per_tok`
`model_executor/layers/moe/routing_indices_cache.py`	170	对 `Glm4MoeForCausalLM` 特殊分支使用 `num_experts_per_tok`

建议：

将上述文件中的 fd_config.model_config.num_experts_per_tok 统一替换为 fd_config.model_config.moe_k，使 "Unify" 目标真正达成。
routing_indices_cache.py:169-172 中对 Glm4MoeForCausalLM 的硬编码特殊分支在统一映射后可以移除，直接使用 moe_k，简化逻辑：

# 当前（可简化）：
if fd_config.model_config.architectures[0] == "Glm4MoeForCausalLM":
    self.moe_top_k = fd_config.model_config.num_experts_per_tok
else:
    self.moe_top_k = fd_config.model_config.moe_k

# 建议改为：
self.moe_top_k = fd_config.model_config.moe_k

总体评价

映射逻辑正确，不会引入 bug。但作为一个标题为 "Unify" 的 PR，建议一并更新下游消费方以完成完整的统一，否则后续维护者可能不清楚应该使用 moe_k 还是 num_experts_per_tok。

gongshaotian

LGTM

Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compa…

9cfb159

…tibility

xyxinyang had a problem deploying to Metax_ci April 20, 2026 06:49 — with GitHub Actions Failure

PaddlePaddle-bot reviewed Apr 20, 2026

View reviewed changes

gongshaotian approved these changes Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FDConfig] Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility#7509

[FDConfig] Unify num_experts_per_tok to moe_k in ModelConfig for MoE model compatibility#7509
xyxinyang wants to merge 1 commit intoPaddlePaddle:developfrom
xyxinyang:develop-zc

xyxinyang commented Apr 20, 2026

Uh oh!

paddle-bot bot commented Apr 20, 2026

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

gongshaotian left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

xyxinyang commented Apr 20, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Apr 20, 2026

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

问题

🟡 建议：统一映射后，下游消费方未同步切换到 moe_k

总体评价

Uh oh!

gongshaotian left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

🟡 建议：统一映射后，下游消费方未同步切换到 `moe_k`