[DO NOT MERGE] GLM-4.5 MoE: fix softmax/bf16 config for alignment testing by zhanghonggeng · Pull Request #4640 · PaddlePaddle/PaddleFormers

zhanghonggeng · 2026-06-09T11:18:35Z

…ting

Before submitting

Lint code. If there are lint issues, please format the code first.

# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py

Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

PR changes

Description

…ting

Paddle-CI-Bot · 2026-06-09T12:03:17Z

PaddleFormers Log Analysis

Run #27204580702 · Attempt 1

日志分析报告

流水线名称	问题标签	修复建议	日志片段
Unittest GPU CI	其他（dtype 不匹配）	修复 `glm4_moe_dsa` modeling 中 softmax/bmm 前缺少 bf16 cast，确保 attention_probs 与 value 类型一致	报错代码
Model Unittest GPU CI	其他（生成结果与 baseline 不一致）	更新 `glm4_moe` 的 lora/lora_tp_pp 各 train_type baseline，或回退本 PR 对 `glm4_moe/modeling.py` 中 softmax/bf16 config 的修改后重新对齐	报错代码

失败的测试case:

# Unittest GPU CI
tests/transformers/glm_moe_dsa/test_modeling.py::GlmMoeDsaModelTest::test_GlmMoeDsa_lm_head_model

# Model Unittest GPU CI
scripts/regression/test_models.py::TestTrain::test_lora[glm4_moe-sft]
scripts/regression/test_models.py::TestTrain::test_lora[glm4_moe-dpo]
scripts/regression/test_models.py::TestTrain::test_lora[glm4_moe-pt]
scripts/regression/test_models.py::TestTrain::test_lora_tp_pp[glm4_moe-sft]
scripts/regression/test_models.py::TestTrain::test_lora_tp_pp[glm4_moe-pt]
scripts/regression/test_models.py::TestTrain::test_lora_tp_pp[glm4_moe-dpo]

根本原因分析:

本 PR 提交信息为 [DO NOT MERGE] GLM-4.5 MoE: fix softmax/bf16 config for alignment testing，修改了 paddleformers/transformers/glm4_moe/modeling.py（+12/-0 行）和 glm4_moe_dsa 相关代码。

Unittest GPU CI：glm_moe_dsa 的 test_GlmMoeDsa_lm_head_model 在 paddle.bmm(attention_probs, value) 时，attention_probs 为 bfloat16，但底层 bmm kernel 尝试以 float32 读取，dtype 不匹配报 InvalidArgument。根因是本 PR 修改了 DSA attention 路径中的 softmax/dtype 配置，导致 attention_probs 不再隐式转 float32。
Model Unittest GPU CI：glm4_moe lora / lora_tp_pp（sft/dpo/pt 共 6 个 case）的 generate 输出 token 与 baseline 不一致（如实际 [44551,44551,...] vs 期望 [10564,10564,...]），直接由 modeling 的 softmax/bf16 行为变化导致模型推理路径输出改变，而 baseline 未同步更新。

修复建议:

修复 GlmMoeDsa dtype 问题：在 paddleformers/transformers/glm_moe_dsa/modeling.py 中，core_attention 的 paddle.bmm 前加显式类型对齐，确保 attention_probs 和 value 同为 bf16：
```
# 在 bmm 前加 cast 保证一致
value = value.cast(attention_probs.dtype)
context = paddle.bmm(attention_probs, value)
```
或检查 softmax 配置是否误设了 compute_type=float32 导致类型不一致。
更新 glm4_moe baseline：若本 PR 的 softmax/bf16 修改是预期行为，则需同步更新 scripts/regression/test_models.py 中 glm4_moe 各训练方式（sft/dpo/pt × lora/lora_tp_pp）的 excepted_result baseline token id。可本地运行 --update-baseline=true 重新生成后提交。
标记 DO NOT MERGE：PR 标题已标注，确认以上两步修复验证通过后再移除该标记。

🔍 准确性记录：请点击评论底部 😊 图标，选择 👍（准确）或 👎（有误），将自动记录到 CI 监控系统

_{🔄 每次 Re-run 后自动更新}

[DO NOT MERGE] GLM-4.5 MoE: fix softmax/bf16 config for alignment tes…

ee84a46

…ting

zhanghonggeng force-pushed the glm_1 branch from 8d1fe46 to ee84a46 Compare June 9, 2026 11:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE] GLM-4.5 MoE: fix softmax/bf16 config for alignment testing#4640

[DO NOT MERGE] GLM-4.5 MoE: fix softmax/bf16 config for alignment testing#4640
zhanghonggeng wants to merge 1 commit into
PaddlePaddle:developfrom
zhanghonggeng:glm_1

zhanghonggeng commented Jun 9, 2026

Uh oh!

Paddle-CI-Bot commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhanghonggeng commented Jun 9, 2026

Before submitting

PR types

PR changes

Description

Uh oh!

Paddle-CI-Bot commented Jun 9, 2026

PaddleFormers Log Analysis

日志分析报告

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants