add muon by xxyux · Pull Request #4231 · PaddlePaddle/PaddleFormers

xxyux · 2026-04-07T15:15:40Z

Before submitting

Lint code. If there are lint issues, please format the code first.

# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py

Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

New features

PR changes

We are adapting Muon Optimizer.
Working...

Description

paddle-bot · 2026-04-07T15:15:51Z

Thanks for your contribution!

CLAassistant · 2026-04-24T05:55:25Z

All committers have signed the CLA.

xxyux · 2026-04-24T10:33:09Z

/re-run all-failed

xxyux · 2026-04-25T04:15:35Z

/re-run all-failed

codecov-commenter · 2026-04-25T04:53:06Z

Codecov Report

❌ Patch coverage is 4.91228% with 271 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@73a1b60). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
paddleformers/transformers/minimax_m2/modeling.py	0.00%	136 Missing ⚠️
paddleformers/trainer/utils/offload_optimizer.py	0.00%	70 Missing ⚠️
paddleformers/trainer/trainer_utils.py	4.76%	40 Missing ⚠️
paddleformers/trainer/trainer.py	0.00%	18 Missing ⚠️
paddleformers/trainer/training_args.py	73.33%	4 Missing ⚠️
paddleformers/trainer/utils/reshard/common.py	25.00%	3 Missing ⚠️

❌ Your patch status has failed because the patch coverage (4.91%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #4231   +/-   ##
==========================================
  Coverage           ?   38.89%           
==========================================
  Files              ?      474           
  Lines              ?    90061           
  Branches           ?        0           
==========================================
  Hits               ?    35029           
  Misses             ?    55032           
  Partials           ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

GuoxiaWang

看评论，下一个版本需要修掉现有问题。这个版本先合入。

GuoxiaWang · 2026-04-25T05:39:24Z

+
+        optimizer._create_accumulators(paddle.base.framework.default_main_program().global_block(), parameter_list)
+        return
+


下一个版本需要改掉 MoE 和非MoE 的假设

GuoxiaWang · 2026-04-25T05:40:17Z

        default="adamw",
        metadata={"help": "The optimizer to use."},
    )
+    muon_exclude_patterns: Optional[List[str]] = field(


下一个版本需要适配正则表达式，而不能仅仅是一个字符串 in 操作，不然在多模态混合模型中很难准确定位参数

GuoxiaWang · 2026-04-25T05:44:42Z


+    # Step 4: mock Muon._muon_update and Muon._apply_optimize
+    # Muon's _muon_update is pure Python (paddle.lerp + paddle.assign),
+    # so it bypasses the _C_ops.adamw_ patch above. We need explicit


当前paddle合入的版本已经修改了已经适配了 adamw_，后续还需要写一个 muon_ 的 Kernel 才行。这里写的 master weight 的reload 和 offload 做不到逐tensor 的 offload。现在这种实现是提前把所有的 master weight reload，一旦显存占用大，直接炸了，达不到目的。下个版本得修改。

GuoxiaWang · 2026-04-25T05:52:20Z

    config: MiniMaxM2Config

+    @classmethod
+    def _build_muon_slice_config(cls, model, config) -> dict:


这里为什么要写一个默认的函数在这里，现在还没有打磨好，默认的是不是容易出问题？

xxyux force-pushed the add_muon branch 2 times, most recently from 78430a6 to f6698f4 Compare April 7, 2026 15:25

add muon

2779545

xxyux force-pushed the add_muon branch from 75ae6d4 to 7d78d0d Compare April 24, 2026 06:04

add muon optimizer slice config for MiniMaxM2 model

df9fbc0

xxyux force-pushed the add_muon branch from 7d78d0d to df9fbc0 Compare April 24, 2026 10:13

tianlef added the skip-ci: fleet-model-test label Apr 25, 2026

risemeup1 approved these changes Apr 25, 2026

View reviewed changes

risemeup1 merged commit c0b7e21 into PaddlePaddle:develop Apr 25, 2026
26 of 31 checks passed

GuoxiaWang approved these changes Apr 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add muon#4231

add muon#4231
risemeup1 merged 2 commits intoPaddlePaddle:developfrom
xxyux:add_muon

xxyux commented Apr 7, 2026

Uh oh!

paddle-bot Bot commented Apr 7, 2026

Uh oh!

CLAassistant commented Apr 24, 2026 •

edited

Loading

Uh oh!

xxyux commented Apr 24, 2026

Uh oh!

xxyux commented Apr 25, 2026

Uh oh!

codecov-commenter commented Apr 25, 2026

Uh oh!

Uh oh!

GuoxiaWang left a comment

Uh oh!

GuoxiaWang Apr 25, 2026

Uh oh!

GuoxiaWang Apr 25, 2026

Uh oh!

GuoxiaWang Apr 25, 2026

Uh oh!

GuoxiaWang Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants


		optimizer._create_accumulators(paddle.base.framework.default_main_program().global_block(), parameter_list)
		return

Conversation

xxyux commented Apr 7, 2026

Before submitting

PR types

PR changes

Description

Uh oh!

paddle-bot Bot commented Apr 7, 2026

Uh oh!

CLAassistant commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xxyux commented Apr 24, 2026

Uh oh!

xxyux commented Apr 25, 2026

Uh oh!

codecov-commenter commented Apr 25, 2026

Codecov Report

Uh oh!

Uh oh!

GuoxiaWang left a comment

Choose a reason for hiding this comment

Uh oh!

GuoxiaWang Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

GuoxiaWang Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

GuoxiaWang Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

GuoxiaWang Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

CLAassistant commented Apr 24, 2026 •

edited

Loading