Add Muon Optimizer [cherry-pick from dev] by xxyux · Pull Request #78679 · PaddlePaddle/Paddle

xxyux · 2026-04-14T12:49:36Z

PR Category

Execute Infrastructure

PR Types

New features

Description

Add Muon Optimizer
devPR:#78335

是否引起精度变化

否

paddle-bot · 2026-04-14T12:49:42Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Add Muon optimizer implementation with Newton-Schulz orthogonalization for distributed training: - Muon optimizer (python/paddle/optimizer/muon.py): - Newton-Schulz iteration for orthogonal gradient updates - QKV split modes: per_head, qkv_sep, full - FFN gate_up split support - Multiple NS coefficient types: simple, quintic, polar_express, aol - MuonShardingOptimizer: - Whole-tensor assignment for 2D parameters (Muon) - Element-wise sharding for non-2D parameters (AdamW) - Hybrid memory balancing across ranks - Test coverage: - All 24 parameter combinations tested - 2-GPU sharding validation against single-GPU reference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

MuonShardingOptimizer is a dygraph-only optimizer designed for Muon optimizer with distributed sharding support. It should not be loaded by the static graph meta optimizer factory, similar to: - HybridParallelOptimizer - HeterParallelOptimizer - DGCMomentumOptimizer This fix prevents static graph tests (e.g., test_static_model_parallel, test_raw_program_optimizer) from crashing when MuonShardingOptimizer tries to access dynamic-graph-only attributes like _parameter_list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…grad - Use MixPrecisionLayer + MixPrecisionOptimizer pattern so params have main_grad before MuonShardingOptimizer init, enabling the safe main_grad path in the refactored clear_grad (which now iterates over all parameters instead of only 2D params) - Add paddle.amp.auto_cast in train_batch for proper BF16 forward pass - Use np.random.randn for weight init (zero-centered, better NS stability) - Cast params to float32 before numpy comparison to avoid BF16 uint16 bit-pattern comparison issues Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…l parameter list Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add MLAInfo dataclass and MLA split_head orthogonal update in muon.py - Add clear_param_storage/reset_param_storage methods in MuonShardingOptimizer - Support MoE expert param storage management via _color_to_comm_buffer_list Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

xxyux · 2026-04-14T15:50:43Z

/re-run all-failed

codecov-commenter · 2026-04-14T16:57:44Z

Codecov Report

❌ Patch coverage is 12.99886% with 763 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (release/3.3@2115d0a). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...d/fleet/meta_optimizers/muon_sharding_optimizer.py	7.64%	495 Missing ⚠️
python/paddle/optimizer/muon.py	19.09%	267 Missing ⚠️
...ers/dygraph_optimizer/hybrid_parallel_optimizer.py	85.71%	1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (12.99%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@              Coverage Diff               @@
##             release/3.3   #78679   +/-   ##
==============================================
  Coverage               ?   12.99%           
==============================================
  Files                  ?        7           
  Lines                  ?      877           
  Branches               ?        0           
==============================================
  Hits                   ?      114           
  Misses                 ?      763           
  Partials               ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sneaxiy

LGTM for coverage due to lack of BF16.

xxyux and others added 6 commits April 14, 2026 20:57

fix: refactor clear_grad to handle tensor_fusion and iterate over ful…

3c5822d

…l parameter list Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

optimization for reduce-gradient by applying comm_buffer to 2d params

49ab821

xxyux force-pushed the release/3.3 branch from a9a20bd to 49ab821 Compare April 14, 2026 12:58

update CI

940fbb7

xxyux force-pushed the release/3.3 branch from c158302 to 940fbb7 Compare April 15, 2026 04:21

swgu98 added skip-ci: static-check skip-ci: Doc-Preview skip-ci: h-ci labels Apr 15, 2026

sneaxiy approved these changes Apr 15, 2026

View reviewed changes

swgu98 added skip-ci: coverage skip-ci: approval labels Apr 15, 2026

sneaxiy merged commit a12dc4d into PaddlePaddle:release/3.3 Apr 15, 2026
134 of 158 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Muon Optimizer [cherry-pick from dev]#78679

Add Muon Optimizer [cherry-pick from dev]#78679
sneaxiy merged 7 commits intoPaddlePaddle:release/3.3from
xxyux:release/3.3

xxyux commented Apr 14, 2026 •

edited

Loading

Uh oh!

paddle-bot bot commented Apr 14, 2026

Uh oh!

xxyux commented Apr 14, 2026

Uh oh!

codecov-commenter commented Apr 14, 2026

Uh oh!

sneaxiy left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

xxyux commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

是否引起精度变化

Uh oh!

paddle-bot bot commented Apr 14, 2026

Uh oh!

xxyux commented Apr 14, 2026

Uh oh!

codecov-commenter commented Apr 14, 2026

Codecov Report

Uh oh!

sneaxiy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

xxyux commented Apr 14, 2026 •

edited

Loading