feat: support pytorch-optimizer training optimizers by mfazrinizar · Pull Request #1006 · roboflow/rf-detr

mfazrinizar · 2026-04-28T18:44:38Z

What does this PR do?

This PR adds configurable optimizer support to RF-DETR training while keeping the existing AdamW behavior as the default. optimizer="adamw" continues to use RF-DETR's built-in fused torch.optim.AdamW path, and non-default optimizer names are resolved through pytorch-optimizer, for example optimizer="lion" or optimizer="pytorch_optimizer:adamw".

The implementation preserves RF-DETR parameter groups and layer-wise learning rates by building optimizers from the existing get_param_dict() output. It also adds optimizer_kwargs so users can pass optimizer-specific arguments such as AdamW betas or Lion weight_decouple without overriding RF-DETR-managed values like params, lr, weight_decay, or fused.

It also wires the options through the public training API, keeps PTL-only optimizer config out of the legacy namespace, adds focused tests, documents the new parameters, and includes pytorch-optimizer in the training extra.

Related Issue(s): Closes #89

Type of Change

New feature (non-breaking change that adds functionality)

Testing

I have tested this change locally
I have added/updated tests for this change

Test details:

Added and updated tests cover:

TrainConfig defaults, accepted values, empty-name rejection, and reserved optimizer_kwargs rejection
default AdamW behavior remaining backward compatible
optimizer_kwargs forwarding to RF-DETR's default AdamW optimizer
custom optimizer loading through pytorch-optimizer
RF-DETR parameter groups and layer-wise learning rates being preserved for custom optimizers
custom optimizer kwargs forwarding
explicit pytorch_optimizer: prefix support, including opting into the external AdamW implementation
missing dependency and invalid optimizer-name error handling
real pytorch-optimizer Lion construction smoke test
RFDETR.train(optimizer=..., optimizer_kwargs=...) forwarding into get_train_config()
optimizer-only config staying out of the legacy namespace

Local validation was run with PYTHONPATH=src in the rfdetr conda environment via the environment Python executable.

Real optimizer smoke tests passed for:

direct pytorch-optimizer Lion construction and one optimizer step with a PyTorch parameter group
RF-DETR's _build_pytorch_optimizer() constructing Lion with RF-DETR-style parameter groups and running one optimizer step with optimizer_kwargs={"weight_decouple": True}

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code where necessary, particularly in hard-to-understand areas
My changes generate no new warnings or errors
I have updated the documentation accordingly (if applicable)

Additional Context

The implementation intentionally uses pytorch_optimizer.load_optimizer() instead of create_optimizer() so RF-DETR keeps its existing parameter grouping, backbone learning rates, and scheduler behavior. Some specialized optimizers may still require optimizer-specific kwargs; initialization errors include a hint to check the selected optimizer's supported arguments.

…optimizer

codecov · 2026-04-28T18:47:02Z

Codecov Report

❌ Patch coverage is 84.68468% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 80%. Comparing base (2f81ac0) to head (1c4c735).

❌ Your patch check has failed because the patch coverage (85%) is below the target coverage (95%). You can increase the patch coverage or adjust the target coverage.
❌ Your project check has failed because the head coverage (80%) is below the target coverage (95%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@           Coverage Diff            @@
##           develop   #1006    +/-   ##
========================================
  Coverage       80%     80%            
========================================
  Files          100     100            
  Lines         8457    8564   +107     
========================================
+ Hits          6784    6875    +91     
- Misses        1673    1689    +16

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Alarmod · 2026-04-28T19:07:59Z

I tested Yolo26 with AdamW and MuSGD
ultralytics/ultralytics#23789

from ultralytics.optim.muon import MuSGD
# Initialize optimizer with specific parameter groups
# Use 'use_muon=True' only for 2D+ tensors for the hybrid effect
optimizer = MuSGD(model.parameters(), lr=0.01, momentum=0.9)

It will be compatible with external optimizer?

mfazrinizar · 2026-04-28T20:00:23Z

@Alarmod it's compatible with external optimizers provided by pytorch-optimizer that accept normal PyTorch param groups. However, it's not automatically compatible with arbitrary imported optimizer classes like Ultralytics MuSGD yet. That could be a follow-up with a small adapter/registry and dedicated tests, interesting to support it. Already working on it.

Copilot

Pull request overview

Adds configurable optimizer selection to RF-DETR training while preserving the existing fused torch.optim.AdamW behavior by default, and enabling external optimizers via pytorch-optimizer or importable Python optimizer classes.

Changes:

Extend TrainConfig with optimizer, optimizer_kwargs, and rank-based optimizer_param_group_overrides validation.
Update RFDETRModelModule.configure_optimizers() to build either the default fused AdamW, a pytorch-optimizer optimizer, or a python: imported optimizer while preserving RF-DETR param groups / LRs.
Add tests and documentation covering new optimizer configuration and ensuring optimizer-only fields don’t leak into the legacy namespace.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`src/rfdetr/training/module_model.py`	Implements provider parsing, optimizer loading/instantiation, param-group overrides, and updated optimizer construction in `configure_optimizers()`.
`src/rfdetr/config.py`	Adds `OptimizerParamGroupOverride` model and new `TrainConfig` fields + validators for optimizer configuration.
`src/rfdetr/_namespace.py`	Ensures optimizer-only config stays out of the legacy namespace mapping.
`pyproject.toml`	Adds `pytorch-optimizer` to the `train` extra.
`tests/training/test_module_model.py`	Adds unit tests for optimizer selection, kwargs forwarding, param-group preservation, overrides, and error handling.
`tests/training/test_detr_shim.py`	Verifies `RFDETR.train(...)` forwards optimizer config through to `get_train_config()`.
`tests/training/test_args.py`	Verifies optimizer fields are not forwarded into the legacy namespace.
`tests/models/test_config.py`	Adds config default/validation tests for new optimizer-related fields.
`docs/learn/train/training-parameters.md`	Documents new optimizer parameters and provides usage examples.
`docs/learn/train/customization.md`	Updates lifecycle hook documentation to reflect configurable optimizer + overrides.

…izer-train

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…zrinizar/rf-detr into feat/pytorch-optimizer-train

The python:/import: provider called importlib.import_module() on a dotted path originating from TrainConfig.optimizer — an unconstrained import-time code-execution surface reachable from LightningCLI YAML configs. Removed _load_python_optimizer and _build_python_optimizer; collapsed configure_optimizers to the two-branch dispatch (built-in fused AdamW / pytorch-optimizer). The pytorch_optimizer: provider and OptimizerParamGroupOverride are unaffected. - Remove import importlib (now unused) - Remove _load_python_optimizer(), _build_python_optimizer() - Update _split_optimizer_name() — only adamw and pytorch_optimizer: valid - Retarget rank-aware override tests to mock _load_pytorch_optimizer [resolve roboflow#4] /review finding by foundry:sw-engineer + Codex co-review (report: .temp/output-review-develop-2026-04-28.md): "C1: _load_python_optimizer performs unbounded importlib.import_module() on config-string path" --- Co-authored-by: Claude Code <noreply@anthropic.com> Co-authored-by: OpenAI Codex <codex@openai.com>

The API call site uses load_optimizer() and get_supported_optimizers() from the 3.x series. An unpinned dep risks a silent breaking change on the next major release. [resolve roboflow#5] /review finding by foundry:linting-expert (report: .temp/output-review-develop-2026-04-28.md): "S1: pytorch-optimizer unpinned — fast release cadence risks install breakage" --- Co-authored-by: Claude Code <noreply@anthropic.com>

The fused CUDA kernel path only applies to optimizer='adamw'. Users who set model_config.fused_optimizer=True and switch to a pytorch-optimizer optimizer would previously get no feedback. Add a logger.info message in the non-default branch so the configuration mismatch is visible. [resolve roboflow#7] /review finding by foundry:sw-engineer (report: .temp/output-review-develop-2026-04-28.md): "H2: fused=True silently dropped for non-AdamW — add logger.info" --- Co-authored-by: Claude Code <noreply@anthropic.com>

Add a '### Custom optimizer' section to customization.md documenting the optimizer=/optimizer_kwargs= TrainConfig fields and linking forward to training-parameters.md. Add a !!! warning admonition listing SAM, Lookahead, Ranger, PCGrad, and GradientCentralization as incompatible with PTL automatic_optimization=True and explaining why. [resolve roboflow#9] /review finding by foundry:sw-engineer (report: .temp/output-review-develop-2026-04-28.md): "H4: Wrapping optimizers (SAM/Lookahead) incompatible with PTL; add docs warning" --- Co-authored-by: Claude Code <noreply@anthropic.com>

The pytorch-optimizer path already wraps TypeError from _instantiate_optimizer with a hint about optimizer_kwargs. The built-in AdamW branch was unguarded, so unknown kwargs (e.g. weight_decouple passed to torch AdamW) surfaced as a bare TypeError with no RF-DETR context. Wrap it consistently. [resolve roboflow#21] /review finding by foundry:sw-engineer (report: .temp/output-review-develop-2026-04-28.md): "M5: Built-in AdamW path lacks the _instantiate_optimizer-style TypeError context" --- Co-authored-by: Claude Code <noreply@anthropic.com>

- test_detr_shim.py: replace python:external_optimizers.HybridOptimizer with pytorch_optimizer:lion (valid provider after security removal) - test_module_model.py: rename unused result/_call_kwargs variables (lint) - module_model.py: getattr(model, 'num_classes') → direct attr access; fix unicode × in comment (lint) --- Co-authored-by: Claude Code <noreply@anthropic.com>

mfazrinizar added 3 commits April 29, 2026 01:27

feat: support pytorch-optimizer training optimizers

8ea0564

test: config, training args, detr shim, and module model for pytorch-…

305d00f

…optimizer

docs: add pytorch-optimizer documentation in train docs

8943980

mfazrinizar requested review from Borda, SkalskiP, isaacrob and probicheaux as code owners April 28, 2026 18:44

feat: support import-path training optimizers

3771603

Borda requested a review from Copilot April 28, 2026 20:42

Borda added the enhancement New feature or request label Apr 28, 2026

Copilot started reviewing on behalf of Borda April 28, 2026 20:42 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

Comment thread src/rfdetr/training/module_model.py Outdated

Comment thread src/rfdetr/training/module_model.py Outdated

Comment thread src/rfdetr/training/module_model.py Outdated

Borda and others added 11 commits April 28, 2026 23:32

Merge remote-tracking branch 'origin/develop' into feat/pytorch-optim…

4098b93

…izer-train

Update src/rfdetr/training/module_model.py

6b4b8e8

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/rfdetr/training/module_model.py

4858087

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/rfdetr/training/module_model.py

490db24

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Merge branch 'feat/pytorch-optimizer-train' of https://github.com/mfa…

aba14f4

…zrinizar/rf-detr into feat/pytorch-optimizer-train

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support pytorch-optimizer training optimizers#1006

feat: support pytorch-optimizer training optimizers#1006
mfazrinizar wants to merge 15 commits intoroboflow:developfrom
mfazrinizar:feat/pytorch-optimizer-train

mfazrinizar commented Apr 28, 2026

Uh oh!

codecov Bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

Alarmod commented Apr 28, 2026

Uh oh!

mfazrinizar commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mfazrinizar commented Apr 28, 2026

What does this PR do?

Type of Change

Testing

Checklist

Additional Context

Uh oh!

codecov Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Alarmod commented Apr 28, 2026

Uh oh!

mfazrinizar commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov Bot commented Apr 28, 2026 •

edited

Loading