Integration of dinov3, including optional train and inference scripts. by S-Mahoney · Pull Request #324 · roboflow/rf-detr

S-Mahoney · 2025-08-17T14:25:40Z

Description

Integration of Dinov3 wrapper into RFDETR pipeline.
Included are scripts which allow training of both v2 and v3 using rfdetr, as well as inference test scripts.

Due to the recent release of Dinov3; the weights required can be accessed by setting your HUGGINGFACE_HUB_TOKEN for private access (requires permission) or by requesting access to Dinov3 weights from Meta and cloning the Dinov3 repo locally.

Type of change

New feature (non-breaking change which adds functionality)
This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

Tested via the training and inference scripts supplied, still allows activation and use of v2 whilst enabling user to also use v3. More training of pre-trained weights required for inference to be on par with dinov2 pretrained rfdetr.

Any specific deployment considerations

Licensing requirements need to be checked by RFDETR owners before deployment of this branch.

I have read the CLA Document and I sign the CLA.

CLAassistant · 2025-08-17T14:25:45Z

All committers have signed the CLA.

S-Mahoney · 2025-08-17T14:26:34Z

I have read the CLA Document and I sign the CLA.

john09282922 · 2025-10-24T01:50:10Z

changing backbone with DinoV3 is super better for object detection, might be latency issue...

RoyiAvital · 2025-11-29T15:45:57Z

I'd add the option to use the ConvNeXt tiny as a backbone.
It is faster to run and comparable performance wise.

Yet I think the main issue is the license of the DINOv3 model.

Je1zzz · 2025-12-13T16:41:46Z

I'd add the option to use the ConvNeXt tiny as a backbone. It is faster to run and comparable performance wise.

Yet I think the main issue is the license of the DINOv3 model.

Hey there, I wonder if you just load the weight of dinov3_convext_tiny? or do you load another weight of model?

stedavkle · 2026-02-06T09:27:43Z

Hi @S-Mahoney, thank you for the work! Could you sync the pull request to include the newest changes from the main repo?
Best regards

Copilot

Pull request overview

This pull request integrates DINOv3 (the latest version of Meta's DINO vision transformer) as an alternative backbone encoder into the RF-DETR object detection framework. The integration allows users to choose between DINOv2 and DINOv3 encoders through configuration, with support for loading weights from either HuggingFace Hub or local PyTorch Hub repositories.

Changes:

Added DINOv3 wrapper class with flexible weight loading (HuggingFace or local repo)
Extended configuration system to support DINOv3 encoder variants (small, base, large) with automatic parameter validation and adjustment
Improved training engine with better AMP handling and gradient context management
Added example training and inference scripts demonstrating v2/v3 encoder selection

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 31 comments.

Show a summary per file

File	Description
rfdetr/models/backbone/dinov3.py	New DINOv3 wrapper implementing multi-path forward logic with HF and PyTorch Hub support
rfdetr/models/backbone/backbone.py	Extended to branch between dinov2 and dinov3 encoders based on model name prefix
rfdetr/models/backbone/dinov3_configs/*.json	Configuration files for dinov3 small/base/large model architectures
rfdetr/config.py	Added EncoderName type, DINOv3 config fields, and validators for automatic parameter adjustment
rfdetr/engine.py	Updated AMP context manager logic and replaced inference_mode with no_grad for interpolation
rfdetr/train_v2_or_v3.py	New training script with encoder aliasing and environment-based configuration
rfdetr/inference_test.py	New demo script for testing DINOv2/v3 inference with URL-based image input

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-06T10:32:22Z

rfdetr/engine.py

+            torch.set_grad_enabled(True)  # safety
+            with torch.inference_mode(False):
+                with autocast(**get_autocast_args(args)):
+                    outputs = model(new_samples, new_targets)
+                    loss_dict = criterion(outputs, new_targets)
+                    weight_dict = criterion.weight_dict
+                    losses = sum(
+                        (1 / args.grad_accum_steps) * loss_dict[k] * weight_dict[k]
+                        for k in loss_dict.keys()
+                        if k in weight_dict
+                    )


The indentation of the autocast block and its contents appears to be changed. While the torch.inference_mode(False) wrapper was added for safety, the call to torch.set_grad_enabled(True) at line 133 is redundant since torch.inference_mode(False) already ensures gradients are enabled. Consider removing the torch.set_grad_enabled(True) call to simplify the code.

Copilot · 2026-02-06T10:32:23Z

rfdetr/config.py

    num_select: int = 300
    projector_scale: List[Literal["P3", "P4", "P5"]] = ["P4"]
-    out_feature_indexes: List[int] = [2, 5, 8, 11]
+    out_feature_indexes: List[int] = [2, 4, 5, 9]


The default out_feature_indexes is changed from [2, 5, 8, 11] to [2, 4, 5, 9], but there's a validator at lines 101-108 that forces it to [8, 11] when using dinov3 encoders. This creates inconsistency and makes it unclear what the actual indexes will be. Consider documenting why these specific indexes were chosen and whether the default should be different for v2 vs v3.

Copilot · 2026-02-06T10:32:23Z

rfdetr/models/backbone/dinov3.py

+                if cand is not None:
+                    if torch.is_tensor(cand) and cand.dim() == 4:
+                        # Already a spatial map [B, C, Hp, Wp]
+                        # Repeat to match requested out_feature_indexes count
+                        C = cand.shape[1]
+                        if C != self.hidden_size:
+                            self.hidden_size = int(C)
+                            self._out_feature_channels = [self.hidden_size] * len(self.out_feature_indexes)
+                        return [cand for _ in self.out_feature_indexes]
+                    # Otherwise assume tokens
+                    tokens = cand
+                    # If [HW, C] or [B*HW, C], _tokens_to_map will handle reshape
+                    feats = [self._tokens_to_map(tokens, B, H, W) for _ in self.out_feature_indexes]
+                    return feats


In the forward_features fallback path, when no suitable candidate is found in the dictionary (line 179), the code falls through without raising an error. This could lead to unexpected behavior. Consider raising a more informative error if cand remains None after attempting to extract features from the dictionary.

rfdetr/models/backbone/backbone.py

rfdetr/engine.py

Copilot · 2026-02-06T10:32:29Z

rfdetr/config.py

+
+    # auto-fit out_feature_indexes to avoid projector shape mismatches
+    @field_validator("out_feature_indexes", mode="after")
+    def _coerce_out_feats_for_backbone(cls, v, info: ValidationInfo):


Normal methods should have 'self', rather than 'cls', as their first parameter.

rfdetr/config.py

rfdetr/models/backbone/dinov3.py

rfdetr/engine.py

rfdetr/train_v2_or_v3.py

Remove leftover ChatGPT appendix Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Removal of unused import 'Field' Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Removal of unused import 'List' Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Removal of unused import 'platform' Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Removal of unused import 'nullcontext' Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Removing commented code clutter Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Integration of dinov3, including optional train and inference scripts.

3391b86

S-Mahoney requested review from Matvezy, SkalskiP, isaacrob-roboflow and probicheaux as code owners August 17, 2025 14:25

This was referenced Aug 17, 2025

Dinov3 #323

Closed

Implement DINOv3 as backbone #319

Closed

github-actions bot added the has conflicts label Jan 22, 2026

Borda added the enhancement New feature or request label Jan 22, 2026

Borda requested review from Copilot and removed request for isaacrob-roboflow February 6, 2026 10:25

Copilot started reviewing on behalf of Borda February 6, 2026 10:25 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

Update rfdetr/models/backbone/backbone.py

5c0dd43

Remove leftover ChatGPT appendix Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

S-Mahoney requested a review from isaacrob as a code owner February 6, 2026 10:36

S-Mahoney and others added 5 commits February 6, 2026 10:37

Update rfdetr/config.py

82d5396

Removal of unused import 'Field' Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update rfdetr/models/backbone/dinov3.py

e454223

Removal of unused import 'List' Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update rfdetr/train_v2_or_v3.py

87a3092

Removal of unused import 'platform' Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update rfdetr/engine.py

80e72bc

Removal of unused import 'nullcontext' Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update rfdetr/engine.py

2fb7959

Removing commented code clutter Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Borda force-pushed the develop branch 4 times, most recently from 60b16c1 to 523f9df Compare February 14, 2026 06:46

Conversation

S-Mahoney commented Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Uh oh!

CLAassistant commented Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

S-Mahoney commented Aug 17, 2025

Uh oh!

john09282922 commented Oct 24, 2025

Uh oh!

RoyiAvital commented Nov 29, 2025

Uh oh!

Je1zzz commented Dec 13, 2025

Uh oh!

stedavkle commented Feb 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Comments

S-Mahoney commented Aug 17, 2025 •

edited

Loading

CLAassistant commented Aug 17, 2025 •

edited

Loading