Skip to content

RF-DETR does not automatically resize input images during training (resolution parameter seems ineffective) #979

@universeliang

Description

@universeliang

Search before asking

  • I have searched the RF-DETR issues and found no similar bug report.

Bug

I am encountering an issue when training RF-DETR with images of varying resolutions. My dataset contains images with different sizes, such as 960×540 and 1920×1080. Models like YOLO and RT-DETR can automatically resize images (e.g., to 640×640) during training by setting an image_size or similar parameter. However, in RF-DETR, I observed that the input images are not automatically resized, which leads to the following error:

AssertionError: Backbone requires input shape to be divisible by 32, but got torch.Size([8, 3, 1080, 1920])

I attempted the following approaches:

model.train(resolution=640) 

model = RFDETRNano(resolution=640)

However, neither of these had any effect. The input images were still passed to the model in their original resolutions, and the same error persisted.
Does RF-DETR support automatic image resizing during training (similar to YOLO or RT-DETR)?

Environment

  • RF-DETR: 0.1.0
  • OS: Windows 10
  • Python: 3.10.0
  • PyTorch: 2.5.1
  • GPU: NVIDIA RTX 4070ti super

Minimal Reproducible Example

from rfdetr import RFDETRBase, RFDETRNano, RFDETRSmall, RFDETRMedium, RFDETRLarge

if __name__ == '__main__':
    model = RFDETRNano(resolution=640, num_classes=10)
    model.train(
        resolution=640
    )

Additional

No response

Are you willing to submit a PR?

  • Yes, I'd like to help by submitting a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions