Skip to content

Conversation

@Borda
Copy link
Member

@Borda Borda commented Jan 21, 2026

This pull request adds robust support for both COCO and YOLO Roboflow datasets by auto-detecting the dataset format and delegating to the appropriate loader. It introduces utility functions for format detection, improves class name extraction for both formats, and simplifies argument handling in dataset builders.

Roboflow Dataset Format Detection and Loading:

  • Added detect_roboflow_format and a new build_roboflow function in rfdetr/datasets/__init__.py to automatically detect whether a Roboflow dataset is in COCO or YOLO format and call the correct builder (build_roboflow_from_coco or build_roboflow_from_yolo).
  • Updated imports and function references to support the new detection and builder logic, and included the YoloDetection class for type checking.

Class Name Extraction Improvements:

  • Refactored rfdetr/detr.py to add a _load_classes static method that loads class names from either COCO or YOLO-style datasets (supporting both JSON and YAML formats), with error handling for ambiguous or missing files.

Builder and Argument Handling Simplification:

  • Split the Roboflow COCO builder into a dedicated build_roboflow_from_coco function, and replaced repeated try/except blocks with getattr for cleaner argument extraction.
  • Removed unused or redundant try/except blocks for argument extraction in both COCO and Roboflow dataset builders.

Minor Code Quality Improvements:

  • Cleaned up unused docstrings and imports for better readability and maintainability. [1] [2]

Borda added 7 commits January 21, 2026 14:48
- Introduce `ConvertYolo` for converting YOLO detections to RF-DETR's target format.
- Add `YoloDetection` class for handling YOLO-format datasets with optional mask support.
…ow` logic, reduce redundancy by reusing shared functionality.
…Roboflow dataset structure handling

- Introduce `build_roboflow_from_yolo` for handling YOLO-format datasets.
- Refactor `build_roboflow_from_coco` to streamline attribute handling and adjust Roboflow's dataset structure mapping.
- Introduce `detect_roboflow_format` to auto-detect COCO or YOLO format.
- Refactor `build_roboflow` to delegate to the appropriate builder based on detected format.
- Update `build_dataset` to use the new unified `build_roboflow` function.
- Introduce `YoloCoco` as a minimal wrapper for YOLO datasets to enable COCO-style evaluation.
- Update `YoloDetection` to initialize the `YoloCoco` wrapper for seamless integration with COCO evaluators.
- Modify dataset initialization logic to support both COCO and YOLO formats.
Copilot AI review requested due to automatic review settings January 21, 2026 16:46
@Borda Borda requested a review from Matvezy as a code owner January 21, 2026 16:46
@codecov
Copy link

codecov bot commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 15.13944% with 213 lines in your changes missing coverage. Please review.
✅ Project coverage is 20%. Comparing base (aa90fc6) to head (bd5dfcc).

Additional details and impacted files
@@           Coverage Diff           @@
##           develop   #569    +/-   ##
=======================================
- Coverage       21%    20%    -0%     
=======================================
  Files           45     46     +1     
  Lines         5960   6196   +236     
=======================================
+ Hits          1225   1261    +36     
- Misses        4735   4935   +200     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds comprehensive support for YOLO format datasets to RF-DETR, complementing the existing COCO format support. The implementation includes automatic format detection, unified dataset loading, and improved class name extraction that works with both formats.

Changes:

  • Added YOLO dataset loader with supervision integration and COCO API compatibility wrapper
  • Implemented automatic format detection for Roboflow datasets to distinguish between COCO and YOLO formats
  • Refactored class name extraction into a unified method supporting both COCO JSON and YOLO YAML formats

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 10 comments.

File Description
rfdetr/datasets/yolo.py New file implementing YOLO dataset support with ConvertYolo, YoloCoco wrapper, YoloDetection dataset class, and builder function
rfdetr/datasets/init.py Added format detection logic and dispatcher function to route to appropriate builder based on detected format
rfdetr/datasets/coco.py Renamed build_roboflow to build_roboflow_from_coco, replaced try-except blocks with getattr for cleaner argument handling, removed empty docstring
rfdetr/detr.py Added _load_classes static method to load class names from either COCO or YOLO datasets, with support for both JSON and YAML formats
Comments suppressed due to low confidence (1)

rfdetr/datasets/coco.py:249

  • The build function for COCO still uses a try-except block for square_resize_div_64 (lines 246-249), while the refactored build_roboflow_from_coco function uses getattr for the same purpose (line 289). For consistency with the refactoring approach described in the PR, consider updating this try-except block to use getattr as well.
    try:
        square_resize_div_64 = args.square_resize_div_64
    except:
        square_resize_div_64 = False

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 242 to 258
def getCatIds(self, catNms=None, supNms=None, catIds=None):
"""Get category IDs that satisfy given filter conditions."""
if catNms is None:
catNms = []
if supNms is None:
supNms = []
if catIds is None:
catIds = []

cats = self.dataset["categories"]

if len(catNms) > 0:
cats = [cat for cat in cats if cat["name"] in catNms]
if len(catIds) > 0:
cats = [cat for cat in cats if cat["id"] in catIds]

return [cat["id"] for cat in cats]
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getCatIds method accepts a supNms parameter (line 242) but never uses it in the filtering logic. If supercategory name filtering is not needed for YOLO datasets (which always use "none" as supercategory), consider adding a comment explaining this, or implement the filtering for API completeness.

Copilot uses AI. Check for mistakes.
Comment on lines +138 to +139
# any YAML file starting with data e.g. data.yaml, dataset.yaml
yaml_data_files = [yp for yp in yaml_paths if os.path.basename(yp).startswith("data")]
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 138 describes filtering for YAML files starting with "data" but the actual examples given (e.g., "data.yaml, dataset.yaml") show that "dataset.yaml" doesn't start with "data". This creates a mismatch between the comment and the code behavior. The code correctly filters for files starting with "data", but the comment and error message on line 156 should be updated to reflect this accurately.

Copilot uses AI. Check for mistakes.
Borda and others added 6 commits January 21, 2026 17:56
- Add `_MockSvDataset` for testing `YoloCoco` functionality.
- Extend docstrings with examples and argument details for key `YoloCoco` methods.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants