Skip to content

[RFC] Support YOLOX detection model #6341

Open
@zhiqwang

Description

@zhiqwang

🚀 The feature

YOLO aka. You Only Look Once, which is a vibrant series of object detection models since the release of Joseph Redmon You Only Look Once: Unified, Real-Time Object Detection.

So far a couple of more notable implementations are as follows (all PyTorch):

  • YOLOv3 2018. Cited by 14369 12
  • YOLOv4 2020. Cited by 4905
  • YOLOv5 2020. Starred at GitHub 29.4k 3
  • YOLOX 2021. Cited by 299 4
  • YOLOv7 2022. 5

Motivation, pitch

Until now, one of the most successful ones is probably YOLOv5. YOLOv5 is great, and they have also built up a very friendly community and ecosystem. We don't intend to copy YOLOv5 into TorchVision, our main goal here is to make training SoTA models easier and share reusable subcomponents to build the next SoTA models in the same/proxy family.6

YOLOX is a high-performance anchor-free YOLO, and it has a good balance in terms of copyright and code quality, it's enough to have a YOLOX implementation from the community's perspective.

The License

YOLO{v5/v7} are built under the GPL-3.0 license, and YOLOX is built under the Apache-2.0 license.

More context

I have previously rewritten the code used in the inference part of YOLOv5 according to the style and specification of torchvision7, and I can relicense that part to BSD-3-Clause license. The amount of work involved in the model inference part is not much with the help of YOLOX base code.

Data augmentation and a new trainer engine will be the core of what we will do here.

The data augmentation section is in the planning list #6224 , and we have already merged some augmentation methods like #5825 , I think it would help us to build the next SoTA models with a new primitives like classification models.8

As TorchVision adds more and more models, it may be time to abstract out a simple trainer engine for sharing reusable subcomponents. It might be more appropriate to open a new thread for necessity and specific steps about this part.

cc @datumbox @YosuaMichael @oke-aditya

Footnotes

  1. https://github.com/ultralytics/yolov3/tree/v9.1

  2. https://github.com/open-mmlab/mmdetection/tree/master/configs/yolo

  3. https://github.com/ultralytics/yolov5

  4. https://github.com/Megvii-BaseDetection/YOLOX

  5. https://github.com/WongKinYiu/yolov7

  6. https://github.com/keras-team/keras-cv/issues/622#issuecomment-1198063712

  7. https://github.com/zhiqwang/yolov5-rt-stack/tree/main/yolort/models

  8. https://pytorch.org/blog/how-to-train-state-of-the-art-models-using-torchvision-latest-primitives/

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions