open-mmlab · okotaku · Sep 13, 2022 · Sep 18, 2022 · Sep 18, 2022 · Sep 18, 2022
diff --git a/README.md b/README.md
@@ -184,6 +184,7 @@ Results and models are available in the [model zoo](docs/en/model_zoo.md).
             <li><a href="configs/deformable_detr">Deformable DETR (ICLR'2021)</a></li>
             <li><a href="configs/tood">TOOD (ICCV'2021)</a></li>
             <li><a href="configs/ddod">DDOD (ACM MM'2021)</a></li>
+            <li><a href="configs/yoloxpai">YOLOX-PAI (ArXiv'2022)</a></li>
       </ul>
       </td>
       <td>

diff --git a/README_zh-CN.md b/README_zh-CN.md
@@ -183,6 +183,7 @@ MMDetection 是一个基于 PyTorch 的目标检测开源工具箱。它是 [Ope
             <li><a href="configs/deformable_detr">Deformable DETR (ICLR'2021)</a></li>
             <li><a href="configs/tood">TOOD (ICCV'2021)</a></li>
             <li><a href="configs/ddod">DDOD (ACM MM'2021)</a></li>
+            <li><a href="configs/yoloxpai">YOLOX-PAI (ArXiv'2022)</a></li>
       </ul>
       </td>
       <td>

diff --git a/configs/yoloxpai/README.md b/configs/yoloxpai/README.md
@@ -0,0 +1,37 @@
+# YOLOX-PAI
+
+> [YOLOX-PAI: An Improved YOLOX, Stronger and Faster than YOLOv6](https://arxiv.org/abs/2208.13040)
+
+<!-- [ALGORITHM] -->
+
+## Abstract
+
+We develop an all-in-one computer vision toolbox named EasyCV to facilitate the use of various SOTA computer vision methods. Recently, we add YOLOX-PAI, an improved version of YOLOX, into EasyCV. We conduct ablation studies to investigate the influence of some detection methods on YOLOX. We also provide an easy use for PAI-Blade which is used to accelerate the inference process based on BladeDISC and TensorRT. Finally, we receive 42.8 mAP on COCO dateset within 1.0 ms on a single NVIDIA V100 GPU, which is a bit faster than YOLOv6. A simple but efficient predictor api is also designed in EasyCV to conduct end2end object detection.
+
+<div align=center>
+<img src="https://user-images.githubusercontent.com/24734142/189808824-094c66f7-f95c-4e31-8a1e-50515fce545d.png"/>
+</div>
+
+## Results and Models
+
+|  Backbone   | ASFF | TOOD | box AP |                                                         Config                                                          |         Download         |
+| :---------: | :--: | :--: | :----: | :---------------------------------------------------------------------------------------------------------------------: | :----------------------: |
+| YOLOX-PAI-s |  N   |  N   |  41.8  |      [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yoloxpai/yolox_pai_s_8x8_300e_coco.py)      | [model](<>) \| [log](<>) |
+| YOLOX-PAI-s |  Y   |  N   |  42.8  |   [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yoloxpai/yolox_pai_asff_s_8x8_300e_coco.py)    | [model](<>) \| [log](<>) |
+| YOLOX-PAI-s |  Y   |  Y   |  43.6  | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/yoloxpai/yolox_pai_asff_tood_s_8x8_300e_coco.py) | [model](<>) \| [log](<>) |
+
+## Usage
+
+### Install additional requirements
+
+RepVGG backbone needs to install [MMClassification](https://github.com/open-mmlab/mmclassification) first, which has abundant backbones for downstream tasks.
+If you have already installed requirements for mmdet, run
+
+```shell
+pip install 'mmcls>=0.24.0'
+```
+
+See [this document](https://mmclassification.readthedocs.io/en/latest/install.html) for the details of MMClassification installation.
+
+Minimum required version of MMCV is `1.6.3`.
+See [this document](https://mmcv.readthedocs.io/en/latest/get_started/installation.html) for the details of MMCV installation.
diff --git a/configs/yoloxpai/metafile.yml b/configs/yoloxpai/metafile.yml
@@ -0,0 +1,58 @@
+Collections:
+  - Name: YOLOX-PAI
+    Metadata:
+      Training Data: COCO
+      Training Techniques:
+        - SGD with Nesterov
+        - Weight Decay
+        - Cosine Annealing Lr Updater
+      Training Resources: 8x TITANXp GPUs
+      Architecture:
+        - RepVGG
+        - PAFPN
+    Paper:
+      URL: https://arxiv.org/abs/2208.13040
+      Title: 'YOLOX-PAI: An Improved YOLOX, Stronger and Faster than YOLOv6'
+    README: configs/yoloxpai/README.md
+    Code:
+      URL:
+      Version:
+
+
+Models:
+  - Name: yolox_pai_s_8x8_300e_coco
+    In Collection: YOLOX-PAI
+    Config: configs/yoloxpai/yolox_pai_s_8x8_300e_coco.py
+    Metadata:
+      Training Memory (GB):
+      Epochs: 300
+    Results:
+      - Task: Object Detection
+        Dataset: COCO
+        Metrics:
+          box AP: 41.8
+    Weights:
+  - Name: yolox_pai_asff_s_8x8_300e_coco
+    In Collection: YOLOX-PAI
+    Config: configs/yoloxpai/yolox_pai_asff_s_8x8_300e_coco.py
+    Metadata:
+      Training Memory (GB):
+      Epochs: 300
+    Results:
+      - Task: Object Detection
+        Dataset: COCO
+        Metrics:
+          box AP: 42.8
+    Weights:
+  - Name: yolox_pai_asff_tood_s_8x8_300e_coco
+    In Collection: YOLOX-PAI
+    Config: configs/yoloxpai/yolox_pai_asff_tood_s_8x8_300e_coco.py
+    Metadata:
+      Training Memory (GB):
+      Epochs: 300
+    Results:
+      - Task: Object Detection
+        Dataset: COCO
+        Metrics:
+          box AP: 43.6
+    Weights:
diff --git a/configs/yoloxpai/yolox_pai_asff_s_8x8_300e_coco.py b/configs/yoloxpai/yolox_pai_asff_s_8x8_300e_coco.py
@@ -0,0 +1,3 @@
+_base_ = './yolox_pai_s_8x8_300e_coco.py'
+
+model = dict(neck=dict(type='YOLOXASFFPAFPN'))
diff --git a/configs/yoloxpai/yolox_pai_asff_tood_s_8x8_300e_coco.py b/configs/yoloxpai/yolox_pai_asff_tood_s_8x8_300e_coco.py
@@ -0,0 +1,3 @@
+_base_ = './yolox_pai_asff_s_8x8_300e_coco.py'
+
+model = dict(bbox_head=dict(type='YOLOXTOODHead'))
diff --git a/configs/yoloxpai/yolox_pai_s_8x8_300e_coco.py b/configs/yoloxpai/yolox_pai_s_8x8_300e_coco.py
@@ -0,0 +1,14 @@
+_base_ = '../yolox/yolox_s_8x8_300e_coco.py'
+custom_imports = dict(imports=['mmcls.models'], allow_failed_imports=False)
+
+model = dict(
+    backbone=dict(
+        _delete_=True,
+        type='mmcls.RepVGG',
+        arch='yolox-pai-small',
+        add_ppf=True,
+        norm_cfg=dict(type='BN', eps=0.001, momentum=0.03),
+        out_indices=(1, 2, 3),
+    ),
+    neck=dict(act_cfg=dict(type='SiLU')),
+    bbox_head=dict(act_cfg=dict(type='SiLU')))
diff --git a/mmdet/models/dense_heads/__init__.py b/mmdet/models/dense_heads/__init__.py
@@ -41,6 +41,7 @@
 from .yolo_head import YOLOV3Head
 from .yolof_head import YOLOFHead
 from .yolox_head import YOLOXHead
+from .yolox_tood_head import YOLOXTOODHead
 
 __all__ = [
     'AnchorFreeHead', 'AnchorHead', 'GuidedAnchorHead', 'FeatureAdaption',
@@ -54,5 +55,5 @@
     'DETRHead', 'YOLOFHead', 'DeformableDETRHead', 'SOLOHead',
     'DecoupledSOLOHead', 'CenterNetHead', 'YOLOXHead',
     'DecoupledSOLOLightHead', 'LADHead', 'TOODHead', 'MaskFormerHead',
-    'Mask2FormerHead', 'SOLOV2Head', 'DDODHead'
+    'Mask2FormerHead', 'SOLOV2Head', 'DDODHead', 'YOLOXTOODHead'
 ]
diff --git a/mmdet/models/dense_heads/yolox_tood_head.py b/mmdet/models/dense_heads/yolox_tood_head.py
@@ -0,0 +1,103 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from mmcv.cnn import ConvModule
+
+from mmdet.core import multi_apply
+from mmdet.models.builder import HEADS
+from mmdet.models.dense_heads import YOLOXHead
+from mmdet.models.dense_heads.tood_head import TaskDecomposition
+
+
+@HEADS.register_module()
+class YOLOXTOODHead(YOLOXHead):
+    """YOLOXTOOD head used in `YOLOX-PAI <https://arxiv.org/abs/2208.13040>`_.
+
+    Args:
+        tood_stacked_convs (int): Number of conv layers in TOOD head.
+            Default: 3.
+        la_down_rate (int): Downsample rate of layer attention.
+            Default: 32.
+        tood_norm_cfg (dict): Config dict for normalization layer in TOOD head.
+            Default: dict(type='GN', num_groups=32, requires_grad=True).
+    """
+
+    def __init__(self,
+                 *args,
+                 tood_stacked_convs=3,
+                 la_down_rate=32,
+                 tood_norm_cfg=dict(
+                     type='GN', num_groups=32, requires_grad=True),
+                 **kwargs):
+        super().__init__(*args, **kwargs)
+        self.tood_stacked_convs = tood_stacked_convs
+        self.la_down_rate = la_down_rate
+        self.tood_norm_cfg = tood_norm_cfg
+
+        self._build_tood_layers()
+
+    def _build_tood_layers(self):
+        self.inter_convs = nn.ModuleList()
+        for _ in range(self.tood_stacked_convs):
+            self.inter_convs.append(
+                ConvModule(
+                    self.in_channels,
+                    self.in_channels,
+                    3,
+                    stride=1,
+                    padding=1,
+                    conv_cfg=self.conv_cfg,
+                    norm_cfg=self.tood_norm_cfg))
+
+        self.multi_level_cls_decomps = nn.ModuleList()
+        self.multi_level_reg_decomps = nn.ModuleList()
+        for _ in self.strides:
+            self.multi_level_cls_decomps.append(
+                TaskDecomposition(self.in_channels, self.tood_stacked_convs,
+                                  self.tood_stacked_convs * self.la_down_rate,
+                                  self.conv_cfg, self.tood_norm_cfg))
+            self.multi_level_reg_decomps.append(
+                TaskDecomposition(self.in_channels, self.tood_stacked_convs,
+                                  self.tood_stacked_convs * self.la_down_rate,
+                                  self.conv_cfg, self.tood_norm_cfg))
+
+    def forward_single(self, x, cls_convs, reg_convs, conv_cls, conv_reg,
+                       conv_obj, cls_decomp, reg_decomp):
+        """Forward feature of a single scale level."""
+
+        inter_feats = []
+        for inter_conv in self.inter_convs:
+            x = inter_conv(x)
+            inter_feats.append(x)
+        feat = torch.cat(inter_feats, 1)
+
+        avg_feat = F.adaptive_avg_pool2d(feat, (1, 1))
+        cls_x = cls_decomp(feat, avg_feat)
+        reg_x = reg_decomp(feat, avg_feat)
+
+        cls_feat = cls_convs(cls_x)
+        reg_feat = reg_convs(reg_x)
+
+        cls_score = conv_cls(cls_feat)
+        bbox_pred = conv_reg(reg_feat)
+        objectness = conv_obj(reg_feat)
+
+        return cls_score, bbox_pred, objectness
+
+    def forward(self, feats):
+        """Forward features from the upstream network.
+
+        Args:
+            feats (tuple[Tensor]): Features from the upstream network, each is
+                a 4D-tensor.
+        Returns:
+            tuple[Tensor]: A tuple of multi-level predication map, each is a
+                4D-tensor of shape (batch_size, 5+num_classes, height, width).
+        """
+
+        return multi_apply(
+            self.forward_single, feats, self.multi_level_cls_convs,
+            self.multi_level_reg_convs, self.multi_level_conv_cls,
+            self.multi_level_conv_reg, self.multi_level_conv_obj,
+            self.multi_level_cls_decomps, self.multi_level_reg_decomps)
diff --git a/mmdet/models/necks/__init__.py b/mmdet/models/necks/__init__.py
@@ -14,10 +14,11 @@
 from .rfp import RFP
 from .ssd_neck import SSDNeck
 from .yolo_neck import YOLOV3Neck
+from .yolox_asff_pafpn import YOLOXASFFPAFPN
 from .yolox_pafpn import YOLOXPAFPN
 
 __all__ = [
     'FPN', 'BFP', 'ChannelMapper', 'HRFPN', 'NASFPN', 'FPN_CARAFE', 'PAFPN',
     'NASFCOS_FPN', 'RFP', 'YOLOV3Neck', 'FPG', 'DilatedEncoder',
-    'CTResNetNeck', 'SSDNeck', 'YOLOXPAFPN', 'DyHead'
+    'CTResNetNeck', 'SSDNeck', 'YOLOXPAFPN', 'DyHead', 'YOLOXASFFPAFPN'
 ]
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		_base_ = './yolox_pai_s_8x8_300e_coco.py'

		model = dict(neck=dict(type='YOLOXASFFPAFPN'))
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		_base_ = './yolox_pai_asff_s_8x8_300e_coco.py'

		model = dict(bbox_head=dict(type='YOLOXTOODHead'))
Copy link Contributor shinya7y Sep 18, 2022 Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. Doesn't this config use `neck=dict(type='YOLOXASFFPAFPN')`? Copy link Contributor Author okotaku Sep 18, 2022 Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. I made a mistake. I fixed base config.