Skip to content

Commit 542457c

Browse files
authored
Support H2RBox-v2 (#805)
* Support H2RBox-v2 * Update README * Add figure in README * Update figure in README * Fix README lint problem
1 parent 8b30525 commit 542457c

19 files changed

Lines changed: 1628 additions & 5 deletions

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -172,6 +172,7 @@ A summary can be found in the [Model Zoo](docs/en/model_zoo.md) page.
172172
- [x] [H2RBox](configs/h2rbox/README.md) (ICLR'2023)
173173
- [x] [PSC](configs/psc/README.md) (CVPR'2023)
174174
- [x] [RTMDet](configs/rotated_rtmdet/README.md) (arXiv)
175+
- [x] [H2RBox-v2](configs/h2rbox_v2/README.md) (arXiv)
175176

176177
</details>
177178

README_zh-CN.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -168,6 +168,7 @@ https://user-images.githubusercontent.com/10410257/154433305-416d129b-60c8-44c7-
168168
- [x] [H2RBox](configs/h2rbox/README.md) (ICLR'2023)
169169
- [x] [PSC](configs/psc/README.md) (CVPR'2023)
170170
- [x] [RTMDet](configs/rotated_rtmdet/README.md) (arXiv)
171+
- [x] [H2RBox-v2](configs/h2rbox_v2/README.md) (arXiv)
171172

172173
</details>
173174

configs/h2rbox_v2/README.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# H2RBox-v2
2+
3+
> [H2RBox-v2: Boosting HBox-supervised Oriented Object Detection via Symmetric Learning](https://arxiv.org/pdf/2304.04403)
4+
5+
<!-- [ALGORITHM] -->
6+
7+
## Abstract
8+
9+
<div align=center>
10+
<img src="https://raw.githubusercontent.com/zytx121/image-host/main/imgs/h2rbox_v2.png" width="800"/>
11+
</div>
12+
13+
With the increasing demand for oriented object detection e.g. in autonomous driving and remote sensing, the oriented annotation has become a labor-intensive work. To make full use of existing horizontally annotated datasets and reduce the annotation cost, a weakly-supervised detector H2RBox for learning the rotated box (RBox) from the horizontal box (HBox) has been proposed and received great attention. This paper presents a new version, H2RBox-v2, to further bridge the gap between HBox-supervised and RBox-supervised oriented object detection. While exploiting axisymmetry via flipping and rotating consistencies is available through our theoretical analysis, H2RBox-v2, using a weakly-supervised branch similar to H2RBox, is embedded with a novel self-supervised branch that learns orientations from the symmetry inherent in the image of objects. Complemented by modules to cope with peripheral issues, e.g. angular periodicity, a stable and effective solution is achieved. To our knowledge, H2RBox-v2 is the first symmetry-supervised paradigm for oriented object detection. Compared to H2RBox, our method is less susceptible to low annotation quality and insufficient training data, which in such cases is expected to give a competitive performance much closer to fully-supervised oriented object detectors. Specifically, the performance comparison between H2RBox-v2 and Rotated FCOS on DOTA-v1.0/1.5/2.0 is 72.31%/64.76%/50.33% vs. 72.44%/64.53%/51.77%, 89.66% vs. 88.99% on HRSC, and 42.27% vs. 41.25% on FAIR1M.
14+
15+
## Results and models
16+
17+
DOTA1.0
18+
19+
| Backbone | AP50 | lr schd | Mem (GB) | Inf Time (fps) | Aug | Batch Size | Configs | Download |
20+
| :----------------------: | :---: | :-----: | :------: | :------------: | :---: | :--------: | :-------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
21+
| ResNet50 (1024,1024,200) | 72.59 | 1x | 10.10 | 29.1 | - | 2 | [h2rbox_v2-le90_r50_fpn-1x_dota](./h2rbox_v2-le90_r50_fpn-1x_dota.py) | [model](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn-1x_dota/h2rbox_v2-le90_r50_fpn-1x_dota-fa5ad1d2.pth) \| [log](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn-1x_dota/h2rbox_v2-le90_r50_fpn-1x_dota-20230313_103051.json) |
22+
| ResNet50 (1024,1024,200) | 78.25 | 1x | 10.33 | 29.1 | MS+RR | 2 | [h2rbox_v2-le90_r50_fpn_ms_rr-1x_dota](./h2rbox_v2-le90_r50_fpn_ms_rr-1x_dota.py) | [model](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn_ms_rr-1x_dota/h2rbox_v2-le90_r50_fpn_ms_rr-1x_dota-5e0e53e1.pth) \| [log](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn_ms_rr-1x_dota/h2rbox_v2-le90_r50_fpn_ms_rr-1x_dota-20230324_011934.json) |
23+
24+
DOTA1.5
25+
26+
| Backbone | AP50 | lr schd | Mem (GB) | Inf Time (fps) | Aug | Batch Size | Configs | Download |
27+
| :----------------------: | :---: | :-----: | :------: | :------------: | :-: | :--------: | :-------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
28+
| ResNet50 (1024,1024,200) | 64.76 | 1x | 10.95 | 29.1 | - | 2 | [h2rbox_v2-le90_r50_fpn-1x_dotav15](./h2rbox_v2-le90_r50_fpn-1x_dotav15.py) | [model](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn-1x_dotav15/h2rbox_v2-le90_r50_fpn-1x_dotav15-3adc0309.pth) \| [log](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn-1x_dotav15/h2rbox_v2-le90_r50_fpn-1x_dotav15-20230316_192940.json) |
29+
30+
DOTA2.0
31+
32+
| Backbone | AP50 | lr schd | Mem (GB) | Inf Time (fps) | Aug | Batch Size | Configs | Download |
33+
| :----------------------: | :---: | :-----: | :------: | :------------: | :-: | :--------: | :-----------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
34+
| ResNet50 (1024,1024,200) | 50.33 | 1x | 11.02 | 29.1 | - | 2 | [h2rbox_v2-le90_r50_fpn-1x_dotav2](./h2rbox_v2-le90_r50_fpn-1x_dotav2.py) | [model](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn-1x_dotav2/h2rbox_v2-le90_r50_fpn-1x_dotav2-b1ec4d3c.pth) \| [log](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn-1x_dotav2/h2rbox_v2-le90_r50_fpn-1x_dotav2-20230316_200353.json) |
35+
36+
HRSC
37+
38+
| Backbone | AP50 | lr schd | Mem (GB) | Inf Time (fps) | Aug | Batch Size | Configs | Download |
39+
| :----------------------: | :---: | :-----: | :------: | :------------: | :-: | :--------: | :-------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
40+
| ResNet50 (1024,1024,200) | 89.66 | 1x | 5.50 | 45.9 | - | 2 | [h2rbox_v2-le90_r50_fpn-6x_hrsc](./h2rbox_v2-le90_r50_fpn-6x_hrsc.py) | [model](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn-6x_hrsc/h2rbox_v2-le90_r50_fpn-6x_hrsc-b3b5e06b.pth) \| [log](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn-6x_hrsc/h2rbox_v2-le90_r50_fpn-6x_hrsc-20230312_073744.json) |
41+
| ResNet50 (1024,1024,200) | 89.56 | 1x | 5.50 | 45.9 | RR | 2 | [h2rbox_v2-le90_r50_fpn_rr-6x_hrsc](./h2rbox_v2-le90_r50_fpn_rr-6x_hrsc.py) | [model](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn_rr-6x_hrsc/h2rbox_v2-le90_r50_fpn_rr-6x_hrsc-ee6e851a.pth) \| [log](https://download.openmmlab.com/mmrotate/v1.0/h2rbox_v2/h2rbox_v2-le90_r50_fpn_rr-6x_hrsc/h2rbox_v2-le90_r50_fpn_rr-6x_hrsc-20230312_073800.json) |
42+
43+
## Citation
44+
45+
```
46+
@misc{yu2023h2rboxv2,
47+
title={H2RBox-v2: Boosting HBox-supervised Oriented Object Detection via Symmetric Learning},
48+
author={Yi Yu and Xue Yang and Qingyun Li and Yue Zhou and Gefan Zhang and Feipeng Da and Junchi Yan},
49+
year={2023},
50+
eprint={2304.04403},
51+
archivePrefix={arXiv},
52+
primaryClass={cs.CV}
53+
}
54+
```
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
_base_ = [
2+
'../_base_/datasets/dota.py', '../_base_/schedules/schedule_1x.py',
3+
'../_base_/default_runtime.py'
4+
]
5+
angle_version = 'le90'
6+
7+
# model settings
8+
model = dict(
9+
type='H2RBoxV2Detector',
10+
crop_size=(1024, 1024),
11+
view_range=(0.25, 0.75),
12+
data_preprocessor=dict(
13+
type='mmdet.DetDataPreprocessor',
14+
mean=[123.675, 116.28, 103.53],
15+
std=[58.395, 57.12, 57.375],
16+
bgr_to_rgb=True,
17+
pad_size_divisor=32,
18+
boxtype2tensor=False),
19+
backbone=dict(
20+
type='mmdet.ResNet',
21+
depth=50,
22+
num_stages=4,
23+
out_indices=(0, 1, 2, 3),
24+
frozen_stages=1,
25+
norm_cfg=dict(type='BN', requires_grad=True),
26+
norm_eval=True,
27+
style='pytorch',
28+
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
29+
neck=dict(
30+
type='mmdet.FPN',
31+
in_channels=[256, 512, 1024, 2048],
32+
out_channels=256,
33+
start_level=1,
34+
add_extra_convs='on_output',
35+
num_outs=5,
36+
relu_before_extra_convs=True),
37+
bbox_head=dict(
38+
type='H2RBoxV2Head',
39+
num_classes=15,
40+
in_channels=256,
41+
angle_version='le90',
42+
stacked_convs=4,
43+
feat_channels=256,
44+
strides=[8, 16, 32, 64, 128],
45+
center_sampling=True,
46+
center_sample_radius=1.5,
47+
norm_on_bbox=True,
48+
centerness_on_reg=True,
49+
use_hbbox_loss=False,
50+
scale_angle=False,
51+
rotation_agnostic_classes=[1, 9, 11],
52+
agnostic_resize_classes=[1],
53+
use_circumiou_loss=True,
54+
use_standalone_angle=True,
55+
use_reweighted_loss_bbox=False,
56+
angle_coder=dict(
57+
type='PSCCoder',
58+
angle_version=angle_version,
59+
dual_freq=False,
60+
num_step=3,
61+
thr_mod=0),
62+
bbox_coder=dict(
63+
type='DistanceAnglePointCoder', angle_version=angle_version),
64+
loss_cls=dict(
65+
type='mmdet.FocalLoss',
66+
use_sigmoid=True,
67+
gamma=2.0,
68+
alpha=0.25,
69+
loss_weight=1.0),
70+
loss_bbox=dict(type='mmdet.IoULoss', loss_weight=1.0),
71+
loss_centerness=dict(
72+
type='mmdet.CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
73+
loss_symmetry_ss=dict(
74+
type='H2RBoxV2ConsistencyLoss',
75+
use_snap_loss=True,
76+
loss_rot=dict(
77+
type='mmdet.SmoothL1Loss', loss_weight=1.0, beta=0.1),
78+
loss_flp=dict(
79+
type='mmdet.SmoothL1Loss', loss_weight=0.05, beta=0.1))),
80+
# training and testing settings
81+
train_cfg=None,
82+
test_cfg=dict(
83+
nms_pre=2000,
84+
min_bbox_size=0,
85+
score_thr=0.05,
86+
nms=dict(type='nms_rotated', iou_threshold=0.1),
87+
max_per_img=2000))
88+
89+
# load hbox annotations
90+
train_pipeline = [
91+
dict(type='mmdet.LoadImageFromFile', backend_args={{_base_.backend_args}}),
92+
dict(type='mmdet.LoadAnnotations', with_bbox=True, box_type='qbox'),
93+
# Horizontal GTBox, (x1,y1,x2,y2)
94+
dict(type='ConvertBoxType', box_type_mapping=dict(gt_bboxes='hbox')),
95+
# Horizontal GTBox, (x,y,w,h,theta)
96+
dict(type='ConvertBoxType', box_type_mapping=dict(gt_bboxes='rbox')),
97+
dict(type='mmdet.Resize', scale=(1024, 1024), keep_ratio=True),
98+
dict(
99+
type='mmdet.RandomFlip',
100+
prob=0.75,
101+
direction=['horizontal', 'vertical', 'diagonal']),
102+
dict(type='mmdet.PackDetInputs')
103+
]
104+
105+
train_dataloader = dict(dataset=dict(pipeline=train_pipeline))
106+
107+
# optimizer
108+
optim_wrapper = dict(
109+
optimizer=dict(
110+
_delete_=True,
111+
type='AdamW',
112+
lr=0.00005,
113+
betas=(0.9, 0.999),
114+
weight_decay=0.05))
115+
116+
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=12, val_interval=6)
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
_base_ = [
2+
'../_base_/datasets/dotav15.py', '../_base_/schedules/schedule_1x.py',
3+
'../_base_/default_runtime.py'
4+
]
5+
angle_version = 'le90'
6+
7+
# model settings
8+
model = dict(
9+
type='H2RBoxV2Detector',
10+
crop_size=(1024, 1024),
11+
view_range=(0.25, 0.75),
12+
data_preprocessor=dict(
13+
type='mmdet.DetDataPreprocessor',
14+
mean=[123.675, 116.28, 103.53],
15+
std=[58.395, 57.12, 57.375],
16+
bgr_to_rgb=True,
17+
pad_size_divisor=32,
18+
boxtype2tensor=False),
19+
backbone=dict(
20+
type='mmdet.ResNet',
21+
depth=50,
22+
num_stages=4,
23+
out_indices=(0, 1, 2, 3),
24+
frozen_stages=1,
25+
norm_cfg=dict(type='BN', requires_grad=True),
26+
norm_eval=True,
27+
style='pytorch',
28+
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
29+
neck=dict(
30+
type='mmdet.FPN',
31+
in_channels=[256, 512, 1024, 2048],
32+
out_channels=256,
33+
start_level=1,
34+
add_extra_convs='on_output',
35+
num_outs=5,
36+
relu_before_extra_convs=True),
37+
bbox_head=dict(
38+
type='H2RBoxV2Head',
39+
num_classes=16,
40+
in_channels=256,
41+
angle_version='le90',
42+
stacked_convs=4,
43+
feat_channels=256,
44+
strides=[8, 16, 32, 64, 128],
45+
center_sampling=True,
46+
center_sample_radius=1.5,
47+
norm_on_bbox=True,
48+
centerness_on_reg=True,
49+
use_hbbox_loss=False,
50+
scale_angle=False,
51+
rotation_agnostic_classes=[1, 9, 11],
52+
agnostic_resize_classes=[1],
53+
use_circumiou_loss=True,
54+
use_standalone_angle=True,
55+
use_reweighted_loss_bbox=False,
56+
angle_coder=dict(
57+
type='PSCCoder',
58+
angle_version=angle_version,
59+
dual_freq=False,
60+
num_step=3,
61+
thr_mod=0),
62+
bbox_coder=dict(
63+
type='DistanceAnglePointCoder', angle_version=angle_version),
64+
loss_cls=dict(
65+
type='mmdet.FocalLoss',
66+
use_sigmoid=True,
67+
gamma=2.0,
68+
alpha=0.25,
69+
loss_weight=1.0),
70+
loss_bbox=dict(type='mmdet.IoULoss', loss_weight=1.0),
71+
loss_centerness=dict(
72+
type='mmdet.CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
73+
loss_symmetry_ss=dict(
74+
type='H2RBoxV2ConsistencyLoss',
75+
use_snap_loss=True,
76+
loss_rot=dict(
77+
type='mmdet.SmoothL1Loss', loss_weight=1.0, beta=0.1),
78+
loss_flp=dict(
79+
type='mmdet.SmoothL1Loss', loss_weight=0.05, beta=0.1))),
80+
# training and testing settings
81+
train_cfg=None,
82+
test_cfg=dict(
83+
nms_pre=2000,
84+
min_bbox_size=0,
85+
score_thr=0.05,
86+
nms=dict(type='nms_rotated', iou_threshold=0.1),
87+
max_per_img=2000))
88+
89+
# load hbox annotations
90+
train_pipeline = [
91+
dict(type='mmdet.LoadImageFromFile', backend_args={{_base_.backend_args}}),
92+
dict(type='mmdet.LoadAnnotations', with_bbox=True, box_type='qbox'),
93+
# Horizontal GTBox, (x1,y1,x2,y2)
94+
dict(type='ConvertBoxType', box_type_mapping=dict(gt_bboxes='hbox')),
95+
# Horizontal GTBox, (x,y,w,h,theta)
96+
dict(type='ConvertBoxType', box_type_mapping=dict(gt_bboxes='rbox')),
97+
dict(type='mmdet.Resize', scale=(1024, 1024), keep_ratio=True),
98+
dict(
99+
type='mmdet.RandomFlip',
100+
prob=0.75,
101+
direction=['horizontal', 'vertical', 'diagonal']),
102+
dict(type='mmdet.PackDetInputs')
103+
]
104+
105+
train_dataloader = dict(dataset=dict(pipeline=train_pipeline))
106+
107+
# optimizer
108+
optim_wrapper = dict(
109+
optimizer=dict(
110+
_delete_=True,
111+
type='AdamW',
112+
lr=0.00005,
113+
betas=(0.9, 0.999),
114+
weight_decay=0.05))
115+
116+
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=12, val_interval=6)

0 commit comments

Comments
 (0)