Skip to content

[Bug] Half and Float tensors conflit when using AMP for training in v3.0.0rc5  #9727

Open
@wusize

Description

@wusize

Prerequisite

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

3.x branch https://github.com/open-mmlab/mmdetection/tree/3.x

Environment

TorchVision: 0.12.0+cu113
OpenCV: 4.6.0
MMEngine: 0.5.0
MMDetection: 3.0.0rc5+9981107

Reproduces the problem - code sample

In the post process of RPN

det_bboxes, keep_idxs = batched_nms(bboxes, results.scores, results.level_ids, cfg.nms)

torch.exp in box decoder transforms tensors of torch.half to torch.float, resulting bboxes to be a float tensor with results.scores remaining to be half.

The config is

_base_ = [
    '../_base_/models/faster-rcnn_r50-caffe-c4.py',
    '../_base_/datasets/coco_detection.py',
    '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]

# optimizer
optim_wrapper = dict(
    type='AmpOptimWrapper',
    clip_grad=dict(max_norm=35, norm_type=2),
)

Reproduces the problem - command or script

None

Reproduces the problem - error message

None

Additional information

None

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingfp16mixed precision trainingv-3.x

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions