Skip to content

tue-p8n/deformops

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deformable grid sampling operations

This PyTorch extension implements efficient custom operators for deformable grid sampling with bilinear interpolation. This is used in the deformable convolutions and multi-scale deformable attention modules commonly found in dense vision networks.

The deformable sampling operation is implemented as a CUDA extension, which is JIT-compiled when the operation is first called.

Installation

The recommended method for installing deformops is to simply install the package from PyPI.

pip install deformops

Usage

Basic modules are provided in deformops.nn:

  • DeformConv2d (deformable convolution), and
  • DeformAttn2d (multi-scale deformable attention).

The deformable convolution follows a similar interface to torch.nn.Conv2d. The following initializes the module and passes random initialized data to it:

from deformops.nn import DeformConv2d
from torch import randn

dconv = DeformConv2d(16)

x = randn((4, 16, 32, 64))  # B, C, H, W
y = dconv(x)  # run forward pass

Refer to the documentation for more information.

About

The core operation in this package -- deformable grid sampling -- was first proposed for convolutions in the paper Deformable Convolutional Networks (ICCV 2017) by Dai et al. with Microsoft Research Asia. The paper proposed two key innovations:

  1. Deformable convolution, which adds learnable 2D offsets to the regular grid sampling locations in standard convolution, and
  2. Deformable RoI pooling, which adds learnable offsets to each bin position in regular RoI pooling.

The method uses bilinear interpolation to handle the fractional offsets that arise from the deformed sampling grid. This enabled CNNs to better model geometric transformations by learning the spatial sampling locations adaptively from the target tasks.

Multi-scale deformable attention was first introduced in the paper Deformable DETR: Deformable Transformers for End-to-End Object Detection (ICLR 2020) by Zhu et al. This paper adapted the deformable convolution concept to the attention mechanism used in (vision) transformers, using offsets in the attention matrix to handle multiple feature scales simultaneously.

About

Deformable grid sampling operations implemented in PyTorch and CUDA.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors