Support for ONNX's ConvTranspose, DepthToSpace and MaxUnpool operators (AIS-2146)

### Checklist

- [x] Checked the issue tracker for similar issues to ensure this is not a duplicate.
- [x] Described the feature in detail and justified the reason for the request.
- [x] Provided specific use cases and examples.

### Feature description

I suggest adding an implementation for the ONNX operators [ConvTranspose](https://onnx.ai/onnx/operators/onnx__ConvTranspose.html), [DepthToSpace](https://onnx.ai/onnx/operators/onnx__DepthToSpace.html) and [MaxUnpool](https://onnx.ai/onnx/operators/onnx__MaxUnpool.html) to ESP-DL. ConvTranspose and MaxUnpool could also be implemented with hardware acceleration like their counterparts Conv and MaxPool.

### Use cases

The requested operators improve ESP-DL's usability for machine learning tasks that require dense per-pixel outputs like semantic segmentation, monocular depth estimation and scene flow. The Deep Learning models used for these tasks require some upsampling operation. Currently, the only supported upsampling operation is Resize. This restriction significantly limits which model architectures can run on ESP-DL.

### Alternatives

I have attempted to rewrite the ONNX graph generated by PyTorch so that ConvTranspose and DepthToSpace nodes are replaced by a sequence of implemented operators which approximate the original operation.

[Dumoulin & Visin (2018, pp. 20-27)](https://doi.org/10.48550/arXiv.1603.07285) state that it is possible to implement a transposed convolution using a regular convolution. However, internal zero padding must be added to emulate fractional stride which makes this approach inefficient in terms of memory and FLOPs.

[PyTorch's C++ source code](https://github.com/pytorch/pytorch/blob/3a66a1cb99d7e1c1fe7abf8f390e177b9183a436/aten/src/ATen/native/PixelShuffle.cpp#L13-L58) for the `torch.nn.functional.pixel_shuffle` function and the [ONNX specification](https://onnx.ai/onnx/operators/onnx__DepthToSpace.html) use Reshape and Transpose operators to describe DepthToSpace's functionality. Both operators are already implemented in ESP-DL and can be used to create a new module.

### Additional context

I have started to implement an ESP-DL module for DepthToSpace myself, but I could not test it yet because ESP-PPQ v1.0.0 fails to quantize a PyTorch model consisting of a single `torch.nn.PixelShuffle` layer. It prints the following messages:

```sh
[ERROR][ESPDL]:    Can not reset DepthToSpace:/DepthToSpace layout
[INFO][ESPDL]:     skip not QuantableOperation
[WARNING][ESPDL]:  onnx::DepthToSpace_0 does not bind perm parameter
[WARNING][ESPDL]:  1 does not bind perm parameter
```

ESP-PPQ may need to be modified as well to fully support the DepthToSpace operator across the entire model deployment process.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for ONNX's ConvTranspose, DepthToSpace and MaxUnpool operators (AIS-2146) #262

Checklist

Feature description

Use cases

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for ONNX's ConvTranspose, DepthToSpace and MaxUnpool operators (AIS-2146) #262

Description

Checklist

Feature description

Use cases

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions