RFC: Support Img2Col Transformation for Conv2D Including Quantized Types

## Overview

Add a Conv2D to Img2Col + Matmul transformation pass in the global optimization phase to enable backends with optimized matmul implementations to leverage them for both regular and quantized convolution operations.

## Motivation

This proposal originated from discussions in [PR #23278](https://github.com/iree-org/iree/pull/23278).

Convolution operations are critical for computer vision workloads. The img2col (image-to-column) transformation is a well-established technique that converts convolutions into matrix multiplications, enabling backends with highly optimized GEMM (General Matrix Multiply) implementations to leverage them for convolution operations.

### Current State and Problem

IREE currently has an img2col transformation pass in the **preprocessing phase**, but quantized convolution lowering happens in the **global optimization phase** via `LinalgQuantizedConvToConvPass`. This creates a pipeline ordering issue:

```
Preprocessing Phase:
  - ConvertConv2DToImg2ColPass (if enabled)

Global Optimization Phase:
  - LinalgQuantizedConvToConvPass  ← Converts quantized conv to regular conv
  - LinalgQuantizedMatmulToMatmulPass
```

**The Problem**: When quantized convolutions are lowered to regular convolutions in global optimization, they cannot benefit from the img2col transformation because that pass already ran in preprocessing. This means:

- Quantized convolutions (common in mobile/edge models) miss out on img2col optimization
- We cannot reuse the img2col transformation for both regular and quantized convolutions
- Duplicating the pass in both phases is not maintainable

**Alternative Workaround**: Users can manually specify a preprocessing pipeline:
```bash
iree-preprocessing-pass-pipeline="builtin.module(util.func(iree-global-opt-quantized-conv-to-conv, iree-preprocessing-convert-conv2d-to-img2col))"
```

While this is technically possible, it has some practical limitations:
- Requires understanding of internal pass dependencies and ordering
- Pass ordering needs careful maintenance (e.g., must run after quantized conv lowering)
- May miss necessary cleanup passes like canonicalization and CSE
- Can become outdated as the compiler evolves
- Not ideal for general users who may not be familiar with compiler internals

Integrating the pass into the standard pipeline provides a more robust and user-friendly solution.

## Proposal

Move `ConvertConv2DToImg2ColPass` from preprocessing to global optimization, placing it **after** quantized convolution lowering. Add `ConvertConv2DToImg2ColPass` to the global optimization pipeline that transforms linalg convolution operations into img2col + matmul form.

```
Global Optimization Phase:
  - LinalgQuantizedConvToConvPass  ← Lowers quantized conv first
  - LinalgQuantizedMatmulToMatmulPass
  - ConvertConv2DToImg2ColPass     ← Now applies to both regular AND quantized convs
```

This enables:

1. **Unified img2col transformation**: Both regular and quantized convolutions can use the same img2col pass, improving code reuse and maintainability.

2. **Better quantized model performance**: Quantized convolutions (after lowering) can now benefit from img2col + optimized matmul, critical for edge deployment scenarios.

3. **Leverage optimized matmul implementations**: Backends with highly tuned matmul kernels (vendor libraries, ukernels, etc.) can benefit from converting convolutions to matmul operations.

4. **Improved inference performance**: Particularly beneficial for:
   - Quantized models on edge devices (MobileNet, EfficientNet, etc.)
   - Deployments where matmul implementations are more optimized than direct convolution
   - Inference workloads (batch=1 or small batches are common in deployment)
   - Modern CNN architectures typically rely on small convolutional kernels as their core spatial operation.

5. **Better integration with IREE pipeline**: Matmul operations integrate more naturally with dispatch formation and fusion heuristics in the global optimization phase.

6. **No impact on other backends**: The transformation is opt-in via a command-line flag, allowing backends to preserve direct convolution form for their own specialized optimizations when img2col is not beneficial.

### Design

**Supported Operations:**
- `linalg.conv_2d_nhwc_hwcf` → img2col + `linalg.matmul`
- `linalg.conv_2d_nchw_fchw` → img2col + `linalg.matmul`
- `linalg.depthwise_conv_2d_nhwc_hwc` → img2col + depthwise matmul
- Quantized convolutions (after lowering via `LinalgQuantizedConvToConvPass`) → same transformation path

**Pipeline Integration:**

The pass runs in `buildGlobalOptimizationPassPipeline()` **after** quantized convolution lowering:

```cpp
FunctionLikeNest(mainPassManager)
    .addPass(createLinalgQuantizedConvToConvPass)      // 1. Lower quantized conv
    .addPass(createLinalgQuantizedMatmulToMatmulPass)  // 2. Lower quantized matmul
    .addPass(createConvertConv2DToImg2ColPass)         // 3. Apply img2col (NEW, opt-in)
    .addPass(IREE::Flow::createCanonicalizePass)
```

**Critical ordering**: The pass must run after `LinalgQuantizedConvToConvPass` so that:
1. Quantized convolutions are first lowered to regular `linalg.conv_2d_*` operations
2. The img2col transformation can then apply to **both** originally-regular and originally-quantized convolutions
3. All convolutions benefit from the same img2col + matmul optimization path

**Opt-in via flag**: The pass is controlled by a command-line option (e.g., `--iree-global-opt-enable-conv2d-to-img2col`) and is **disabled by default**. This allows:
- Backends to opt-in when img2col provides better performance for their target architecture
- Backends to preserve direct convolution form when they have specialized optimizations
- Flexibility for different deployment scenarios and hardware targets

This placement also ensures:
- Transformation occurs before dispatch region formation
- Downstream passes can optimize the resulting matmul operations
- Matmul fusion opportunities are preserved
- No impact on backends that prefer direct convolution lowering


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Support Img2Col Transformation for Conv2D Including Quantized Types #23471

Overview

Motivation

Current State and Problem

Proposal

Design

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: Support Img2Col Transformation for Conv2D Including Quantized Types #23471

Description

Overview

Motivation

Current State and Problem

Proposal

Design

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions