Add torchvision to benchmarks.

## Benchmark torchvision as a potential backend in `albucore` low-level ops

### Context

In `albucore` we maintain a set of low-level helpers (e.g. `hflip`, `multiply_by_constant`, etc.) that dynamically select the fastest implementation depending on **shape, dtype, and layout**, currently choosing between:

* OpenCV
* NumPy
* SimSIMD (where applicable)

These helpers are performance-critical and sit at the bottom of the AlbumentationsX stack, so backend selection is intentionally pragmatic and benchmark-driven.

Given how widely **torchvision** is used — and the fact that many users already depend on it transitively — it’s worth evaluating whether torchvision ops should be considered as an **additional backend** for some of these primitives.

This issue proposes a **systematic benchmark**, not a commitment.

---

### Goal

Answer a single question with data:

> Are there specific low-level operations, shapes, or dtypes where torchvision is meaningfully faster or more robust than our existing OpenCV / NumPy / SimSIMD paths?

If yes → we can consider adding torchvision as an optional backend.
If no → we document the result and move on.

---

### Scope of benchmarking

Suggested initial candidates (non-exhaustive):

* `hflip` / `vflip`
* elementwise ops (e.g. multiply/add by constant)
* simple type-preserving transforms
* batched inputs (N, H, W, C)
* multi-channel images (C > 4)
* common dtypes: `uint8`, `float32`

Dimensions to explicitly test:

* contiguous vs non-contiguous memory
* small vs large images
* CPU execution only (no CUDA)

---

### Comparison targets

For each operation:

* OpenCV implementation
* NumPy implementation
* SimSIMD (where available)
* torchvision functional equivalent

Metrics:

* wall-clock time
* allocation behavior (extra copies?)
* constraints on shape / dtype
* semantic equivalence (exactness vs approximation)

---

### Non-goals

* No GPU / CUDA benchmarks in this issue
* No API changes proposed yet
* No commitment to add torchvision as a hard dependency

---

### Acceptance criteria

This issue is considered resolved when:

* Benchmarks are reproducible and documented
* We have a clear table of “torchvision wins / losses”
* A decision is made: **add torchvision backend** or **explicitly reject**

Either outcome is useful.

---

### Why this matters

At this level of the stack, performance differences compound.
If torchvision provides a measurable win in specific regimes (e.g. large tensors, specific layouts), we should know — and if it doesn’t, we should stop wondering.

Benchmarks > intuition.

---

### References

* Existing backend selection logic in `albucore`
* torchvision functional transforms documentation
* prior OpenCV / NumPy / SimSIMD benchmarks in the repo


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add torchvision to benchmarks. #76

Benchmark torchvision as a potential backend in `albucore` low-level ops

Context

Goal

Scope of benchmarking

Comparison targets

Non-goals

Acceptance criteria

Why this matters

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add torchvision to benchmarks. #76

Description

Benchmark torchvision as a potential backend in albucore low-level ops

Context

Goal

Scope of benchmarking

Comparison targets

Non-goals

Acceptance criteria

Why this matters

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Benchmark torchvision as a potential backend in `albucore` low-level ops