Add Initial Qwen-2.5VL Image Processor Support by sayanshaw24 · Pull Request #1008 · microsoft/onnxruntime-extensions

sayanshaw24 · 2025-11-12T00:19:11Z

Description

This PR introduces support for Qwen2.5-VL image preprocessing in ONNX Runtime Extensions. It implements the full resize (including smart resize) → rescale → normalize → patching pipeline required by Qwen2.5-VL-7B-Instruct.

Key updates:

New Qwen2.5-VL–compatible PatchImage op
- Converts RGB HWC to CHW channel ordering
- Performs accurate temporal padding and 9-D reshape/transpose logic
- Produces Python-aligned patch embeddings
Smart Resize support in the Resize op
- Implements Qwen2.5-VL’s shortest-edge and longest-edge constraints
- Supports pixel-count–based resizing (min/max pixels)
- Matches smart_resize behavior from the HF transformers
Qwen2.5-VL Normalization handling
- Fixes channel-order expectations across resize → normalize → patch
- Ensures exact mean/std behavior equivalent to HF transformers
- Includes C++ normalization adjustments when Qwen2.5-VL mode is enabled
Qwen2.5-VL processor JSON
- Includes Qwen-specific resize parameters
- Correct mean/std normalization
- Patch dimensions (patch_size, merge_size, temporal_patch_size)
RGB Correctness Guarantees
- Fixes the BGR→RGB mismatch observed in PatchImage
- Ensures alignment with the Qwen2.5VL

This enables ORT Extensions processing support for Qwen2.5-VL vision & multimodal models across ORT Extensions, ready-to-use in ORT GenAI. Note that we currently only provide single-image and non-video support.

Validation

C++ unit tests verifying pixel_values match reference output; MSE comparison: achieved ≤ 1e-3
Step-by-step parity checks across:
- Resize
- Rescale
- Normalize
- Patch extraction
Regression-tested with Phi-3-V, Phi-4, MLlama, Gemma, etc. — no changes in their outputs

…sions into sayanshaw/qwen2-5-vl

shared/api/image_transforms_qwen2_5.hpp

…sions into sayanshaw/qwen2-5-vl

shared/api/image_transforms.hpp

tianleiwu · 2025-12-05T18:09:08Z

shared/api/image_transforms_qwen2_5.hpp

+    }
+
+    // Add batch dimension (frames = 1 initially) and prepare patches vector
+    std::vector<float> patches = chw;


We shall avoid such vector copy.

I think we can allocate with padded space. Fill them with padded value. Then apply HWC -> CHW (need to compute destination offset with padded height / width).

Sayan Shaw added 2 commits November 11, 2025 16:17

add initial qwen2.5vl image processor support

cdec3e5

Merge branch 'main' of https://github.com/microsoft/onnxruntime-exten…

301b442

…sions into sayanshaw/qwen2-5-vl

apsonawane reviewed Nov 14, 2025

View reviewed changes

shared/api/image_transforms_qwen2_5.hpp Outdated Show resolved Hide resolved

Sayan Shaw added 2 commits November 19, 2025 15:49

refactor and improve processor

fea695e

Merge branch 'main' of https://github.com/microsoft/onnxruntime-exten…

3cf6969

…sions into sayanshaw/qwen2-5-vl

sayanshaw24 marked this pull request as ready for review November 20, 2025 00:05

sayanshaw24 requested a review from a team as a code owner November 20, 2025 00:05

Merge branch 'main' of https://github.com/microsoft/onnxruntime-exten…

892e608

…sions into sayanshaw/qwen2-5-vl

sayanshaw24 enabled auto-merge (squash) November 20, 2025 18:41

Merge branch 'main' of https://github.com/microsoft/onnxruntime-exten…

f4ddc1e

…sions into sayanshaw/qwen2-5-vl

hanbitmyths reviewed Nov 24, 2025

View reviewed changes

shared/api/image_transforms.hpp Show resolved Hide resolved

hanbitmyths reviewed Nov 24, 2025

View reviewed changes

shared/api/image_transforms.hpp Outdated Show resolved Hide resolved

resolve comments

d1a6068

hanbitmyths approved these changes Nov 26, 2025

View reviewed changes

apsonawane approved these changes Nov 26, 2025

View reviewed changes

sayanshaw24 merged commit ccab2e5 into main Nov 26, 2025
37 checks passed

sayanshaw24 deleted the sayanshaw/qwen2-5-vl branch November 26, 2025 19:48

tianleiwu reviewed Dec 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Initial Qwen-2.5VL Image Processor Support#1008

Add Initial Qwen-2.5VL Image Processor Support#1008
sayanshaw24 merged 7 commits intomainfrom
sayanshaw/qwen2-5-vl

sayanshaw24 commented Nov 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tianleiwu Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sayanshaw24 commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Validation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tianleiwu Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sayanshaw24 commented Nov 12, 2025 •

edited

Loading