Change input shape for 3D position_ids of Qwen 2.5 VL with M-RoPE by CuriousPanCake · Pull Request #3400 · openvinotoolkit/openvino.genai

CuriousPanCake · 2026-02-26T14:28:09Z

The analysis has shown that the correct shape for the 3D position_ids tensor of Qwen 2.5 VL in the ContinuousBatching mode is not flattening, but [3, total_token_num]. It allows to preserve the correct 3D semantics for M-RoPE of the model. Instead of Reshaping on the transformation side, it is correct and logical to provide proper tensor from the GenAI side.

Ticket: CVS-167316

Signed-off-by: Andrii Staikov andrii.staikov@intel.com

Copilot

Pull request overview

This PR modifies the handling of 3D position_ids in continuous batching for models that use M-RoPE (Multi-dimensional Rotary Position Embedding), specifically Qwen2-VL models. The change fixes incorrect position_ids reshaping from flattening [3, 1, N] → [3*N] to properly squeezing [3, 1, N] → [3, N].

Changes:

Fixed M-RoPE position_ids handling by squeezing the pseudo-batch dimension instead of flattening all dimensions
Added more specific shape validation (checking for shape[0]==3 and shape[1]==1) before reshaping

Copilot · 2026-02-26T14:34:17Z

src/cpp/src/continuous_batching/model_runner.hpp

+        if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&
+            position_ids.get_shape()[1] == 1) {
+            // M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]
+            const auto& position_ids_shape = position_ids.get_shape();
+            position_ids.set_shape({position_ids_shape[0], position_ids_shape[2]});
        }


This change affects M-RoPE position_ids handling for models like Qwen2-VL, but there's no test coverage for Qwen2-VL models in continuous batching mode. The test test_vlm_continuous_batching_generate_vs_add_request uses @parametrize_one_model_pa which only tests MODEL_IDS[0] (tiny-random-minicpmv-2_6), not the Qwen2-VL models that use M-RoPE with 3D position_ids.

According to custom coding guideline 1000000 rule 9: "Tests have been updated or added to cover the new code." Since this fix addresses M-RoPE-specific position_ids handling, tests should be added or updated to verify that Qwen2-VL models work correctly with continuous batching, particularly in the generation phase where 3D position_ids [3, 1, N] are created.

Copilot · 2026-02-26T14:34:17Z

src/cpp/src/continuous_batching/model_runner.hpp

+        if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&
+            position_ids.get_shape()[1] == 1) {
+            // M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]
+            const auto& position_ids_shape = position_ids.get_shape();


The hardcoded check for shape[0] == 3 and shape[1] == 1 is specific to M-RoPE (Multi-dimensional Rotary Position Embedding) used by Qwen2-VL models. This is fragile because:

It assumes M-RoPE always uses exactly 3 dimensions and a batch size of 1

Other models with 3D position_ids but different shapes would bypass this logic

There's no documentation explaining why these specific values are checked

Consider:

Adding a comment explaining that these values are M-RoPE specific constants

Adding a model capability flag or configuration to indicate M-RoPE support rather than relying on shape inference

Validating that these constants match the actual M-RoPE implementation in Qwen2VL (currently hardcoded as shape [3, 1, N] in qwen2vl/classes.cpp:1208)

Suggested change

if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&

position_ids.get_shape()[1] == 1) {

// M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]

const auto& position_ids_shape = position_ids.get_shape();

// Qwen2-VL M-RoPE compatibility:

// Qwen2-VL uses Multi-dimensional Rotary Position Embedding (M-RoPE), where position_ids

// are produced with a fixed 3D layout [3, 1, total_token_num]. The leading dimension (3)

// corresponds to M-RoPE's internal components and the middle dimension is a pseudo-batch

// of size 1. The model implementation (see qwen2vl/classes.cpp, around line 1208) expects

// a squeezed layout [3, total_token_num] at inference time.

//

// To support Qwen2-VL without changing other models, handle only this specific M-RoPE

// shape by removing the pseudo-batch dimension.

constexpr std::size_t qwen2vl_mrope_components = 3; // M-RoPE internal components

constexpr std::size_t qwen2vl_mrope_batch = 1; // pseudo-batch dimension

const auto& position_ids_shape = position_ids.get_shape();

if (position_ids_shape.size() == 3 &&

position_ids_shape[0] == qwen2vl_mrope_components &&

position_ids_shape[1] == qwen2vl_mrope_batch) {

// M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]

Copilot · 2026-02-26T14:34:18Z

src/cpp/src/continuous_batching/model_runner.hpp

+        if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&
+            position_ids.get_shape()[1] == 1) {
+            // M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]
+            const auto& position_ids_shape = position_ids.get_shape();
+            position_ids.set_shape({position_ids_shape[0], position_ids_shape[2]});
        }


The PR description is incomplete and does not follow the template. According to the custom coding guideline 1000000, rule 1: "PR description must be aligned with pull_request_template.md and its checklist must be filled out."

The description should include:

A summary of the change (M-RoPE position_ids shape handling fix)

Why this change is needed

Whether tests have been added or updated

Ticket information if applicable

All checklist items must be completed

CuriousPanCake · 2026-02-26T14:53:54Z

Requires: openvinotoolkit/openvino#34365

CuriousPanCake · 2026-03-04T15:33:27Z

The tests will pass once we merge the runtime PR: openvinotoolkit/openvino#34365

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Copilot · 2026-03-11T09:19:51Z

src/cpp/src/visual_language/pipeline.cpp

+// TODO: remove it when GEMMA3 ticket-171180 is fixed
 bool requires_sdpa(const std::filesystem::path& models_dir) {
    auto vlm_config = utils::from_config_json_if_exists<VLMConfig>(models_dir, "config.json");
-    return vlm_config.model_type == VLMModelType::QWEN2_VL ||
-           vlm_config.model_type == VLMModelType::QWEN2_5_VL ||
-           vlm_config.model_type == VLMModelType::GEMMA3;
+    return vlm_config.model_type == VLMModelType::GEMMA3;
 }


PR description does not follow the repository PR template: the checklist items are missing and the ticket placeholder format (CVS-###) is not used. Please update the PR description to match .github/pull_request_template.md (include the checklist with checked/unchecked items and the CVS ticket line).

Copilot · 2026-03-11T09:19:51Z

src/cpp/src/continuous_batching/model_runner.hpp

+        if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&
+            position_ids.get_shape()[1] == 1) {
+            // M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]
+            const auto& position_ids_shape = position_ids.get_shape();
+            position_ids.set_shape({position_ids_shape[0], position_ids_shape[2]});
        }


The new M-RoPE position_ids shape adjustment ([3, 1, total_token_num] -> [3, total_token_num]) is a behavior change in the ContinuousBatching inference path, but there does not appear to be any C++ test coverage asserting the expected position_ids rank/shape for VLM/Qwen CB runs. Please add a regression test that fails if 3D position_ids are flattened or otherwise reshaped incorrectly.

wip

1705946

CuriousPanCake requested a review from popovaan as a code owner February 26, 2026 14:28

Copilot AI review requested due to automatic review settings February 26, 2026 14:28

github-actions bot added the category: continuous batching Continuous batching label Feb 26, 2026

CuriousPanCake changed the title ~~wip~~ Change input shape for 3D position_ids of Qwen 2.5 VL with M-RoPE Feb 26, 2026

Copilot started reviewing on behalf of CuriousPanCake February 26, 2026 14:28 View session

Copilot AI reviewed Feb 26, 2026

View reviewed changes

popovaan approved these changes Feb 26, 2026

View reviewed changes

CuriousPanCake mentioned this pull request Feb 26, 2026

[TRANSFORMATIONS] Add Qwen 2.5 VL M-RoPE handling to SDPAToPA openvinotoolkit/openvino#34365

Merged

yatarkan approved these changes Feb 26, 2026

View reviewed changes

yatarkan mentioned this pull request Feb 27, 2026

[Testing] Support Qwen3-VL + LoRA for VLM + PA accuracy fix (M-RoPE, Qwen VL models) #3409

Closed

4 tasks

yatarkan mentioned this pull request Mar 6, 2026

[VLM] Support Qwen3-VL model #3253

Merged

3 tasks

yatarkan added the Code Freeze label Mar 9, 2026

yatarkan added this to the 2026.1 milestone Mar 9, 2026

yatarkan and others added 2 commits March 11, 2026 13:14

Merge branch 'master' into CVS-167316_genai

2f123b8

enable qwen

5fd821f

Copilot AI review requested due to automatic review settings March 11, 2026 09:14

github-actions bot added the category: visual language Visual language pipeline label Mar 11, 2026

Copilot started reviewing on behalf of CuriousPanCake March 11, 2026 09:15 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

yatarkan enabled auto-merge March 11, 2026 11:37

Merge branch 'master' into CVS-167316_genai

49f8439

yatarkan added this pull request to the merge queue Mar 12, 2026

Merged via the queue into openvinotoolkit:master with commit 603d66f Mar 12, 2026
173 of 178 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change input shape for 3D position_ids of Qwen 2.5 VL with M-RoPE#3400

Change input shape for 3D position_ids of Qwen 2.5 VL with M-RoPE#3400
yatarkan merged 4 commits intoopenvinotoolkit:masterfrom
CuriousPanCake:CVS-167316_genai

CuriousPanCake commented Feb 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

CuriousPanCake commented Feb 26, 2026

Uh oh!

CuriousPanCake commented Mar 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-        if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&
-            position_ids.get_shape()[1] == 1) {
-            // M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]
-            const auto& position_ids_shape = position_ids.get_shape();
+        // Qwen2-VL M-RoPE compatibility:
+        // Qwen2-VL uses Multi-dimensional Rotary Position Embedding (M-RoPE), where position_ids
+        // are produced with a fixed 3D layout [3, 1, total_token_num]. The leading dimension (3)
+        // corresponds to M-RoPE's internal components and the middle dimension is a pseudo-batch
+        // of size 1. The model implementation (see qwen2vl/classes.cpp, around line 1208) expects
+        // a squeezed layout [3, total_token_num] at inference time.
+        //
+        // To support Qwen2-VL without changing other models, handle only this specific M-RoPE
+        // shape by removing the pseudo-batch dimension.
+        constexpr std::size_t qwen2vl_mrope_components = 3;  // M-RoPE internal components
+        constexpr std::size_t qwen2vl_mrope_batch      = 1;  // pseudo-batch dimension
+        const auto& position_ids_shape = position_ids.get_shape();
+        if (position_ids_shape.size() == 3 &&
+            position_ids_shape[0] == qwen2vl_mrope_components &&
+            position_ids_shape[1] == qwen2vl_mrope_batch) {
+            // M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]

Conversation

CuriousPanCake commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

CuriousPanCake commented Feb 26, 2026

Uh oh!

CuriousPanCake commented Mar 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CuriousPanCake commented Feb 26, 2026 •

edited

Loading