Skip to content

Change input shape for 3D position_ids of Qwen 2.5 VL with M-RoPE#3400

Merged
yatarkan merged 4 commits intoopenvinotoolkit:masterfrom
CuriousPanCake:CVS-167316_genai
Mar 12, 2026
Merged

Change input shape for 3D position_ids of Qwen 2.5 VL with M-RoPE#3400
yatarkan merged 4 commits intoopenvinotoolkit:masterfrom
CuriousPanCake:CVS-167316_genai

Conversation

@CuriousPanCake
Copy link
Contributor

@CuriousPanCake CuriousPanCake commented Feb 26, 2026

The analysis has shown that the correct shape for the 3D position_ids tensor of Qwen 2.5 VL in the ContinuousBatching mode is not flattening, but [3, total_token_num]. It allows to preserve the correct 3D semantics for M-RoPE of the model. Instead of Reshaping on the transformation side, it is correct and logical to provide proper tensor from the GenAI side.

Signed-off-by: Andrii Staikov andrii.staikov@intel.com

Copilot AI review requested due to automatic review settings February 26, 2026 14:28
@github-actions github-actions bot added the category: continuous batching Continuous batching label Feb 26, 2026
@CuriousPanCake CuriousPanCake changed the title wip Change input shape for 3D position_ids of Qwen 2.5 VL with M-RoPE Feb 26, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modifies the handling of 3D position_ids in continuous batching for models that use M-RoPE (Multi-dimensional Rotary Position Embedding), specifically Qwen2-VL models. The change fixes incorrect position_ids reshaping from flattening [3, 1, N] → [3*N] to properly squeezing [3, 1, N] → [3, N].

Changes:

  • Fixed M-RoPE position_ids handling by squeezing the pseudo-batch dimension instead of flattening all dimensions
  • Added more specific shape validation (checking for shape[0]==3 and shape[1]==1) before reshaping

Comment on lines +502 to 507
if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&
position_ids.get_shape()[1] == 1) {
// M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]
const auto& position_ids_shape = position_ids.get_shape();
position_ids.set_shape({position_ids_shape[0], position_ids_shape[2]});
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change affects M-RoPE position_ids handling for models like Qwen2-VL, but there's no test coverage for Qwen2-VL models in continuous batching mode. The test test_vlm_continuous_batching_generate_vs_add_request uses @parametrize_one_model_pa which only tests MODEL_IDS[0] (tiny-random-minicpmv-2_6), not the Qwen2-VL models that use M-RoPE with 3D position_ids.

According to custom coding guideline 1000000 rule 9: "Tests have been updated or added to cover the new code." Since this fix addresses M-RoPE-specific position_ids handling, tests should be added or updated to verify that Qwen2-VL models work correctly with continuous batching, particularly in the generation phase where 3D position_ids [3, 1, N] are created.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +502 to +505
if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&
position_ids.get_shape()[1] == 1) {
// M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]
const auto& position_ids_shape = position_ids.get_shape();
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded check for shape[0] == 3 and shape[1] == 1 is specific to M-RoPE (Multi-dimensional Rotary Position Embedding) used by Qwen2-VL models. This is fragile because:

  1. It assumes M-RoPE always uses exactly 3 dimensions and a batch size of 1
  2. Other models with 3D position_ids but different shapes would bypass this logic
  3. There's no documentation explaining why these specific values are checked

Consider:

  • Adding a comment explaining that these values are M-RoPE specific constants
  • Adding a model capability flag or configuration to indicate M-RoPE support rather than relying on shape inference
  • Validating that these constants match the actual M-RoPE implementation in Qwen2VL (currently hardcoded as shape [3, 1, N] in qwen2vl/classes.cpp:1208)
Suggested change
if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&
position_ids.get_shape()[1] == 1) {
// M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]
const auto& position_ids_shape = position_ids.get_shape();
// Qwen2-VL M-RoPE compatibility:
// Qwen2-VL uses Multi-dimensional Rotary Position Embedding (M-RoPE), where position_ids
// are produced with a fixed 3D layout [3, 1, total_token_num]. The leading dimension (3)
// corresponds to M-RoPE's internal components and the middle dimension is a pseudo-batch
// of size 1. The model implementation (see qwen2vl/classes.cpp, around line 1208) expects
// a squeezed layout [3, total_token_num] at inference time.
//
// To support Qwen2-VL without changing other models, handle only this specific M-RoPE
// shape by removing the pseudo-batch dimension.
constexpr std::size_t qwen2vl_mrope_components = 3; // M-RoPE internal components
constexpr std::size_t qwen2vl_mrope_batch = 1; // pseudo-batch dimension
const auto& position_ids_shape = position_ids.get_shape();
if (position_ids_shape.size() == 3 &&
position_ids_shape[0] == qwen2vl_mrope_components &&
position_ids_shape[1] == qwen2vl_mrope_batch) {
// M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]

Copilot uses AI. Check for mistakes.
Comment on lines +502 to 507
if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&
position_ids.get_shape()[1] == 1) {
// M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]
const auto& position_ids_shape = position_ids.get_shape();
position_ids.set_shape({position_ids_shape[0], position_ids_shape[2]});
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description is incomplete and does not follow the template. According to the custom coding guideline 1000000, rule 1: "PR description must be aligned with pull_request_template.md and its checklist must be filled out."

The description should include:

  • A summary of the change (M-RoPE position_ids shape handling fix)
  • Why this change is needed
  • Whether tests have been added or updated
  • Ticket information if applicable
  • All checklist items must be completed

Copilot generated this review using guidance from repository custom instructions.
@CuriousPanCake
Copy link
Contributor Author

Requires: openvinotoolkit/openvino#34365

@CuriousPanCake
Copy link
Contributor Author

The tests will pass once we merge the runtime PR: openvinotoolkit/openvino#34365

@yatarkan yatarkan mentioned this pull request Mar 6, 2026
3 tasks
@yatarkan yatarkan added this to the 2026.1 milestone Mar 9, 2026
Copilot AI review requested due to automatic review settings March 11, 2026 09:14
@github-actions github-actions bot added the category: visual language Visual language pipeline label Mar 11, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment on lines +640 to 644
// TODO: remove it when GEMMA3 ticket-171180 is fixed
bool requires_sdpa(const std::filesystem::path& models_dir) {
auto vlm_config = utils::from_config_json_if_exists<VLMConfig>(models_dir, "config.json");
return vlm_config.model_type == VLMModelType::QWEN2_VL ||
vlm_config.model_type == VLMModelType::QWEN2_5_VL ||
vlm_config.model_type == VLMModelType::GEMMA3;
return vlm_config.model_type == VLMModelType::GEMMA3;
}
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description does not follow the repository PR template: the checklist items are missing and the ticket placeholder format (CVS-###) is not used. Please update the PR description to match .github/pull_request_template.md (include the checklist with checked/unchecked items and the CVS ticket line).

Copilot uses AI. Check for mistakes.
Comment on lines +517 to 522
if (position_ids.get_shape().size() == 3 && position_ids.get_shape()[0] == 3 &&
position_ids.get_shape()[1] == 1) {
// M-RoPE: squeeze pseudo-batch dim [3, 1, total_token_num] -> [3, total_token_num]
const auto& position_ids_shape = position_ids.get_shape();
position_ids.set_shape({position_ids_shape[0], position_ids_shape[2]});
}
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new M-RoPE position_ids shape adjustment ([3, 1, total_token_num] -> [3, total_token_num]) is a behavior change in the ContinuousBatching inference path, but there does not appear to be any C++ test coverage asserting the expected position_ids rank/shape for VLM/Qwen CB runs. Please add a regression test that fails if 3D position_ids are flattened or otherwise reshaped incorrectly.

Copilot generated this review using guidance from repository custom instructions.
@yatarkan yatarkan enabled auto-merge March 11, 2026 11:37
@yatarkan yatarkan added this pull request to the merge queue Mar 12, 2026
Merged via the queue into openvinotoolkit:master with commit 603d66f Mar 12, 2026
173 of 178 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants