[Bug]: NPU compile_model fails with "to_shape was called on a dynamic shape" for Qwen3-0.6B INT4 (Lunar Lake)

### OpenVINO Version

2026.0.0-20965-c6d6a13a886-releases/2026/0

### Operating System

Windows System

### Device used for inference

NPU

### Framework

None

### Model used

Qwen/Qwen3-0.6B

### Issue description

`ov.Core().compile_model(model, "NPU")` fails with a `RuntimeError` when compiling
a Qwen3-0.6B INT4 OpenVINO IR model on NPU (Intel NPU 4000, Lunar Lake). The VPUX
compiler encounters a Gather node with dynamic input dimensions (upper bounds set to
`INT64_MAX`) and raises a "non broadcastable dimensions" diagnostic, followed by a
`to_shape was called on a dynamic shape` exception.

The identical model IR compiles and runs correctly on both GPU and CPU.

### Analysis

The model IR exported by optimum-intel uses dynamic dimensions for batch and sequence
length. The upper bounds for these dimensions contain `INT64_MAX` (9223372036854775807),
which indicates they were not resolved to concrete values during export.

The VPUX compiler encounters these unresolved dynamic dimensions at multiple nodes:
- `Convert_1` (embedding input)
- `Gather` (embedding lookup) — **first fatal diagnostic**
- `Power`, `ReduceMean`, `Multiply` (layer norm computations)

The Gather node triggers the critical error: the VPUX compiler attempts to broadcast
dimensions `9223372036854775807` and `-9223372036854775808` (signed overflow of INT64_MAX),
which fails. The compiler then calls `to_shape()` on the resulting dynamic `PartialShape`,
which is not supported and raises the exception.

**Key source locations from the exception chain:**
- `src\core\src\partial_shape.cpp:266` — `to_shape()` called on dynamic shape
- `src\plugins\intel_npu\src\compiler_adapter\src\ze_graph_ext_wrappers.cpp:405` — L0 graph creation failure
- `src\plugins\intel_npu\src\plugin\src\plugin.cpp:879` — NPU plugin propagation

### Cross-Device Comparison

| Device | Result |
|---|---|
| GPU (`compile_model(model, "GPU")`) | **PASS** — compiles and runs correctly |
| CPU (`compile_model(model, "CPU")`) | **PASS** — compiles and runs correctly |
| NPU (`compile_model(model, "NPU")`) | **FAIL** — `RuntimeError: to_shape was called on a dynamic shape` |

The GPU and CPU plugins handle the dynamic dimensions correctly. The NPU plugin does not.

### Environment

| Component | Version |
|---|---|
| OpenVINO | 2026.0.0-20965-c6d6a13a886-releases/2026/0 |
| OpenVINO GenAI | 2026.0.0.0-2820-dab5b993a38 |
| optimum-intel | 1.27.0 |
| nncf | 3.0.0 |
| transformers | 4.51.3 |
| OS | Windows 11 Build 26200 |
| Hardware | Core Ultra 7 258V (Lunar Lake), Arc 140V (Xe2), NPU 4000 |
| NPU driver | 32.0.100.4514 |
| GPU driver | 32.0.101.6987 |

### Model Details

| Property | Value |
|---|---|
| Model | Qwen/Qwen3-0.6B |
| Format | OpenVINO IR (exported via optimum-intel 1.27.0) |
| Quantization | INT4, per-group (group_size=128), asymmetric |
| Model size on disk | 367.26 MB |
| Inputs | 4 (dynamic batch + sequence dimensions) |
| Outputs | 1 |
| SHA256 (openvino_model.bin) | `9d25652b603f65c5a507ef9c4d35c285bbf94e116b707aa20973cff57c2226fd` |
| SHA256 (openvino_model.xml) | `c568dab243f546ff9f259fdbd1de25091a001949fee197faa1cdd3e784ba7895` |

### Related Issues
**Related issue:** This was discovered during investigation of #34450 (a separate
`as_convolution` LLVM ABORT affecting the same model via the heterogeneous LLMPipeline
path). The two bugs are distinct — this one occurs earlier in the VPUX compiler pipeline
and surfaces as a catchable `RuntimeError`, whereas #34450 terminates the process with
an uncatchable SIGABRT.

- **#32466** — Same `INT64_MAX` upper bounds error (`Upper bounds were not specified, got the default value - '9223372036854775807'`) on NPU with SenseVoice (ONNX). Still open, assigned to `@dmatveev`. Different model and export path, same VPUX compiler failure.
- **#26375** — Same `to_shape was called on a dynamic shape` error on NPU with GNN model (dynamic batch). Closed as stale. `@YuChern-Intel` confirmed "only static shapes are supported on NPU."
- **#26357** — Same `to_shape` error on NPU with PixArt model (dynamic input dimensions). Closed as stale. Workaround suggested: reshape to static inputs.
- **#24619** — `get_shape was called on a descriptor::Tensor with dynamic shape` variant on NPU (HuggingFace classification model). Closed. `@avitial` confirmed "no dynamism support on NPU in the driver."
- **#34450** — Separate crash (LLVM ABORT in `as_convolution` pass) affecting the same Qwen3-0.6B model via the heterogeneous `LLMPipeline` path. Different failure mode: #34450 is an uncatchable SIGABRT deeper in the VPUX compiler; this issue is an earlier, catchable `RuntimeError` on the direct `compile_model("NPU")` path.

### Step-by-step reproduction

**1. Export the model (one-time):**
```bash
optimum-cli export openvino \
  --model Qwen/Qwen3-0.6B \
  --weight-format int4 \
  --group-size 128 \
  --ratio 1.0 \
  <output_dir>
```

**2. Attempt NPU compilation:**
```python
import openvino as ov

core = ov.Core()
model = core.read_model("<output_dir>/openvino_model.xml")
print(f"Model inputs: {len(model.inputs)}, outputs: {len(model.outputs)}")
compiled = core.compile_model(model, "NPU")  # fails here
```

### Expected Behavior

The model should compile for NPU, consistent with GPU and CPU behavior.

### Actual Behavior

The VPUX compiler emits multiple `[ERROR]` diagnostics about unspecified upper bounds
on nodes with dynamic dimensions, then fails with a `RuntimeError`.

### Relevant log output

```shell
**Full stdout:**

OpenVINO: 2026.0.0-20965-c6d6a13a886-releases/2026/0
Available devices: ['CPU', 'GPU', 'NPU']
Reading model...
Model read: 4 inputs, 1 outputs
Compiling for NPU...
[ERROR] 10:33:48.885 [IE::FrontEnd::importNetwork]   Upper bounds are not specified for node '__module.model.embed_tokens/ov_ext::embedding/Convert_1' (type 'Convert'): input '0' bounds are '[9223372036854775807, 9223372036854775807]'
[ERROR] 10:33:48.885 [IE::FrontEnd::importNetwork]   Upper bounds are not specified for node '__module.model.embed_tokens/ov_ext::embedding/Gather' (type 'Gather'): input '1' bounds are '[9223372036854775807, 9223372036854775807]'
[ERROR] 10:33:48.885 [IE::FrontEnd::importNetwork]   Upper bounds are not specified for node '__module.model.layers.0.input_layernorm/aten::pow/Power' (type 'Power'): input '0' bounds are '[9223372036854775807, 9223372036854775807, 1024]'
[ERROR] 10:33:48.885 [IE::FrontEnd::importNetwork]   Upper bounds are not specified for node '__module.model.layers.0.input_layernorm/aten::mean/ReduceMean' (type 'ReduceMean'): input '0' bounds are '[9223372036854775807, 9223372036854775807, 1024]'
[ERROR] 10:33:48.885 [vpux-compiler] Got Diagnostic at loc(fused<{name = "__module.model.embed_tokens/ov_ext::embedding/Gather", type = "Gather"}>["__module.model.embed_tokens/ov_ext::embedding/Gather"]) : Got non broadcastable dimensions pair : '9223372036854775807' and -9223372036854775808'
[ERROR] 10:33:48.886 [IE::FrontEnd::importNetwork]   Upper bounds are not specified for node '__module.model.layers.0.input_layernorm/aten::mul/Multiply' (type 'Multiply'): input '0' bounds are '[9223372036854775807, 9223372036854775807, 1024]'
Python exception: Exception from src\inference\src\cpp\core.cpp:113:
Exception from src\inference\src\dev\plugin.cpp:53:
Exception from src\plugins\intel_npu\src\plugin\src\plugin.cpp:879:
Exception from src\plugins\intel_npu\src\compiler_adapter\src\ze_graph_ext_wrappers.cpp:405:
L0 pfnCreate2 result: ZE_RESULT_ERROR_INVALID_ARGUMENT, code 0x78000004 - generic error code for invalid arguments . [NPU_VCL] Compiler returned msg:
Exception from src\core\src\partial_shape.cpp:266:
to_shape was called on a dynamic shape.


**Full stderr:**

loc(fused<{name = "__module.model.embed_tokens/ov_ext::embedding/Gather", type = "Gather"}>["__module.model.embed_tokens/ov_ext::embedding/Gather"]): error: Got non broadcastable dimensions pair : '9223372036854775807' and -9223372036854775808'
Traceback (most recent call last):
  File "npu_compile_attempt.py", line 21, in <module>
    compiled = core.compile_model(model, "NPU")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "openvino/_ov_api.py", line 646, in compile_model
    super().compile_model(model, device_name, {} if config is None else config),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Exception from src\inference\src\cpp\core.cpp:113:
Exception from src\inference\src\dev\plugin.cpp:53:
Exception from src\plugins\intel_npu\src\plugin\src\plugin.cpp:879:
Exception from src\plugins\intel_npu\src\compiler_adapter\src\ze_graph_ext_wrappers.cpp:405:
L0 pfnCreate2 result: ZE_RESULT_ERROR_INVALID_ARGUMENT, code 0x78000004 - generic error code for invalid arguments . [NPU_VCL] Compiler returned msg:
Exception from src\core\src\partial_shape.cpp:266:
to_shape was called on a dynamic shape.
```

### Issue submission checklist

- [x] I'm reporting an issue. It's not a question.
- [x] I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
- [x] There is reproducer code and related data files such as images, videos, models, etc.

Device	Result
GPU (`compile_model(model, "GPU")`)	PASS — compiles and runs correctly
CPU (`compile_model(model, "CPU")`)	PASS — compiles and runs correctly
NPU (`compile_model(model, "NPU")`)	FAIL — `RuntimeError: to_shape was called on a dynamic shape`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: NPU compile_model fails with "to_shape was called on a dynamic shape" for Qwen3-0.6B INT4 (Lunar Lake) #34617

OpenVINO Version

Operating System

Device used for inference

Framework

Model used

Issue description

Analysis

Cross-Device Comparison

Environment

Model Details

Related Issues

Step-by-step reproduction

Expected Behavior

Actual Behavior

Relevant log output

Issue submission checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component	Version
OpenVINO	2026.0.0-20965-c6d6a13a886-releases/2026/0
OpenVINO GenAI	2026.0.0.0-2820-dab5b993a38
optimum-intel	1.27.0
nncf	3.0.0
transformers	4.51.3
OS	Windows 11 Build 26200
Hardware	Core Ultra 7 258V (Lunar Lake), Arc 140V (Xe2), NPU 4000
NPU driver	32.0.100.4514
GPU driver	32.0.101.6987

Property	Value
Model	Qwen/Qwen3-0.6B
Format	OpenVINO IR (exported via optimum-intel 1.27.0)
Quantization	INT4, per-group (group_size=128), asymmetric
Model size on disk	367.26 MB
Inputs	4 (dynamic batch + sequence dimensions)
Outputs	1
SHA256 (openvino_model.bin)	`9d25652b603f65c5a507ef9c4d35c285bbf94e116b707aa20973cff57c2226fd`
SHA256 (openvino_model.xml)	`c568dab243f546ff9f259fdbd1de25091a001949fee197faa1cdd3e784ba7895`

[Bug]: NPU compile_model fails with "to_shape was called on a dynamic shape" for Qwen3-0.6B INT4 (Lunar Lake) #34617

Description

OpenVINO Version

Operating System

Device used for inference

Framework

Model used

Issue description

Analysis

Cross-Device Comparison

Environment

Model Details

Related Issues

Step-by-step reproduction

Expected Behavior

Actual Behavior

Relevant log output

Issue submission checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions