Qwen3 OpenVINO support failing

Hello!

## Bug Report overview
* Exporting Qwen/Qwen3-Embedding-0.6B to OpenVINO results in `nan` and various warnings.

## Details
Running the following script results in all `nan`'s:
```python
from optimum.intel.openvino import OVModelForFeatureExtraction
from transformers import AutoTokenizer

model = OVModelForFeatureExtraction.from_pretrained("Qwen/Qwen3-Embedding-0.6B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-Embedding-0.6B")

sentences = ["This is an example sentence", "Each sentence is converted"]

for sentence in sentences:
    inputs = tokenizer(sentence, return_tensors="pt")
    outputs = model(**inputs)
    print(outputs.last_hidden_state)
```

```
Importing `MambaCache` from `transformers.cache_utils` is deprecated and will be removed in a future version. Please import it from `transformers` or `transformers.models.mamba.cache_mamba` instead.
No OpenVINO files were found for Qwen/Qwen3-Embedding-0.6B, setting `export=True` to convert the model to the OpenVINO IR. Don't forget to save the resulting model with `.save_pretrained()`
`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.
[sic]\transformers\masking_utils.py:190: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if (padding_length := kv_length + kv_offset - attention_mask.shape[-1]) > 0:
[sic]\transformers\masking_utils.py:218: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if padding_mask is not None and padding_mask.shape[-1] > kv_length:
[sic]\transformers\integrations\sdpa_attention.py:82: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!   
  is_causal = query.shape[2] > 1 and attention_mask is None and getattr(module, "is_causal", True)
tensor([[[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan]]])
tensor([[[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan]]])
```

I would expect this to approximately match the results with `AutoModel`:
```python
from transformers import AutoTokenizer
from transformers import AutoModel

model = AutoModel.from_pretrained("Qwen/Qwen3-Embedding-0.6B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-Embedding-0.6B")

sentences = ["This is an example sentence", "Each sentence is converted"]

for sentence in sentences:
    inputs = tokenizer(sentence, return_tensors="pt")
    outputs = model(**inputs)
    print(outputs.last_hidden_state)
```
```
tensor([[[  2.8325, -17.7200,   0.1776,  ...,  -6.9551, -12.0463,   0.8086],
         [ -0.2501,  -5.6465,  -1.3577,  ...,   1.5168,  -0.8063,  -3.2892],
         [  1.7796,   0.4740,  -1.3742,  ...,  -1.9026,  -0.6497,  -0.7862],
         [  3.3753,  -5.5741,  -1.3082,  ...,  -2.4205,  -0.3497,  -2.8036],
         [ -0.5573,  -7.5063,  -0.8194,  ...,  -0.1285,   2.5724,  -3.2815],
         [ -4.4455,  -1.7790,  -0.9880,  ...,   1.1086,   4.6749,  -1.2746]]],
       grad_fn=<MulBackward0>)
tensor([[[ 2.5925e+00, -6.6789e+00,  9.9559e-03,  ..., -6.1397e+00,
          -1.2755e+01,  3.6784e-01],
         [-5.9988e-01, -1.1165e+01, -8.8903e-01,  ...,  1.1954e+00,
           4.8131e-01, -9.6972e-01],
         [-1.5776e+00, -7.8559e+00, -1.2252e+00,  ..., -1.7063e+00,
           9.8415e-01,  2.0721e+00],
         [ 2.1419e+00, -1.1065e+01, -1.0917e+00,  ..., -5.8802e-01,
          -3.2069e+00, -4.4548e+00],
         [ 2.0169e-01,  1.6636e+00, -1.0134e+00,  ..., -6.3104e-01,
           4.3270e+00, -1.5783e+00]]], grad_fn=<MulBackward0>)
```

I'm using `transformers==4.55.4` and `optimuml-intel==1.25.2`.

See https://github.com/UKPLab/sentence-transformers/issues/3515 for more details. This is affecting an attempt to convert a Sentence Transformer model to OpenVINO.

cc @santhoshtr

- Tom Aarsen

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qwen3 OpenVINO support failing #1446

Bug Report overview

Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Qwen3 OpenVINO support failing #1446

Description

Bug Report overview

Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions