-
Notifications
You must be signed in to change notification settings - Fork 145
Open
Description
Hello!
Bug Report overview
- Exporting Qwen/Qwen3-Embedding-0.6B to OpenVINO results in
nan
and various warnings.
Details
Running the following script results in all nan
's:
from optimum.intel.openvino import OVModelForFeatureExtraction
from transformers import AutoTokenizer
model = OVModelForFeatureExtraction.from_pretrained("Qwen/Qwen3-Embedding-0.6B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-Embedding-0.6B")
sentences = ["This is an example sentence", "Each sentence is converted"]
for sentence in sentences:
inputs = tokenizer(sentence, return_tensors="pt")
outputs = model(**inputs)
print(outputs.last_hidden_state)
Importing `MambaCache` from `transformers.cache_utils` is deprecated and will be removed in a future version. Please import it from `transformers` or `transformers.models.mamba.cache_mamba` instead.
No OpenVINO files were found for Qwen/Qwen3-Embedding-0.6B, setting `export=True` to convert the model to the OpenVINO IR. Don't forget to save the resulting model with `.save_pretrained()`
`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.
[sic]\transformers\masking_utils.py:190: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if (padding_length := kv_length + kv_offset - attention_mask.shape[-1]) > 0:
[sic]\transformers\masking_utils.py:218: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if padding_mask is not None and padding_mask.shape[-1] > kv_length:
[sic]\transformers\integrations\sdpa_attention.py:82: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
is_causal = query.shape[2] > 1 and attention_mask is None and getattr(module, "is_causal", True)
tensor([[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]]])
tensor([[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]]])
I would expect this to approximately match the results with AutoModel
:
from transformers import AutoTokenizer
from transformers import AutoModel
model = AutoModel.from_pretrained("Qwen/Qwen3-Embedding-0.6B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-Embedding-0.6B")
sentences = ["This is an example sentence", "Each sentence is converted"]
for sentence in sentences:
inputs = tokenizer(sentence, return_tensors="pt")
outputs = model(**inputs)
print(outputs.last_hidden_state)
tensor([[[ 2.8325, -17.7200, 0.1776, ..., -6.9551, -12.0463, 0.8086],
[ -0.2501, -5.6465, -1.3577, ..., 1.5168, -0.8063, -3.2892],
[ 1.7796, 0.4740, -1.3742, ..., -1.9026, -0.6497, -0.7862],
[ 3.3753, -5.5741, -1.3082, ..., -2.4205, -0.3497, -2.8036],
[ -0.5573, -7.5063, -0.8194, ..., -0.1285, 2.5724, -3.2815],
[ -4.4455, -1.7790, -0.9880, ..., 1.1086, 4.6749, -1.2746]]],
grad_fn=<MulBackward0>)
tensor([[[ 2.5925e+00, -6.6789e+00, 9.9559e-03, ..., -6.1397e+00,
-1.2755e+01, 3.6784e-01],
[-5.9988e-01, -1.1165e+01, -8.8903e-01, ..., 1.1954e+00,
4.8131e-01, -9.6972e-01],
[-1.5776e+00, -7.8559e+00, -1.2252e+00, ..., -1.7063e+00,
9.8415e-01, 2.0721e+00],
[ 2.1419e+00, -1.1065e+01, -1.0917e+00, ..., -5.8802e-01,
-3.2069e+00, -4.4548e+00],
[ 2.0169e-01, 1.6636e+00, -1.0134e+00, ..., -6.3104e-01,
4.3270e+00, -1.5783e+00]]], grad_fn=<MulBackward0>)
I'm using transformers==4.55.4
and optimuml-intel==1.25.2
.
See UKPLab/sentence-transformers#3515 for more details. This is affecting an attempt to convert a Sentence Transformer model to OpenVINO.
cc @santhoshtr
- Tom Aarsen
Metadata
Metadata
Assignees
Labels
No labels