Describe the issue
If a GPT-like ONNX model contains SkipLayerNormalization nodes, an error occurs during loading with the OpenVINOExecutionProvider; however, the model works well with other ExecutionProviders, such as CPU, CUDA, and DML. The following is the error message:
line 561, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /onnxruntime/onnxruntime/core/providers/openvino/ov_interface.cc:79 std::shared_ptr<ov::Model> onnxruntime::openvino_ep::OVCore::ReadModel(std::string&&, const std::string&) [OpenVINO-EP] [OpenVINO-EP] Exception while Reading network: Check 'onnx_node.get_outputs_size() <= outputs_size' failed at src/frontends/onnx/frontend/src/core/graph.cpp:392:
FrontEnd API failed with GeneralFailure:
Expected output number of SkipLayerNormalization node is 4 while the implementation provides 1 outputs
To reproduce
import onnxruntime
onnx_model_A = "Whisper_Encoder.onnx"
provider_options = [
{
'device_type': 'CPU',
'precision': 'ACCURACY',
'num_of_threads': 1,
'num_streams': 1,
'enable_opencl_throttling': True,
'enable_qdq_optimizer': False,
'disable_dynamic_shapes': False
}
]
session_opts = onnxruntime.SessionOptions()
session_opts.log_severity_level = 0
ort_session_A = onnxruntime.InferenceSession(onnx_model_A, sess_options=session_opts, providers=['OpenVINOExecutionProvider'], provider_options=provider_options)
Urgency
No response
Platform
Linux
OS Version
Ubuntu 24.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.22.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
OpenVINO
Execution Provider Library Version
1.22.0
Model File
Whisper_Encoder.zip
Is this a quantized model?
No
Describe the issue
If a GPT-like ONNX model contains
SkipLayerNormalizationnodes, an error occurs during loading with theOpenVINOExecutionProvider; however, the model works well with other ExecutionProviders, such as CPU, CUDA, and DML. The following is the error message:To reproduce
Urgency
No response
Platform
Linux
OS Version
Ubuntu 24.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.22.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
OpenVINO
Execution Provider Library Version
1.22.0
Model File
Whisper_Encoder.zip
Is this a quantized model?
No