Open
Description
Describe the issue
I have converted a pytorch model to onnx using the torch.onnx.export() with dynamo=True parameter, the onnx model is generated but when trying to use it in ORT inference session, it just prints segmenatation fault
.
Model files
model.onnx file -> https://drive.google.com/file/d/12FmtoHk7FxK85j-_jKiMs4_aFoVs6M3v/view
export report -> onnx_export_2025-03-18_14-12-53-706073_success.md
verbose logs from inference session:
2025-03-18 17:21:34.962599446 [I:onnxruntime:, inference_session.cc:590 TraceSessionOptions] Session Options { execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath:"" enable_mem_pattern:1 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntime_profile_ session_logid: session_log_severity_level:0 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:3 intra_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } use_per_session_threads:1 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: { } }
2025-03-18 17:21:34.962641744 [I:onnxruntime:, inference_session.cc:410 operator()] Flush-to-zero and denormal-as-zero are off
2025-03-18 17:21:34.962653170 [I:onnxruntime:, inference_session.cc:418 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2025-03-18 17:21:34.962661879 [I:onnxruntime:, inference_session.cc:436 ConstructorCommon] Dynamic block base set to 0
Segmentation fault
To reproduce
Using the following code to create inference session. It gives segmentation fault with CPUExecutionProvider as well
import onnxruntime as ort
import onnx
model_name = "model.onnx"
sess_options = ort.SessionOptions()
sess_options.log_severity_level = 0
model = onnx.load(model_name)
print(ort.get_device())
# Get the opset version
from onnxscript import ir
model = ir.load("model.onnx")
print(model.opset_imports)
onnx_model = onnx.load(model_name)
onnx.checker.check_model(onnx_model)
print("Model is valid")
ort_sess_gpu = ort.InferenceSession(model_name,
providers = [
'CUDAExecutionProvider',
],
sess_options = sess_options
)
Urgency
Currently blocked on this as we are unable to deploy this model to production
Platform
Linux
OS Version
Debian 5.10.234-1 (2025-02-24) x86_64 GNU/Linux
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.21.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 12.4
Metadata
Metadata
Assignees
Labels
No labels