Open
Description
Describe the issue
I have converted a pytorch model to onnx using the torch.onnx.export() with dynamo=True parameter, the onnx model is generated but when trying to use it in ORT inference session, it just prints segmenatation fault
.
Model files
model.onnx file -> https://drive.google.com/file/d/12FmtoHk7FxK85j-_jKiMs4_aFoVs6M3v/view
export report -> onnx_export_2025-03-18_14-12-53-706073_success.md
verbose logs from inference session:
2025-03-18 17:21:34.962599446 [I:onnxruntime:, inference_session.cc:590 TraceSessionOptions] Session Options { execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath:"" enable_mem_pattern:1 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntime_profile_ session_logid: session_log_severity_level:0 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:3 intra_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } use_per_session_threads:1 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: { } }
2025-03-18 17:21:34.962641744 [I:onnxruntime:, inference_session.cc:410 operator()] Flush-to-zero and denormal-as-zero are off
2025-03-18 17:21:34.962653170 [I:onnxruntime:, inference_session.cc:418 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2025-03-18 17:21:34.962661879 [I:onnxruntime:, inference_session.cc:436 ConstructorCommon] Dynamic block base set to 0
Segmentation fault
To reproduce
Using the following code to create inference session. It gives segmentation fault with CPUExecutionProvider as well
import onnxruntime as ort
import onnx
model_name = "model.onnx"
sess_options = ort.SessionOptions()
sess_options.log_severity_level = 0
model = onnx.load(model_name)
print(ort.get_device())
# Get the opset version
from onnxscript import ir
model = ir.load("model.onnx")
print(model.opset_imports)
onnx_model = onnx.load(model_name)
onnx.checker.check_model(onnx_model)
print("Model is valid")
ort_sess_gpu = ort.InferenceSession(model_name,
providers = [
'CUDAExecutionProvider',
],
sess_options = sess_options
)
Urgency
Currently blocked on this as we are unable to deploy this model to production
Platform
Linux
OS Version
Debian 5.10.234-1 (2025-02-24) x86_64 GNU/Linux
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.21.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 12.4