Skip to content

Creating ORT inference session from onnx model gives segmentation fault #24087

Open
@jayakommuru

Description

@jayakommuru

Describe the issue

I have converted a pytorch model to onnx using the torch.onnx.export() with dynamo=True parameter, the onnx model is generated but when trying to use it in ORT inference session, it just prints segmenatation fault.

Model files
model.onnx file -> https://drive.google.com/file/d/12FmtoHk7FxK85j-_jKiMs4_aFoVs6M3v/view
export report -> onnx_export_2025-03-18_14-12-53-706073_success.md

verbose logs from inference session:

2025-03-18 17:21:34.962599446 [I:onnxruntime:, inference_session.cc:590 TraceSessionOptions] Session Options {  execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath:"" enable_mem_pattern:1 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntime_profile_ session_logid: session_log_severity_level:0 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:3 intra_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str:  set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str:  set_denormal_as_zero: 0 } use_per_session_threads:1 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: {  } }
2025-03-18 17:21:34.962641744 [I:onnxruntime:, inference_session.cc:410 operator()] Flush-to-zero and denormal-as-zero are off
2025-03-18 17:21:34.962653170 [I:onnxruntime:, inference_session.cc:418 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2025-03-18 17:21:34.962661879 [I:onnxruntime:, inference_session.cc:436 ConstructorCommon] Dynamic block base set to 0
Segmentation fault

To reproduce

Using the following code to create inference session. It gives segmentation fault with CPUExecutionProvider as well

import onnxruntime as ort
import onnx

model_name = "model.onnx"

sess_options = ort.SessionOptions()
sess_options.log_severity_level = 0 

model = onnx.load(model_name)

print(ort.get_device())

# Get the opset version
from onnxscript import ir
model = ir.load("model.onnx")
print(model.opset_imports)


onnx_model = onnx.load(model_name)
onnx.checker.check_model(onnx_model)
print("Model is valid")




ort_sess_gpu = ort.InferenceSession(model_name,
        providers = [
            'CUDAExecutionProvider',
            ],
        sess_options = sess_options
        )

Urgency

Currently blocked on this as we are unable to deploy this model to production

Platform

Linux

OS Version

Debian 5.10.234-1 (2025-02-24) x86_64 GNU/Linux

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.21.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 12.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions