-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
This is an issue for tracking the confirmed bug in VULN-143221
Description
A vulnerability exists in ONNXRuntime (tested on version 1.19.2 & 1.20.1), where setting an explicit graph_optimization_level in onnxruntime.SessionOptions can cause significant inconsistencies in inference results. When running the same model with identical inputs, the outputs differ depending on whether the optimization level is explicitly configured (e.g., ORT_DISABLE_ALL) or left as the default.
This behavior undermines the reliability of ONNX Runtime, particularly in scenarios where consistent outputs are critical for model validation, deployment, and production environments.
Reproduction steps
-
Step 1. Download the ONNX model (i.e.., model3.onnx) with crafted structures in this link.
-
Step 2. Run inference twice with the same input data:
Once with default settings (no explicit graph_optimization_level).
Once with sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL. -
Step 3.
Compare the results using np.testing.assert_allclose. The outputs may differ significantly, triggering an exception if the discrepancy exceeds the specified tolerance (atol=1e-3, rtol=1e-3).
import onnxruntime as ort
import numpy as np
def test_graph_optimization_discrepancy(model_path):
input_data = {"v10_0": np.random.rand(60).astype(np.float16)}
# Default session
session1 = ort.InferenceSession(model_path)
output_names = [output.name for output in session1.get_outputs()]
results1 = session1.run(output_names, input_data)
# Session with explicitly disabled graph optimization
sess_options = ort.SessionOptions()
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL
session2 = ort.InferenceSession(model_path, sess_options)
results2 = session2.run(output_names, input_data)
# Compare results
for r1, r2 in zip(results1, results2):
np.testing.assert_allclose(r1, r2, atol=1e-3, rtol=1e-3)
test_graph_optimization_discrepancy("model3.onnx")
Callstack
Traceback (most recent call last):
File "D:/code/python/OPTFuzz/ONNX/bugs/bug3.py", line 24, in <module>
test_graph_optimization_discrepancy("model3.onnx")
File "D:/code/python/OPTFuzz/ONNX/bugs/bug3.py", line 21, in test_graph_optimization_discrepancy
np.testing.assert_allclose(r1, r2, atol=1e-3, rtol=1e-3)
File "C:\software\conda\envs\OPTFuzz\lib\site-packages\numpy\testing\_private\utils.py", line 1592, in assert_allclose
assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
File "C:\software\conda\envs\OPTFuzz\lib\contextlib.py", line 75, in inner
return func(*args, **kwds)
File "C:\software\conda\envs\OPTFuzz\lib\site-packages\numpy\testing\_private\utils.py", line 862, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=0.001, atol=0.001
Mismatched elements: 95 / 400 (23.8%)
Max absolute difference: 0.9433594
Max relative difference: 99.145134
x: array([[[[ 6.557533e+00, 5.981650e+00, 7.282983e+00, 6.907324e+00,
8.025227e+00],
[ 7.683022e+00, 6.642226e+00, 6.947234e+00, 5.576146e+00,...
y: array([[[[ 6.557533e+00, 6.666800e+00, 7.299584e+00, 6.799902e+00,
7.440510e+00],
[ 7.629311e+00, 6.917006e+00, 7.392470e+00, 5.630254e+00,...