Skip to content

[Inference Error] The onnx inference result is inconsistent with the numpy inference result #23202

@songqiuyu

Description

@songqiuyu

Describe the issue

I want to implement the inference of onnx model in my own C code,but in some layers,the result between C and ONNX has 1 error, such as C is 40 but onnx is 41.

I want to know why numpy's result is -87 but onnx is -88 ? ?
In Quant model inference, an error of 1 is fatal!The cumulative error through many layers can reach 4-5 (in 8-bit integers)
Thank u :>

the test code below⬇

To reproduce

import onnx
from onnx import helper, TensorProto, numpy_helper
import numpy as np
import onnxruntime as ort

A = 'A'
B = 'B'
C = 'C'


A_scale = 0.008010663092136383
A_zero_point = 7
B_scale = 0.00622713053599
B_zero_point = -128
C_scale = 0.006873490754514933
C_zero_point = -128


input_A = helper.make_tensor_value_info(A, TensorProto.INT8, [1, 1, 1, 1])
input_B = helper.make_tensor_value_info(B, TensorProto.INT8, [1, 1, 1, 1])


output = helper.make_tensor_value_info(C, TensorProto.INT8, [1, 1, 1, 1])


initializer_A_scale = numpy_helper.from_array(np.array(A_scale, dtype=np.float32), name='A_scale')
initializer_A_zero_point = numpy_helper.from_array(np.array(A_zero_point, dtype=np.int8), name='A_zero_point')

initializer_B_scale = numpy_helper.from_array(np.array(B_scale, dtype=np.float32), name='B_scale')
initializer_B_zero_point = numpy_helper.from_array(np.array(B_zero_point, dtype=np.int8), name='B_zero_point')

initializer_C_scale = numpy_helper.from_array(np.array(C_scale, dtype=np.float32), name='C_scale')
initializer_C_zero_point = numpy_helper.from_array(np.array(C_zero_point, dtype=np.int8), name='C_zero_point')



qlinear_add_node = helper.make_node(
    'QLinearAdd',
    inputs=[A, 'A_scale', 'A_zero_point', B, 'B_scale', 'B_zero_point', 'C_scale', 'C_zero_point'],
    outputs=[C],
    name='QLinearAdd',
     domain='com.microsoft' 
)
opset_version_ai_onnx = 13  
opset_version_com_microsoft = 1  

graph = helper.make_graph(
    nodes=[qlinear_add_node],
    name='QLinearAdd_Graph',
    inputs=[input_A, input_B],
    outputs=[output],
    initializer=[
        initializer_A_scale,
        initializer_A_zero_point,
        initializer_B_scale,
        initializer_B_zero_point,
        initializer_C_scale,
        initializer_C_zero_point
    ]
)


model = helper.make_model(graph, producer_name='onnx-qlinearadd-fixed-params', 
                          opset_imports=[ helper.make_opsetid(domain='ai.onnx', version=opset_version_ai_onnx),
        helper.make_opsetid(domain='com.microsoft', version=opset_version_com_microsoft)])
onnx.save(model, 'qlinearadd_fixed_params_model.onnx')
print("ONNX MODEL save 'qlinearadd_fixed_params_model.onnx'")


A_int8 = np.array([-8], dtype=np.int8)
B_int8 = np.array([-64], dtype=np.int8)


A_real = A_scale * (A_int8.astype(np.int32) - A_zero_point)
B_real = B_scale * (B_int8.astype(np.int32) - B_zero_point)


C_real = A_real + B_real

A1 = A_scale *(A_int8 - A_zero_point)
B1 = B_scale*(B_int8 - B_zero_point)

print((A1+B1) / C_scale + C_zero_point )

C_int32 = np.round(C_real / C_scale) + C_zero_point
C_int8 = C_int32.astype(np.int8)
print(C_int8)
session = ort.InferenceSession('qlinearadd_fixed_params_model.onnx')


output_name = session.get_outputs()[0].name

A_data = np.array([-8], dtype=np.int8).reshape([1, 1, 1, 1])
B_data = np.array([-64], dtype=np.int8).reshape([1, 1, 1, 1])


input_dict = {
    'A': A_data,
    'B': B_data
}


outputs = session.run([output_name], input_dict)


C_output = outputs[0]
print("output C:", C_output)

Urgency

No response

Platform

Windows

OS Version

11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

onnxruntime==1.19.2 python

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    quantizationissues related to quantizationstaleissues that have not been addressed in a while; categorized by a bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions