[Inference Error] The onnx inference result is inconsistent with the numpy inference result

### Describe the issue

I want to implement the inference of onnx model in my own C code，but in some layers，the result between C and ONNX has 1 error, such as C is 40 but onnx is 41.

I want to know why numpy's result is -87 but onnx is -88 ? ? 
In Quant model inference, an error of 1 is fatal！The cumulative error through many layers can reach 4-5 (in 8-bit integers)
Thank u :>

the test code below⬇

### To reproduce

``` python
import onnx
from onnx import helper, TensorProto, numpy_helper
import numpy as np
import onnxruntime as ort

A = 'A'
B = 'B'
C = 'C'


A_scale = 0.008010663092136383
A_zero_point = 7
B_scale = 0.00622713053599
B_zero_point = -128
C_scale = 0.006873490754514933
C_zero_point = -128


input_A = helper.make_tensor_value_info(A, TensorProto.INT8, [1, 1, 1, 1])
input_B = helper.make_tensor_value_info(B, TensorProto.INT8, [1, 1, 1, 1])


output = helper.make_tensor_value_info(C, TensorProto.INT8, [1, 1, 1, 1])


initializer_A_scale = numpy_helper.from_array(np.array(A_scale, dtype=np.float32), name='A_scale')
initializer_A_zero_point = numpy_helper.from_array(np.array(A_zero_point, dtype=np.int8), name='A_zero_point')

initializer_B_scale = numpy_helper.from_array(np.array(B_scale, dtype=np.float32), name='B_scale')
initializer_B_zero_point = numpy_helper.from_array(np.array(B_zero_point, dtype=np.int8), name='B_zero_point')

initializer_C_scale = numpy_helper.from_array(np.array(C_scale, dtype=np.float32), name='C_scale')
initializer_C_zero_point = numpy_helper.from_array(np.array(C_zero_point, dtype=np.int8), name='C_zero_point')



qlinear_add_node = helper.make_node(
    'QLinearAdd',
    inputs=[A, 'A_scale', 'A_zero_point', B, 'B_scale', 'B_zero_point', 'C_scale', 'C_zero_point'],
    outputs=[C],
    name='QLinearAdd',
     domain='com.microsoft' 
)
opset_version_ai_onnx = 13  
opset_version_com_microsoft = 1  

graph = helper.make_graph(
    nodes=[qlinear_add_node],
    name='QLinearAdd_Graph',
    inputs=[input_A, input_B],
    outputs=[output],
    initializer=[
        initializer_A_scale,
        initializer_A_zero_point,
        initializer_B_scale,
        initializer_B_zero_point,
        initializer_C_scale,
        initializer_C_zero_point
    ]
)


model = helper.make_model(graph, producer_name='onnx-qlinearadd-fixed-params', 
                          opset_imports=[ helper.make_opsetid(domain='ai.onnx', version=opset_version_ai_onnx),
        helper.make_opsetid(domain='com.microsoft', version=opset_version_com_microsoft)])
onnx.save(model, 'qlinearadd_fixed_params_model.onnx')
print("ONNX MODEL save 'qlinearadd_fixed_params_model.onnx'")


A_int8 = np.array([-8], dtype=np.int8)
B_int8 = np.array([-64], dtype=np.int8)


A_real = A_scale * (A_int8.astype(np.int32) - A_zero_point)
B_real = B_scale * (B_int8.astype(np.int32) - B_zero_point)


C_real = A_real + B_real

A1 = A_scale *(A_int8 - A_zero_point)
B1 = B_scale*(B_int8 - B_zero_point)

print((A1+B1) / C_scale + C_zero_point )

C_int32 = np.round(C_real / C_scale) + C_zero_point
C_int8 = C_int32.astype(np.int8)
print(C_int8)
session = ort.InferenceSession('qlinearadd_fixed_params_model.onnx')


output_name = session.get_outputs()[0].name

A_data = np.array([-8], dtype=np.int8).reshape([1, 1, 1, 1])
B_data = np.array([-64], dtype=np.int8).reshape([1, 1, 1, 1])


input_dict = {
    'A': A_data,
    'B': B_data
}


outputs = session.run([output_name], input_dict)


C_output = outputs[0]
print("output C:", C_output)
```

### Urgency

_No response_

### Platform

Windows

### OS Version

11

### ONNX Runtime Installation

Built from Source

### ONNX Runtime Version or Commit ID

onnxruntime==1.19.2 python

### ONNX Runtime API

Python

### Architecture

X64

### Execution Provider

Default CPU

### Execution Provider Library Version

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference Error] The onnx inference result is inconsistent with the numpy inference result #23202

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Inference Error] The onnx inference result is inconsistent with the numpy inference result #23202

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions