-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Labels
staleissues that have not been addressed in a while; categorized by a botissues that have not been addressed in a while; categorized by a bot
Description
Describe the issue
I’ve noticed mismatches between the outputs of a PyTorch model and the corresponding ONNX model when running inference with ONNX Runtime. Specifically, I’m working with float16 precision, and the results differ between the two frameworks. While I’m aware that such mismatches can occur for float32, should I also expect similar discrepancies when working with float16 (maybe because intermediate ops are computed in float32) ? If so, what are the potential causes, and how can I resolve or minimize these differences?
Any insights or guidance on this matter would be greatly appreciated!
To reproduce
import numpy as np
import onnxruntime
import torch
import torch.nn as nn
class Dense(nn.Linear):
def __init__(self, in_features, out_features):
super().__init__(in_features=in_features, out_features=out_features,
bias=False, device="cpu", dtype=torch.float16)
self.weight.requires_grad = False
def forward(self, input):
return super().forward(input)
def compare_outputs(pytorch_model, onnx_model_path, inputs):
def _to_numpy(tensor):
return tensor.cpu().numpy()
# ONNXRuntime inference
ort_session = onnxruntime.InferenceSession(onnx_model_path)
ort_outputs = ort_session.run(None, {'x': _to_numpy(inputs)})
# Torch inference
pytorch_model.eval()
torch_outputs = [_to_numpy(pytorch_model(inputs))]
# Test fail
np.testing.assert_array_equal(ort_outputs, torch_outputs)
def main():
torch.manual_seed(0)
# Create random float16 inputs either between [-fp16min, fp16max]
size = (64, 256)
x_rand_tensor = torch.rand(size, requires_grad=False, dtype=torch.float32)
f16_min = torch.finfo(torch.float16).min + 1
f16_max = torch.finfo(torch.float16).max - 1
scale_factor = (f16_max - f16_min)
offset = f16_min
x = (x_rand_tensor * scale_factor + offset).to(torch.float16)
# Create the model
dense_model = Dense(256, 1024)
onnx_model_path = "dense_model.onnx"
torch.onnx.export(
dense_model,
x,
onnx_model_path,
opset_version=15,
input_names=['x'],
output_names=['output'],
)
print(f"[INFO] Model exported to {onnx_model_path}")
compare_outputs(dense_model, onnx_model_path, x)
if __name__ == "__main__":
main()Urgency
No
Platform
Linux
OS Version
Ubuntu 22.04.3 LTS
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.21.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
staleissues that have not been addressed in a while; categorized by a botissues that have not been addressed in a while; categorized by a bot