-
Notifications
You must be signed in to change notification settings - Fork 76
Description
Description
Hello, thank you for releasing this model.
While running the provided ONNX model with onnxruntime-web using the WebGPU execution provider, inference fails with the following error:
[WebGPU] Kernel "[DequantizeLinear] inner.encoder.conv1.bias_DequantizeLinear" failed.
Error: In the case of dequantizing int32 there is no zero point.
This prevents the model from running on WebGPU, even though the same model may run on WASM / CPU execution providers.
⸻
Root Cause (Likely)
The failing node is:
inner.encoder.conv1.bias_DequantizeLinear
This node appears to dequantize a bias tensor stored as int32, which is standard for quantized Conv layers.
However, the exported DequantizeLinear node seems to include a zero_point input for an int32 tensor.
According to the ONNX specification:
• DequantizeLinear must not use a zero_point for int32 inputs
(or the zero point must be implicitly assumed to be 0)
WebGPU strictly enforces this rule and errors out, whereas other execution providers may overlook it.
⸻
Expected Behavior
The model should be exported such that int32 bias dequantization does not include a zero_point input, or otherwise follows the ONNX spec for DequantizeLinear(int32).
With a corrected export:
• The model should run correctly on WebGPU
• No runtime kernel error should occur
⸻
Request
Would it be possible to:
1. Re-export / re-publish the ONNX model with corrected DequantizeLinear nodes for int32 bias, and
2. Upload the updated .onnx file(s) to this repository?
This change would greatly improve compatibility with:
• onnxruntime-web
• WebGPU execution provider
• browser-based inference use cases
⸻
Additional Notes
• This issue is not WebGPU-specific logic in the model, but rather strict spec validation in WebGPU EP.
• Many users targeting browser inference (especially WebGPU) are likely to hit this.
• I’m happy to provide a minimal reproducible example or pinpoint the exact node if needed.
Thank you very much for your work and for considering this request.