Quantized ONNX model fails on WebGPU due to invalid int32 DequantizeLinear (bias zero_point)

Description

Hello, thank you for releasing this model.

While running the provided ONNX model with onnxruntime-web using the WebGPU execution provider, inference fails with the following error:

```
[WebGPU] Kernel "[DequantizeLinear] inner.encoder.conv1.bias_DequantizeLinear" failed.
Error: In the case of dequantizing int32 there is no zero point.
```

This prevents the model from running on WebGPU, even though the same model may run on WASM / CPU execution providers.

⸻

Root Cause (Likely)

The failing node is:

inner.encoder.conv1.bias_DequantizeLinear

This node appears to dequantize a bias tensor stored as int32, which is standard for quantized Conv layers.

However, the exported DequantizeLinear node seems to include a zero_point input for an int32 tensor.
According to the ONNX specification:
	•	DequantizeLinear must not use a zero_point for int32 inputs
(or the zero point must be implicitly assumed to be 0)

WebGPU strictly enforces this rule and errors out, whereas other execution providers may overlook it.

⸻

Expected Behavior

The model should be exported such that int32 bias dequantization does not include a zero_point input, or otherwise follows the ONNX spec for DequantizeLinear(int32).

With a corrected export:
	•	The model should run correctly on WebGPU
	•	No runtime kernel error should occur

⸻

Request

Would it be possible to:
	1.	Re-export / re-publish the ONNX model with corrected DequantizeLinear nodes for int32 bias, and
	2.	Upload the updated .onnx file(s) to this repository?

This change would greatly improve compatibility with:
	•	onnxruntime-web
	•	WebGPU execution provider
	•	browser-based inference use cases

⸻

Additional Notes
	•	This issue is not WebGPU-specific logic in the model, but rather strict spec validation in WebGPU EP.
	•	Many users targeting browser inference (especially WebGPU) are likely to hit this.
	•	I’m happy to provide a minimal reproducible example or pinpoint the exact node if needed.

Thank you very much for your work and for considering this request.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantized ONNX model fails on WebGPU due to invalid int32 DequantizeLinear (bias zero_point) #33

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Quantized ONNX model fails on WebGPU due to invalid int32 DequantizeLinear (bias zero_point) #33

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions