Skip to content

Quantized ONNX model fails on WebGPU due to invalid int32 DequantizeLinear (bias zero_point) #33

@urim-thummim

Description

@urim-thummim

Description

Hello, thank you for releasing this model.

While running the provided ONNX model with onnxruntime-web using the WebGPU execution provider, inference fails with the following error:

[WebGPU] Kernel "[DequantizeLinear] inner.encoder.conv1.bias_DequantizeLinear" failed.
Error: In the case of dequantizing int32 there is no zero point.

This prevents the model from running on WebGPU, even though the same model may run on WASM / CPU execution providers.

Root Cause (Likely)

The failing node is:

inner.encoder.conv1.bias_DequantizeLinear

This node appears to dequantize a bias tensor stored as int32, which is standard for quantized Conv layers.

However, the exported DequantizeLinear node seems to include a zero_point input for an int32 tensor.
According to the ONNX specification:
• DequantizeLinear must not use a zero_point for int32 inputs
(or the zero point must be implicitly assumed to be 0)

WebGPU strictly enforces this rule and errors out, whereas other execution providers may overlook it.

Expected Behavior

The model should be exported such that int32 bias dequantization does not include a zero_point input, or otherwise follows the ONNX spec for DequantizeLinear(int32).

With a corrected export:
• The model should run correctly on WebGPU
• No runtime kernel error should occur

Request

Would it be possible to:
1. Re-export / re-publish the ONNX model with corrected DequantizeLinear nodes for int32 bias, and
2. Upload the updated .onnx file(s) to this repository?

This change would greatly improve compatibility with:
• onnxruntime-web
• WebGPU execution provider
• browser-based inference use cases

Additional Notes
• This issue is not WebGPU-specific logic in the model, but rather strict spec validation in WebGPU EP.
• Many users targeting browser inference (especially WebGPU) are likely to hit this.
• I’m happy to provide a minimal reproducible example or pinpoint the exact node if needed.

Thank you very much for your work and for considering this request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions