Skip to content

[WebGPU] com.microsoft.QMoE produces invalid results for certain attribute combinations #27220

@xenova

Description

@xenova

Describe the issue

The com.microsoft.QMoE operation fails with large correctness issue for the following attributes:

Image

cc @guschmue

To reproduce

import onnxruntime as ort
from huggingface_hub import hf_hub_download

path = hf_hub_download(repo_id="Xenova/onnxruntime-webgpu-test-data", filename="fail_673_standalone.onnx", repo_type="dataset")
cpu_session = ort.InferenceSession(path, providers=['CPUExecutionProvider'])
cpu_result = cpu_session.run(None, {})

webgpu_session = ort.InferenceSession(path, providers=['WebGpuExecutionProvider'])
webgpu_result = webgpu_session.run(None, {})

diff = cpu_result[0] - webgpu_result[0]
print('Diff stats: ', f'min={diff.min()}, max={diff.max()}, mean={diff.mean()}, std={diff.std()}')

produces this error:

Diff stats:  min=-7.402378082275391, max=8.158369064331055, mean=-0.0036832974292337894, std=0.3670535087585449

Urgency

medium-high -- blocks certain models which use this operation.

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

main

Execution Provider

'webgpu' (WebGPU)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ep:WebGPUort-web webgpu providerplatform:webissues related to ONNX Runtime web; typically submitted using template

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions