-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Labels
ep:WebGPUort-web webgpu providerort-web webgpu providerplatform:webissues related to ONNX Runtime web; typically submitted using templateissues related to ONNX Runtime web; typically submitted using template
Description
Describe the issue
The com.microsoft.QMoE operation fails with large correctness issue for the following attributes:
cc @guschmue
To reproduce
import onnxruntime as ort
from huggingface_hub import hf_hub_download
path = hf_hub_download(repo_id="Xenova/onnxruntime-webgpu-test-data", filename="fail_673_standalone.onnx", repo_type="dataset")
cpu_session = ort.InferenceSession(path, providers=['CPUExecutionProvider'])
cpu_result = cpu_session.run(None, {})
webgpu_session = ort.InferenceSession(path, providers=['WebGpuExecutionProvider'])
webgpu_result = webgpu_session.run(None, {})
diff = cpu_result[0] - webgpu_result[0]
print('Diff stats: ', f'min={diff.min()}, max={diff.max()}, mean={diff.mean()}, std={diff.std()}')produces this error:
Diff stats: min=-7.402378082275391, max=8.158369064331055, mean=-0.0036832974292337894, std=0.3670535087585449
Urgency
medium-high -- blocks certain models which use this operation.
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
main
Execution Provider
'webgpu' (WebGPU)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
ep:WebGPUort-web webgpu providerort-web webgpu providerplatform:webissues related to ONNX Runtime web; typically submitted using templateissues related to ONNX Runtime web; typically submitted using template