[WebGPU EP] Disable Split-K when use_deterministic_compute is true#27086
[WebGPU EP] Disable Split-K when use_deterministic_compute is true#27086guschmue merged 6 commits intomicrosoft:mainfrom
Split-K when use_deterministic_compute is true#27086Conversation
This patch disables `Split-K` when the onnxruntime session is created with `use_deterministic_compute` being true in `SessionOptions` as current implementation of `Split-K` only relies on atomic operations which will cause non-deterministic answers.
There was a problem hiding this comment.
Pull request overview
This pull request disables the Split-K optimization in WebGPU execution provider when the session option use_deterministic_compute is set to true. Split-K uses atomic operations which produce non-deterministic results due to the random order of partial result summation, so this change ensures deterministic outputs when required.
Changes:
- Modified
UseSplitKmethod to accept aComputeContextparameter and check deterministic compute settings - Updated all call sites to pass the context parameter
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| onnxruntime/core/providers/webgpu/webgpu_utils.h | Added forward declaration of ComputeContext and updated UseSplitK signature to include context parameter |
| onnxruntime/core/providers/webgpu/webgpu_utils.cc | Added include for compute_context.h and implemented deterministic compute check in UseSplitK |
| onnxruntime/core/providers/webgpu/math/matmul.cc | Updated UseSplitK call to pass context parameter |
| onnxruntime/core/providers/webgpu/math/gemm_packed.cc | Updated UseSplitK call to pass context parameter with address-of operator |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
change looks good but need to wait for the CI fix (#27235). May need sync to latest main once it's merged. |
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
Description
This patch disables
Split-Kwhen the onnxruntime session is created withuse_deterministic_computeset to true inSessionOptions, which is set to false by default right now.In
onnxruntime_provider_tests, we can addso.use_deterministic_computeto true in theBaseTester::Run()to require the result must be deterministic with same input.Motivation and Context
Current implementation of
Split-Konly relies on atomic operations which will cause non-deterministic answers, so we shouldn't useSplit-Kwhen we always want deterministic output with same input.See #27003 for more details.