[WebNN] Support more features for GQA by Honry · Pull Request #27234 · microsoft/onnxruntime

Honry · 2026-02-04T01:58:12Z

Add support for GroupQueryAttention with:

do_rotary=true (cos_cache/sin_cache inputs)
Packed QKV (optional key/value inputs)
Optional past_key/past_value for prefill mode
Remove fp16->fp32 casting workaround

Add ApplyRotaryEmbedding helper function.

Fix decode stage by using qkv_sequence_length to distinguish prefill vs decode, and use runtime seqlens_k instead of static past_sequence_length for rotary position calculation.

Add support for GroupQueryAttention with: - do_rotary=true (cos_cache/sin_cache inputs) - Packed QKV (optional key/value inputs) - Optional past_key/past_value for prefill mode - Remove fp16->fp32 casting workaround Add ApplyRotaryEmbedding helper function. Fix decode stage by using qkv_sequence_length instead of has_past_key to distinguish prefill vs decode, and use runtime seqlens_k instead of static past_sequence_length for rotary position calculation.

Honry · 2026-02-04T02:00:00Z

@fdwr, @guschmue, PTAL, thanks!

fdwr

Minor comment, else LGTM.

onnxruntime/core/providers/webnn/builders/impl/attention_helper.h

guschmue · 2026-02-05T18:15:49Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2026-02-05T18:16:11Z

Azure Pipelines successfully started running 4 pipeline(s).

guschmue · 2026-02-05T18:33:12Z

run 'lintrunner -a' to make the CI happy

fdwr

👍

fdwr · 2026-02-06T01:14:30Z

Hmm, linter issues. I can't tell what it's complaining about though (why can't this linter be clearer? 🤨):

-    emscripten::val input,// Shape: [batch_size, sequence_length, num_heads, head_size]
-    emscripten::val cos_cache,// Shape: [max_sequence_length, head_size / 2]
-    emscripten::val sin_cache,// Shape: [max_sequence_length, head_size / 2]
-    emscripten::val position_ids,// Shape: [batch_size, sequence_length] or [1]
+    emscripten::val input,// Shape: [batch_size, sequence_length, num_heads, head_size]
+    emscripten::val cos_cache,// Shape: [max_sequence_length, head_size / 2]
+    emscripten::val sin_cache,// Shape: [max_sequence_length, head_size / 2]
+    emscripten::val position_ids,// Shape: [batch_size, sequence_length] or [1]

Honry · 2026-02-06T01:29:58Z

Thanks much @fdwr, @guschmue, lint error fixed, please help retrigger the CI. Thanks!

fdwr

👍

fdwr · 2026-02-07T02:08:10Z

/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline,Windows GPU WebGPU CI Pipeline,Windows OpenVINO CI Pipeline

fdwr · 2026-02-07T02:08:13Z

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline

fdwr · 2026-02-07T02:08:16Z

/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI

fdwr · 2026-02-07T02:08:19Z

/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models

azure-pipelines · 2026-02-07T02:08:20Z

Azure Pipelines successfully started running 1 pipeline(s).

fdwr · 2026-02-07T02:08:22Z

/azp run Test Linux CUDA x64 Release,Test Linux TensorRT x64 Release,web_Debug / build_onnxruntime_web,web_Release / build_onnxruntime_web

azure-pipelines · 2026-02-07T02:08:22Z

Azure Pipelines successfully started running 1 pipeline(s).

fdwr · 2026-02-07T02:08:23Z

/azp run Linux QNN CI Pipeline

azure-pipelines · 2026-02-07T02:08:25Z

No pipelines are associated with this pull request.

azure-pipelines · 2026-02-07T02:08:28Z

No pipelines are associated with this pull request.

azure-pipelines · 2026-02-07T02:08:28Z

Azure Pipelines successfully started running 2 pipeline(s).

azure-pipelines · 2026-02-07T02:08:32Z

Azure Pipelines successfully started running 1 pipeline(s).

fdwr previously approved these changes Feb 4, 2026

View reviewed changes

onnxruntime/core/providers/webnn/builders/impl/attention_helper.h Show resolved Hide resolved

address the comment

77d2cac

Honry dismissed fdwr’s stale review via 77d2cac February 4, 2026 07:58

guschmue added the ep:WebNN WebNN execution provider label Feb 5, 2026

guschmue previously approved these changes Feb 5, 2026

View reviewed changes

guschmue enabled auto-merge (squash) February 5, 2026 18:15

fdwr previously approved these changes Feb 6, 2026

View reviewed changes

Fix lint error

55562e5

auto-merge was automatically disabled February 6, 2026 01:29
Head branch was pushed to by a user without write access

Honry dismissed stale reviews from fdwr and guschmue via 55562e5 February 6, 2026 01:29

fdwr approved these changes Feb 6, 2026

View reviewed changes

fdwr merged commit 83d11b5 into microsoft:main Feb 7, 2026
94 of 165 checks passed

Conversation

Honry commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Honry commented Feb 4, 2026

Uh oh!

fdwr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

guschmue commented Feb 5, 2026

Uh oh!

azure-pipelines bot commented Feb 5, 2026

Uh oh!

guschmue commented Feb 5, 2026

Uh oh!

fdwr left a comment

Choose a reason for hiding this comment

Uh oh!

fdwr commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Honry commented Feb 6, 2026

Uh oh!

fdwr left a comment

Choose a reason for hiding this comment

Uh oh!

fdwr commented Feb 7, 2026

Uh oh!

fdwr commented Feb 7, 2026

Uh oh!

fdwr commented Feb 7, 2026

Uh oh!

fdwr commented Feb 7, 2026

Uh oh!

azure-pipelines bot commented Feb 7, 2026

Uh oh!

fdwr commented Feb 7, 2026

Uh oh!

azure-pipelines bot commented Feb 7, 2026

Uh oh!

fdwr commented Feb 7, 2026

Uh oh!

azure-pipelines bot commented Feb 7, 2026

Uh oh!

azure-pipelines bot commented Feb 7, 2026

Uh oh!

azure-pipelines bot commented Feb 7, 2026

Uh oh!

azure-pipelines bot commented Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Honry commented Feb 4, 2026 •

edited

Loading

fdwr commented Feb 6, 2026 •

edited

Loading