[WebGPU] Add --enable-subgroups flag for optional subgroup support#3431
[WebGPU] Add --enable-subgroups flag for optional subgroup support#3431ksgr5566 wants to merge 6 commits intomlc-ai:mainfrom
Conversation
Summary of ChangesHello @ksgr5566, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new --enable-subgroups command-line flag to allow enabling WebGPU subgroup shuffle primitives. The changes are well-implemented, adding the argument to the CLI, updating help text, and correctly passing the flag to the TVM target configuration. My review includes one suggestion to simplify the logic for applying this new setting, making it more concise and focused on the enabled case.
There was a problem hiding this comment.
Pull request overview
This pull request adds optional WebGPU subgroup support to the MLC LLM compilation pipeline through a new --enable-subgroups CLI flag. When enabled for WebGPU targets, this flag sets the supports_subgroups attribute on the TVM target, which triggers TVM to use subgroup shuffle primitives instead of shared memory barriers for reductions.
Changes:
- Add
--enable-subgroupsCLI argument to the compile command - Thread the flag through
detect_target_and_host()to apply WebGPU-specific target modifications - Add help documentation for the new flag
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| python/mlc_llm/cli/compile.py | Adds --enable-subgroups argument and passes it to detect_target_and_host() |
| python/mlc_llm/interface/help.py | Adds help text explaining the new flag's purpose |
| python/mlc_llm/support/auto_target.py | Extends detect_target_and_host() signature and implements _apply_webgpu_subgroups() helper |
Comments suppressed due to low confidence (1)
python/mlc_llm/support/auto_target.py:46
- The docstring is missing documentation for the new
enable_subgroupsparameter. Please add a Parameters section entry describing this parameter, its type (Optional[bool]), default value (None), and what it does (e.g., "Whether to enable WebGPU subgroup support. When True, sets supports_subgroups on WebGPU targets. When None, no modification is made to the target.").
"""Detect the configuration for the target device and its host, for example, target GPU and
the host CPU.
Parameters
----------
target_hint : str
The hint for the target device.
host_hint : str
The hint for the host CPU, default is "auto".
"""
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hi @ksgr5566 thanks! The WASM CI is failing https://ci.mlc.ai/blue/organizations/jenkins/mlc-llm/detail/PR-3431/1/pipeline, could you check the failures? Also cc @akaashrp for visibility. |
Hi @MasterJH5574, this would require TVM version change. I have raised the TVM PR to Charlie's subgroup PR branch (CharlieFRuan/tvm#1). If that is done, the version can be updated here and the build would succeed. |
|
@ksgr5566 I see. Should we upstream to https://github.com/apache/tvm instead of Charlie's fork? |
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
| return target, build_func | ||
|
|
||
|
|
||
| def _apply_webgpu_subgroups(target: Target, enable_subgroups: Optional[bool]) -> Target: |
There was a problem hiding this comment.
Is it possible to add a test for this behavior (enable_subgroups flag causes target_dict to have a supports_subgroups key that is set to True)?
There was a problem hiding this comment.
Also non-webgpu target + enable_subgroups=True leads to unchanged target, and any target + enable_subgroups = False leads to unchanged target
Summary
Adds
--enable-subgroupsCLI flag to mlc_llm compile command, enabling WebGPU subgroup shuffle primitives on supported devices.Changes
--enable-subgroupsargument to compile.py_apply_webgpu_subgroups()supports_subgroupsboolean on target dict for TVM to handleDetails
When flag is enabled, TVM's target parser automatically sets
thread_warp_size=32(default is 1), which triggers warp reduction lowering. Without flag, reductions
use shared memory barriers (universally compatible).
Related
Companion to TVM PR (CharlieFRuan/tvm#1) gating logic. Works with both baseline (no flag)
and subgroup-enabled (with flag) WASM variants.
Testing
mlc_llm compile ... --device webgpu(no subgroups)mlc_llm compile ... --device webgpu --enable-subgroups(with subgroups)Tested with Llama-3.2-1B-q4f16_1. Baseline (no flag) uses shared memory reductions;
with flag, generates subgroupShuffle* ops.
Both the generated WGSLs here: https://gist.github.com/ksgr5566/301664a5dda3e46f44092be4d09b2d4f