Fix dark mode link visibility accessibility issue on ONNX Runtime website#25846
Closed
Fix dark mode link visibility accessibility issue on ONNX Runtime website#25846
Conversation
Naming conflicts when expand-pool2d-squeeze (implemented as reshape) logic is invoked during ONNX -> QNN op lowering. Model with multiple pool 1D ops would hit this issue.
- Added TopK in registry.py so as to create QDQ nodes for the op - Ensure that both the input and output quantization params are equal - Added unit test to verify the creation of QDQ nodes for TopK ### Description: Added support for creation of QDQ nodes for TopK when quantized with ORT static quantization tool ### Motivation and Context: Currently there is support to form a node unit for TopK operator when QDQ nodes are present and both the input and output quantization params are equal. But there was no support to create QDQ nodes for TopK operator in the ORT static quantization tool
### Description The parser does no longer link agains the plugin library but also loads it dynamic. Due to that I think we should also make the library optional in ORT. @chilo-ms
…f nodes (#25191) Added an API that creates a sub-graph from a set of nodes in an OrtGraph. This API is needed in the GetCapability EP ABI porting when EP wants to check whether a 'sub-graph' of the graph is supported by the hardware backend.
### Description This change is a follow up to #25130. - consume duktape from vcpkg if --use_vcpkg is specified - ~~add a Windows CI pipeline for dynamic WGSL template~~ (Will do in a separate PR) - upgrade wgsl-template package from 0.1.10 to 0.1.13 - support adding contribop folder as input
add a build option to enable default options more appropriate for client/on-device workloads. initial use case will be to set the default thread pool allow_spinning policy , which we want to default to 0/false for builds targeted for client/on-device workloads. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
The `from` is not a property of `Float16Array` but an inherited function, we can use `Float16Array['from']` to check if it is available.
Add a new API `Node_GetEpType` to get the EP that the node is assigned to run on. This API is needed when porting the plugin TRT EP in `GetCapability` where ep needs to know whether the subgraph(s) of the control flow node is assigned to the ep and then to add this control flow op to the support list.
### Description Enable DSP queue polling when performance profile is burst
### Description - Set context priority to low when workload type is Efficient - Set context priority to command line configured value if Default - Error out otherwise (invalid argument)
Add Compile API ModelCompilationOptions_SetEpContextBinaryInformation to set the folder path and model name so that the EP can get the right place to dump the [model_name]_[ep].bin file.
### Description Windows WebGPU CI: add build matrix for wgsl template
Use `inputShape.length - 1` instead of `inputShape.length` to avoid out-of-bounds access.
Description (reference: GHSA-5crp-9r3c-p9vr) Newtonsoft.Json prior to version 13.0.1 is vulnerable to Insecure Defaults due to improper handling of expressions with high nesting level that lead to StackOverFlow exception or high CPU and RAM usage. Exploiting this vulnerability results in Denial Of Service (DoS). To mitigate the issue one either need to update Newtonsoft.Json to 13.0.1 or set MaxDepth parameter in the JsonSerializerSettings. ``` JsonConvert.DefaultSettings = () => new JsonSerializerSettings { MaxDepth = 128 }; ``` This file is the only place using `JsonConvert`, so I blindly put this fix and hope the warning will disappear.
Change to use `Node_GetEpName` API name to avoid confusion. For plugin EPs, the EP factory can use whatever name that registered with ORT, so make the API name `Node_GetEpName` to align with `OrtEpFactory.GetName.`
### Description This PR fixes the number of hidden layers used during the export of Whisper by always using the number of hidden layers in the decoder. ### Motivation and Context Most of the Whisper models contain the same number of hidden layers in the encoder and decoder. However, Whisper large v3 turbo contains 32 hidden layers in the encoder and only 4 hidden layers in the decoder. This PR also fixes [this issue](microsoft/onnxruntime-genai#1611).
…transformers/models/llama (#25328) Bumps [transformers](https://github.com/huggingface/transformers) from 4.48.0 to 4.52.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/huggingface/transformers/releases">transformers's releases</a>.</em></p> <blockquote> <h2>Patch release v4.51.3</h2> <p>A mix of bugs were fixed in this patch; very exceptionally, we diverge from semantic versioning to merge GLM-4 in this patch release.</p> <ul> <li>Handle torch ver in flexattn (<a href="https://redirect.github.com/huggingface/transformers/issues/37400">#37400</a>)</li> <li>handle torch version edge cases (<a href="https://redirect.github.com/huggingface/transformers/issues/37399">#37399</a>)</li> <li>Add glm4 (<a href="https://redirect.github.com/huggingface/transformers/issues/37388">#37388</a>)</li> </ul> <h1>Patch Release 4.51.2</h1> <p>This is another round of bug fixes, but they are a lot more minor and outputs were not really affected!</p> <ul> <li>Fix Llama4 offset (<a href="https://redirect.github.com/huggingface/transformers/issues/37414">#37414</a>) by <a href="https://github.com/Cyrilvallez"><code>@Cyrilvallez</code></a></li> <li>Attention Quantization with FBGemm & TP (<a href="https://redirect.github.com/huggingface/transformers/issues/37384">#37384</a>) by <a href="https://github.com/MekkCyber"><code>@MekkCyber</code></a></li> <li>use rms_norm_eps for the L2Norm for Llama4 (<a href="https://redirect.github.com/huggingface/transformers/issues/37418">#37418</a>) by <a href="https://github.com/danielhanchen"><code>@danielhanchen</code></a></li> <li>mark llama4 as not supported with fa2 (<a href="https://redirect.github.com/huggingface/transformers/issues/37416">#37416</a>) by <a href="https://github.com/winglian"><code>@winglian</code></a></li> </ul> <h1>Patch release v4.51.1</h1> <p>Since the release of Llama 4, we have fixed a few issues that we are now releasing in patch v4.51.1</p> <ul> <li>Fixing flex attention for torch=2.6.0 (<a href="https://redirect.github.com/huggingface/transformers/issues/37285">#37285</a>)</li> <li>more fixes for post-training llama4 (<a href="https://redirect.github.com/huggingface/transformers/issues/37329">#37329</a>)</li> <li>Remove HQQ from caching allocator warmup (<a href="https://redirect.github.com/huggingface/transformers/issues/37347">#37347</a>)</li> <li>fix derived berts _init_weights (<a href="https://redirect.github.com/huggingface/transformers/issues/37341">#37341</a>)</li> <li>Fix init empty weights without accelerate (<a href="https://redirect.github.com/huggingface/transformers/issues/37337">#37337</a>)</li> <li>Fix deepspeed with quantization (<a href="https://redirect.github.com/huggingface/transformers/issues/37324">#37324</a>)</li> <li>fix llama4 training (<a href="https://redirect.github.com/huggingface/transformers/issues/37319">#37319</a>)</li> <li>fix flex attn when optional args aren't passed (<a href="https://redirect.github.com/huggingface/transformers/issues/37327">#37327</a>)</li> <li>Multiple llama4 fixe (<a href="https://redirect.github.com/huggingface/transformers/issues/37353">#37353</a>)</li> </ul> <p>Thanks all for your patience</p> <h2>v4.51.0: Llama 4, Phi4-Multimodal, DeepSeek-v3, Qwen3</h2> <h2>New Model Additions</h2> <h3>Llama 4</h3> <p><img src="https://github.com/user-attachments/assets/d613b292-94b0-4902-9dc7-2d00693222e4" alt="image" /></p> <p>Llama 4, developed by Meta, introduces a new auto-regressive Mixture-of-Experts (MoE) architecture.This generation includes two models:</p> <ul> <li>The highly capable Llama 4 Maverick with 17B active parameters out of ~400B total, with 128 experts.</li> <li>The efficient Llama 4 Scout also has 17B active parameters out of ~109B total, using just 16 experts.</li> </ul> <p>Both models leverage early fusion for native multimodality, enabling them to process text and image inputs. Maverick and Scout are both trained on up to 40 trillion tokens on data encompassing 200 languages (with specific fine-tuning support for 12 languages including Arabic, Spanish, German, and Hindi).</p> <p>For deployment, Llama 4 Scout is designed for accessibility, fitting on a single server-grade GPU via on-the-fly 4-bit or 8-bit quantization, while Maverick is available in BF16 and FP8 formats. These models are released under the custom Llama 4 Community License Agreement, available on the model repositories</p> <p>Getting started with Llama 4 using transformers is straightforward. Make sure you have transformers v4.51.0 or later installed:</p> <pre><code>pip install -U transformers[hf_xet] </tr></table> </code></pre> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/huggingface/transformers/commit/945727948c1143a10ac6f7d811aa58bb0d126b5b"><code>9457279</code></a> Release: v4.52.1</li> <li><a href="https://github.com/huggingface/transformers/commit/eaa301673a0a7a1a8c5d3f11c046d1592a7ae16b"><code>eaa3016</code></a> Revert parallelism temporarily (<a href="https://redirect.github.com/huggingface/transformers/issues/38240">#38240</a>)</li> <li><a href="https://github.com/huggingface/transformers/commit/b5f494632c0fff2527dd3140423408644a9b0076"><code>b5f4946</code></a> Protect ParallelInterface</li> <li><a href="https://github.com/huggingface/transformers/commit/113424bcd53b92600f77d82f48add0a60fb41556"><code>113424b</code></a> Release: v4.52.0</li> <li><a href="https://github.com/huggingface/transformers/commit/f834d368f6a21ed54188d9c96fbb9013b1d2c75f"><code>f834d36</code></a> [gemma3] fix bidirectional attention mask (<a href="https://redirect.github.com/huggingface/transformers/issues/38080">#38080</a>)</li> <li><a href="https://github.com/huggingface/transformers/commit/2edb0e4b4dda8172d5628ca7497a4125f28bf6fc"><code>2edb0e4</code></a> [mllama] fix loading and inference (<a href="https://redirect.github.com/huggingface/transformers/issues/38223">#38223</a>)</li> <li><a href="https://github.com/huggingface/transformers/commit/390f153469dfdc793e7a9c7eb4822ea76f4f796a"><code>390f153</code></a> Add padding-free to bamba (<a href="https://redirect.github.com/huggingface/transformers/issues/35861">#35861</a>)</li> <li><a href="https://github.com/huggingface/transformers/commit/2a79471318a9b7b16706f3bb5cd833c7e81919a6"><code>2a79471</code></a> Fixing Bitnet after use_rms_norm introduction (<a href="https://redirect.github.com/huggingface/transformers/issues/38229">#38229</a>)</li> <li><a href="https://github.com/huggingface/transformers/commit/9661896083c9d983341afa45cc4b84af01706e72"><code>9661896</code></a> Enable Quantize KV Cache for Mistral Model (<a href="https://redirect.github.com/huggingface/transformers/issues/35042">#35042</a>)</li> <li><a href="https://github.com/huggingface/transformers/commit/1c2f36b480e02c9027d2523746d34e27b39e01a4"><code>1c2f36b</code></a> parallelism goes brrr (<a href="https://redirect.github.com/huggingface/transformers/issues/37877">#37877</a>)</li> <li>Additional commits viewable in <a href="https://github.com/huggingface/transformers/compare/v4.48.0...v4.52.1">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.12.2 to 0.12.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/ruff/releases">ruff's releases</a>.</em></p> <blockquote> <h2>0.12.3</h2> <h2>Release Notes</h2> <h3>Preview features</h3> <ul> <li>[<code>flake8-bugbear</code>] Support non-context-manager calls in <code>B017</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/19063">#19063</a>)</li> <li>[<code>flake8-use-pathlib</code>] Add autofixes for <code>PTH100</code>, <code>PTH106</code>, <code>PTH107</code>, <code>PTH108</code>, <code>PTH110</code>, <code>PTH111</code>, <code>PTH112</code>, <code>PTH113</code>, <code>PTH114</code>, <code>PTH115</code>, <code>PTH117</code>, <code>PTH119</code>, <code>PTH120</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/19213">#19213</a>)</li> <li>[<code>flake8-use-pathlib</code>] Add autofixes for <code>PTH203</code>, <code>PTH204</code>, <code>PTH205</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/18922">#18922</a>)</li> </ul> <h3>Bug fixes</h3> <ul> <li>[<code>flake8-return</code>] Fix false-positive for variables used inside nested functions in <code>RET504</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/18433">#18433</a>)</li> <li>Treat form feed as valid whitespace before a line continuation (<a href="https://redirect.github.com/astral-sh/ruff/pull/19220">#19220</a>)</li> <li>[<code>flake8-type-checking</code>] Fix syntax error introduced by fix (<code>TC008</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19150">#19150</a>)</li> <li>[<code>pyupgrade</code>] Keyword arguments in <code>super</code> should suppress the <code>UP008</code> fix (<a href="https://redirect.github.com/astral-sh/ruff/pull/19131">#19131</a>)</li> </ul> <h3>Documentation</h3> <ul> <li>[<code>flake8-pyi</code>] Make example error out-of-the-box (<code>PYI007</code>, <code>PYI008</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19103">#19103</a>)</li> <li>[<code>flake8-simplify</code>] Make example error out-of-the-box (<code>SIM116</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19111">#19111</a>)</li> <li>[<code>flake8-type-checking</code>] Make example error out-of-the-box (<code>TC001</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19151">#19151</a>)</li> <li>[<code>flake8-use-pathlib</code>] Make example error out-of-the-box (<code>PTH210</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19189">#19189</a>)</li> <li>[<code>pycodestyle</code>] Make example error out-of-the-box (<code>E272</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19191">#19191</a>)</li> <li>[<code>pycodestyle</code>] Make example not raise unnecessary <code>SyntaxError</code> (<code>E114</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19190">#19190</a>)</li> <li>[<code>pydoclint</code>] Make example error out-of-the-box (<code>DOC501</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19218">#19218</a>)</li> <li>[<code>pylint</code>, <code>pyupgrade</code>] Fix syntax errors in examples (<code>PLW1501</code>, <code>UP028</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19127">#19127</a>)</li> <li>[<code>pylint</code>] Update <code>missing-maxsplit-arg</code> docs and error to suggest proper usage (<code>PLC0207</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/18949">#18949</a>)</li> <li>[<code>flake8-bandit</code>] Make example error out-of-the-box (<code>S412</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19241">#19241</a>)</li> </ul> <h2>Contributors</h2> <ul> <li><a href="https://github.com/AlexWaygood"><code>@AlexWaygood</code></a></li> <li><a href="https://github.com/BurntSushi"><code>@BurntSushi</code></a></li> <li><a href="https://github.com/Gankra"><code>@Gankra</code></a></li> <li><a href="https://github.com/InSyncWithFoo"><code>@InSyncWithFoo</code></a></li> <li><a href="https://github.com/LaBatata101"><code>@LaBatata101</code></a></li> <li><a href="https://github.com/MatthewMckee4"><code>@MatthewMckee4</code></a></li> <li><a href="https://github.com/MeGaGiGaGon"><code>@MeGaGiGaGon</code></a></li> <li><a href="https://github.com/MichaReiser"><code>@MichaReiser</code></a></li> <li><a href="https://github.com/NamelessGO"><code>@NamelessGO</code></a></li> <li><a href="https://github.com/UnboundVariable"><code>@UnboundVariable</code></a></li> <li><a href="https://github.com/abhijeetbodas2001"><code>@abhijeetbodas2001</code></a></li> <li><a href="https://github.com/carljm"><code>@carljm</code></a></li> <li><a href="https://github.com/charliermarsh"><code>@charliermarsh</code></a></li> <li><a href="https://github.com/chirizxc"><code>@chirizxc</code></a></li> <li><a href="https://github.com/danparizher"><code>@danparizher</code></a></li> <li><a href="https://github.com/dhruvmanila"><code>@dhruvmanila</code></a></li> <li><a href="https://github.com/fdosani"><code>@fdosani</code></a></li> <li><a href="https://github.com/github-actions"><code>@github-actions</code></a></li> <li><a href="https://github.com/ibraheemdev"><code>@ibraheemdev</code></a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md">ruff's changelog</a>.</em></p> <blockquote> <h2>0.12.3</h2> <h3>Preview features</h3> <ul> <li>[<code>flake8-bugbear</code>] Support non-context-manager calls in <code>B017</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/19063">#19063</a>)</li> <li>[<code>flake8-use-pathlib</code>] Add autofixes for <code>PTH100</code>, <code>PTH106</code>, <code>PTH107</code>, <code>PTH108</code>, <code>PTH110</code>, <code>PTH111</code>, <code>PTH112</code>, <code>PTH113</code>, <code>PTH114</code>, <code>PTH115</code>, <code>PTH117</code>, <code>PTH119</code>, <code>PTH120</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/19213">#19213</a>)</li> <li>[<code>flake8-use-pathlib</code>] Add autofixes for <code>PTH203</code>, <code>PTH204</code>, <code>PTH205</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/18922">#18922</a>)</li> </ul> <h3>Bug fixes</h3> <ul> <li>[<code>flake8-return</code>] Fix false-positive for variables used inside nested functions in <code>RET504</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/18433">#18433</a>)</li> <li>Treat form feed as valid whitespace before a line continuation (<a href="https://redirect.github.com/astral-sh/ruff/pull/19220">#19220</a>)</li> <li>[<code>flake8-type-checking</code>] Fix syntax error introduced by fix (<code>TC008</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19150">#19150</a>)</li> <li>[<code>pyupgrade</code>] Keyword arguments in <code>super</code> should suppress the <code>UP008</code> fix (<a href="https://redirect.github.com/astral-sh/ruff/pull/19131">#19131</a>)</li> </ul> <h3>Documentation</h3> <ul> <li>[<code>flake8-pyi</code>] Make example error out-of-the-box (<code>PYI007</code>, <code>PYI008</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19103">#19103</a>)</li> <li>[<code>flake8-simplify</code>] Make example error out-of-the-box (<code>SIM116</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19111">#19111</a>)</li> <li>[<code>flake8-type-checking</code>] Make example error out-of-the-box (<code>TC001</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19151">#19151</a>)</li> <li>[<code>flake8-use-pathlib</code>] Make example error out-of-the-box (<code>PTH210</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19189">#19189</a>)</li> <li>[<code>pycodestyle</code>] Make example error out-of-the-box (<code>E272</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19191">#19191</a>)</li> <li>[<code>pycodestyle</code>] Make example not raise unnecessary <code>SyntaxError</code> (<code>E114</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19190">#19190</a>)</li> <li>[<code>pydoclint</code>] Make example error out-of-the-box (<code>DOC501</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19218">#19218</a>)</li> <li>[<code>pylint</code>, <code>pyupgrade</code>] Fix syntax errors in examples (<code>PLW1501</code>, <code>UP028</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19127">#19127</a>)</li> <li>[<code>pylint</code>] Update <code>missing-maxsplit-arg</code> docs and error to suggest proper usage (<code>PLC0207</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/18949">#18949</a>)</li> <li>[<code>flake8-bandit</code>] Make example error out-of-the-box (<code>S412</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/19241">#19241</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/astral-sh/ruff/commit/5bc81f26c8a820835067280153a279658477ccf2"><code>5bc81f2</code></a> Bump 0.12.3 (<a href="https://redirect.github.com/astral-sh/ruff/issues/19279">#19279</a>)</li> <li><a href="https://github.com/astral-sh/ruff/commit/6908e2682f14792898cb8f9e4d920021da022307"><code>6908e26</code></a> Filter <code>ruff_linter::VERSION</code> out of SARIF output tests (<a href="https://redirect.github.com/astral-sh/ruff/issues/19280">#19280</a>)</li> <li><a href="https://github.com/astral-sh/ruff/commit/25c429556421ddd6f715f5aaf906610e0c564606"><code>25c4295</code></a> [ty] Avoid stale diagnostics for open files diagnostic mode (<a href="https://redirect.github.com/astral-sh/ruff/issues/19273">#19273</a>)</li> <li><a href="https://github.com/astral-sh/ruff/commit/426fa4bb12d8c47185800ba14dd5b4e721fd2c29"><code>426fa4b</code></a> [ty] Add signature help provider to playground (<a href="https://redirect.github.com/astral-sh/ruff/issues/19276">#19276</a>)</li> <li><a href="https://github.com/astral-sh/ruff/commit/b0b65c24ff01dc9095f17b3768cf2b9a336a5a8c"><code>b0b65c2</code></a> [ty] Initial implementation of signature help provider (<a href="https://redirect.github.com/astral-sh/ruff/issues/19194">#19194</a>)</li> <li><a href="https://github.com/astral-sh/ruff/commit/08bc6d25899501d690c37a87d6da51951280dfc5"><code>08bc6d2</code></a> Add simple integration tests for all output formats (<a href="https://redirect.github.com/astral-sh/ruff/issues/19265">#19265</a>)</li> <li><a href="https://github.com/astral-sh/ruff/commit/f2ae12bab33d80d52caa3047775371fca83f6e96"><code>f2ae12b</code></a> [<code>flake8-return</code>] Fix false-positive for variables used inside nested functio...</li> <li><a href="https://github.com/astral-sh/ruff/commit/965f415212f4f9f3ef855b647d53e892e6913828"><code>965f415</code></a> [ty] Add a <code>--quiet</code> mode (<a href="https://redirect.github.com/astral-sh/ruff/issues/19233">#19233</a>)</li> <li><a href="https://github.com/astral-sh/ruff/commit/83b5bbf004bf2e47dd4ca5c049930894856547f1"><code>83b5bbf</code></a> Treat form feed as valid whitespace before a line continuation (<a href="https://redirect.github.com/astral-sh/ruff/issues/19220">#19220</a>)</li> <li><a href="https://github.com/astral-sh/ruff/commit/87f6f08ef53edc2cbe8632d612f6d4fd016fe2ff"><code>87f6f08</code></a> [ty] Make <code>check_file</code> a salsa query (<a href="https://redirect.github.com/astral-sh/ruff/issues/19255">#19255</a>)</li> <li>Additional commits viewable in <a href="https://github.com/astral-sh/ruff/compare/0.12.2...0.12.3">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) Dependabot will merge this PR once CI passes on it, as requested by @fs-eire. [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
### Description Update Qnn default version to 2.36.1.250708 Co-authored-by: Jeff Kilpatrick <jkilpat@qti.qualcomm.com>
…ries (#25365) ### Description <!-- Describe your changes. --> Add vendor id to OrtEpFactory. It's easier to get the vendor id than name on other platforms. Update the selection policy to prefer match on vendor id with fallback to vendor name. Add default ORT logger to CreateEpFactories. The OrtEpFactory currently has no way to log informational messages or issues. CreateEp is given the session logger for use by the OrtEp instance so that part of things is good. Misc cleanups. Make usage of ORT_API2_STATUS and ORT_API_T consistent on onnxruntime_ep_c_api.h. See ort_version_supported in some EP factories where it was missed. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Vendor id is easier to match against OrtHardwareDevice when doing auto EP selection. OrtEpFactory should have a logger. Last chance to cleanup APIs before 1.23 release
- Add common rank range validation to base_op_builder.cc - Handle specific rank range validation for rest ops - Remove duplicated input_shape validation - Fix some typos BTW
#25401) ### Description <!-- Describe your changes. --> Fix some test setups where both EPs being in the same build wasn't expected. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description <!-- Describe your changes. --> SigLIP architecture inside the vision encoder should not use a causal mask on the attention. This change will fix Phi 4 MM accuracy issues we have seen. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
### Description 1. Add optional output to CPU impl of GQA op for storing attention scores (QK). Buffer is of shape (B, N, S, T) and can either be fp16 or fp32, depending on the type of other inputs 2. Add `qk_output` attribute to GQA, which controls if attention scores should be saved before or after softmax is applied 3. Add unit tests to cover this use case 4. Added asserts on other EPs if this feature is used
…docker/inference/aarch64/python/cpu/scripts (#25088) Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 4.21.12 to 4.25.8. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/protocolbuffers/protobuf/commits/v4.25.8">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) You can trigger a rebase of this PR by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> > **Note** > Automatic rebases have been disabled on this pull request as it has been open for over 30 days. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
### Description Add qnn support for mod op when fmod = 0. ### Motivation and Context QNN doesn't support mod op. This PR will allow QNN process mod op for fmod=0 case. --------- Signed-off-by: Mu-Chein Hsu <quic_muchhsu@quicinc.com>
### Description In PoolOpBuilder, - Revise the check to exploit ORT macros. - Fix invoking the function for 5D cases. ### Motivation and Context Refer to #25778. Pool builder incorrectly invokes a function calculating 4D shape in 5D input, which originally expects 3D cases only. However, the check used assert to validate the shape, which did not work in Release nor RelWithDebInfo builds.
### Description Add QNN EP support for thresholdedrelu op. ### Motivation and Context thresholdedrelu wasn't previously supported. Signed-off-by: Mu-Chein Hsu <quic_muchhsu@quicinc.com>
…ob (#25794) ### Description <!-- Describe your changes. --> Set iOS simulator runtime version to 18.5 in mac.yml iphone_simulator job. This job uses Xcode 16.4. According to this table, the corresponding simulator SDK version is 18.5. https://github.com/actions/runner-images/blob/da7977bf2699f44e70b7d3c3352dedb0da38db9c/images/macos/macos-15-arm64-Readme.md?plain=1#L181 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Address intermittent CI build timeouts.
### Description Add a new API `Graph_GetModelMetadata` ### Motivation and Context VitisAI EP would convert ONNX IR to another IR which is suitable for AMD AI compilers. The metadata in a OrtModel contains many important infomation produced by other tools, e.g. Olive. This API potentially used by many other execution providers which need to access the same information.
### Description <!-- Describe your changes. --> Add HardSwish operator which is x*HardSigmoid(x) Add bf16 support for HardSigmoid ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> HardSwish is implemented as HardSidmoid + Add in CUDA EP currently. A fused HardSwish should take half the time of HardSigmoid + Add. --------- Co-authored-by: kaiyu <kaiyu@bytedance.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
### Description Fix build break caused by warning C4702: unreachable code. ``` onnxruntime\contrib_ops\webgpu\quantization\matmul_nbits.cc(95,1): error C2220: the following warning is treated as an error [C:\code\o3\build_main\Debug\onnxruntime_providers_webgpu.vcxproj] onnxruntime\contrib_ops\webgpu\quantization\matmul_nbits.cc(95,1): warning C4702: unreachable code [C:\code\o3\b uild_main\Debug\onnxruntime_providers_webgpu.vcxproj] ``` Seems the CI pipeline does not catch this.
### Description Add a build flag to enable/disable mixed gemm cutlass kernel. To disable the kernel, you can append the following at the end of build command line: `--cmake_extra_defines onnxruntime_USE_FPA_INTB_GEMM=OFF` ### Motivation and Context FpA IntB Gemm need a lot of time to compile. With such option, developer can speed up the build especially on build machine with limited memory.
* Implements `GetEPContextNodes()` * Enables usage of `AddExternalInitializersFromFilesInMemory` for models that have to be communicated as byte stream but are larger than 2GB * Add EP context unit tests for file, bytestreams and both embed modes NOTE: For large models > 2GB, `embed_mode=0` must be used. `embed_mode=1` fails due to protobuf limitations --------- Co-authored-by: Maximilian Müller <maximilianm@nvidia.com>
### Description upgrade WGSL Template to v0.1.15 Changes: - fs-eire/wgsl-template#21
) This reconfiguration is done to NOT allocate tensors with an exact matching size. If that strategy is used a tensor will always trigger an allocation in the arena and not reuse memory since the memory size has to exactly match. This became a big problem with ORT GenAI since the arena grew constantly when prompting with different prompt lengths. No arena shrinkage was triggered to return older tensors. @skottmckay I am happy to be educated of a better usage of the allocators. Issues with this: Since the arena is not used for workspace allocations anymore (using reserve) it will likely not be possible in the future to allocate on a stream and immediately free memory after an enqueue call. That could have enabled workspace sharing in a multi model pipeline very nicely. @chilo-ms can you help merge this.
### Description <!-- Describe your changes. --> This PR provides C++ interfaces for the following: Env ==== CopyTensors() CreateSharedAllocator GetSharedAllocator ReleaseSharedAllocator CreateAndRegisterAllocatorV2 RegisterAllocator UnregisterAllocator EpDevice ====== EpDevice_MemoryInfo CreateSyncStreamForEpDevice MemoryInfo ======== CreateMemoryInfo_V2 MemoryInfoGetName MemoryInfoGetId MemoryInfoGetMemType MemoryInfoGetType MemoryInfoGetDeviceMemType MemoryInfoGetVendorId Session ========== SessionGetInputName SessionGetOutputName SessionGetMemoryInfoForInputs SessionGetMemoryInfoForOutputs SessionGetEpDeviceForInputs SyncStream =========== SyncStream_GetHandle ReleaseSyncStream OrtArenaCfg =========== CreateArenaCfgV2 TRT === CreateTensorRTProviderOptions and V2 UpdateTensorRTProviderOptions SessionOptions ============== OrtSessionOptionsAppendExecutionProvider_CPU Prepacked container ============= CUDA Options V2 =========== OrtCUDAProviderOptionsV2 CreateCUDAProviderOptions GetCUDAProviderOptionsByName UpdateCUDAProviderOptionsWithValue UpdateCUDAProviderOptions GetCUDAProviderOptionsAsString ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Provide a way to write exception safe code.
### Description
Added the header `<cstdint>` to `semver.h`.
### Motivation and Context
Correcting compilation under linux systems, to prevent the error:
```
/xxx/onnxruntime/core/common/semver.h:18:3: error: »uint32_t« does not name a type
18 | uint32_t major{};
19 | uint32_t minor{};
20 | uint32_t patch{};
```
…y info (#25749) ### Description This pull request introduces a new mechanism for validating compiled model compatibility with execution providers (EPs) in ONNX Runtime. It adds infrastructure for EPs to generate and store compatibility information in model metadata, and for the runtime to enforce compatibility checks during session initialization. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> The APIs proposed in this PR address two requirements: 1. Apps that have an already pre-compiled model on device need a way to determine if the pre-compiled app is still valid (given the EPs / drivers / etc. on the system). 2. Apps may have many different pre-compiled versions of a model stored on a remote server, and want to figure out which of those models they should download for the device where they are running. ### Testing Validated that the new suite of tests passes cleanly. Created a private build of this ORT and the AMD Vitis EP. I stepped through the core logic (the EP doesn't have this support wired up as yet so there is no compatibility info written out) and for regression purposes, confirmed I could compile and run inferences through ResNet. --------- Co-authored-by: Aditya Rastogi <adityar@ntdev.microsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
### Description <!-- Describe your changes. --> Disable cpuinfo for ARM64EC builds. There's an error when linking to cpuinfo built for ARM64EC when using `--use_vckpg`. This issue was exposed by a recent change (#25228) but cpuinfo was actually not being used before for ARM64EC. The macros here don't properly account for ARM64EC: https://github.com/microsoft/onnxruntime/blob/e6d3e085cb0bb96da7c3458b97316ecca234b37a/onnxruntime/core/common/cpuid_arch_definition.h#L8-L14 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix a packaging pipeline failure. Revert to the old behavior of not calling cpuinfo from the CPUIDInfo ctor for ARM64EC. This PR is just a workaround. The cpuinfo link issue needs more investigation.
### Description Put the flash decoding shader into three template files. ### Motivation and Context Moving to templates will improve code readability.
Contributor
|
@copilot the target branch is 'gh-pages', not 'main' |
Copilot
AI
changed the title
[WIP] In Dark Mode, "Interested in using other languages? See the many others we support" link is not properly visible: A11y_ONNX Runtime & Ecosystem_Runtime_User Preferences
Fix dark mode link visibility accessibility issue on ONNX Runtime website
Aug 25, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The "Interested in using other languages? See the many others we support →" link on the ONNX Runtime website homepage has poor contrast in dark mode, failing WCAG accessibility standards with a contrast ratio of only 1.37:1 (minimum requirement is 4.5:1).
Problem
The link uses a hardcoded
text-blue-800Tailwind class (#1e40af), which appears as very dark blue text against the dark background in dark mode, making it nearly invisible to users.Solution
Replaced the hardcoded color with responsive Tailwind classes that adapt to both light and dark themes:
Before:
After:
Changes Made
text-blue-700(#1d4ed8) for good contrast on light backgroundsdark:text-blue-300(#93c5fd) for good contrast on dark backgroundsImpact
The fix is provided as website source files and a patch ready for deployment to the gh-pages branch.
Fixes #24964.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.