- The onnxruntime-gpu v1.10.0 will be removed from PyPI. We have hit our PyPI project
- size limit for onnxruntime-gpu, so we will be removing our oldest package version to free up the
- necessary space.
-
-
- ONNX Runtime v1.20.0 is now officially released. For release notes, assets,
- and more, visit our
- GitHub Releases page.
+ There will be breaking API changes in this release. We'll keep you posted!
-
Versioning Updates
+
Major Updates in 1.22
- We are planning to upgrade ONNX Runtime support for the following (where the first value is the
- highest version previously supported and the second value is the version support that will be
- added in ORT 1.20.1):
+ The current release (1.22.0) includes the following key features and updates:
-
QNN SDK 2.27 --> 2.28
-
DirectML 1.15.2 --> 1.16
-
ONNX 1.17 support will be included in a future release.
-
-
-
Major Updates
-
- In addition to various bug fixes and performance improvements, ORT 1.20.1 will include the
- following updates:
-
-
-
- CPU FP16 implementation fixes for the following kernels: LayerNormalization,
- SimplifiedLayerNormalization, SkipLayerNormalization, SkipSimplifiedLayerNormalization.
-
-
Python quantization tool updates.
-
New QNN SDK version support.
+
ORT API changes - Improved API interface for better usability
+
New WebGPU Execution Provider - Enhanced support for web-based ML workloads
Feature Requests
@@ -163,154 +136,203 @@
Note: All timelines and features listed on this page are subject to change.
-
ONNX Runtime 1.20.1
+
ONNX Runtime 1.21
- Tentative release date: 11/20/2024
+ Release date: February 2025
-
+
Announcements
-
-
- The onnxruntime-gpu v1.10.0 will be removed from PyPI. We have hit our PyPI
- project size limit for onnxruntime-gpu, so we will be removing our oldest package version
- to free up the necessary space.
-
-
+
No large announcements of note this release! We've made a lot of small refinements to streamline your ONNX Runtime experience.
-
+
-
-
Build System & Packages
+
+
GenAI & Advanced Model Features
-
No features planned for 1.20.1. Stay tuned for 1.21 features.
+
Enhanced Decoding & Pipeline Support
+
+
Added "chat mode" support for CPU, GPU, and WebGPU.
+
Provided support for decoder model pipelines.
+
Added support for Java API for MultiLoRA.
+
+
API & Compatibility Updates
+
+
Chat mode introduced breaking changes in the API (see migration guide).
+
+
Bug Fixes for Model Output
+
+
Fixed Phi series garbage output issues with long prompts.
+
Resolved gibberish issues with top_k on CPU.
+
-
+
-
-
Core
+
+
Core Refinements
-
No features planned for 1.20.1. Stay tuned for 1.21 features.
+
+
Reduced default logger usage for improved efficiency (#23030).
- Handle input models with pre-quantized weights (#22633).
-
-
- Prevent int32 quantized bias from clipping by adjusting the weight's scale (#22020).
-
+
Added support for caching generated CoreML models.
-
+
-
-
EPs
+
+
Extensions & Tokenizer Improvements
-
CPU
+
Expanded Tokenizer Support
-
- Fix CPU FP16 implementations for the following kernels: LayerNormalization,
- SimplifiedLayerNormalization, SkipLayerNormalization, SkipSimplifiedLayerNormalization.
-
+
Now supports more tokenizer models, including ChatGLM, Baichuan2, Phi-4, etc.
+
Added full Phi-4 pre/post-processing support for text, vision, and audio.
+
Introduced RegEx pattern loading from tokenizer.json.
-
QNN
+
+
Image Codec Enhancements
-
QNN SDK 2.28.x support.
+
ImageCodec now links to native APIs if available; otherwise, falls back to built-in libraries.
-
DirectML
+
+
Unified Tokenizer API
-
DirectML 1.16 support.
+
Introduced a new tokenizer op schema to unify the tokenizer codebase.
+
Added support for loading tokenizer data from a memory blob in the C API.
-
+
-
-
Mobile
+
+
Infrastructure & Build Improvements
-
No features planned for 1.20.1. Stay tuned for 1.21 features.
+
CMake File Changes
+
+
CMake Version: Increased the minimum required CMake version from 3.26 to 3.28. Added support for CMake 4.0.
+
Python Version: Increased the minimum required Python version from 3.8 to 3.10 for building ONNX Runtime from source.
+
Improved VCPKG support
+
Added options for WebGPU EP:
+
+
onnxruntime_USE_EXTERNAL_DAWN
+
onnxruntime_CUSTOM_DAWN_SRC_PATH
+
onnxruntime_BUILD_DAWN_MONOLITHIC_LIBRARY
+
onnxruntime_ENABLE_PIX_FOR_WEBGPU_EP
+
onnxruntime_ENABLE_DAWN_BACKEND_VULKAN
+
onnxruntime_ENABLE_DAWN_BACKEND_D3D12
+
+
+
Added cmake option onnxruntime_BUILD_QNN_EP_STATIC_LIB for building with QNN EP as a static library.