diff --git a/src/routes/roadmap/+page.svelte b/src/routes/roadmap/+page.svelte index c1b28dcb4856f..34124963de845 100644 --- a/src/routes/roadmap/+page.svelte +++ b/src/routes/roadmap/+page.svelte @@ -2,7 +2,7 @@ let description = 'ONNX Runtime Release Roadmap - find the latest release information for ONNX Runtime.'; let keywords = - 'onnx runtime, onnx runtime roadmap, onnx runtime release, onnx runtime 1.20, onnx runtime 1.21, onnx runtime 1.20.1'; + 'onnx runtime, onnx runtime roadmap, onnx runtime release, onnx runtime 1.21, onnx runtime 1.22, onnx runtime 1.23'; @@ -24,6 +24,8 @@

+ +
Previous release
-
1.20.0
-
Release date: 11/1/2024
+
1.21.0
+
Release date: Feb 2025
-
-
In-Progress Release
-
1.20.1
-
Release date: 11/20/2024
+
Current Release
+
1.22.0
+
Release date: May 2025
-
Next release
-
1.21
-
Release date: Feb. 2025
+
1.23
+
Release date: Aug 2025
@@ -94,44 +94,17 @@

Announcements

-

Versioning Updates

+

Major Updates in 1.22

- We are planning to upgrade ONNX Runtime support for the following (where the first value is the - highest version previously supported and the second value is the version support that will be - added in ORT 1.20.1): + The current release (1.22.0) includes the following key features and updates:

- -

Major Updates

-

- In addition to various bug fixes and performance improvements, ORT 1.20.1 will include the - following updates: -

-

Feature Requests

@@ -163,154 +136,203 @@ Note: All timelines and features listed on this page are subject to change.

-

ONNX Runtime 1.20.1

+

ONNX Runtime 1.21

- Tentative release date: 11/20/2024 + Release date: February 2025

- +
Announcements
-
    -
  • - The onnxruntime-gpu v1.10.0 will be removed from PyPI. We have hit our PyPI - project size limit for onnxruntime-gpu, so we will be removing our oldest package version - to free up the necessary space. -
  • -
+

No large announcements of note this release! We've made a lot of small refinements to streamline your ONNX Runtime experience.

- +
- -
Build System & Packages
+ +
GenAI & Advanced Model Features
-

No features planned for 1.20.1. Stay tuned for 1.21 features.

+

Enhanced Decoding & Pipeline Support

+
    +
  • Added "chat mode" support for CPU, GPU, and WebGPU.
  • +
  • Provided support for decoder model pipelines.
  • +
  • Added support for Java API for MultiLoRA.
  • +
+

API & Compatibility Updates

+
    +
  • Chat mode introduced breaking changes in the API (see migration guide).
  • +
+

Bug Fixes for Model Output

+
    +
  • Fixed Phi series garbage output issues with long prompts.
  • +
  • Resolved gibberish issues with top_k on CPU.
  • +
- +
- -
Core
+ +
Core Refinements
-

No features planned for 1.20.1. Stay tuned for 1.21 features.

+
    +
  • Reduced default logger usage for improved efficiency (#23030).
  • +
  • Fixed a visibility issue in threadpool (#23098).
  • +
- +
- -
Performance
+ +
Execution Provider (EP) Updates
-

No features planned for 1.20.1. Stay tuned for 1.21 features.

+

General

+
    +
  • Removed TVM EP from the source tree (#22827).
  • +
  • Marked NNAPI EP for deprecation (following Google's deprecation of NNAPI).
  • +
  • Fixed a DLL delay loading issue that impacts WebGPU EP and DirectML EP's usability on Windows (#23111, #23227)
  • +
+ +

TensorRT EP

+
    +
  • Added support for TensorRT 10.8.
  • +
  • onnx-tensorrt open-source parser user: please check documentation for requirements.
  • +
  • Assigned DDS ops (NMS, RoiAlign, NonZero) to TensorRT by default.
  • +
  • Introduced option trt_op_types_to_exclude to exclude specific ops from TensorRT assignment.
  • +
+ +

CUDA EP

+
    +
  • Added a python API preload_dlls to coexist with PyTorch.
  • +
  • Miscellaneous enhancements for Flux model inference.
  • +
+ +

QNN EP

+
    +
  • Introduced QNN shared memory support.
  • +
  • Improved performance for AI Hub models.
  • +
  • Added support for QAIRT/QNN SDK 2.31.
  • +
  • Added Python 3.13 package.
  • +
  • QNN EP is now built as a shared library/DLL by default. To retain previous build behavior, use build option --use_qnn static_lib.
  • +
+ +

DirectML EP

+
    +
  • Updated DirectML version from 1.15.2 to 1.15.4 (#22635).
  • +
+ +

OpenVINO EP

+
    +
  • Introduced OpenVINO EP Weights Sharing feature.
  • +
  • Added support for various contrib Ops in OVEP: +
      +
    • SkipLayerNormalization, MatMulNBits, FusedGemm, FusedConv, EmbedLayerNormalization, BiasGelu, Attention, DynamicQuantizeMatMul, FusedMatMul, QuickGelu, SkipSimplifiedLayerNormalization
    • +
    +
  • +
+ +

VitisAI EP

+
    +
  • Miscellaneous bug fixes and improvements.
  • +
- +
- -
Quantization
+ +
Mobile Platform Enhancements
+

CoreML Updates

    -
  • - Introduce get_int_qdq_config() helper to get QDQ configurations (#22677). -
  • -
  • - Update QDQ Pad, Slice, Softmax (#22676). -
  • -
  • - Handle input models with pre-quantized weights (#22633). -
  • -
  • - Prevent int32 quantized bias from clipping by adjusting the weight's scale (#22020). -
  • +
  • Added support for caching generated CoreML models.
- +
- -
EPs
+ +
Extensions & Tokenizer Improvements
-

CPU

+

Expanded Tokenizer Support

    -
  • - Fix CPU FP16 implementations for the following kernels: LayerNormalization, - SimplifiedLayerNormalization, SkipLayerNormalization, SkipSimplifiedLayerNormalization. -
  • +
  • Now supports more tokenizer models, including ChatGLM, Baichuan2, Phi-4, etc.
  • +
  • Added full Phi-4 pre/post-processing support for text, vision, and audio.
  • +
  • Introduced RegEx pattern loading from tokenizer.json.
-

QNN

+ +

Image Codec Enhancements

    -
  • QNN SDK 2.28.x support.
  • +
  • ImageCodec now links to native APIs if available; otherwise, falls back to built-in libraries.
-

DirectML

+ +

Unified Tokenizer API

    -
  • DirectML 1.16 support.
  • +
  • Introduced a new tokenizer op schema to unify the tokenizer codebase.
  • +
  • Added support for loading tokenizer data from a memory blob in the C API.
- +
- -
Mobile
+ +
Infrastructure & Build Improvements
-

No features planned for 1.20.1. Stay tuned for 1.21 features.

+

CMake File Changes

+
    +
  • CMake Version: Increased the minimum required CMake version from 3.26 to 3.28. Added support for CMake 4.0.
  • +
  • Python Version: Increased the minimum required Python version from 3.8 to 3.10 for building ONNX Runtime from source.
  • +
  • Improved VCPKG support
  • +
  • Added options for WebGPU EP: +
      +
    • onnxruntime_USE_EXTERNAL_DAWN
    • +
    • onnxruntime_CUSTOM_DAWN_SRC_PATH
    • +
    • onnxruntime_BUILD_DAWN_MONOLITHIC_LIBRARY
    • +
    • onnxruntime_ENABLE_PIX_FOR_WEBGPU_EP
    • +
    • onnxruntime_ENABLE_DAWN_BACKEND_VULKAN
    • +
    • onnxruntime_ENABLE_DAWN_BACKEND_D3D12
    • +
    +
  • +
  • Added cmake option onnxruntime_BUILD_QNN_EP_STATIC_LIB for building with QNN EP as a static library.
  • +
  • Removed cmake option onnxruntime_USE_PREINSTALLED_EIGEN.
  • +
  • Fixed a build issue with Visual Studio 2022 17.3 (#23911)
  • +
+ +

Modernized Build Tools

+
    +
  • Now using VCPKG for most package builds.
  • +
  • Upgraded Gradle from 7.x to 8.x.
  • +
  • Updated JDK from 11 to 17.
  • +
  • Enabled onnxruntime_USE_CUDA_NHWC_OPS by default for CUDA builds.
  • +
  • Added support for WASM64 (build from source; no package published).
  • +
+ +

Dependency Cleanup

+
    +
  • Removed Google's nsync from dependencies.
  • +
+ +

Others

+
    +
  • Updated Node.js installation script to support network proxy usage (#23231)
  • +
- +
- +
Web
-

No features planned for 1.20.1. Stay tuned for 1.21 features.

-
-
- - -
- -
generate() API
-
-

No features planned for 1.20.1. Stay tuned for 1.21 features.

-
-
- - -
- -
Extensions
-
-

No features planned for 1.20.1. Stay tuned for 1.21 features.

-
-
- - -
- -
Olive
-
-

No features planned for 1.20.1. Stay tuned for 1.21 features.

+

No updates of note.

-
+