NVIDIA Model Optimizer Changelog (Windows)

0.41 (TBD)

Bug Fixes

Fix ONNX 1.19 compatibility issues with CuPy during ONNX INT4 AWQ quantization. ONNX 1.19 uses ml_dtypes.int4 instead of numpy.int8 which caused CuPy failures.

New Features

Add support for ONNX Mixed Precision Weight-only quantization using INT4 and INT8 precisions. Refer quantization example for GenAI LLMs.
Add support for some diffusion models' quantization on Windows. Refer example script for details.
Add Perplexity and KL-Divergence accuracy benchmarks.

0.33 (2025-07-21)

New Features

Model Optimizer for Windows now supports NvTensorRtRtx execution-provider.

0.27 (2025-04-30)

New Features

New LLM models like DeepSeek etc. are supported with ONNX INT4 AWQ quantization on Windows. Refer Windows Support Matrix for details about supported features and models.
Model Optimizer for Windows now supports ONNX INT8 and FP8 quantization (W8A8) of SAM2 and Whisper models. Check example scripts for getting started with quantizing these models.

0.19 (2024-11-18)

New Features

This is the first official release of Model Optimizer for Windows
ONNX INT4 Quantization: :meth:`modelopt.onnx.quantization.quantize_int4 <modelopt.onnx.quantization.int4.quantize>` now supports ONNX INT4 quantization for DirectML and TensorRT* deployment. See :ref:`Support_Matrix` for details about supported features and models.
LLM Quantization with Olive: Enabled LLM quantization through Olive, streamlining model optimization workflows. Refer Olive example.
DirectML Deployment Guide: Added DML deployment guide. Refer :ref:`Onnxruntime_Deployment` deployment guide for details.
MMLU Benchmark for Accuracy Evaluations: Introduced MMLU benchmarking for accuracy evaluation of ONNX models on DirectML (DML).
Published quantized ONNX models collection: Published quantized ONNX models at HuggingFace NVIDIA collections.

* This version includes experimental features such as TensorRT deployment of ONNX INT4 models, PyTorch quantization and sparsity. These are currently unverified on Windows.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Model Optimizer Changelog (Windows)

0.41 (TBD)

0.33 (2025-07-21)

0.27 (2025-04-30)

0.19 (2024-11-18)

FilesExpand file tree

CHANGELOG-Windows.rst

Latest commit

History

CHANGELOG-Windows.rst

File metadata and controls

NVIDIA Model Optimizer Changelog (Windows)

0.41 (TBD)

0.33 (2025-07-21)

0.27 (2025-04-30)

0.19 (2024-11-18)