Skip to content

Latest commit

 

History

History
47 lines (29 loc) · 3.12 KB

File metadata and controls

47 lines (29 loc) · 3.12 KB

NVIDIA Model Optimizer Changelog (Windows)

0.41 (TBD)

Bug Fixes

  • Fix ONNX 1.19 compatibility issues with CuPy during ONNX INT4 AWQ quantization. ONNX 1.19 uses ml_dtypes.int4 instead of numpy.int8 which caused CuPy failures.

New Features

0.33 (2025-07-21)

New Features

  • Model Optimizer for Windows now supports NvTensorRtRtx execution-provider.

0.27 (2025-04-30)

New Features

  • New LLM models like DeepSeek etc. are supported with ONNX INT4 AWQ quantization on Windows. Refer Windows Support Matrix for details about supported features and models.
  • Model Optimizer for Windows now supports ONNX INT8 and FP8 quantization (W8A8) of SAM2 and Whisper models. Check example scripts for getting started with quantizing these models.

0.19 (2024-11-18)

New Features

* This version includes experimental features such as TensorRT deployment of ONNX INT4 models, PyTorch quantization and sparsity. These are currently unverified on Windows.