Skip to content

Releases: NVIDIA-NeMo/Export-Deploy

NVIDIA NeMo-Export-Deploy 0.2.1

22 Oct 23:36
950000c

Choose a tag to compare

  • Bug fixes for HuggingFace model deployment (#459)
    • Fixed HuggingFace deployable implementations for both Triton and Ray Serve backends
    • Improved tokenizer handling in HuggingFace deployment scripts
  • Minor fixes for Ray deployment (#464)
    • Additional bug fixes in Ray deployment utilities

NVIDIA NeMo-Export-Deploy 0.2.0

09 Oct 20:01
726695b

Choose a tag to compare

  • MegatronLM and Megatron-Bridge model deployment support with Triton Inference Server and Ray Serve
  • Multi-node multi-instance Ray Serve based deployment for NeMo 2, Megatron-Bridge, and Megatron-LM models.
  • Update vLLM export to use NeMo->HF->vLLM export path
  • Multi-Modal deployment for NeMo 2 models with Triton Inference Server
  • NeMo Retriever Text Reranking ONNX and TensorRT export support

NVIDIA NeMo-Export-Deploy 0.2.0rc2

18 Aug 06:32
7867110

Choose a tag to compare

Pre-release

Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc2 (2025-08-18)

NVIDIA NeMo-Export-Deploy 0.1.1

15 Aug 08:24
ca72da9

Choose a tag to compare

ci: Mock DCO check

Signed-off-by: oliver könig <[email protected]>

NVIDIA NeMo-Export-Deploy 0.2.0rc1

14 Aug 15:54
62485cc

Choose a tag to compare

Pre-release

Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc1 (2025-08-14)

NVIDIA NeMo-Export-Deploy 0.2.0rc0

03 Aug 16:48
657c525

Choose a tag to compare

Pre-release

Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc0 (2025-08-03)

NVIDIA NeMo-Export-Deploy 0.1.0

30 Jul 16:01
b6cf209

Choose a tag to compare

  • NeMo Export-Deploy Release
  • Pip installers for export and deploy
  • RayServe support for multi-instance deployment
  • TensorRT-LLM PyTorch backend
  • mcore inference optimizations