Releases: NVIDIA-NeMo/Export-Deploy
Releases · NVIDIA-NeMo/Export-Deploy
NVIDIA NeMo-Export-Deploy 0.2.1
NVIDIA NeMo-Export-Deploy 0.2.0
- MegatronLM and Megatron-Bridge model deployment support with Triton Inference Server and Ray Serve
- Multi-node multi-instance Ray Serve based deployment for NeMo 2, Megatron-Bridge, and Megatron-LM models.
- Update vLLM export to use NeMo->HF->vLLM export path
- Multi-Modal deployment for NeMo 2 models with Triton Inference Server
- NeMo Retriever Text Reranking ONNX and TensorRT export support
NVIDIA NeMo-Export-Deploy 0.2.0rc2
Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc2 (2025-08-18)
NVIDIA NeMo-Export-Deploy 0.1.1
ci: Mock DCO check Signed-off-by: oliver könig <[email protected]>
NVIDIA NeMo-Export-Deploy 0.2.0rc1
Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc1 (2025-08-14)
NVIDIA NeMo-Export-Deploy 0.2.0rc0
Prerelease: NVIDIA NeMo-Export-Deploy 0.2.0rc0 (2025-08-03)
NVIDIA NeMo-Export-Deploy 0.1.0
- NeMo Export-Deploy Release
- Pip installers for export and deploy
- RayServe support for multi-instance deployment
- TensorRT-LLM PyTorch backend
- mcore inference optimizations