Skip to content

Releases: NVIDIA-NeMo/Automodel

NVIDIA NeMo-Automodel 0.1.2

23 Oct 19:24
45ad729

Choose a tag to compare

  • Features:

    • Included support for limiting the number of samples with the ColumnMappedDataset
  • Bug Fixes (step scheduler):

    • Switched to zero-based indexing
    • Epoch length accounts for accumulation steps

NVIDIA NeMo-Automodel 0.1.0

08 Oct 14:18
7146809

Choose a tag to compare

New Features

  • Pretraining support for
    • Models under 40B with PyT FSDP2
    • Larger models by applying PyT PP
    • TP can also be used for models with a TP plan
    • Large MOE via custom implementations
  • Knowledge distillation for LLMs (requires same tokenizer)
  • FP8 with torchao (requires torch.compile)
  • Parallelism
    • HSDP with FSDP2
    • Auto Pipelining Support
  • Checkpointing
    • Pipeline support (load and save)
    • Parallel load with meta device
  • Data
    • ColumnMapped Dataset for single-turn SFT
    • Pretrain Data: Megatron-Core and Nano-gpt compatible data
  • Performance https://docs.nvidia.com/nemo/automodel/latest/performance-summary.html
    • Pretraining benchmark for Large MoE user-defined models
    • Fast DeepSeek v3 implementation with DeepEP
  • Megatron FSDP support
  • Packed sequence support
  • Triton kernels for LoRA

NVIDIA NeMo-Automodel 0.1.0rc0

17 Sep 13:59
d36402d

Choose a tag to compare

Pre-release

Prerelease: NVIDIA NeMo-Automodel 0.1.0rc0 (2025-09-17)