Skip to content

Latest commit

 

History

History
26 lines (15 loc) · 1.29 KB

File metadata and controls

26 lines (15 loc) · 1.29 KB

Megatron Discussions

This directory contains in-depth guides, tutorials, and discussions about optimizing and using Megatron for various use cases.

Available Guides

Training Guides

  • Megatron-FSDP User Guide

    A practical guide to enable Megatron-FSDP training, including a quick-start example for DeepSeek-V3, required and recommended configurations, and instructions for checkpoint conversion from torch_dist to fsdp_dtensor.

  • Spectral Descent: Orthogonalizing Momentum via Newton-Schulz Iteration

    A discussion of Muon and related higher-order optimizers in Megatron Core, including layer-wise distributed optimizers, tensor parallel Newton-Schulz execution modes, and performance results on NVIDIA GB300.

Contributing

If you'd like to contribute a guide or tutorial, please follow this structure:

  1. Create a new directory: docs/discussions/your-guide-name/
  2. Add your main guide: docs/discussions/your-guide-name/your-guide-name.md
  3. Create an images directory: docs/discussions/your-guide-name/images/
  4. Update this README.md with a link to your guide

Each guide should be self-contained with its own images and supporting files.