End-to-end tutorials covering the LLM customization lifecycle using NeMo AutoModel.
| Tutorial | Dataset | Description | Launch on Brev |
|---|---|---|---|
| Domain Adaptive Pre-Training (DAPT) | Domain-specific text corpus | Continued pre-training of a foundation model on domain data to improve in-domain performance (inspired by ChipNeMo). | 🚧 |
| Supervised Fine-Tuning (SFT) | SQuAD | Full-parameter SFT to adapt a pre-trained model to follow instructions. | 🚧 |
| Parameter-Efficient Fine-Tuning (PEFT) | SQuAD | Memory-efficient LoRA fine-tuning for task adaptation. | 🚧 |
| Evaluation | Standard benchmarks (MMLU, HellaSwag, IFEval, etc.) | Evaluate AutoModel checkpoints with lm-evaluation-harness. | 🚧 |
| Reasoning SFT | Reasoning instruction data (OpenAI chat format) | Fine-tune a model to selectively enable chain-of-thought reasoning via system prompt control. | 🚧 |
| Nemotron Parse Fine-Tuning | Invoices | Fine-tune Nemotron Parse v1.1 for structured document extraction. |
- NeMo AutoModel installed (see the AutoModel README for setup instructions).
- NVIDIA GPU(s) with sufficient memory (specific requirements noted per tutorial).
- Hugging Face account and API token for gated models (e.g., Llama).
These tutorials cover four stages of the LLM customization lifecycle:
Foundation Model ──> DAPT ──> SFT / PEFT ──> Evaluation
|
└──────────> Reasoning SFT ──────────> Evaluation
- DAPT: Inject domain knowledge via continued pre-training.
- SFT / PEFT: Teach the model to follow instructions or solve specific tasks.
- Reasoning SFT: Teach the model chain-of-thought reasoning with on/off control.
- Evaluation: Measure quality on standard benchmarks after each stage.
For reinforcement learning from human feedback (RLHF / DPO / PPO), see NeMo-RL.