A minimal supervised fine-tuning (SFT) setup for QnA-style dialogue models, building upon Karpathy’s nanoGPT.
This project demonstrates how to implement prompt masking and a chat-style template for fine-tuning small GPT models on conversational datasets such as Databricks Dolly-15k.
- QnA prompt–response formatting (
User:/Assistant:template) - Label masking to train only on assistant responses
- Simple PyTorch + Datasets data pipeline
- Configurable logging with Weights & Biases
- AMP training support for faster convergence
python nanosft.py \
--dataset databricks/databricks-dolly-15k \
--hf_model_type gpt2 \
--batch_size 8 \
--grad_accum 4 \
--epochs 1 \
--lr 3e-5 \
--use_amp- Optimization for larger-scale runs (gradient checkpointing, mixed precision)
- Multi-GPU and distributed training support
- LoRA-based fine-tuning for efficient parameter updates
Note: currently only a educational version, not prod ready