A root-level, from-scratch character-level GPT notebook that trains on Tiny Shakespeare and supports:
- Jetson Orin Nano (CUDA)
- MacBook Air M4 (PyTorch MPS/Metal)
Main notebook: tiny_gpt_from_scratch_v2.ipynb
- Verify PyTorch can see CUDA:
python3 -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'no cuda')" - Optional monitor while training:
tegrastats
- Verify MPS is available:
python3 -c "import torch; print(torch.backends.mps.is_available())"
jupyter notebook tiny_gpt_from_scratch_v2.ipynbor
jupyter lab tiny_gpt_from_scratch_v2.ipynb- Run the notebook training sections through Save Checkpoint to produce
gpt_char_ckpt.pt. - Restart kernel.
- Run only the ONE-CELL: Load + Chat cell.
- Chat immediately from terminal-style input in that cell.
Use plain ASCII when possible (no emojis/special symbols) to avoid vocab errors.
User: Who are you?\nAssistant:User: Explain gravity in simple words.\nAssistant:User: Write a short poem about rain.\nAssistant:User: Give me three tips to learn Python.\nAssistant:KING RICHARD III: What news?\nMESSENGER:HAMLET: To be, or not to be?\nHORATIO:ROMEO: But, soft! what light through yonder window breaks?\nJULIET:
-
Missing checkpoint (
gpt_char_ckpt.pt):- Run training + save checkpoint cells first.
- Confirm notebook working directory matches the checkpoint location.
-
KeyErrorinencode:- Your prompt contains characters not present in training vocab.
- Use simpler ASCII prompts, avoid emojis/special chars, or retrain with broader text.
-
Out of memory (OOM):
- Reduce in order:
batch_size->block_size->d_model. - Restart kernel after changing memory-heavy settings.
- Reduce in order: