Skip to content

tkgaolol/nanogpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NanoGPT Study Repository

This repository contains my study implementation and notes for @karpathy's build-nanogpt tutorial. It's a from-scratch reproduction of GPT-2.

Repository Structure

Core Training Files

  • train.py - Main training script with complete GPT implementation including:
    • Multi-head self-attention with Flash Attention optimization
    • MLP blocks with GELU activation
    • Layer normalization and residual connections
    • Data loading and distributed training support
    • HellaSwag evaluation integration

Data Processing

  • fineweb.py - FineWeb-Edu dataset downloader and tokenizer

    • Downloads 10B token dataset for pretraining
    • GPT-2 tokenization using tiktoken
    • Efficient data sharding for large-scale training
  • input.txt - Sample text data for quick experiments

Evaluation

  • hellaswag.py - HellaSwag benchmark evaluation script
    • Common sense reasoning evaluation
    • Multiple choice completion task
    • Model performance comparison utilities

Exploration

  • play.ipynb - Jupyter notebook for interactive experimentation
    • Model testing and inference
    • Training visualization
    • Architecture exploration

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published