A curated collection of github projects with tiny code base. Most of them are primarily interesting for educational purposes, but some of them (e.g. tinygrad) compete with large and complex projects.
- Andrej Karpathy
- Diffusion models
- GPU
- 🤗Huggingface
- Inference engines
- LLMs
- PyTorch Foundation
- RecSys
- Reinforcement learning
- Tabular ML
- ML
- ML & CyberSec
- C
- Low-level
- cryptos - Pure Python from-scratch zero-dependency implementation of Bitcoin for educational purposes.
- llama2.c - Inference Llama 2 in one file of pure C.
- llm.c - LLM training in simple, raw C/CUDA.
- micrograd - A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API.
- minbpe - Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
- minGPT - A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training.
- nanoGPT - The simplest, fastest repository for training/finetuning medium-sized GPT.
- nano-llama31 - nanoGPT style version of Llama 3.1.
- nanochat - The best ChatGPT that $100 can buy.
- diffusion-gpt - An annotated implementation of a character-level disrete diffusion model for text generation. Inspired by nanoGPT.
- micro_diffusion - Micro-budget training of large-scale diffusion models by Sony Research.
- minimal-text-diffusion - A minimal implementation of diffusion model for text generation. Also contains a basic list of papers/blogs/videos for a deeper dive into diffusion models.
- penny - hand-written gpu communication lib (nccl).
- tiny-gpu - A minimal GPU design in Verilog to learn how GPUs work from the ground up.
- nanotron - Minimalistic large language model 3D-parallelism training.
- nanoVLM - The simplest, fastest repository for training/finetuning small-sized VLMs.
- picotron - The minimalist & most-hackable repository for pre-training Llama-like models with 4D Parallelism. It is designed with simplicity and educational purposes in mind.
- smolagents - A barebones library for agents that think in code.
- smol-course - A course on aligning smol models.
- smollm - Everything about the SmolLM and SmolVLM family of models.
- flex-nano-vllm - FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
- mini-sglang - Mini-Sglang by sgl-project.
- nano-vllm - A lightweight vLLM implementation built from scratch.
- tiny-vllm - You're going to build a high performance LLM inference engine with C++ and CUDA.
- tokasaurus - LLM inference engine optimized for throughput-intensive workloads. On throughput-focused benchmarks, Tokasaurus can outperform vLLM and SGLang by up to 3x+.
- minimind - Project aims to train a super-small language model MiniMind with only 3 RMB cost and 2 hours, starting completely from scratch.
- modded-nanogpt - NanoGPT (124M) in 3 minutes on 8xH100.
- modded-nanogpt-rwkv - Modified variant of nanoGPT for RWKV.
- nanoMoE - The simplest, fastest repository for training/finetuning medium-sized MoE-based GPTs. Also, an awesome post by the author about MoE.
- nanoT5 - Fast & Simple repository for pre-training and fine-tuning T5-style models.
- needle - Distillation Gemini 3.1 into a 26m parameter model. It works especially well with function calling.
- parameter-golf - OpenAI Model Craft Challenge: train the best language model that fits in a 16MB artifact and trains in under 10 minutes on 8xH100s.
- gpt-fast - Simple and efficient pytorch-native transformer text generation. LLaMA like, gptq, tensor parallelism, spec decoding, etc.
- LeanRL - LeanRL is a fork of CleanRL where hand-picked scripts have been re-written using PyTorch 2 features, mainly torch.compile and cudagraphs.
- segment-anything-fast - Segment Anything over 8x using only pure, native PyTorch.
- MiniOneRec - Minimal reproduction of OneRec.
- Mini-R1 - Minimal reproduction of DeepSeek R1-Zero. Code built upon trl.
- minimalRL - Implementations of basic RL algorithms with minimal lines of codes.
- nano-aha-moment - Inspired by TinyZero and Mini-R1, but designed to be much simpler, cleaner, and faster, with every line of code visible and understandable.
- nanoRLHF - This project aims to perform RLHF training from scratch, implementing almost all core components manually except for PyTorch and Triton.
- TinyZero - Minimal reproduction of DeepSeek R1-Zero. Code built upon verl.
- nanoTabPFN - Train your own small TabPFN in less than 500 LOC and a few minutes. The purpose of this repository is to be a good starting point for students and researchers that are interested in learning about how TabPFN works under the hood.
- mini-swe-agent - The 100 line AI agent that solves GitHub issues or helps you in your command line.
- nano-graphrag - A simple, easy-to-hack GraphRAG implementation.
- tinygrad - You like pytorch? You like micrograd? You love tinygrad! ❤️
- tinyvector - A tiny nearest-neighbor embedding database built with SQLite and Pytorch.
- subwiz - nanoGPT based model, trained to discover subdomains.
- agent-c - A ultra-lightweight AI agent written in C that communicates with OpenRouter API and executes shell commands.
- flux2.c - FLUX.2-klein-4B Pure C Implementation. Zero external dependencies beyond the C standard library. By the creator of redis + Claude.
- miniaudio - Audio playback and capture library written in C, in a single source file.
- nanoMPI - A minimal MPI Implementation loosely based on OpenMPI. nanoMPI allows beginners to the field of distributed computing to quickly see answers to questions like "how is a ring allreduce implemented?"
- tiny-tpu - A minimal tensor processing unit (TPU), inspired by Google's TPU V2 and V1.
