Skip to content

Commit f44ee7a

Browse files
committed
[feat] update docs
1 parent 941db7a commit f44ee7a

File tree

8 files changed

+890
-161
lines changed

8 files changed

+890
-161
lines changed

docs/images/train_grpo_512.png

214 KB
Loading

docs/images/train_grpo_768.png

246 KB
Loading

docs/images/train_ppo_512.png

246 KB
Loading

docs/images/train_ppo_768.png

241 KB
Loading

docs/images/train_spo_768.png

233 KB
Loading

docs/index.md

Lines changed: 106 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# <strong>Welcome to MiniMind!</strong>
1+
# Welcome to MiniMind!
22

33
<figure markdown>
44
![logo](images/logo.png)
@@ -7,47 +7,118 @@
77

88
## 📌 Introduction
99

10-
MiniMind is a super-small language model project trained completely from scratch, requiring **only $0.5 + 2 hours** to train a **26M** language model!
10+
**MiniMind** is a complete, open-source project for training ultra-small language models from scratch with minimal cost. Train a **26M** ChatBot in just **2 hours** with only **$3** on a single 3090 GPU!
1111

1212
- **MiniMind** series is extremely lightweight, the smallest version is **1/7000** the size of GPT-3
13-
- The project open-sources the minimalist structure of large models, including:
14-
- Mixture of Experts (MoE)
15-
- Dataset cleaning
16-
- Pretraining
17-
- Supervised Fine-Tuning (SFT)
18-
- LoRA fine-tuning
19-
- Direct Preference Optimization (DPO)
20-
- Model distillation
21-
- All core algorithm code is reconstructed from scratch using native PyTorch, without relying on third-party abstract interfaces
22-
- This is not only a full-stage open-source reproduction of large language models, but also a tutorial for getting started with LLMs
23-
24-
!!! note "Training Cost"
25-
"2 hours" is based on NVIDIA 3090 hardware (single card) testing, "$0.5" refers to GPU server rental cost
26-
27-
## ✨ Key Features
28-
29-
- **Ultra-low cost**: Single 3090, 2 hours, $0.5 to train a ChatBot from scratch
30-
- **Complete pipeline**: Covers Tokenizer, pretraining, SFT, LoRA, DPO, distillation full process
31-
- **Education-friendly**: Clean code, suitable for learning LLM principles
32-
- **Ecosystem compatible**: Supports `transformers`, `llama.cpp`, `vllm`, `ollama` and other mainstream frameworks
33-
34-
## 📊 Model List
35-
36-
| Model (Size) | Inference Memory (Approx.) | Release |
37-
|------------|----------|---------|
38-
| MiniMind2-small (26M) | 0.5 GB | 2025.04.26 |
39-
| MiniMind2-MoE (145M) | 1.0 GB | 2025.04.26 |
40-
| MiniMind2 (104M) | 1.0 GB | 2025.04.26 |
13+
- Complete implementation covering:
14+
- **Tokenizer training** with custom vocabulary
15+
- **Pretraining** (knowledge learning)
16+
- **Supervised Fine-Tuning (SFT)** (conversation patterns)
17+
- **LoRA fine-tuning** (parameter-efficient adaptation)
18+
- **Direct Preference Optimization (DPO)** (human preference alignment)
19+
- **RLAIF algorithms** (PPO/GRPO/SPO - reinforcement learning)
20+
- **Knowledge distillation** (compress large model knowledge)
21+
- **Model reasoning distillation** (DeepSeek-R1 style)
22+
- **YaRN algorithm** (context length extrapolation)
23+
- **Pure PyTorch implementation**: All core algorithms are implemented from scratch using native PyTorch, without relying on third-party abstract interfaces
24+
- **Educational value**: This is not only a full-stage open-source reproduction of large language models, but also a comprehensive tutorial for getting started with LLMs
25+
- **Extended capabilities**: MiniMind now supports [MiniMind-V](https://github.com/jingyaogong/minimind-v) for vision multimodal tasks
26+
27+
!!! note "Training Cost & Time"
28+
"2 hours" is based on **NVIDIA 3090** hardware (single card) testing
29+
30+
"$3" refers to GPU server rental cost
31+
32+
With 8× RTX 4090 GPUs, training time can be compressed to **under 10 minutes**
33+
34+
## ✨ Key Highlights
35+
36+
- **Ultra-low cost**: Single 3090, 2 hours, $3 to train a fully functional ChatBot from scratch
37+
- **Complete pipeline**: Tokenizer → Pretraining → SFT → LoRA → DPO/RLAIF → Distillation → Reasoning
38+
- **Latest algorithms**: Implements cutting-edge techniques including GRPO, SPO, and YaRN
39+
- **Education-friendly**: Clean, well-documented code suitable for learning LLM principles
40+
- **Ecosystem compatible**: Seamless support for `transformers`, `trl`, `peft`, `llama.cpp`, `vllm`, `ollama`, and `Llama-Factory`
41+
- **Full capabilities**: Supports multi-GPU training (DDP/DeepSpeed), model visualization (Wandb/SwanLab), and dynamic checkpoint management
42+
- **Production-ready**: OpenAI API protocol support for easy integration with third-party UIs (FastGPT, Open-WebUI, etc.)
43+
- **Multimodal extension**: Extended to vision with [MiniMind-V](https://github.com/jingyaogong/minimind-v)
44+
45+
## 📊 Model Series
46+
47+
### MiniMind2 Series (Latest - 2025.04.26)
48+
49+
| Model | Parameters | Vocabulary | Layers | Hidden Dim | Context | Inference Memory |
50+
|-------|-----------|------------|--------|-----------|---------|-----------------|
51+
| MiniMind2-small | 26M | 6,400 | 8 | 512 | 2K | ~0.5 GB |
52+
| MiniMind2-MoE | 145M | 6,400 | 8 | 640 | 2K | ~1.0 GB |
53+
| MiniMind2 | 104M | 6,400 | 16 | 768 | 2K | ~1.0 GB |
54+
55+
### MiniMind-V1 Series (Legacy - 2024.09.01)
56+
57+
| Model | Parameters | Vocabulary | Layers | Hidden Dim | Context |
58+
|-------|-----------|------------|--------|-----------|---------|
59+
| minimind-v1-small | 26M | 6,400 | 8 | 512 | 2K |
60+
| minimind-v1-moe | 104M | 6,400 | 8 | 512 | 2K |
61+
| minimind-v1 | 108M | 6,400 | 16 | 768 | 2K |
62+
63+
## 📅 Latest Updates (2025-10-24)
64+
65+
🔥 **RLAIF Training Algorithms**: Native implementation of PPO, GRPO, and SPO
66+
67+
- **YaRN Algorithm**: RoPE length extrapolation for improved long-sequence handling
68+
- **Adaptive Thinking**: Reasoning models support optional thinking chains
69+
- **Full template support**: Tool calling and reasoning tags (`<tool_call>`, `<think>`, etc.)
70+
- **Visualization**: Switched from WandB to [SwanLab](https://swanlab.cn/) (China-friendly)
71+
- **Reasoning models**: Complete MiniMind-Reason series based on DeepSeek-R1 distillation
72+
73+
## 🎯 Project Contents
74+
75+
- Complete MiniMind-LLM architecture code (Dense + MoE models)
76+
- Detailed Tokenizer training code
77+
- Full training pipeline: Pretrain → SFT → LoRA → RLHF/RLAIF → Distillation
78+
- High-quality, curated and deduplicated datasets at all stages
79+
- Native PyTorch implementation of key algorithms, minimal third-party dependencies
80+
- Multi-GPU training support (single-machine multi-card DDP, DeepSpeed, distributed clusters)
81+
- Visualization with Wandb/SwanLab
82+
- Model evaluation on third-party benchmarks (C-Eval, C-MMLU, OpenBookQA)
83+
- YaRN algorithm for RoPE context length extrapolation
84+
- OpenAI API protocol server for easy integration
85+
- Streamlit web UI for chat
86+
- Full compatibility with community tools: llama.cpp, vllm, ollama, Llama-Factory
87+
- MiniMind-Reason models: Complete open-source data + weights for reasoning distillation
4188

4289
## 🚀 Quick Navigation
4390

44-
- [Quick Start](quickstart.md) - Environment setup, model download, quick testing
45-
- [Model Training](training.md) - Pretraining, SFT, LoRA, DPO training process
91+
- **[Quick Start](quickstart.md)** - Environment setup, model download, quick testing
92+
- **[Model Training](training.md)** - Pretraining, SFT, LoRA, RLHF, RLAIF, and reasoning training
4693

47-
## 🔗 Related Links
94+
## 🔗 Links & Resources
4895

96+
**Project Repositories**:
4997
- **GitHub**: [https://github.com/jingyaogong/minimind](https://github.com/jingyaogong/minimind)
5098
- **HuggingFace**: [MiniMind Collection](https://huggingface.co/collections/jingyaogong/minimind-66caf8d999f5c7fa64f399e5)
51-
- **ModelScope**: [MiniMind Models](https://www.modelscope.cn/profile/gongjy)
52-
- **Online Demo**: [ModelScope Studio](https://www.modelscope.cn/studios/gongjy/MiniMind)
99+
- **ModelScope**: [MiniMind Profile](https://www.modelscope.cn/profile/gongjy)
100+
101+
**Online Demos**:
102+
- [ModelScope Studio - Standard Chat](https://www.modelscope.cn/studios/gongjy/MiniMind)
103+
- [ModelScope Studio - Reasoning Model](https://www.modelscope.cn/studios/gongjy/MiniMind-Reasoning)
104+
- [Bilibili Video Introduction](https://www.bilibili.com/video/BV12dHPeqE72/)
105+
106+
**Vision Extension**:
107+
- [MiniMind-V](https://github.com/jingyaogong/minimind-v) - Multimodal vision language models
108+
109+
## 💡 Why MiniMind?
110+
111+
The AI community is flooded with high-cost, complex frameworks that abstract away the fundamentals. MiniMind aims to democratize LLM learning by:
112+
113+
1. **Lowering the barrier**: No need for expensive GPUs or cloud services
114+
2. **Understanding, not just using**: Learn every detail from tokenization to inference
115+
3. **End-to-end learning**: Train from scratch, not just fine-tune existing models
116+
4. **Code clarity**: Pure PyTorch implementations you can read and understand
117+
5. **Practical results**: Get a working ChatBot with minimal resources
118+
119+
As we say: **"Building a Lego airplane is far more exciting than flying first class!"**
120+
121+
---
122+
123+
Next: [Get Started →](quickstart.md)
53124

0 commit comments

Comments
 (0)