Learn AutoResearch

Define a metric. Set a target. Let the agent iterate overnight.
A project-based course on autonomous research loops — inspired by Karpathy's self-improving ML training loop.

中文版 → | Live Docs →

What Is This?

Learn AutoResearch teaches you to automate the research loop: define a measurable metric, let an agent generate hypotheses, implement changes, evaluate results, and commit improvements — then repeat overnight.

The core idea comes from Andrej Karpathy's autoresearch. This course generalizes it beyond ML to any domain where you can write {"pass": bool, "score": float}.

How the Loop Works

flowchart LR
    A([Define Metric]) --> B([Predict])
    B --> C([Implement])
    C --> D([Evaluate])
    D -->|score improved| E([Commit])
    D -->|score worse| F([Revert])
    E --> G{Target met?}
    F --> B
    G -->|no| B
    G -->|yes| H([Ship])

Each iteration: one hypothesis, one change, one measurement. Git records every experiment. You wake up to a ranked table of what worked.

What You Will Learn

#	Skill	How You Practice It
1	Measurable goals	Turn "make it faster" into `median_time_s < 0.5`
2	Autonomous loops	One change per iteration, automatic rollback
3	Scientific debugging	Falsifiable hypotheses, evidence-based investigation
4	Predict before acting	5-expert perspectives before any major change
5	Security auditing	STRIDE + OWASP + red-team with code-level evidence
6	Shipping	8-phase pipeline: code → content → deployment

Curriculum

Phase	Lectures	Project	Goal
1 — Foundations	L01 Why manual iteration fails · L02 Measurable goals	P01 Sort optimization	`median_time_s < 0.5`
2 — Core Loop	L03 Five-stage internals · L04 When stuck	P02 Function fitting	`rmse < 0.05`
3 — Debug & Fix	L05 Scientific debugging · L06 Error-crushing pipeline	P03 FastAPI debugging	`test_pass_rate == 1.0`
4 — Predict & Reason	L07 Five-expert prediction · L08 Adversarial refinement	P04 Architecture debate	`weighted_score ≥ 0.65`
5 — Security & Scenarios	L09 STRIDE+OWASP audit · L10 12-dimension exploration	P05 Security audit	`security_score == 1.0`
6 — Ship & Advanced	L11 Universal ship pipeline · L12 Overnight runs	P06 End-to-end pipeline	`rouge1_recall ≥ 0.60`

Project Code

Every project ships with a runnable starter and reference solution:

projects/
├── project-01/   sort optimization
├── project-02/   function fitting
├── project-03/   FastAPI debugging
├── project-04/   architecture debate
├── project-05/   security audit
└── project-06/   end-to-end pipeline

Each starter/evaluate.py follows the contract:

print(json.dumps({"pass": bool, "score": float}))

Quick Start

# Install dependencies
npm install

# Start local dev server
npm run dev

# Build static site
npm run build

Tech Stack

Layer	Tool
Site generator	VitePress 1.6+
Diagrams	vitepress-plugin-mermaid
Languages	English (root) + Chinese (`/zh/`)
Project code	Python 3.10+, stdlib only — no pip required

Citation

@software{learn_autoresearch2026,
  title  = {Learn AutoResearch: A Project-Based Course on Autonomous Research Loops},
  author = {Zhao, Zhimin},
  year   = {2026},
  url    = {https://github.com/AI4Scientist/learn-auto-research}
}

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
docs		docs
memory		memory
projects		projects
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
README_CN.md		README_CN.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learn AutoResearch

What Is This?

How the Loop Works

What You Will Learn

Curriculum

Project Code

Quick Start

Tech Stack

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Learn AutoResearch

What Is This?

How the Loop Works

What You Will Learn

Curriculum

Project Code

Quick Start

Tech Stack

Citation

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages