Nano-scientist

Nano-scientist

Nano. Lean. Four loops, one paper.

An autonomous research agent that turns a topic into a peer-reviewed technical report — within a dollar budget you set.

Built on PocketFlow · Inspired by karpathy/autoresearch

🔥 News

[2026. 05] 🎉 Major Release! 87 modular skills, CrossRef citation recovery, multi-model quality gate via REVIEWER_MODEL, and MCP server injection. Full BibTeX stub fallback ensures zero undefined citations.
[2026. 01] 🚀 Nano-scientist public launch — budget-first autonomous research pipeline producing LaTeX + BibTeX + PDF in a single python main.py call.

✨ Why Nano-scientist


Budget-first control	Fix a dollar limit; the agent adapts depth and report type automatically. Loops exit the moment estimated remaining calls fall too low — no wasted spend.
Four autonomous loops	Literature → Experimentation → Writing → Compiling — each self-terminating on quality gate or budget exhaustion. No central planner.
87 modular skills	From paper search and code generation to grant proposals, patent drafting, and adversarial review. Lazy-loaded; each skill is one `SKILL.md` file.
Research-to-PDF pipeline	Produces LaTeX source, deduplicated BibTeX (CrossRef-verified), per-skill artifact files, figures, scripts, and a compiled PDF — all in one run.
Zero-drop citations	Entries failing CrossRef verification are recovered via title lookup or kept as `@misc` stubs — never silently dropped.

🧪 Showcases

Sample reports generated by Nano-scientist at --budget 1:

🧠 How it works

flowchart TD
    I([Initializer\nzero LLM calls]) -->|literature| LIT

    subgraph LIT_LOOP["  Literature Loop  "]
        LIT[LiteratureReviewLoop\ndecide → skill → quality gate]
        LIT -->|"next iter"| LIT
    end

    LIT -->|"goal met · budget low"| EXP

    subgraph EXP_LOOP["  Experiment Loop  "]
        EXP[ExperimentationLoop\ndecide → skill → quality gate]
        EXP -->|"next iter"| EXP
    end

    EXP -->|"goal met · budget low"| WR

    subgraph WRITE_LOOP["  Writing Loop  "]
        WR[WritingLoop\nwrite sections → review pass → fix]
    end

    WR -->|compile| CT

    subgraph COMPILE["  Compiling Loop  "]
        CT[CompileTeX\npdflatex + bibtex]
        FT[FixTeX\npatch errors]
        CT -->|fix| FT
        FT -->|compile| CT
    end

    CT -->|done| F([Finisher\ncost_log · summary])
    FT -->|done| F

Stage breakdown

Stage	What happens
Initializer	Creates `outputs/<uuid>/`, classifies topic as survey vs. experimental (`is_survey`) — zero LLM calls
LiteratureReviewLoop	Each iter: LLM picks `skill\|done` → executes skill → quality gate checks goal; exits on goal met or budget low
ExperimentationLoop	Survey: synthesis skills (tables, figures from literature). Experimental: `experiment-pipeline`, `experiment-craft`, etc.
WritingLoop	Writes all required sections, runs a peer-review pass, addresses major comments, assembles `.tex`
CompilingLoop	`pdflatex` + `bibtex`; on error or undefined citations, FixTeX patches and recompiles (up to 2 attempts)
Finisher	Writes `cost_log.json` + `summary.json`, prints total cost

🚀 Quickstart

# 1) Clone
git clone https://github.com/AI4Scientist/nano-scientist
cd nano-scientist

# 2) Install dependencies
pip install -r requirements.txt

# 3) Add API keys
cp .env.example .env
# edit .env — minimum: OPENROUTER_API_KEY

# 4) Run
python main.py "CRISPR off-target effects in primary T cells" --budget 2.00

# Or pass a research proposal .md file
python main.py proposal.md --budget 0.50

Output lands in outputs/<uuid>/:

outputs/
└── <uuid>/
    ├── report.tex         # assembled LaTeX source
    ├── report.pdf         # final PDF (if pdflatex installed)
    ├── references.bib     # deduplicated BibTeX
    ├── artifacts/         # per-skill markdown outputs
    ├── figures/           # generated plots / images
    ├── data/              # collected CSV / JSON data
    ├── scripts/           # executed code blocks
    ├── traj.txt           # full stdout trace
    ├── history.json       # step-by-step execution log
    ├── cost_log.json      # per-step token costs
    └── summary.json       # final run summary

🖥️ CLI reference

python main.py [topic] [options]

Arguments:
  topic                 Research topic — a plain string or path to a .md file.
                        Optional when using --list-skills.

Options:
  -b, --budget FLOAT    Spend limit in USD  (default: $1.00)
  -o, --output DIR      Output directory    (default: outputs/)
  -e, --env FILE        Path to .env file   (default: .env)
  --list-skills         Print available skills and exit

python main.py "CRISPR off-target effects in primary T cells" --budget 1.00
python main.py proposal.md --budget 1.00
python main.py --list-skills

Budget

Every run targets a full 8-section paper. Budget controls depth, not report type — more budget means more skill calls, more citations, and more revision rounds. Loops terminate when estimated remaining LLM calls drop below a threshold, so the agent always spends as much as it can usefully spend.

🧩 Skills

Each skill is a folder under skills/ with a single SKILL.md (lazy-loaded at runtime). Skills with allowed-tools: Bash get a real tool-calling loop with bash execution and error feedback.

Core research skills (click to expand)

Skill	What it does
`paper-navigator`	Find and read academic papers: keyword search, citation traversal, arXiv monitoring, SOTA lookup
`research-survey`	Structured literature survey: outline → draft → section expansion → final assembly with dense citations
`research-ideation`	Multi-persona ideation → ELO tournament ranking → manuscript-quality proposal
`paper-planning`	Story design, experiment planning, figure design, 4-week timeline
`experiment-pipeline`	4-stage experiment execution: baseline → hyperparameter tuning → proposed method → ablation
`experiment-craft`	Debug and iterate on existing experiments: 5-step diagnostic flow
`experiment-iterative-coder`	Iterative code refinement: plan → code → evaluate → refine cycles with lint/test scoring
`paper-writing`	Academic sections: 11-step workflow with LaTeX templates and section-by-section guidance
`paper-review`	Pre-submission self-review: 5-aspect checklist, adversarial stress-testing
`paper-rebuttal`	Peer-review rebuttals: score diagnosis, comment prioritization, 18 tactical writing rules
`evo-memory`	Persistent research memory: Ideation Memory + Experimentation Memory across cycles
`study-workflow`	Publication-quality research workflow diagram as PNG via image generation

Extended skill library — 87 skills total

Skill	What it does
`ablation-planner`	Design ablations from a reviewer's perspective for paper submission
`alphaxiv`	Quick single-paper lookup via AlphaXiv LLM-optimized summaries
`analyze-results`	Compute statistics, generate comparison tables from ML experiment results
`arxiv`	Search, download, and summarize papers from arXiv
`auto-paper-improvement-loop`	Autonomously improve a generated paper via GPT review → fix → recompile
`auto-review-loop`	Autonomous multi-round research review loop via Codex MCP
`auto-review-loop-llm`	Same review loop using any OpenAI-compatible LLM API
`auto-review-loop-minimax`	Same review loop using MiniMax API
`citation-audit`	Verify every bibliographic entry is real, correctly attributed, and in context
`claims-drafting`	Draft patent claims for an invention
`comm-lit-review`	Communications-domain literature review (wireless, networking, satellite, cellular)
`deepxiv`	Search and progressively read open-access papers through DeepXiv
`dse-loop`	Autonomous design space exploration for computer architecture and EDA
`embodiment-description`	Write detailed embodiment descriptions for patent specifications
`exa-search`	AI-powered web search via Exa with content extraction
`experiment-audit`	Audit experiment integrity via cross-model review before claiming results
`experiment-bridge`	Bridge from EXPERIMENT_PLAN.md to GPU-deployed initial results
`experiment-plan`	Turn a proposal into a claim-driven experiment roadmap
`experiment-queue`	SSH job queue for multi-seed/multi-config ML experiments with OOM-aware retry
`feishu-notify`	Send notifications to Feishu/Lark
`figure-description`	Process patent figures and generate formal drawing descriptions
`figure-spec`	Deterministic publication-quality diagrams from JSON to editable SVG
`formula-derivation`	Structure and derive research formulas into paper-ready derivation packages
`gemini-search`	Broad literature discovery via Gemini
`grant-proposal`	Draft structured grant proposals (KAKENHI, NSF, NSFC, ERC, DFG, SNSF, ARC, NWO)
`idea-creator`	Generate and rank research ideas given a broad direction
`idea-discovery`	Full pipeline: lit review → idea generation → novelty check → critical review
`idea-discovery-robot`	Same pipeline adapted for robotics and embodied AI
`invention-structuring`	Structure a raw invention into a formal invention disclosure
`jurisdiction-format`	Compile patent into CN/US/EP jurisdiction-specific filing format
`kill-argument`	Adversarial two-thread review: strongest rejection memo → point-by-point defense
`mermaid-diagram`	Generate Mermaid diagrams (.mmd + .md) with syntax verification
`meta-optimize`	Analyze ARIS usage logs and propose harness optimizations
`monitor-experiment`	Monitor running experiments, check progress, collect results
`novelty-check`	Verify research idea novelty against recent literature
`openalex`	Search via OpenAlex for open citation data and institutional metadata
`overleaf-sync`	Two-way sync between local paper directory and Overleaf via Git bridge
`paper-claim-audit`	Verify every number and comparison in the paper matches raw result files
`paper-compile`	Compile LaTeX to PDF, fix errors, verify output
`paper-figure`	Generate publication-quality figures and tables from experiment results
`paper-illustration`	AI illustrations for academic papers via Gemini image generation
`paper-illustration-image2`	Same via Codex native image generation
`paper-plan`	Generate structured outline from review conclusions and experiment results
`paper-poster`	Conference poster (A0/A1 PDF + editable PPTX + SVG) from a compiled paper
`paper-slides`	Conference slides (Beamer PDF + editable PPTX) with speaker notes
`paper-talk`	End-to-end conference talk pipeline: paper → outline → Beamer + PPTX → export
`paper-write`	Draft LaTeX section by section from an outline
`patent-novelty-check`	Assess patent novelty and non-obviousness against prior art
`patent-pipeline`	Full patent drafting: CN/US/EP support, invention patents and utility models
`patent-review`	External patent examiner review of a patent application
`pixel-art`	Generate pixel art SVG illustrations for READMEs and docs
`prior-art-search`	Search patent databases and academic literature for prior art
`proof-checker`	Rigorous mathematical proof verification and fixing workflow
`proof-writer`	Write rigorous mathematical proofs for ML/AI theory
`qzcli`	Manage GPU compute jobs on the Qizhi (启智) platform
`rebuttal`	Submission rebuttal pipeline with coverage enforcement and venue limits
`research-ideation`	Multi-persona ideation → ELO ranking → manuscript-quality proposal
`research-lit`	Search and analyze papers, find related work, summarize key ideas
`research-pipeline`	Full pipeline: idea discovery → implementation → review → paper writing
`research-refine`	Turn a vague direction into a problem-anchored, frontier-aware method plan
`research-refine-pipeline`	Chain research-refine + experiment-plan in one shot
`research-review`	Deep critical review of research via Codex MCP
`research-wiki`	Persistent research knowledge base across the full research lifecycle
`resubmit-pipeline`	Orchestrate text-only paper resubmission to a different venue
`result-to-claim`	Judge what claims experiment results support before writing
`run-experiment`	Deploy and run ML experiments on local, remote, Vast.ai, or Modal GPU
`semantic-scholar`	Search published papers via Semantic Scholar API with citation metadata
`serverless-modal`	Run GPU workloads on Modal — training, fine-tuning, inference
`slides-polish`	Per-page Codex review + targeted python-pptx / Beamer fixes
`specification-writing`	Write full patent specification from claims and invention disclosure
`system-profile`	Profile scripts, processes, GPU, memory, interconnect — structured reports
`training-check`	Periodic WandB metric checks to catch NaN/divergence/idle GPUs early
`vast-gpu`	Rent, manage, and destroy GPU instances on vast.ai
`writing-systems-papers`	Paragraph-level blueprint for 10–12 page systems papers (OSDI, SOSP, ASPLOS)

Add a skill

Create skills/my-skill/SKILL.md with YAML frontmatter:

---
id: my-skill
description: One-line description shown in the planner.
allowed-tools: Bash        # grants bash tool-calling with error feedback
required-keys: [HF_TOKEN]  # optional; skill is filtered out if key missing
---

Your skill instructions here.

Register in skills/skills.json:

{ "id": "my-skill", "description": "One-line description shown in the planner." }

🔐 Environment variables

Required

Variable	Used for
`OPENROUTER_API_KEY`	Core LLM inference (all nodes)

Skill-gated (optional)

Variable	Skills that use it
`HF_TOKEN`	Skills accessing Hugging Face Hub
`GITHUB_TOKEN`	Skills querying GitHub repos/issues
`S2_API_KEY`	Semantic Scholar API
`OPENAI_API_KEY`	Skills using OpenAI-compatible endpoints

Missing skill keys automatically filter out dependent skills at startup.

Tuning (all optional)

Variable	Default	Purpose
`MODEL_NAME`	—	Override the inference model
`INFERENCE_BASE_URL`	—	Custom OpenAI-compatible endpoint
`REVIEWER_MODEL`	—	Second model for quality gate (e.g. `openai/gpt-4o`); falls back to `MODEL_NAME` if unset
`INPUT_TOKEN_COST_PER_MILLION`	—	Estimate remaining LLM calls
`OUTPUT_TOKEN_COST_PER_MILLION`	—	Estimate remaining LLM calls
`LOOKBACK`	`3`	History steps visible per LLM call
`MAX_REVIEW_ROUNDS`	`1`	Writing review/revision passes
`MAX_TOOL_ROUNDS`	`16`	Max bash tool-calling rounds per skill
`MAX_LOOP_ITERATIONS`	`20`	Max iterations per research loop
`MIN_CALLS_TO_CONTINUE`	`3`	Stop loop when estimated remaining calls falls below this
`OUTPUT_LANGUAGE`	auto-detect	Force output language (e.g. `"French"`); ASCII-only topics default to English

🗂️ Project layout

nano-scientist/
├── main.py              # CLI entry point
├── src/
│   ├── flow.py          # PocketFlow wiring (3 loops + compile/fix)
│   ├── nodes.py         # 7 nodes + helpers
│   └── utils.py         # LLM client, cost tracking, BibTeX utils
├── skills/              # 87 modular research skills
│   ├── skills.json      # skill index (id + description)
│   └── <skill-name>/
│       └── SKILL.md     # instructions + YAML frontmatter
├── outputs/             # generated reports (git-ignored)
└── .env                 # API keys (git-ignored)

🤝 Join the Community

Open an issue — bug reports, feature requests, skill ideas
Submit a PR — new skills are one SKILL.md file

📌 Citation

If you use Nano-scientist in your research, please cite:

@software{nano_scientist2026,
  title  = {Nano-scientist: Autonomous Research Agent for Budget-Constrained Scientific Reports},
  author = {{AI4Scientist Team}},
  year   = {2026},
  url    = {https://github.com/AI4Scientist/nano-scientist}
}

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
showcases		showcases
skills		skills
src		src
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nano-scientist

Nano. Lean. Four loops, one paper.

🔥 News

✨ Why Nano-scientist

🧪 Showcases

🧠 How it works

Stage breakdown

🚀 Quickstart

🖥️ CLI reference

Budget

🧩 Skills

Add a skill

🔐 Environment variables

Required

Skill-gated (optional)

Tuning (all optional)

🗂️ Project layout

🤝 Join the Community

📌 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nano-scientist

Nano. Lean. Four loops, one paper.

🔥 News

✨ Why Nano-scientist

🧪 Showcases

🧠 How it works

Stage breakdown

🚀 Quickstart

🖥️ CLI reference

Budget

🧩 Skills

Add a skill

🔐 Environment variables

Required

Skill-gated (optional)

Tuning (all optional)

🗂️ Project layout

🤝 Join the Community

📌 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages