Archon v0.1.0. This version introduces a unified
archonCLI, a one-line installer, an auto-launching web dashboard, and an interactive proof graph.Upgrading from a pre-CLI checkout? See MIGRATION.md — it walks through installing the new CLI and reconciling an already-initialized project without losing your local edits.
Full release notes: CHANGELOG.md.
Archon is an agentic system that autonomously formalizes research-level mathematics in Lean 4. A plan agent provides strategic guidance while prover agents write and verify proofs — separating analysis from execution to avoid context explosion. The system handles repository-scale formalization through three phases: scaffolding, proving, and polish. Built on Claude Code and Claude Opus 4.6, with a modified fork of lean-lsp-mcp and lean4-skills. Archon originated from orchestrating Claude Code with OpenClaw — see Standard vs. orchestrator-scheduled mode. See also our blog and announcement.
Archon is designed and optimized for project-level formalization — multi-file repositories with interdependent theorems, not isolated competition problems. As such, single-problem benchmarks are not a specific optimization target. For model choice, Opus 4.6 is strongly recommended; Sonnet also works well but is less capable. Other models have not been tested — weaker models may struggle with the complex skills and prompt structures, in which case Archon's system design could hurt performance rather than help it.
Security note: archon loop runs Claude Code with --dangerously-skip-permissions --permission-mode bypassPermissions, meaning the model can execute arbitrary shell commands, read/write any file the process can access, and make network requests — all without asking for confirmation. This is necessary for unattended operation but carries real risk: a misbehaving model could delete files, overwrite code, or run unintended commands. While Opus 4.6 NEVER caused harm across all of our experiments, the following measures can further reduce exposure:
- Commit and push your project before running Archon, so any unintended changes can be easily reverted.
- Run Archon under a dedicated, low-privilege user that only has access to the project directory
- Run inside a Docker container or VM with no access to sensitive data or credentials
- Avoid running as root or with access to production systems
- Review
.archon/proof-journal/after each run to audit what the agents did
Note:
archon loopruns Claude Code with--dangerously-skip-permissions, which Claude Code refuses when running as root on Linux. Two workarounds:
- Use a non-root account (RECOMMENDED) — e.g. create one with
adduser— so you are not running with excessive root privileges.- Set
export IS_SANDBOX=1so Claude Code is allowed to start with this high-risk option.
To install the CLI tools and system dependencies, run the following command in your terminal (it is recommended, but not required, to run inside a Python virtual environment):
curl -sSL https://raw.githubusercontent.com/frenzymath/Archon/refs/heads/main/install.sh | bashThis fetches the repository, runs pip install ., and executes archon setup to install system-level dependencies (uv, Claude Code) and verify your Lean toolchain. The installation process might be slow the first time.
You should now be able to verify the installation and be guided on its usage with:
archon -hTo update an existing install later:
archon updatearchon setup also checks for API keys used by the informal agent
(OPENAI_API_KEY, GEMINI_API_KEY, or OPENROUTER_API_KEY) — at least one is recommended but not required.
The bundled informal agent is a simplified demonstration: a single API call to an external model for proof sketches. Our internal implementation is more involved but not yet ready for open-sourcing. In practice the one-shot approach does not show an obvious performance drop, likely because Claude Code performs its own verification and refinement on the returned sketches.
| Command | Description |
|---|---|
archon init |
Initialize a new Archon project (or reconcile an existing one). |
archon loop |
Run the automated plan → prove → review loop. |
archon dashboard |
Start the web monitoring interface (auto-launched by loop by default). |
archon doctor |
Verify the full Archon setup and health. |
archon prove |
Directly prove an inline statement. |
archon setup |
Install required system dependencies. |
archon update |
Update Archon to the latest published version. |
Run archon --help or archon <command> --help for details.
The project path must point to the directory containing your lakefile.lean or lakefile.toml — this is what defines a Lean project. Otherwise, it can contain informal content (papers, blueprints, notes) that Archon will use to initialize the project structure and write the first objectives.
To initialize:
archon init /path/to/your-lean-projectIf no path is given, init prompts you for a project name and creates it.
archon init does the following inside your project:
- Creates
.archon/with runtime state files and a copy of Archon's prompts (previous versions symlinked — see MIGRATION.md) - Installs Archon's lean4 skills as the
lean4@archon-localplugin at project scope - Copies the informal agent into
.claude/tools/archon-informal-agent.py - Installs Archon's lean-lsp MCP server as
archon-lean-lspat project scope - Detects and disables any conflicting global
lean4-skills/lean-lspMCP (see Existing lean4-skills and lean-lsp MCP installations) - Launches Claude Code interactively to detect project state, set up lakefile/Mathlib if needed, and write initial objectives
If the project has already been initialized, it might use a different version of Archon or you might have modified the prompts/skills manually, init offers four choices:
- keep — leave your existing setup alone; refresh MCP/plugin registrations only
- merge (recommended) — launch Claude Code in a focused diff session and reconcile each prompt /
CLAUDE.mdfile interactively - overwrite — replace all Archon files with the bundled versions (discards local edits)
- abort — cancel without changes
Init automatically runs /archon-lean4:doctor at the end to verify the full setup (Lean environment, MCP, skills, state files). See MIGRATION.md for details on upgrading an older project.
archon loop /path/to/your-lean-projectThe loop alternates plan and prover agents through stages:
| Stage | What happens |
|---|---|
autoformalize |
Scaffolding — translate informal math into Lean declarations with sorry |
prover |
Proving — fill sorry placeholders with verified proofs |
polish |
Verification and polish — golf, refactor, extract reusable lemmas |
By default, archon loop also launches the web dashboard (see Web Dashboard) in the background on a free port in the range 8080–8099 and prints the URL. The dashboard keeps running after the loop finishes so you can review results; stop it with Ctrl-C or by closing the terminal. Disable it with --no-dashboard, or open a browser automatically with --open.
NOTE: The prover agent is instructed to push formalization as far as possible, so the first few runs typically take several hours as it clears all low-hanging fruits. Once only genuinely difficult sorrys remain, each iteration becomes much shorter. To confirm the agent is running, watch the dashboard or tail .archon/logs/iter-*/provers/*.jsonl; the agent also writes Lean files while running, which you can see directly.
The loop exits automatically when the stage reaches COMPLETE. You can run archon loop on multiple projects in parallel from separate terminals — each project's state is independent.
Archon runs fully autonomously, but guiding it with your expertise will speed it up, align it with your preferred proof style, and help it overcome mathematical and Lean challenges.
There are three ways to influence Archon's behavior. Each serves a different purpose:
| Mechanism | When to use | Lifetime | Who reads it |
|---|---|---|---|
| USER_HINTS.md | Mid-run course corrections | One-shot — cleared after each plan cycle | Plan agent |
| /- USER: ... -/ comments | File-specific proof guidance | Persistent — stays in the .lean file |
Prover agent |
| Prompts and skills | Change how agents think and operate | Permanent — applies every iteration | All agents |
USER_HINTS.md — for things that change between iterations. Examples: "prioritize theorem X next", "stop trying approach Y, it's a dead end". The plan agent reads this once, acts on it, and clears the file. Don't put permanent instructions here — they'll be lost.
/- USER: ... -/ comments — for proof-level guidance tied to a specific .lean file. Examples: "try using Finset.sum_comm here", "this sorry depends on the helper lemma above". These persist in the source file and are visible to whichever prover agent owns that file.
Prompts and skills — for changing how agents behave across all iterations. Edit prompts when you want to change the plan agent's strategy, the prover's proof style, or the review agent's analysis. Create or extend skills for reusable workflows in specific situations. For a deeper treatment — including which changes are short-lived vs. permanent, how skills and prompts differ, the recommended order of adjustments, and how to evolve them as you encounter recurring issues — see Section 5 (Skills and Prompts) in ORCHESTRATOR_GUIDE.md.
Archon has two layers — local overrides global:
| Layer | Location | Scope |
|---|---|---|
| Global | Bundled inside the installed archon package |
All projects (template source) |
| Local | <project>/.archon/prompts/*.md |
One project only |
Starting with v0.1.0, local prompts are copies of the bundled templates rather than symlinks. This means:
- You can edit
<project>/.archon/prompts/*.mdfreely without affecting other projects. - Template updates do not automatically propagate. To pull in newer versions of the bundled prompts, run
archon initagain in that project and choose merge (recommended) or overwrite.
If you're coming from a version that used symlinks, see MIGRATION.md for the one-time migration flow.
Archon ships with a modified fork of lean4-skills, installed as lean4@archon-local (providing /archon-lean4:prove, /archon-lean4:doctor, etc.). Skills are sourced from the installed archon package and registered with Claude Code as a local plugin marketplace.
Modifying global skills: Edit files under the installed package's
skills/lean4/ directory. archon init re-registers the marketplace at the correct path on each run, so your edits take effect after re-init.
Adding new global skills: Create a new directory under the bundled
skills/<your-skill-name>/ with a SKILL.md or .claude-plugin/plugin.json inside, and add it to skills/.claude-plugin/marketplace.json. Run archon init again on your project to pick up the new skill.
We encourage you to customize. If you notice the prover repeatedly making the same mistakes, or a proof strategy that consistently works for your project, codify it — add a skill or adjust a prompt. Archon improves as its skills and prompts accumulate lessons from your specific formalization work.
Adding local skills: Place them in <project>/.claude/skills/<your-skill-name>/SKILL.md. They are discovered by Claude Code automatically and won't conflict with Archon's /archon-lean4:* commands. No re-init needed.
To check how the formalization is going, the easiest starting point is the dashboard (auto-launched by archon loop — visit the URL printed in the terminal, e.g. http://localhost:8080). It shows iteration progress, parallel prover status, a file-centric Diffs view backed by recorded code snapshots, agent logs with live streaming, and proof journal milestones.
The Logs view groups logs by iteration with phase timing (plan → prover → review) and per-prover completion status.
The Journal view tracks proof milestones across sessions — see which theorems were solved, blocked, or retried, with condensed reasoning traces that let you follow how the agents approached each proof.
A new Graph view (v0.1.0) renders the proof dependency graph interactively, so you can see which theorems block which and how the formalization is structured.
See src/archon/ui/README.md for more details on Overview / Diffs / Logs / Journal / Graph and the supporting API surface.
You can also inspect state files directly:
.archon/logs/iter-<N>/**/*.jsonl— running log of agent activity. The latest iteration's files tell you whether agents are still working..archon/PROJECT_STATUS.md— overall progress: total sorries, what's solved, what's blocked, and reusable proof patterns..archon/proof-journal/sessions/session_N/summary.md— detailed record of a specific iteration: what was attempted, what succeeded, what failed, and why.
These are updated automatically by the review agent after each iteration.
If you disabled the auto-launched dashboard, or want to look at a project after the loop has finished and the terminal is gone:
archon dashboard /path/to/your-lean-projectIf you already have lean4-skills or lean-lsp MCP installed globally, archon init detects them and disables them for this project only — so only Archon's modified versions are active. Your global installations are untouched and continue working in all other projects.
To restore the originals in an Archon project:
cd /path/to/your-project
claude plugin enable lean4-skills --scope project # re-enable standard skills
claude mcp add lean-lsp -s project -- uvx lean-lsp-mcp # re-enable standard MCP| Flag | Description |
|---|---|
--max-iterations N / -m N |
Max plan→prover→review cycles (default: 10). Exits early if stage reaches COMPLETE. |
--max-parallel N |
Max concurrent provers in parallel mode (default: 8). |
--stage STAGE / -s STAGE |
Force a stage (autoformalize, prover, polish) instead of reading from PROGRESS.md. |
--serial |
One prover at a time instead of parallel (one per file). |
--verbose-logs |
Save raw Claude stream events to .raw.jsonl for debugging. |
--no-review |
Skip review phase. Saves time/cost; plan agent still works without it. |
--no-dashboard |
Do not auto-start the web dashboard. |
--open |
Open the dashboard in a browser as soon as it starts. |
--dry-run |
Print prompts without launching Claude. |
Formalization quality improves materially when the agents have access to the original informal mathematics. Supply as much source material as you can — place files in the repository root or a clearly documented top-level folder (e.g. references/):
- Papers and manuscripts — the primary text being formalized (PDF, LaTeX source, or both). This is the single most important input after the Lean project itself.
- Blueprints — if you have a LeanBlueprint or similar dependency graph, include it. Blueprints give the agents a clear picture of the logical structure and what depends on what.
- Key definitions and lemma references — for important definitions or lemmas, note where they first appear (e.g. "Definition 3.2 in [Author, Year]" or "Lemma 2 of arXiv:XXXX.XXXXX"). If the main paper cites important theorems whose proofs appear elsewhere, include those papers too — either add them yourself or ask Claude Code to fetch them. This helps the agents choose correct formalizations and find existing Mathlib content instead of reinventing it.
Even rough or incomplete material is valuable — partial references are far better than none. The more context the agents have, the better they can disambiguate notation, pick appropriate Mathlib abstractions, and produce proofs that match the mathematical intent.
archon loop is the standard mode — a fixed plan→prover→review loop that runs unattended. It is sufficient for most formalization tasks.
In our experiments, replacing the fixed loop with an orchestrator-scheduled mode — where an outer orchestrator like OpenClaw drives Claude Code directly — yielded stronger results. Instead of following a rigid pipeline, the orchestrator can freely choose when to plan, prove, or review based on the current state, and can supervise the model continuously to prevent premature termination.
Ensure your orchestrator has access to the project directory, and ask it to read README.md for an overview of the project.
We provide ORCHESTRATOR_GUIDE.md as a companion guide for your orchestrator. It was authored by our own OpenClaw based on its accumulated experience orchestrating Claude Code across multiple formalization projects. The guide covers how to read Archon's state files, decide which stage to run next, compose prompts from .archon/prompts/, and invoke claude -p — including prompt composition, adaptive scheduling logic, failure recovery, and operational rules learned from production use.
In standard mode, archon loop enforces a fixed cycle: plan→prover→review, repeated up to --max-iterations. The orchestrator-scheduled mode differs in several ways:
- Environment management — the orchestrator handles setup and debugging: installing dependencies, resolving Mathlib cache issues, verifying that skills and MCP work correctly. These tasks often require back-and-forth troubleshooting that a fixed script cannot do.
- Flexible phase ordering — the orchestrator decides when to plan, prove, or review based on what it observes, rather than following a fixed sequence. It might skip planning when the current objectives are still valid.
- Real-time intervention — the orchestrator can step in the moment the model is stuck. It detects surrender patterns (e.g., "Mathlib lacks infrastructure") and pushes the prover back in with refined hints or alternative strategies.
- Richer cross-session context — the orchestrator has its own memory. It can retain whatever state matters for adaptive routing — failure histories, proof patterns, mathematical context — accumulating richer context over time than a script that only persists a few markdown artifacts between iterations.
Flexibility — the orchestrator decides when to plan, prove, or review based on current state rather than following a fixed sequence, making it adaptable to a wider range of formalization tasks.
Stability — a supervisor layer catches errors that a fixed loop cannot: crashed sessions, malformed state files, stuck provers, or plan agents that set unreasonable objectives. The orchestrator acts as a safety net that keeps the process running correctly over hours or days without manual intervention.
Evolvability - by design, orchestrators like OpenClaw can author and refine skills and prompts over time. The global/local skill and prompt slots are designed not only for human experts but also for orchestrators: they can analyze failure modes and update skills or prompts accordingly (with your permission), making the system progressively more powerful.

