Lean Swarm

Lean Swarm is a cost-focused multi-agent prediction and simulation engine designed to approximate MiroFish-class narrative forecasting with aggressive batching, sparse activation, hybrid state, and strict LLM routing guardrails.

Overview

Given a seed document and a prediction question, Lean Swarm builds a simulated world of agents, runs a bounded number of interaction ticks, and returns:

a structured prediction report
a post-simulation world snapshot with agent states and relationship edges

The project is structured for open-source collaboration, MIT licensing, and PyPI publishing from day one.

Quickstart

Install From PyPI

pip install leanswarm

This installs the Python package, API, and CLI. Node.js is not required for the core package.

Optional with uv:

uv pip install leanswarm

Configure Live Providers

Dry-run mode works without any API keys. If you want live model routing, set one of:

export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...

Run

leanswarm smoke
leanswarm simulate --seed examples/seed.txt --question "Will public trust rise this quarter?"
leanswarm api
leanswarm bench

Import In Python

from leanswarm.engine.models import SimulationRequest
from leanswarm.engine.simulator import LeanSwarmEngine

Architecture

Core rules

All model traffic is routed through engine/llm.py.
Every LLM route and simulation tick is logged.
The engine uses Pydantic schemas at every boundary.
LLM calls are retried and concurrency-limited with semaphores.

Current engine shape

Tiered model routing with FLAGSHIP, STANDARD, and CHEAP tiers.
Batched group actions for active agents.
Seed-ingestion and world-building helpers that extract topics, entities, and a world graph from the seed document.
Seed-conditioned population construction with archetype jittering and bounded named-agent counts.
Hybrid numeric state for mood, energy, attention, and relationships.
Sparse activation and trigger heuristics that keep only a subset of agents active per tick.
Hierarchical memory slices for working, episodic, and semantic references backed by SQLite with vector-search support and deterministic offline fallback.
Disk-backed action caching via diskcache.
Early convergence detection on low-delta ticks.
A minimal Next.js viewer under web/ for inspecting pasted simulation JSON and exploring the post-simulation world snapshot.

See docs/architecture.md for more detail.

Optional web viewer

The web/ app is a separate, optional Next.js inspector for pasted simulation JSON. It is not required to install or use the Python package or leanswarm CLI.

cd web
npm install
npm run dev

CLI Usage

Smoke test

leanswarm smoke

Simulate a scenario

leanswarm simulate \
  --seed examples/seed.txt \
  --question "Will the policy announcement improve sentiment?" \
  --activation-mode lean \
  --active-agent-fraction 0.25

Run the API

leanswarm api --host 127.0.0.1 --port 8000

Run the benchmark harness

leanswarm bench

API Usage

Start server

leanswarm api

Example request

curl -X POST http://127.0.0.1:8000/simulate \
  -H "Content-Type: application/json" \
  -d '{
    "seed_document": "A national survey shows mixed views on new policy proposals.",
    "question": "Will approval improve over the next month?",
    "rounds": 4
  }'

Benchmarks

leanswarm bench runs the same benchmark cases in both lean and naive activation modes and returns a comparison payload with:

top-level deltas: cost_ratio_naive_to_lean, quality_delta_lean_vs_naive, runtime_ratio_naive_to_lean
per-mode outputs under modes.lean and modes.naive (quality proxy, runtime, cache stats, token and estimated cost totals)
plot_points: per-case points with mode, score, cost_usd, runtime_seconds, token_total, and related fields for quality-vs-cost plotting

This lets you compare lean efficiency against naive full activation and plot quality-vs-cost points directly from benchmark output without extra transforms. The shipped cases are still lightweight proxy benchmarks rather than a full public benchmark pack.

Limitations

Dry-run routing is still heuristic even though it is seed-sensitive and grouped by task type.
The benchmark harness is still a compact proxy suite, not a full public-opinion evaluation set.
The web client is intentionally minimal, focused on inspecting and exploring simulation JSON.

Contributing

Contributions are welcome. Please open a focused PR with a clear summary of the behavior change and the validation you ran locally.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github		.github
docs		docs
examples		examples
src		src
tests		tests
venv		venv
web		web
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lean Swarm

Overview

Quickstart

Install From PyPI

Configure Live Providers

Run

Import In Python

Architecture

Core rules

Current engine shape

Optional web viewer

CLI Usage

Smoke test

Simulate a scenario

Run the API

Run the benchmark harness

API Usage

Start server

Example request

Benchmarks

Limitations

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Lean Swarm

Overview

Quickstart

Install From PyPI

Configure Live Providers

Run

Import In Python

Architecture

Core rules

Current engine shape

Optional web viewer

CLI Usage

Smoke test

Simulate a scenario

Run the API

Run the benchmark harness

API Usage

Start server

Example request

Benchmarks

Limitations

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages