ProteinOS

An autonomous protein engineering platform. Give it a target protein and an objective, and an LLM agent runs closed‑loop Design → Build → Test → Learn (DBTL) campaigns end‑to‑end — designing variants, simulating assays, fitting fitness models, and proposing the next batch — all visible live in a Cursor‑style dashboard.

The problem

Protein engineering campaigns (boost expression, raise Tm, tune binding, increase solubility) follow the same loop:

Design mutants from biology + prior data.
Build them in DNA/protein form.
Test them in assays.
Learn from results, pick the next batch.

Today this loop is fragmented across spreadsheets, Benchling, BLAST tabs, AlphaFold runs, kinetics scripts, and a Slack thread — and it stalls between iterations because a human has to glue everything together.

ProteinOS collapses that loop into one autonomous system:

A Claude tool‑use agent drives the loop: it calls UniProt/BLAST/PDB/InterPro/ESMFold, proposes rational/directed/ML‑guided variant sets, then learns from per‑round assay outcomes.
A simulation engine scores each variant (Tm shift, kinetics, expression yield, solubility, binding, structural stability) so iteration time is seconds, not weeks.
A Gaussian/linear fitness model is refit every iteration and used to prioritize the next batch (exploration vs exploitation tunable).
A live SSE feed streams every agent thought, tool call, phase transition, and assay datapoint to the UI so a scientist can watch — and intervene in — the run in real time.

Highlights

🧠 Real agent loop — Claude (claude‑sonnet‑4) with a 30‑step tool budget per iteration. Every agent_thought, tool_called, and tool_result is streamed and persisted.
🔬 Real protein APIs — UniProt, NCBI BLAST, RCSB PDB, ESMFold, InterPro; cached locally with abortable fetch + retry budget so the orchestrator can never hang.
🧪 Simulation‑first assays — thermal shift, enzyme kinetics, expression yield, binding affinity, structural stability, solubility. Replaceable with real cloud‑lab integrations later.
📈 Iterative learning — a fitness model is refit each round; the agent uses model uncertainty to choose between exploit and explore.
🖥 Premium UI — black/white Cursor‑style dashboard: campaigns, variants explorer, reports + analytics, settings (theme/density/health/danger zone), ⌘K command palette.
🛡 Resilient by construction — per‑request abort timeouts on every external call, total wall‑clock retry budget, automatic reaping of zombie "running" campaigns at boot, graceful fallback when a UniProt id 404s.
📦 Single command setup — npm install && npm run db:migrate && npm run dev.

Screenshots

Dashboard with quick‑launch templates	Live agent + campaign view

Reports: KPIs, throughput, top variants	Variants explorer (cross‑campaign filters)

Settings: appearance, backend, danger zone	⌘K command palette

Markdown report viewer (Copy / Download .md)

Architecture

                ┌─────────────────────────────────────────┐
                │ React + Vite + Zustand + React Query    │
   Scientist ─► │   Dashboard · Reports · Variants ·      │
                │   Campaign workspace · Live Agent Feed  │
                └────────────────┬────────────────────────┘
                                 │ SSE + REST + ⌘K
                ┌────────────────▼────────────────────────┐
                │ Fastify API (TypeScript, Zod, x-api-key)│
                │  /campaigns · /variants · /events · ...  │
                └────────────────┬────────────────────────┘
                                 │
                ┌────────────────▼────────────────────────┐
                │ BullMQ (Redis) or in‑memory queue        │
                └────────────────┬────────────────────────┘
                                 │
                ┌────────────────▼────────────────────────┐
                │ CampaignOrchestrator (DBTL loop)         │
                │ ├── Claude agent (callAgent + tools)     │
                │ ├── External APIs                        │
                │ │     UniProt · BLAST · PDB · ESMFold    │
                │ ├── Simulation engine                    │
                │ │     Tm · kinetics · solubility · etc.  │
                │ └── Fitness model (GP / linear)          │
                └────────────────┬────────────────────────┘
                                 │
                          ┌──────▼──────┐
                          │ SQLite      │  campaigns · variants · iterations
                          │             │  events · assays · proteins · ...
                          └─────────────┘

Monorepo layout

proteinOS/
├── packages/
│   ├── shared/      # TypeScript types + Zod schemas (agent events, campaigns, ...)
│   ├── backend/     # Fastify API, agent loop, orchestrator, simulation, queue
│   └── frontend/    # React + Vite dashboard
├── scripts/         # setup, db migrate, seed, screenshot capture
├── tests/           # unit + integration + e2e suites
├── docs/screenshots/
├── .env.example
├── docker-compose.yml  # optional Redis
└── turbo.json

Quickstart

1. Prerequisites

Node.js 20+ and npm 10+
An Anthropic API key
Optional: Redis (e.g. docker compose up redis -d). If unavailable, an in‑memory queue is used automatically.

2. Install & configure

git clone https://github.com/arnavbathla/protein-engineering.git
cd protein-engineering
cp .env.example packages/backend/.env
# then edit packages/backend/.env and set:
#   ANTHROPIC_API_KEY=sk-ant-...
npm install
npm run db:migrate

The .env file is git‑ignored. Never commit a real key.

3. Run

npm run dev

This launches three workspaces in parallel:

@proteinos/shared (TypeScript watch build)
@proteinos/backend on http://localhost:3001
@proteinos/frontend on http://localhost:5173

Open http://localhost:5173 and click a template (or "New Campaign") to kick off a run.

4. Optional: seed example data

npm run db:seed

Launching a campaign

The fastest path:

Open the dashboard.
Click a template (e.g. Improve thermostability).
The wizard pre‑fills to the Review step — change the UniProt id if you want, then click Launch.
You'll land on the campaign workspace with:
- A run‑status strip (iteration, phase chip, elapsed timer, live dot).
- A PhaseProgress bar advancing through Design → Build → Test → Learn.
- A right‑rail Reasoning Trail showing live agent_thought, tool_called, and tool_result events streamed over SSE.

Try a deliberately broken run: edit the UniProt id to something bogus (e.g. ZZZZZZ_NOT_REAL) and launch. The orchestrator emits a graceful error event, swaps in a fallback sequence, and continues — proving that no external API failure can wedge a campaign.

REST API

All endpoints are prefixed with /api/v1. Every call (except /health and /campaigns/:id/events) requires x-api-key: <API_KEY> from your .env.

Method	Path	What it does
`GET`	`/health`	Liveness probe
`GET`	`/campaigns`	List campaigns
`POST`	`/campaigns`	Create a campaign (Zod‑validated body)
`GET`	`/campaigns/:id`	Campaign detail
`POST`	`/campaigns/:id/start\|pause\|resume`	Lifecycle controls
`DELETE`	`/campaigns/:id`	Delete (cascades)
`GET`	`/campaigns/:id/events`	SSE stream of agent events
`GET`	`/campaigns/:id/report`	Markdown campaign report
`GET`	`/variants`	Cross‑campaign variants with `campaignId`, `status`, `minFitness`, `q`, `sort`, `limit` filters
`GET`	`/campaigns/:id/variants[/top]`	Per‑campaign variants
`GET`	`/stats/overview`	Totals + 14‑day throughput buckets (used by Reports)
`GET`	`/proteins/`, `/iterations/`, `/protocols/`, `/results/`	Per‑resource reads

Agent event stream

The SSE stream emits typed events including:

campaign_started      iteration_started      iteration_complete
phase_started(init|design|build|test|learn)
phase_completed(...)  assay_progress(percent)
agent_thought         tool_called            tool_result
variants_designed     milestone_achieved
error                 heartbeat              campaign_complete

The dashboard groups these by iteration, renders thoughts as italic rows, and turns each tool call into an expandable card with input/output preview and a spinner that resolves on tool_result.

Development

npm run typecheck      # all workspaces
npm run build          # production builds (turbo)
npm test               # vitest unit + integration
npm run test:e2e       # playwright e2e

Frontend uses Tailwind + custom CSS variables; the design system favors flat black/white surfaces with a hint of green for "running" indicators. Theme + density are persisted via Zustand persist.

Resilience notes

fetchWithRetry in packages/backend/src/external/http.ts uses a per‑request AbortController (10s default) and a global wall‑clock retry budget (~30s).
On boot, reapZombieCampaigns() marks any running/pending campaign older than 10 minutes as failed so the dashboard isn't polluted by interrupted runs.
The orchestrator races UniProt fetches against a 15s timeout and falls back to a built‑in 607‑aa serum albumin sequence so the loop always makes progress.
All progress fields (current_iteration, total_variants_tested, best_fitness_score) are written at iteration start, not just at end — so the UI ticks immediately.

Roadmap

Real cloud‑lab integrations (Emerald, Strateos, Ginkgo) behind the existing assay interface.
Multi‑objective optimization across competing assays.
Better fitness models (Bayesian deep ensemble, transformer surrogates).
Multi‑user auth + per‑user quotas.
Lab notebook export (Benchling, IDT, Twist plate maps).

Forking + bringing your own key

Fork this repo and clone your fork.
cp .env.example packages/backend/.env.
Add your own ANTHROPIC_API_KEY to that file.
npm install && npm run db:migrate && npm run dev.

The repo never commits a real Anthropic key — .env files are git‑ignored at every level. Treat your key like a password.

License

MIT. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProteinOS

The problem

Highlights

Screenshots

Architecture

Monorepo layout

Quickstart

1. Prerequisites

2. Install & configure

3. Run

4. Optional: seed example data

Launching a campaign

REST API

Agent event stream

Development

Resilience notes

Roadmap

Forking + bringing your own key

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs/screenshots		docs/screenshots
packages		packages
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
tsconfig.base.json		tsconfig.base.json
turbo.json		turbo.json

Folders and files

Latest commit

History

Repository files navigation

ProteinOS

The problem

Highlights

Screenshots

Architecture

Monorepo layout

Quickstart

1. Prerequisites

2. Install & configure

3. Run

4. Optional: seed example data

Launching a campaign

REST API

Agent event stream

Development

Resilience notes

Roadmap

Forking + bringing your own key

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages