Cell State Compiler

A deterministic, auditable compiler platform for cell-state engineering. Scientists define cell-state transitions, configure biological constraints, and run the compiler to receive ranked candidate intervention designs — each scored by real Evo 2 genome foundation model inference.

Research platform only. All outputs are model-derived research signals. Not biological validation. Not for clinical use. Not for pathogen, toxin, or gain-of-function research.

The Problem

Engineering cell states is one of the most important and difficult problems in modern medicine. The ability to reprogram a cell — say, converting an exhausted T cell back into a functional memory-like state, or pushing a fibroblast toward a cardiomyocyte — would unlock treatments for cancer, autoimmune disease, aging, and tissue regeneration.

But today the process is largely artisanal:

Researchers manually search literature to identify candidate transcription factors or CRISPR targets
Interventions are designed by intuition and prior knowledge
Screening is expensive: a single experiment can take weeks and cost tens of thousands of dollars
There is no principled way to rank candidates before committing to wet lab work
Failed candidates leave no systematic trail — knowledge is lost between labs and experiments

The core bottleneck is the translation gap between a target cell state (what we want) and a ranked set of concrete molecular interventions (what to actually try). That gap is currently filled by expert intuition — brilliant, but slow, unscalable, and hard to audit.

The Solution

Cell State Compiler treats cell-state engineering as a compilation problem.

Just as a software compiler translates high-level source code into optimized machine instructions, this platform takes a high-level biological specification — starting state, target state, constraints — and compiles it into ranked molecular intervention candidates, each scored against the genome itself using a foundation model.

The workflow:

[Starting Cell State]  →  [Compiler]  →  [Ranked Candidates]
[Target Cell State]                       [Evo 2 Scores]
[Constraint Set]                          [Assay Plan]
                                          [Audit Trail]

Each compile job runs a deterministic nine-step pipeline:

Text screening — safety gate blocks disallowed research domains before any computation
State encoding — starting cell state encoded as a 384-dimensional vector (marker profile + pathway scores + state labels)
Candidate generation — systematic enumeration across six intervention modalities (TF payload, CRISPRa, CRISPRi, RNA payload, regulatory context, small molecule context)
Genome context build — each candidate is grounded to a real DNA context sequence
Evo 2 scoring — genome foundation model scores sequence plausibility and produces a dense embedding for each candidate
Trajectory prediction — deterministic model predicts the state transition path from start to target
Risk assessment — flags hard constraint violations, biosecurity concerns, and high-uncertainty candidates
Safety filtering — rejects any candidate that fails any gate; never silently degrades
Ranking — candidates sorted by weighted composite score; assay plan and full report generated

The result is a ranked, explainable, auditable list of intervention candidates grounded in genome-level sequence plausibility — not just literature association.

Why Foundation Models Change Everything

Classical computational biology approaches to cell state analysis rely on curated gene regulatory networks, transcription factor binding databases, and pathway enrichment scores. These methods are powerful but limited: they can only reason about what has already been measured and annotated.

Genome foundation models — large neural networks trained on billions of base pairs of DNA — represent a qualitative shift. By learning the statistical structure of genomic sequences at scale, they develop internal representations that capture:

Sequence plausibility — how likely a given DNA sequence is under the distribution of real genomic sequences
Functional context — which sequence features associate with gene expression, chromatin accessibility, and regulatory activity
Variant sensitivity — how a single nucleotide change alters the model's assessment of a locus
Transferable embeddings — dense vector representations that can be used for downstream prediction tasks with relatively little labeled data

The analogy to large language models is direct. Just as GPT-scale models learn the statistical structure of language and generalize to new tasks, genome foundation models learn the statistical structure of DNA and generalize to regulatory genomics problems they were never explicitly trained on.

Evo 2

Evo 2 is a genome foundation model developed by the Arc Institute, trained on a large corpus of prokaryotic and eukaryotic sequences at single-nucleotide resolution. At 40 billion parameters it is the largest publicly available genome model to date.

Key capabilities used in this platform:

Operation	What It Computes	How It's Used
Sequence scoring	Mean log-likelihood of a DNA sequence under the model	Measures how "native" a candidate context sequence looks to the genome — high plausibility = the genome can produce this; low plausibility = unusual sequence that may not function as intended
Embedding	Dense vector representation of a sequence from an intermediate layer	Used to compute embedding feature scores and stored for future retrieval/comparison
Variant effect	Delta log-likelihood between a reference and alternate sequence	Quantifies the effect of a proposed edit relative to the reference context

These scores enter the ranking formula as explicit weighted terms — Evo 2 is a first-class input to ranking, not an annotation added afterward.

Ranking weights

target_state_similarity      0.26
identity_preservation        0.18
safety                       0.22
manufacturability            0.10
evo2_sequence_plausibility   0.12   ← Evo 2 score
evo2_context_confidence      0.08   ← derived from Evo 2 uncertainty
evo2_embedding_feature_score 0.04   ← from Evo 2 embedding
uncertainty_penalty         -0.15   ← Evo 2 uncertainty penalizes rank

The goal: candidates that look plausible to the genome itself rank higher than candidates that are merely mechanistically appealing on paper.

CPU fallback (development mode)

When no CUDA GPU is available, the genome model service automatically falls back to a CPU composition scorer — a real 4-mer background frequency model against the human genome composition. This is genuine bioinformatics computation (the same statistical model used by tools like FIMO and HOMER for sequence background scoring), clearly labeled provider: cpu_composition in all outputs. It is not a mock and it is not silent — every result tells you which scoring method was used.

On GPU hardware with Evo 2 installed, all scoring switches automatically to real neural network inference.

Technical Architecture

┌─────────────────────────────────────────────────────────────┐
│  Browser                                                    │
│  Next.js 14 App Router · TypeScript · Tailwind              │
│  TanStack Query · Recharts · Radix UI                       │
│  localhost:3000                                             │
└───────────────────────┬─────────────────────────────────────┘
                        │ REST / JSON
┌───────────────────────▼─────────────────────────────────────┐
│  FastAPI (Python 3.11)                      localhost:8000  │
│  JWT auth · SQLAlchemy 2 · Alembic                         │
│  Compiler pipeline · Audit logging                          │
└──────┬──────────────────────────┬────────────────────────────┘
       │ RQ job queue             │ httpx calls
       ▼                          ▼
┌──────────────┐    ┌─────────────────────────────────────────┐
│  RQ Worker   │    │  Genome Model Service       :8100       │
│  (Python)    │    │  FastAPI · Safety gateway               │
│              │    │  ┌─────────────────────────────────┐    │
│              │    │  │ Evo 2 (GPU)   OR  CPU scorer    │    │
│              │    │  │ arcinstitute/evo2_40b            │    │
│              │    │  │ arcinstitute/evo2_7b             │    │
│              │    │  │ arcinstitute/evo2_1b_base        │    │
│              │    │  └─────────────────────────────────┘    │
└──────────────┘    └─────────────────────────────────────────┘
       │
       ▼
┌──────────────┐    ┌──────────────┐
│  PostgreSQL  │    │  Redis       │
│  + pgvector  │    │  (RQ broker) │
│  :5432       │    │  :6379       │
└──────────────┘    └──────────────┘

Services

Service	Technology	Purpose
`web`	Next.js 14, TypeScript, Tailwind	Full UI: projects, states, compile, candidates, reports
`api`	FastAPI, SQLAlchemy 2, pgvector	REST API, auth, compiler pipeline orchestration
`worker`	Python, RQ	Async compile job execution
`genome-model-service`	FastAPI	Evo 2 inference: scoring, embedding, variant effect
`postgres`	PostgreSQL 16 + pgvector	All relational data + 384-dim vector columns
`redis`	Redis 7	RQ job queue

Database schema (key tables)

users · organizations · organization_members
projects
  cell_states (vector(384))
  target_states (vector(384))
  constraint_sets
  compile_jobs
    genome_assets
    candidate_payloads
      state_trajectories
      risk_assessments
    evo2_model_runs        ← one record per Evo 2 API call
    assay_plans
    reports
  experiments
  audit_logs

Compiler pipeline (detail)

compile_cell_program(request, db):
    1. screen_text_fields()           # biosecurity gate
    2. check_evo2_health()            # hard fail if service down
    3. encode_cell_state() → vec384   # marker + pathway + label encoding
    4. generate_candidates()          # 5-6 modalities × target objectives
    5. for each candidate:
         build_genome_context()       # ground to real DNA sequence
         store GenomeAsset
         score_candidate_with_evo2()  # → ScoreSequenceResponse
         embed_candidate_with_evo2()  # → EmbedSequenceResponse
         store Evo2ModelRun records
         predict_trajectory()         # deterministic state path
         assess_risk()
         apply_safety_filter()
         compute_final_score()        # weighted formula
    6. rank_candidates()
    7. generate_assay_plan()
    8. generate_report()
    9. write_audit_logs()

Deployment

Prerequisites

Docker Desktop 4.x or Docker Engine 24+ with Compose V2
16 GB RAM minimum (CPU-only mode)

GPU requirements for Evo 2:

Model	VRAM	Notes
`evo2_1b_base`	~4 GB	CPU also works (slow, ~60s/sequence)
`evo2_7b`	~16 GB	Single A100/H100 40 GB
`evo2_40b`	~80 GB	Two H100 80 GB, use `docker-compose.gpu.yml`

1. Clone and configure

git clone <repo>
cd cell-state-compiler
cp .env.example .env

Key .env settings:

# For CPU-only local development:
EVO2_MODEL_NAME=evo2_1b_base
EVO2_DEVICE=cpu

# For GPU with 40B model:
EVO2_MODEL_NAME=evo2_40b
EVO2_DEVICE=cuda:0
HUGGINGFACE_TOKEN=hf_...   # required if repo is gated

2. Install NVIDIA Container Toolkit (GPU only)

# Ubuntu / Debian
distribution=$(. /etc/os-release; echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list \
  | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

3. Start all services

CPU (development, any machine):

docker compose up --build

GPU with Evo 2 40B:

docker compose -f docker-compose.yml -f docker-compose.gpu.yml up --build

The docker-compose.gpu.yml override:

Uses Dockerfile.gpu (CUDA 12.4 + flash-attn + evo2)
Sets EVO2_MODEL_NAME=evo2_40b
Allocates all available GPUs to the genome model service
Evo 2 weights download automatically from HuggingFace on first start (~80 GB for 40B)

4. Seed demo data

# Wait for the api container to be healthy, then:
docker compose exec api python /scripts/seed_demo_data.py

This creates:

Admin user: demo@cellcompiler.local / password123
Demo organization, project, cell state, target state, constraint set

5. Verify

# All services healthy
docker compose ps

# API
curl http://localhost:8000/health

# Evo 2 status
curl http://localhost:8100/v1/evo2/health | python3 -m json.tool

# Frontend
open http://localhost:3000

Using the Platform

Endpoints

URL	Description
`http://localhost:3000`	Web application
`http://localhost:8000/docs`	FastAPI interactive docs
`http://localhost:8100/docs`	Genome model service docs

End-to-end flow

Log in at http://localhost:3000 with demo credentials or create an account
Create a project — name, cell type, disease context
Define the starting state — marker profile (CD8, PD1, TOX, TCF7, ...), pathway scores, state labels
Define the target state — desired markers, functional objectives
Configure constraints — allowed modalities, forbidden mechanisms, risk thresholds
Run compile — click Compile Cell Program; job runs asynchronously in the worker
Review candidates — ranked table with Evo 2 scores, plausibility, uncertainty, trajectory
Inspect each candidate — Overview, Scores, Evo2 Analysis, Trajectory, Risk, Assay Plan, Audit tabs
Export report — full markdown + JSON compile report

Checking Evo 2 health

curl http://localhost:8100/v1/evo2/health | python3 -m json.tool

GPU with Evo 2 loaded:

{
  "healthy": true,
  "provider": "local",
  "model_name": "evo2_40b",
  "device": "cuda:0",
  "cuda_available": true,
  "model_loaded": true,
  "smoke_test_passed": true
}

CPU-only (4-mer composition scoring):

{
  "healthy": true,
  "provider": "cpu_composition",
  "model_name": "4mer_human_background",
  "device": "cpu",
  "cuda_available": false,
  "model_loaded": true,
  "details": {
    "evo2_available": false,
    "scoring_mode": "cpu_composition"
  }
}

NVIDIA NIM (alternative to local Evo 2)

If you have access to NVIDIA's hosted Evo 2 NIM endpoint:

EVO2_PROVIDER=nvidia_nim
NVIDIA_NIM_API_KEY=your_key
NVIDIA_NIM_EVO2_URL=https://your-nim-endpoint

The NIM adapter calls the remote endpoint for all scoring and embedding operations. If an operation is unsupported by the endpoint it returns 501 — it never fabricates results.

Safety Design

Safety is structural, not advisory:

Biosecurity text gate — compile requests are rejected before computation if any text field contains pathogen, virus, toxin, virulence, immune evasion, gain of function, bioweapon, weapon, replication competent, or gain-of-function terms
Sequence safety gateway — the genome model service screens every DNA sequence before it reaches Evo 2; blocked sequences are rejected, never silently passed
Candidate safety filter — candidates with a blocked Evo 2 safety status, hard constraint violations, or a biosecurity-flagged risk class are removed from results; they are not scored lower, they are rejected
Generation disabled by default — the sequence generation endpoint requires ENABLE_EVO2_GENERATION=true AND a purpose-specific safety gate pass before any generation occurs
No mock results — CI has a grep check that fails if MockEvo2 or mock_evo2 appears anywhere in the codebase; there is no path through the system that returns fabricated scores
Full audit log — every action (login, project creation, compile job start/complete/fail, candidate view) is written to audit_logs with user ID, timestamp, and entity reference

Disclaimer shown on every Evo 2 result:

Evo 2 scores are model-derived research signals, not biological validation. Experimental confirmation required before any research decision.

Development

Project structure

cell-state-compiler/
  apps/
    api/              FastAPI backend (Python 3.11)
      app/
        compiler/     Nine-step compile pipeline
        models/       SQLAlchemy ORM models
        api/routes/   REST endpoints
        services/     Evo2 client, audit service
        jobs/         RQ task definitions
        migrations/   Alembic migrations
    worker/           RQ worker process
    genome-model-service/
      app/
        adapters/     LocalEvo2Adapter, CpuCompositionScorer, NvidiaEvo2NimAdapter
        services/     Routing and safety application
        safety.py     SequenceSafetyGateway
        model_health.py Health check with fallback logic
    web/              Next.js 14 frontend
      app/            App Router pages
      components/     Shared components
      hooks/          TanStack Query hooks
      lib/            API client, auth utilities
  scripts/
    seed_demo_data.py
    reset_db.py
    check_evo2_runtime.py
  data/demo_sequences/ Safe synthetic DNA for smoke tests
  docker-compose.yml
  docker-compose.gpu.yml

Running tests

# Backend
cd apps/api
pip install -e ".[test]"
pytest app/tests/ -v

# Genome model service
cd apps/genome-model-service
pip install -e ".[test]"
pytest app/tests/ -v

# Frontend type check
cd apps/web
npm run build

CI mock check

grep -r "MockEvo2\|mock_evo2" --include="*.py" --include="*.ts" . \
  && echo "FAIL: mocks found" || echo "PASS"

Local development without Docker

# Start infra only
docker compose up postgres redis genome-model-service

# API
cd apps/api
pip install -e .
alembic upgrade head
uvicorn app.main:app --reload --port 8000

# Worker (separate terminal)
cd apps/worker
python worker.py

# Frontend
cd apps/web
npm install
npm run dev

Environment Variables

Variable	Default	Description
`EVO2_PROVIDER`	`local`	`local` or `nvidia_nim`
`EVO2_MODEL_NAME`	`evo2_1b_base`	Evo 2 checkpoint: `evo2_1b_base`, `evo2_7b`, `evo2_40b`
`EVO2_DEVICE`	`cpu`	`cpu` or `cuda:0`
`EVO2_HEALTH_RUN_SMOKE_TEST`	`true`	Run inference smoke test on health check
`EVO2_MAX_CONTEXT_LENGTH`	`8192`	Max tokens per forward pass
`HUGGINGFACE_TOKEN`	—	HuggingFace token for downloading gated models
`ENABLE_EVO2_GENERATION`	`false`	Enable sequence generation endpoint
`SEQUENCE_SAFETY_MODE`	`restricted`	Safety screening strictness
`MAX_SEQUENCE_LENGTH`	`8192`	Max DNA sequence length accepted
`NVIDIA_NIM_API_KEY`	—	API key for NVIDIA NIM endpoint
`NVIDIA_NIM_EVO2_URL`	—	NVIDIA NIM Evo 2 base URL
`JWT_SECRET`	—	Secret for JWT signing (change in production)
`DATABASE_URL`	—	PostgreSQL connection string
`REDIS_URL`	—	Redis connection string
`GENOME_MODEL_SERVICE_URL`	`http://genome-model-service:8100`	Internal URL for API → genome service calls

See .env.example for all variables with comments.

Troubleshooting

`No module named 'flash_attn_2_cuda'`

Evo 2 requires a CUDA GPU and flash-attn. Without GPU hardware, the genome model service automatically falls back to CPU composition scoring. To run neural network inference, use the GPU compose override:

docker compose -f docker-compose.yml -f docker-compose.gpu.yml up --build

Model download is slow or fails

Evo 2 weights download from HuggingFace on first start. Weights are cached in the hf-model-cache Docker volume so subsequent starts are instant. Ensure you have enough disk space:

evo2_1b_base: ~2 GB
evo2_7b: ~14 GB
evo2_40b: ~80 GB

For gated repositories, set HUGGINGFACE_TOKEN in .env.

CUDA out of memory

Switch to a smaller model:

EVO2_MODEL_NAME=evo2_7b   # or evo2_1b_base

Or use device_map="auto" across multiple GPUs (enabled automatically when torch.cuda.device_count() > 1).

Compile job stays queued

The worker container must be running and connected to Redis. Check:

docker compose logs worker
docker compose exec redis redis-cli ping

`[object Object]` error in UI

Typically a 422 validation error from the API. Open browser DevTools → Network tab to see the raw error response.

Roadmap

Retrieval-augmented candidate generation using pgvector similarity search across historical experiments
Evo 2 variant effect scoring to rank specific nucleotide edits, not just genomic contexts
Prediction vs. observation comparison with quantitative error analysis once wet lab results are uploaded
Support for additional foundation models (Nucleotide Transformer, HyenaDNA, DNABERT-2) as alternative or ensemble backends
Multi-user organizations with role-based access control
Export candidates as structured protocols for lab automation (Opentrons, Hamilton)

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
apps		apps
data/demo_sequences		data/demo_sequences
docs		docs
packages/shared		packages/shared
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Cell State Compiler

The Problem

The Solution

Why Foundation Models Change Everything

Evo 2

Ranking weights

CPU fallback (development mode)

Technical Architecture

Services

Database schema (key tables)

Compiler pipeline (detail)

Deployment

Prerequisites

1. Clone and configure

2. Install NVIDIA Container Toolkit (GPU only)

3. Start all services

4. Seed demo data

5. Verify

Using the Platform

Endpoints

End-to-end flow

Checking Evo 2 health

NVIDIA NIM (alternative to local Evo 2)

Safety Design

Development

Project structure

Running tests

CI mock check

Local development without Docker

Environment Variables

Troubleshooting

No module named 'flash_attn_2_cuda'

Model download is slow or fails

CUDA out of memory

Compile job stays queued

[object Object] error in UI

Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`No module named 'flash_attn_2_cuda'`

`[object Object]` error in UI

Packages