ouroboros-rs

Spec-driven evolutionary workflow engine. Socratic interview, seed crystallization, Double Diamond execution, 3-stage evaluation, and evolutionary loop with convergence detection.

Inspired by ouroboros. Reimplemented from scratch in Rust.

Why This Exists

The original ouroboros (Python) treats specifications as an evolving ontology rather than static documents. It combines Socratic questioning, ambiguity scoring, and an evolutionary loop where each generation's evaluation drives ontology mutations until convergence.

ouroboros-rs reimplements the core algorithmic pipeline in Rust, providing:

10-100x faster specification processing and evaluation cycles
Single binary distribution with no Python runtime dependency
Type-safe events — 18 typed event variants replace Python's dict[str, Any]
Memory safety — no NoneType crashes (original issue #275)
Efficient persistence — SQLite event store without async overhead
Native Result<T, E> — zero-cost error handling that maps directly from the original's Result monad

Architecture

User Input
    |
[Interview Engine]  -- Multi-perspective Socratic questioning
    |                  (Researcher, Simplifier, Architect, BreadthKeeper, SeedCloser)
    | Ambiguity gate: score <= 0.2
    v
[Seed Generator]    -- LLM extracts structured requirements
    |                  -> immutable Seed with ontology schema
    v
[Evolutionary Loop] -- Up to 30 generations
    |
    +-- [Wonder Engine]     Identify gaps, tensions, assumptions (gen 2+)
    +-- [Reflect Engine]    Propose ontology mutations (add/modify/remove)
    +-- [Seed Evolution]    Apply mutations -> new immutable Seed
    +-- [Double Diamond]    Discover -> Define -> Design -> Deliver
    |     +-- AC Decomposition  (2-5 children, max depth 5)
    |     +-- Topological Sort  (Kahn's algorithm for parallel levels)
    +-- [Evaluation Pipeline]
    |     +-- Stage 1: Mechanical  (lint, build, test, coverage >= 0.7)
    |     +-- Stage 2: Semantic    (LLM compliance, drift, gaming detection)
    |     +-- Stage 3: Consensus   (multi-model voting OR deliberative)
    +-- [Convergence Check]
          +-- Ontology similarity >= 0.95
          +-- Stagnation detection (3-gen window)
          +-- Oscillation detection (period-2 cycling)
          +-- Exit gates (score, AC pass, no regressions)

Quick Start

Installation

# From source
git clone https://github.com/JSLEEKR/ouroboros-rs.git
cd ouroboros-rs
cargo build --release

# Binary is at target/release/ouroboros-rs

CLI Usage

# Show default configuration
ouroboros-rs config

# Start an interview session
ouroboros-rs interview --prompt "Build a REST API for task management"

# Start brownfield interview (existing codebase)
ouroboros-rs interview --brownfield --threshold 0.15

# Run evolutionary loop on a seed
ouroboros-rs evolve --seed seed.json --max-generations 10

# Evaluate artifacts against a seed
ouroboros-rs evaluate --seed seed.json --artifacts output/

# Show version and info
ouroboros-rs info

Library Usage

use ouroboros_rs::{
    InterviewEngine, Seed, EvolutionaryLoop, EvalPipeline, OuroborosConfig,
    llm::{LlmAdapter, CompletionConfig, ProviderError},
    seed::{AcceptanceCriterion, OntologySchema, OntologyField},
};
use async_trait::async_trait;

// 1. Implement the LlmAdapter trait for your provider
struct MyLlmProvider;

#[async_trait]
impl LlmAdapter for MyLlmProvider {
    async fn complete(&self, config: CompletionConfig) -> Result<String, ProviderError> {
        // Your LLM API call here
        todo!()
    }
}

// 2. Run an interview
let mut engine = InterviewEngine::new(0.2);
let llm = MyLlmProvider;

let question = engine.generate_question(&llm, Some("Build a parser")).await?;
let ambiguity = engine.record_answer(&question, "A log file parser", &llm).await?;

if engine.is_ready() {
    let result = engine.result();
    // Generate seed from interview
}

// 3. Create a seed directly
let seed = Seed::new(
    "Parse log files efficiently",
    vec!["Handle files up to 1GB".into()],
    vec![
        AcceptanceCriterion::new("AC-1", "Parse syslog format", 1),
        AcceptanceCriterion::new("AC-2", "Handle malformed lines", 2),
    ],
    OntologySchema::new(vec![
        OntologyField::new("log_entry", "object", "A parsed log entry"),
        OntologyField::new("severity", "enum", "Log severity level"),
    ]),
    vec!["Performance > 100MB/s".into()],
);

// 4. Run the evolutionary loop
let config = OuroborosConfig::default();
let mut evloop = EvolutionaryLoop::new(config);
let final_seed = evloop.run(seed, &llm).await?;

Core Concepts

Immutable Seeds

Seeds are the crystallized specification — frozen after creation. Mutations create new Seeds with incremented generation numbers and parent references.

let seed1 = Seed::new("Original goal", ...);
let seed2 = seed1.evolve(
    Some("Evolved goal".into()),
    None,  // keep constraints
    None,  // keep criteria
    Some(new_ontology),
);
assert_eq!(seed2.metadata.generation, 2);
assert_eq!(seed2.metadata.parent_id, Some(seed1.metadata.id));

Ambiguity Scoring

The interview engine scores clarity across dimensions:

Greenfield: ambiguity = 1.0 - (goal*0.40 + constraints*0.30 + criteria*0.30)
Brownfield: ambiguity = 1.0 - (goal*0.35 + constraints*0.25 + criteria*0.25 + context*0.15)

Gate: ambiguity <= 0.2 to proceed

OntologyDelta Similarity

Weighted field comparison for convergence detection:

similarity = name_present * 0.5 + type_match * 0.3 + exact_match * 0.2

where:
  name_present = fields in both / total fields
  type_match   = same-type fields / total fields
  exact_match  = identical fields / total fields

Multi-Signal Convergence

The evolutionary loop stops when any signal fires:

Signal	Condition
Converged	Ontology similarity >= 0.95
Stagnated	Similarity unchanged for 3 consecutive generations
Oscillating	Score alternates A-B-A-B (period-2 cycling)
Gates Met	Eval score >= 0.7 AND all ACs pass
Max Generations	30 generations reached

3-Stage Evaluation Pipeline

Stage 1: Mechanical
  - Runs lint, build, test, coverage checks
  - Early termination: pipeline halts on failure
  - Coverage gate: >= 70%

Stage 2: Semantic (LLM)
  - Per-AC compliance assessment
  - Specification drift detection
  - Reward-hacking / gaming signal detection

Stage 3: Consensus
  - Voting mode: N models vote, threshold approval (default 2/3)
  - Deliberative mode: advocate + devil's advocate + judge

Typed Events

All state changes are captured as typed events (not untyped dictionaries):

pub enum EventPayload {
    InterviewStarted { is_brownfield: bool },
    QuestionAsked { round: usize, perspective: String, question: String },
    SeedGenerated { seed_id: String, generation: u32, ... },
    GenerationCompleted { generation: u32, eval_score: f64, ... },
    ConvergenceDetected { generation: u32, reason: String, similarity: f64 },
    // ... 18 variants total
}

Events are persisted to SQLite for session resume and audit trail.

API Reference

Key Types

Type	Description
`Seed`	Immutable specification with goal, constraints, ACs, ontology
`OntologySchema`	Typed field definitions for the domain model
`OntologyDelta`	Difference between two ontology schemas
`InterviewEngine`	Multi-perspective Socratic interview state machine
`AmbiguityScorer`	Weighted ambiguity scoring (greenfield/brownfield)
`DoubleDiamond`	4-phase execution engine
`AcTree`	Hierarchical acceptance criteria tree
`TopoSort`	Kahn's algorithm for parallel execution levels
`EvalPipeline`	3-stage evaluation orchestrator
`EvolutionaryLoop`	Generation loop with convergence detection
`WonderEngine`	Socratic gap analysis
`ReflectEngine`	Ontology mutation proposer
`SqliteEventStore`	Typed event persistence

Key Traits

/// Implement this for your LLM provider
#[async_trait]
pub trait LlmAdapter: Send + Sync {
    async fn complete(&self, config: CompletionConfig) -> Result<String, ProviderError>;
}

Configuration

{
  "max_generations": 30,
  "ambiguity_threshold": 0.2,
  "convergence_threshold": 0.95,
  "stagnation_window": 3,
  "max_decomposition_depth": 5,
  "min_coverage": 0.7,
  "consensus_threshold": 0.67,
  "consensus_model_count": 3
}

How It Differs from the Original

Aspect	ouroboros (Python)	ouroboros-rs (Rust)
Performance	Python 3.12+ with asyncio	Native binary, zero-cost abstractions
Events	`dict[str, Any]` payloads	18 typed enum variants
Error handling	Custom Result monad	Native `Result<T, E>`
Immutability	Pydantic `frozen=True`	Rust ownership + no `mut`
State	Async SQLite with ORM	rusqlite (sync, no ORM overhead)
Distribution	pip install + Python runtime	Single binary
LLM interface	Coupled to Claude/Codex SDKs	Provider-agnostic trait
Dependencies	15+ (anthropic, textual, litellm...)	9 minimal deps
NoneType safety	Runtime crashes (issue #275)	Compile-time Option checks
Session resume	Fragile (issue #50)	Event replay from SQLite

Features Not Included

The following are intentionally excluded as non-core:

Claude Code / Codex runtime adapters — implement LlmAdapter instead
Product management workflows — secondary feature
TUI dashboard — UI concern, not core algorithm
MCP server — integration concern
Plugin system — distribution concern

Project Structure

src/
  lib.rs              -- Public API and module re-exports
  main.rs             -- CLI entry point (clap)
  config/mod.rs       -- Engine configuration
  llm/
    mod.rs            -- LLM module
    traits.rs         -- LlmAdapter trait, CompletionConfig, ProviderError
    mock.rs           -- MockLlmAdapter for testing
  seed/
    mod.rs            -- Seed module
    schema.rs         -- Seed, OntologySchema, OntologyField, AcceptanceCriterion
    generator.rs      -- SeedGenerator (from interview + evolution)
  interview/
    mod.rs            -- Interview module
    engine.rs         -- InterviewEngine state machine
    ambiguity.rs      -- AmbiguityScorer with weighted components
    perspectives.rs   -- 5 interview perspectives
  execution/
    mod.rs            -- Execution module
    double_diamond.rs -- DoubleDiamond 4-phase engine
    ac_tree.rs        -- AcTree hierarchical decomposition
    decomposition.rs  -- AcDecomposer (2-5 children, max depth 5)
    topo_sort.rs      -- TopoSort (Kahn's algorithm)
  evaluation/
    mod.rs            -- Evaluation module
    pipeline.rs       -- EvalPipeline orchestrator
    mechanical.rs     -- Stage 1: MechanicalEvaluator
    semantic.rs       -- Stage 2: SemanticEvaluator
    consensus.rs      -- Stage 3: ConsensusEvaluator
  evolution/
    mod.rs            -- Evolution module
    evloop.rs         -- EvolutionaryLoop (up to 30 generations)
    wonder.rs         -- WonderEngine (gap analysis)
    reflect.rs        -- ReflectEngine (ontology mutations)
    convergence.rs    -- ConvergenceChecker (multi-signal)
  lineage/
    mod.rs            -- Lineage module
    types.rs          -- OntologyLineage, GenerationRecord, OntologyDelta
    events.rs         -- OuroborosEvent, EventPayload (18 variants)
  persistence/
    mod.rs            -- Persistence module
    sqlite.rs         -- SqliteEventStore

Testing

cargo test           # Run all 199 tests
cargo test --lib     # Library tests only
cargo test -- seed   # Run seed module tests

Tests use MockLlmAdapter — no real LLM calls needed. All modules have comprehensive unit tests covering:

Seed immutability and serialization (11 tests)
Seed generation and JSON extraction (8 tests)
Interview state machine transitions (10 tests)
Ambiguity scoring with edge cases (12 tests)
Perspective selection by round (8 tests)
Evolutionary loop and convergence (3 tests)
Convergence detection signals (9 tests)
Wonder engine gap analysis (6 tests)
Reflect engine mutations (7 tests)
Double Diamond phases (10 tests)
AC tree operations (11 tests)
AC decomposition (6 tests)
Topological sort (9 tests)
Mechanical evaluation (9 tests)
Semantic evaluation (6 tests)
Consensus voting and deliberation (8 tests)
Evaluation pipeline (5 tests)
Event types and serialization (4 tests)
Event store persistence (11 tests)
LLM mock adapter (6 tests)
LLM traits (4 tests)
Configuration (3 tests)

License

MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
ROUND_LOG.md		ROUND_LOG.md
study-notes.md		study-notes.md
target-selection.md		target-selection.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ouroboros-rs

Why This Exists

Architecture

Quick Start

Installation

CLI Usage

Library Usage

Core Concepts

Immutable Seeds

Ambiguity Scoring

OntologyDelta Similarity

Multi-Signal Convergence

3-Stage Evaluation Pipeline

Typed Events

API Reference

Key Types

Key Traits

Configuration

How It Differs from the Original

Features Not Included

Project Structure

Testing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ouroboros-rs

Why This Exists

Architecture

Quick Start

Installation

CLI Usage

Library Usage

Core Concepts

Immutable Seeds

Ambiguity Scoring

OntologyDelta Similarity

Multi-Signal Convergence

3-Stage Evaluation Pipeline

Typed Events

API Reference

Key Types

Key Traits

Configuration

How It Differs from the Original

Features Not Included

Project Structure

Testing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages