🐍 Zouroboros

Self-learning AI development skills for Zo Computer.

Zouroboros is a self-enhancing AI development toolkit. It starts with specification-first development — Socratic interviews, immutable seed specs, and 3-stage evaluation. Then it closes the loop: the system diagnoses its own health, prescribes improvements, executes them autonomously, and verifies the results.

The snake eats its own tail.

Adapted from Q00/ouroboros. Native TypeScript/Bun, zero Python dependencies, designed as Zo Skills.

What's Included

Foundational Skills

Skill	Description
spec-first-interview	Socratic interview → ambiguity scoring → immutable seed YAML
three-stage-eval	Mechanical → Semantic → Consensus verification pipeline
unstuck-lateral	5 lateral-thinking personas to break through stagnation
autoloop	Autonomous single-metric file optimization loop (inspired by karpathy/autoresearch)

Self-Enhancement Skills (the closed loop)

Skill	Description
zouroboros-introspect	Self-diagnostic health scorecard across 7 system metrics
zouroboros-prescribe	Auto-generates improvement seeds from scorecard, with governor safety gate
zouroboros-evolve	Executes prescriptions, measures delta, reverts regressions

Personas

Persona	Purpose
Zouroboros	The self-enhancement engine — clinical, metric-driven, autonomous
Hacker	Break past constraints creatively
Researcher	Stop and investigate systematically
Simplifier	Cut to MVP ruthlessly
Architect	Fix structural problems
Contrarian	Question the problem itself

Install

Zo Computer (recommended)

git clone https://github.com/marlandoj/Zo-Ouroboros.git /tmp/zouroboros
bash /tmp/zouroboros/install.sh
rm -rf /tmp/zouroboros

One-liner

git clone https://github.com/marlandoj/Zo-Ouroboros.git /tmp/zouroboros && bash /tmp/zouroboros/install.sh && rm -rf /tmp/zouroboros

Environment Variables (optional)

Variable	Default	Description
`ZOUROBOROS_WORKSPACE`	`$HOME`	Root workspace path
`ZOUROBOROS_SKILLS_DIR`	`$HOME/Skills`	Where skills are installed
`ZOUROBOROS_IDENTITY_DIR`	`$HOME/IDENTITY`	Where persona files go
`ZOUROBOROS_SEEDS_DIR`	`$HOME/Seeds/zouroboros`	Where prescriptions are saved
`ZOUROBOROS_MEMORY_DB`	`$HOME/.zo/memory/shared-facts.db`	SQLite memory database
`ZOUROBOROS_MEMORY_SCRIPTS`	`$HOME/Skills/zo-memory-system/scripts`	Memory system CLI

Prerequisites

Bun runtime (v1.0+)
SQLite3 CLI
zo-memory-system skill (for memory DB, graph, episodes, procedures)
Optional: Ollama with qwen2.5:1.5b + nomic-embed-text (for memory gate + embeddings)
Autoloop skill (bundled — for file-targeting metric optimization)

Quick Start

1. Score a request

bun Skills/spec-first-interview/scripts/interview.ts score \
  --request "Add rate limiting to the API"

2. Run the self-diagnostic

bun Skills/zouroboros-introspect/scripts/introspect.ts --verbose

Output:

╔════════════════════════════════════════════════════════╗
║  ZOUROBOROS INTROSPECTION SCORECARD                     ║
╠════════════════════════════════════════════════════════╣
║  ✅ Memory Recall          100.0% → score:100%  ║
║  ✅ Graph Connectivity      90.0% → score:100%  ║
║  ⚠️  Routing Accuracy        N/A   → score: 50%  ║
║  ⚠️  Eval Calibration        N/A   → score: 50%  ║
║  ⚠️  Procedure Freshness     N/A   → score: 50%  ║
║  ⚠️  Episode Velocity        N/A   → score: 50%  ║
╠════════════════════════════════════════════════════════╣
║  ⚠️  COMPOSITE HEALTH: 70/100                          ║
║     Weakest: Routing Accuracy                           ║
╚════════════════════════════════════════════════════════╝

3. Run the full self-enhancement pipeline

# Introspect → identify weakest metric
bun Skills/zouroboros-introspect/scripts/introspect.ts --json > /tmp/scorecard.json

# Prescribe → generate improvement seed
bun Skills/zouroboros-prescribe/scripts/prescribe.ts --scorecard /tmp/scorecard.json

# Evolve → execute the prescription
bun Skills/zouroboros-evolve/scripts/evolve.ts --prescription Seeds/zouroboros/rx-*.json

4. Schedule it (Zo Computer)

Create a scheduled agent that runs the pipeline daily. The Zouroboros persona handles the rest autonomously — diagnosing, prescribing, executing, and reporting via email.

The Self-Enhancement Loop

How It Works

Introspect — Measures 7 health metrics across memory, graph, routing, eval, procedures, and episode velocity. Outputs a composite score (0–100) and ranks improvement opportunities.
Prescribe — Maps the weakest metric to one of 14 playbooks. Generates a seed YAML (spec-first format) and optionally a program.md (autoloop format). A governor gate blocks high-risk prescriptions.
Evolve — Executes the prescription via autoloop (file-targeting) or script mode (procedural). Captures pre/post scorecards. Reverts on regression.

7 Health Metrics

Metric	Source	Target	Weight
Memory Recall	Continuation eval fixture pass rate	≥ 85%	0.22
Graph Connectivity	Knowledge graph orphan fact ratio	≥ 80% linked	0.14
Routing Accuracy	Swarm episode success rate	≥ 85%	0.18
Eval Calibration	Stage 3 override rate	≤ 15%	0.14
Procedure Freshness	Stale procedure ratio (14+ days)	≤ 30%	0.14
Episode Velocity	7-day success trend vs prior 7 days	Positive	0.08
Skill Effectiveness	Per-skill success rate from skill_executions	≥ 85%	0.10

14 Playbooks

ID	Playbook	Metric	Severity
A	Fixture Expansion	Memory Recall	WARNING
B	Graph-Boost Weight Tuning	Memory Recall	CRITICAL
C	Batch Wikilink Extraction	Graph Connectivity	WARNING
D	Entity Consolidation	Graph Connectivity	CRITICAL
E	Signal Weight Adjustment	Routing Accuracy	WARNING
F	Capability Keyword Expansion	Routing Accuracy	CRITICAL ⚠️
G	Drift Threshold Adjustment	Eval Calibration	WARNING
H	Semantic Fixture Addition	Eval Calibration	CRITICAL ⚠️
I	Batch Procedure Evolution	Procedure Freshness	WARNING
J	Procedure Regeneration	Procedure Freshness	CRITICAL
K	Failure Root-Cause Analysis	Episode Velocity	WARNING
L	Executor Health Check	Episode Velocity	CRITICAL ⚠️
M	Skill Error Pattern Fix	Skill Effectiveness	WARNING ⚠️
N	Tool Call Optimization	Skill Effectiveness	CRITICAL ⚠️

⚠️ = Requires human approval (governor blocks autonomous execution)

Governor Safety Rules

The governor prevents runaway self-modification:

Approval gate — Playbooks marked ⚠️ require human approval before execution
Schema protection — Never touches database migrations or structure
Blast radius limit — Max 3 files modified per cycle
Weight bounds — Routing/scoring weights can only change ±10% per cycle
Regression detection — Any metric dropping >2% triggers automatic revert
Audit trail — Every cycle stored as a memory episode with full metadata

Foundational Skills

Spec-First Interview

Before building anything complex, the interview skill runs a Socratic dialogue to surface ambiguity. It only asks questions — it never promises to build. Once clarity reaches 80% across three dimensions (goal 40%, constraints 30%, success criteria 30%), it generates an immutable seed YAML.

In conversation:

You: "I need better logging in the API."

Zo: "What does 'better' mean here — more log volume, structured formats, or centralized collection? Are there specific endpoints that are hard to debug today?"

You: "Structured JSON logs, mainly for the payment endpoints. We're flying blind on failed transactions."

Zo: "What's the retention requirement? And should these logs include PII like customer IDs, or do we need to mask those?"

(5–8 questions later, clarity ≥ 0.80)

Zo: "Requirements are clear. Generating seed spec..." → outputs seed-structured-logging.yaml

Ambiguity scoring:

bun Skills/spec-first-interview/scripts/interview.ts score --request "Make the site faster"
# → Ambiguity: 0.98 — HIGH AMBIGUITY

bun Skills/spec-first-interview/scripts/interview.ts score \
  --request "Add Redis caching to /api/products with 5-min TTL, invalidate on product updates"
# → Ambiguity: 0.15 — READY

Three-Stage Evaluation

After building something, the eval pipeline verifies it against the seed spec in three progressive stages. Each stage is a gate — if it fails, you stop and fix before moving on.

In conversation:

You: "I finished the structured logging. Can you verify it?"

Zo: (Stage 1 — Mechanical) "Running lint, tests, type checks... all passing, 91% coverage. Moving to semantic evaluation."

Zo: (Stage 2 — Semantic) "Checking acceptance criteria against the seed: ✓ JSON format with timestamp, level, context fields ✓ Payment endpoints emit structured logs on failure ✗ PII masking not implemented — customer_id appears in plaintext Score: 0.74 — triggering consensus stage."

Zo: (Stage 3 — Consensus, 3 perspectives) Proposer: "Core logging works. PII issue is a one-line fix." Devil's Advocate: "Shipping unmasked PII to log aggregators is a compliance violation. This isn't a minor gap." Synthesizer: "NEEDS WORK. Fix PII masking, then resubmit."

Stage	Cost	Checks
1. Mechanical	$0	Compile, lint, test, coverage
2. Semantic	Low	AC compliance, goal alignment, drift score
3. Consensus	Medium	3-perspective deliberation (if drift > 0.3 or score in 0.7–0.8)

bun Skills/three-stage-eval/scripts/evaluate.ts \
  --artifact ./my-project/ --seed ./seeds/seed-abc.yaml

Unstuck Lateral

When you're stuck — same error after multiple attempts, going in circles, or the eval keeps returning NEEDS WORK — the unstuck skill auto-selects a lateral-thinking persona based on your problem description.

In conversation:

You: "I keep hitting a 403 on the webhook endpoint. I've tried 5 different auth approaches."

Zo (auto-selects Hacker): "You're assuming you need direct API access. What if you delegated auth to a service account that already has the permissions? Or proxied through an internal endpoint that bypasses the external auth layer?"

You: "I don't understand why this returns stale data. The cache should have invalidated."

Zo (auto-selects Researcher): "Let's stop guessing. Have we confirmed the invalidation event actually fires? Let me check the event bus logs for the last 24 hours and compare timestamps against the cache TTL."

You: "This feature is ballooning. I'm touching 12 files just to add one field."

Zo (auto-selects Simplifier): "Drop the persistence layer for MVP. Store it in memory only. That eliminates 8 of those 12 files. Ship that, get feedback, then decide if you need the rest."

You: "Every change to the request handler breaks something else."

Zo (auto-selects Architect): "You're hitting high coupling. The handler mixes routing, validation, and response formatting. Extract validation into a middleware layer. Simple changes stop cascading."

You: "Should we even be building this? Nobody asked for it."

Zo (auto-selects Contrarian): "Then don't build it. Run a 5-question user survey first. If fewer than 30% say they want this, you just saved two weeks."

Signal	Persona
"error", "can't", "constraint"	Hacker
"don't understand", "why"	Researcher
"too complex", "overwhelming"	Simplifier
"keeps breaking", "touching everything"	Architect
"wrong approach", "step back"	Contrarian

File Structure

Zouroboros/
├── README.md
├── LICENSE
├── install.sh                          # Installer script
├── zouroboros.config.ts                # Portable path configuration
├── skills/
│   ├── spec-first-interview/           # Socratic interview + seed generation
│   │   ├── SKILL.md
│   │   ├── scripts/interview.ts
│   │   └── references/
│   ├── three-stage-eval/               # 3-stage verification pipeline
│   │   ├── SKILL.md
│   │   ├── scripts/evaluate.ts
│   │   └── references/
│   ├── unstuck-lateral/                # 5 lateral-thinking personas
│   │   ├── SKILL.md
│   │   └── references/
│   ├── autoloop/                       # Autonomous metric optimization loop
│   │   ├── SKILL.md
│   │   ├── scripts/autoloop.ts
│   │   ├── assets/template.program.md
│   │   └── references/
│   ├── zouroboros-introspect/          # Self-diagnostic scorecard
│   │   ├── SKILL.md
│   │   ├── scripts/introspect.ts
│   │   ├── scripts/skill-tracker.ts   # Skill execution recorder
│   │   └── references/metric-thresholds.md
│   ├── zouroboros-prescribe/           # Self-prescription engine
│   │   ├── SKILL.md
│   │   ├── scripts/prescribe.ts
│   │   └── references/playbooks.md
│   └── zouroboros-evolve/              # Evolution executor
│       ├── SKILL.md
│       └── scripts/evolve.ts
└── personas/
    ├── zouroboros.md                   # Self-enhancement persona template
    ├── unstuck-hacker.md
    ├── unstuck-researcher.md
    ├── unstuck-simplifier.md
    ├── unstuck-architect.md
    └── unstuck-contrarian.md

Architecture

┌─────────────────────────────────────────────────────────────┐
│                      USER (Approval)                         │
│   • Reviews high-risk prescriptions                          │
│   • Receives daily email scorecards                          │
│   • Can override governor, adjust thresholds                 │
└────────────────────────┬────────────────────────────────────┘
                         │
         ┌───────────────┼───────────────┐
         ▼               ▼               ▼
  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
  │  INTROSPECT │→│  PRESCRIBE  │→│   EVOLVE    │
  │  (measure)  │ │  (plan)     │ │  (execute)  │
  │             │ │             │ │             │
  │ 7 metrics   │ │ 14 playbooks│ │ Autoloop or │
  │ Composite   │ │ Governor    │ │ Script mode │
  │ score 0-100 │ │ Seed YAML   │ │ Pre/post    │
  │             │ │ Program.md  │ │ scorecard   │
  └──────┬──────┘ └─────────────┘ └──────┬──────┘
         │                                │
         │        ┌─────────────┐         │
         └───────→│   MEMORY    │←────────┘
                  │             │
                  │ Facts       │
                  │ Episodes    │
                  │ Procedures  │
                  │ Graph       │
                  └─────────────┘

Dependencies

Zouroboros builds on these Zo Computer subsystems:

System	Role	Required?
zo-memory-system	Facts, episodes, procedures, graph, embeddings	Yes
zo-swarm-orchestrator	Parallel task execution with 6-signal routing	For routing metrics
autoloop	Single-metric file optimization (bundled)	For file-targeting playbooks
Ollama	Local inference (memory gate, auto-capture, procedure evolution)	For memory gate

Extending

Adding a New Metric

Add a collector function in introspect.ts (follow the measureMemoryRecall pattern)
Add thresholds in references/metric-thresholds.md
Add playbooks in references/playbooks.md (WARNING + CRITICAL variants)
Register the playbook in prescribe.ts's getPlaybook() switch
Add an executor in evolve.ts if the playbook uses script mode

Adding a New Playbook

Define it in references/playbooks.md with target file, metric command, and constraints
Add it to the getPlaybook() registry in prescribe.ts
If script mode: add a case in evolve.ts's execution switch
If file mode: autoloop handles it automatically via program.md

Adjusting Thresholds

Edit references/metric-thresholds.md and the corresponding constants in introspect.ts. The system will auto-calibrate — if composite stays above 90 for 2+ weeks, tighten targets.

Credits

Adapted from Q00/ouroboros by @Q00 — a specification-first AI development system. Also inspired by potentialInc/claude-ooo and karpathy/autoresearch patterns.

Self-enhancement architecture designed and built on Zo Computer.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
personas		personas
skills		skills
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
zouroboros-diagram.png		zouroboros-diagram.png
zouroboros.config.ts		zouroboros.config.ts

Folders and files

Latest commit

History

Repository files navigation

🐍 Zouroboros

What's Included

Foundational Skills

Self-Enhancement Skills (the closed loop)

Personas

Install

Zo Computer (recommended)

One-liner

Environment Variables (optional)

Prerequisites

Quick Start

1. Score a request

2. Run the self-diagnostic

3. Run the full self-enhancement pipeline

4. Schedule it (Zo Computer)

The Self-Enhancement Loop

How It Works

7 Health Metrics

14 Playbooks

Governor Safety Rules

Foundational Skills

Spec-First Interview

Three-Stage Evaluation

Unstuck Lateral

File Structure

Architecture

Dependencies

Extending

Adding a New Metric

Adding a New Playbook

Adjusting Thresholds

Credits

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages