Skip to content

Latest commit

 

History

History
433 lines (344 loc) · 19.4 KB

File metadata and controls

433 lines (344 loc) · 19.4 KB

SpineFrame Specification

Authoritative technical specification for SpineFrame v0.1.0. Last updated: March 2026.

Overview

SpineFrame is a checkpointable, replayable, forkable DAG execution kernel for structured AI workflows. It is model-agnostic, framework-agnostic, and treats provenance as a first-class concern.

Architecture Diagram

                            ┌──────────────────────────┐
                            │       CLI / Web UI        │
                            │   (Click / FastAPI+React) │
                            └─────────┬────────────────┘
                                      │
                          ┌───────────▼───────────────┐
                          │         Engine             │
                          │  run/resume/replay/fork    │
                          │  budget controls           │
                          │  dynamic replanning        │
                          │  graph expansion           │
                          └───┬───────┬───────┬───────┘
                              │       │       │
              ┌───────────────▼─┐  ┌──▼────┐  ▼──────────────────┐
              │  Trust Layer    │  │Stages │  │  Assurance Layer   │
              │                 │  │       │  │                    │
              │ Artifact        │  │ 53    │  │ Tool Policy        │
              │  Isolation      │  │ types │  │ Domain Assertions  │
              │ Graph Hash      │  │       │  │ Signed Provenance  │
              │  Validation     │  │ 3     │  │ Identity & Actor   │
              │                 │  │ pipes │  │  Tracking          │
              └───────┬─────────┘  └──┬────┘  └────────┬──────────┘
                      │               │                │
                      └───────┬───────┘                │
                              │                        │
              ┌───────────────▼────────────────────────▼──┐
              │            Artifacts & Events              │
              │  Content-addressed JSON (SHA-256)          │
              │  Append-only JSONL event log               │
              │  Immutable checkpoints                     │
              │  Merkle hash provenance chain              │
              └───────────────┬───────────────────────────┘
                              │
              ┌───────────────▼───────────────────────────┐
              │           Provider Layer                   │
              │                                            │
              │  Model: Anthropic, OpenAI (lazy imports)   │
              │  Search: Tavily, Serper, SearXNG, fallback │
              │  Fetch:  HTTP (stdlib)                     │
              │  Extract: Basic HTML (stdlib)              │
              │  Tool:   MCP (stdio + SSE transport)       │
              └────────────────────────────────────────────┘

Data Flow (Research Pipeline Example)

User Query
    │
    ▼
clarify_query ──► plan_query ──► search_web ──► refine_search
                                  (expand)       │
                                  :plan          │ sufficient?
                                  :batch_000     │ no → inject search_r2
                                  :batch_001     │ yes → continue
                                  :collect       ▼
                               fetch_pages ──► extract_chunks ──► extract_claims
                                (expand,        (expand,          (expand,
                                concurrent)     concurrent)       concurrent)
                                    │               │                 │
                                    ▼               ▼                 ▼
                               verify_claims ──► map_claims ──► synthesize
                                                                    │
                                                                    ▼
                                                              Final Report
                                                            + Provenance Chain
                                                            + Signed Bundle

Assurance Flow

Stage Execution
    │
    ├──► Artifact Isolation Check (read/write)
    │       └── pass/warn/fail
    │
    ├──► Tool Policy Check (per tool call)
    │       └── allow/deny + event log
    │
    ├──► Domain Assertions (post-stage)
    │       └── pass/warn/fail/approval_required
    │
    ├──► Event Emission (actor + timestamp)
    │       └── append to events.jsonl
    │
    └──► Provenance Hash Update
            └── merkle chain extension

Core Concepts

Run

A single pipeline execution. Each run gets a unique ID (12-char hex) and a directory under runs_dir/ containing all state.

Run directory layout:

runs/{run_id}/
  run.json              # RunManifest (query, pipeline, status, timestamps)
  graph.json            # GraphDefinition (DAG of StageNodes)
  events.jsonl          # Append-only event log
  artifacts/            # Stage output files (JSON, JSONL, HTML, Markdown)
  stages/{stage_id}/    # Per-stage metadata (meta.json)
  checkpoints/          # Immutable snapshots

Run statuses: created | running | completed | failed | forked | budget_exceeded | cancelled

Stage

A unit of work in the DAG. Each stage has:

  • stage_id -- unique identifier within the graph
  • stage_type -- maps to a registered Stage class via @register_stage("type")
  • depends_on -- list of stage_ids that must complete first
  • input_artifacts / output_artifacts -- declared artifact names
  • phase -- logical grouping for UI display (e.g. "Planning", "Search")

Stage statuses: pending | running | success | failure

Artifact

A file produced by a stage, stored in artifacts/. Every artifact has metadata:

{
  "schema_version": "1",
  "generated_at": "2026-03-07T...",
  "upstream_stage": "plan_query",
  "sha256": "a1b2c3..."
}

Content-addressed via SHA-256. Resume compares input hashes to skip unchanged stages.

Event

An entry in the append-only events.jsonl log:

{
  "timestamp": "2026-03-07T...",
  "event_type": "stage_completed",
  "stage_id": "plan_query",
  "data": {"status": "success", "cost_estimate": 0.015}
}

Event types: run_started, run_completed, run_failed, stage_started, stage_completed, stage_failed, stage_skipped, budget_exceeded, loop_sufficient, loop_max_reached, run_forked, graph_expanded, artifact_access_denied, tool_policy_decision, domain_policy_evaluated, provenance_signed

Data Models

StageNode

stage_id: str                    # unique within graph
stage_type: str                  # maps to @register_stage class
depends_on: list[str]            # upstream stage_ids
input_artifacts: list[str]       # declared input artifact names
output_artifacts: list[str]      # declared output artifact names
expandable: bool                 # can split into plan/batch/collect sub-stages
parent_stage: str                # for sub-stages: parent's stage_id
batch_index: int                 # for batch sub-stages: 0-based index (-1 = not a batch)
concurrent_batches: bool         # batch sub-stages can run in parallel
max_concurrent: int              # cap concurrent batches (0 = use engine's max_workers)
input_artifact_patterns: list[str]   # glob patterns (e.g. ["page_*.html"])
output_artifact_patterns: list[str]  # glob patterns
max_rounds: int                  # >0 enables dynamic replanning
phase: str                       # UI grouping (e.g. "Planning", "Search")

GraphDefinition

A list of StageNodes forming a DAG. Serialized to graph.json.

Validation rules:

  • No duplicate stage_ids
  • All depends_on references must exist in the graph
  • Topological sort must succeed (no cycles)

RunManifest

run_id: str           # 12-char hex UUID
query: str            # user's original query
pipeline: str         # pipeline name (e.g. "research")
created_at: str       # ISO 8601 timestamp
status: str           # run status
parent_run_id: str    # set on forked runs
fork_stage: str       # stage the fork branched from
instructions: str     # user instructions to guide LLM stages

Provenance Models

Source:       source_id, url, title, snippet, fetched_at, content_hash, search_score
Chunk:        chunk_id, source_id, text, start_offset, end_offset, content_hash
Claim:        claim_id, text, chunk_ids, confidence, stage_id, content_hash
EvidenceMap:  claim_id, supporting_chunks, source_ids, provenance_hash

Provenance hash chain:

  • chunk.content_hash = SHA-256(source_id + "|" + text)
  • claim.content_hash = SHA-256(text + sorted chunk_id:chunk_hash pairs)
  • evidence.provenance_hash = SHA-256(claim_hash + sorted chunk_hashes + sorted source_hashes)
  • root_hash = SHA-256(sorted provenance_hashes)

Validation: validate_provenance() recomputes all hashes and checks integrity. Exposed via spineframe verify CLI and /api/runs/{id}/provenance/verify endpoint.

BudgetConfig

max_cost_usd: float       # 0 = unlimited
max_stages: int            # 0 = unlimited
max_time_seconds: float    # 0 = unlimited

Engine Operations

run

Creates a new run directory, writes manifest and graph, executes stages in topological order. Expandable stages are expanded at execution time (plan_batches -> graph expansion -> batch execution -> collect).

resume

Loads an existing run. For each stage, computes SHA-256 of current input artifacts and compares to the hash recorded at last execution. Skips stages with unchanged inputs.

replay

Deletes all stage metadata and artifacts downstream of the specified stage, then re-executes from that point. Upstream artifacts are preserved.

fork

Copies the entire run directory to a new run ID. Clears downstream stage metadata and artifacts from from_stage. Optionally sets a new query and/or instructions. The forked run can then be resumed or replayed.

inspect

Returns the current state of a run: stages with status/cost/timing, checkpoints list, manifest metadata.

Expandable Stages

Stages with expandable=True split into sub-stages at execution time:

  1. The stage's plan_batches() method returns a BatchPlan
  2. The engine replaces the expandable node with: {id}:plan -> [{id}:{batch_id}, ...] -> {id}:collect
  3. Sub-stages have parent_stage set to the original stage_id
  4. Batch sub-stages can run concurrently if concurrent_batches=True
  5. Sub-stage IDs use : separator (e.g. fetch_pages:batch_000)
  6. Sub-stages inherit phase from their parent

After expansion, the graph is re-persisted. All engine mechanics (skip, resume, replay, fork) apply to individual sub-stages.

Dynamic Replanning

Stages with max_rounds > 0 trigger adaptive replanning:

  1. After execution, the engine reads the stage's output artifact
  2. If sufficient: false in the output and round < max_rounds:
    • Injects new {base}_r{N}_search and {base}_r{N} stages into the DAG
    • Rewires downstream dependencies
    • Re-persists the graph
  3. Loop round IDs use _r{N} suffix (e.g. refine_search_r2, refine_search_r3)
  4. Injected stages inherit phase from the original stage

Stage Isolation

Each stage declares its artifact access via input_artifacts, output_artifacts, and glob patterns. The engine enforces access control via _check_access() with fnmatch:

  • Permissive mode (default): warns on undeclared access
  • Strict mode: raises errors on undeclared access

Research Pipeline

The reference implementation is an 11-stage web research pipeline:

Phase Stage Type Expandable
Planning clarify_query clarify_query No
Planning plan_query plan_query No
Search search_web search_web Yes
Search refine_search refine_search No (max_rounds=3)
Search search_web_r2 search_web_r2 Yes
Collection fetch_pages fetch_pages Yes (concurrent)
Extraction extract_chunks extract_chunks Yes (concurrent)
Extraction extract_claims extract_claims Yes (concurrent, max_concurrent=4)
Verification verify_claims verify_claims No
Verification map_claims map_claims_to_sources No
Synthesis synthesize synthesize_report No

Fast pipeline (research_fast): skips extract_claims, verify_claims, map_claims -- goes directly from extract_chunks to synthesize_direct. No provenance chain.

28 registered research stage types (11 main stages + plan/batch/collect variants for expandable stages + synthesize_direct). 53 total registered stage types across all pipelines.

OSINT Pipeline

A 10-stage domain reconnaissance pipeline with 13 stdlib-only source clients (zero API keys required):

Phase Stage Type Expandable
Recon target_parse osint_target_parse No
Recon dns_recon osint_dns_recon No
Recon whois_rdap osint_whois_rdap No
Enrichment subdomain_enum osint_subdomain_enum Yes
Enrichment infrastructure osint_infrastructure Yes (concurrent)
Enrichment web_archive osint_web_archive No
Enrichment tech_fingerprint osint_tech_fingerprint No
Identity identity_recon osint_identity No
Analysis correlate osint_correlate No
Analysis osint_report osint_report No

15 registered OSINT stage types (10 main + plan/batch/collect for 2 expandable stages - 1 shared collect). Source clients use only stdlib (urllib.request, subprocess, socket, json, ssl).

Compliance Pipeline

A 6-stage audit evidence pipeline:

Phase Stage Type Expandable
Collection target_parse compliance_target_parse No
Collection evidence_gather compliance_evidence_gather Yes
Analysis normalize compliance_normalize No
Analysis map_to_controls compliance_map_controls No
Review gap_analysis compliance_gap_analysis No
Export export_package compliance_export_package No

10 registered compliance stage types (6 main + plan/batch/collect for evidence_gather). Supports SOC2 and ISO 27001 control frameworks with keyword-based and LLM-enhanced mapping.

Provider Architecture

Providers are created via create_provider(type, config) with a registry and lazy imports.

Provider types:

  • model -- LLM provider (Anthropic, OpenAI). Must be configured explicitly.
  • search -- Search provider (Tavily, Serper, SearXNG, fallback). Must be configured explicitly.
  • fetch -- HTTP fetch. Defaults to "http" (stdlib).
  • extract -- HTML extraction. Defaults to "basic" (stdlib).
  • tool -- MCP tool provider (stdio + SSE transport). Optional.

Optional dependencies:

  • pip install spineframe[anthropic] -- Anthropic SDK
  • pip install spineframe[openai] -- OpenAI SDK
  • pip install spineframe[tavily] -- Tavily SDK
  • pip install spineframe[mcp] -- MCP tool support
  • pip install spineframe[pdf] -- PyMuPDF for PDF extraction
  • pip install spineframe[signing] -- Ed25519 provenance signing
  • pip install spineframe[web] -- FastAPI + Uvicorn for web UI
  • pip install spineframe[all] -- everything

CLI Commands

14 commands via Click:

Command Description
run Execute a full pipeline run
resume Resume a run, skipping unchanged stages
replay Re-execute from a specific stage downstream
fork Clone a run, diverge from a stage with new query/instructions
show Detailed run info (stages, costs, timing)
status Visual tree view of run progress
inspect Raw JSON dump of run state
ls List all runs
export Export as JSON, Markdown, or self-contained HTML
diff Compare artifacts between two runs
verify Validate provenance chain integrity + signatures
keygen Generate Ed25519 signing keypair
suggest-fork LLM-powered fork point suggestion
web Launch the web UI

Web UI

FastAPI backend + React SPA frontend.

Backend routes:

  • GET /api/runs -- list runs
  • GET /api/runs/{id} -- inspect run (with live status from executor)
  • POST /api/runs -- create new run
  • POST /api/runs/{id}/resume -- resume run
  • POST /api/runs/{id}/replay -- replay from stage
  • POST /api/runs/{id}/fork -- fork run
  • DELETE /api/runs/{id} -- cancel run
  • POST /api/runs/{id}/approve-plan -- approve/modify research plan
  • GET /api/runs/{id}/plan -- get current research plan
  • GET /api/runs/{id}/diff/{other} -- compare runs
  • GET /api/pipelines -- list available pipelines
  • GET /api/runs/{id}/artifacts -- list artifacts
  • GET /api/runs/{id}/artifacts/{name} -- get artifact content
  • GET /api/runs/{id}/events -- get event log
  • POST /api/runs/{id}/suggest-fork -- LLM-powered fork point suggestion
  • POST /api/runs/{id}/approve-diff -- approve diff (actor-stamped)
  • GET /api/runs/{id}/provenance/verify -- verify provenance chain
  • POST /api/runs/{id}/provenance/sign -- sign provenance bundle (Ed25519)
  • POST /api/runs/{id}/policy-check -- evaluate domain policy assertions
  • GET /api/runs/{id}/provenance -- paginated provenance chain
  • GET /api/runs/{id}/artifact-diff/{other_id}/{name} -- unified diff for a single artifact
  • GET /api/runs/{id}/report -- HTML report
  • GET /api/runs/{id}/last-approved -- find most recent approved run for comparison
  • GET /api/health -- health check
  • WS /api/ws -- WebSocket for live progress events (global broadcast; clients filter by run_id in payload)

Frontend features:

  • Run management (create, monitor, resume, replay, fork, cancel)
  • Pipeline selector (research, OSINT, compliance)
  • Live WebSocket progress updates
  • Plan review and editing before execution continues
  • Stage pipeline visualization grouped by phase
  • Smart fork suggestions (prompt-first with LLM analysis)
  • Stage-level fork buttons
  • Embedded HTML report with provenance
  • Artifact browser with costs and timing
  • Run comparison (diff view with executive summary)
  • Provenance chain (paginated, citation back-navigation)

Serialization

All models use explicit to_dict() / from_dict() class methods. No pydantic, no ORM. Optional fields are omitted from serialization when at their default value (empty string, False, 0, empty list).

Graph and manifest are JSON files. Events are JSONL (one JSON object per line). Artifacts are JSON or JSONL depending on the stage.