Skip to content

Latest commit

 

History

History
128 lines (95 loc) · 9.01 KB

File metadata and controls

128 lines (95 loc) · 9.01 KB

Architecture Squad — Schema Design

Pipeline Overview

                            ┌─────────────────────────────────────────────┐
                            │              SCRIBE (always-on)             │
                            │  Observes all stages, appends to           │
                            │  decision-log.json after each transition    │
                            └─────────────────────────────────────────────┘
                                              │ writes
                                              ▼
                                    ┌──────────────────┐
                                    │  decision-log    │
                                    │  (shared state)  │
                                    └──────────────────┘
                                              ▲ reads
                                              │
┌──────────┐    ┌───────────┐    ┌───────────┐    ┌───────────────┐    ┌─────────┐    ┌───────────┐
│ Raw      │───▶│ ROSALIND  │───▶│  HAWKING  │───▶│  OPPENHEIMER  │◀──▶│ FEYNMAN │───▶│  TURING   │───▶ VON BRAUN
│ Input    │    │           │    │           │    │               │    │         │    │           │    │
│          │    │ Normalize │    │ Concept   │    │ Rationalize   │    │Challenge│    │Whitepaper │    │ Delivery
│          │    │ & Chunk   │    │ Tree      │    │ with existing │    │& verify │    │ Author    │    │ Plan
└──────────┘    └───────────┘    └───────────┘    └───────────────┘    └─────────┘    └───────────┘    └──────────┘
                     │                │                  │                  │               │               │
                     ▼                ▼                  ▼                  ▼               ▼               ▼
               normalized-      concept-tree      rationalized-     challenge-       whitepaper       delivery-
               input.json       .json             architecture      report.json      .json            plan.json
                                                  .json

Schema File Index

Schema Producer Consumer(s) Purpose
common.schema.json All members Shared types: envelope, evidenceRef, conceptId, severity, status
normalized-input.schema.json Rosalind All members Uniform chunked representation of any input format — the shared ground truth
concept-tree.schema.json Hawking Oppenheimer Hierarchical concept extraction with evidence links
rationalized-architecture.schema.json Oppenheimer Feynman, Turing Merged concept tree with deltas and conflict resolutions
challenge-report.schema.json Feynman Oppenheimer (loop) Adversarial review with typed challenges
whitepaper.schema.json Turing Von Braun, humans Formal architecture document with ADRs and diagrams
delivery-plan.schema.json Von Braun Humans Phased execution plan with dependencies and gates
decision-log.schema.json Scribe All members (read) Append-only institutional memory
pipeline-manifest.schema.json Orchestrator Squad composition, routing, and loop configuration

Key Design Decisions

1. Artifact Envelope (Provenance Chain)

Every artifact wraps in artifactEnvelope which carries:

  • artifact_id — unique instance ID
  • input_artifact_ids — what upstream artifacts produced this one
  • session_id — groups everything from a single processing run

This means you can trace any concept in a delivery plan back through the whitepaper → rationalized tree → concept tree → normalized input → raw source chunk. Full provenance.

2. Evidence References

evidenceRef is the universal "show your work" type. Every concept, every delta, every challenge can point back to the specific chunk in Rosalind's normalized input. This is what makes the system auditable — no concept floats without grounding.

3. Shared Transcript Access (Lens Model)

Every member has read access to normalized-input.json. The pipeline is sequential, but information access is not. Each member reads the transcript through a different lens:

  • Hawking: "What concepts exist in this input?"
  • Oppenheimer: "Does the transcript support this merge/conflict resolution?" — reads through the concept tree
  • Feynman: "Is the evidence actually as strong as claimed?" — verification against source
  • Turing: "What narrative context makes this whitepaper persuasive?" — stakeholder concerns, debate color
  • Von Braun: "What delivery complexity signals did participants express?" — risk and capacity hints
  • Scribe: "What was the original context for this logged decision?"

This avoids the failure mode where upstream abstraction loses critical nuance. Architecture discussions are rich with subtext — hedging, pushback, confidence levels, role-based authority — that concept trees and rationalized models can flatten. Giving every member access to the source lets them recover that nuance when their role demands it.

The constraint: transcript access is a supplement, not a bypass. Each member's primary input remains the upstream artifact (concept tree, rationalized architecture, whitepaper). The transcript is consulted when the primary input is ambiguous or insufficient.

4. Concept Identity

Concept IDs use the pattern C-{kebab-case} (e.g., C-event-driven-messaging, C-zero-trust-boundary). These are stable across the pipeline — when Oppenheimer merges a concept, the ID persists. When Feynman challenges a concept, he references the same ID. When Turing writes about it, the related_concepts field links back.

This stability is what makes the concept tree a true knowledge graph rather than a series of disconnected documents.

5. The Oppenheimer ↔ Feynman Loop

The challenge-report schema includes:

  • iteration — which pass this is
  • verdictapproved, revise, or blocked
  • Per-challenge status — tracks which challenges were resolved across iterations

The pipeline manifest defines max_iterations (recommend 3) and a convergence condition (feynman.verdict == approved). If max iterations are hit without approval, the pipeline proceeds but the whitepaper carries a flag indicating unresolved challenges.

6. Separation of Structure and Rendering

The whitepaper schema stores structured data (sections, ADRs, diagram definitions), not a rendered document. Turing produces both the JSON and a rendered markdown/docx from it. This means:

  • The structured data is machine-readable for downstream consumers (Von Braun)
  • The rendered document is human-readable for stakeholders
  • The diagrams are notation-agnostic (mermaid, plantuml, etc.) and renderable separately

7. Delivery Plan as Bridge to Execution

Von Braun's schema deliberately avoids hour estimates or sprint assignments. It uses:

  • T-shirt sizing (XS–XL) for work items
  • Phase sequencing with entry/exit criteria
  • Decision gates that explicitly block work items
  • Critical path flagging on dependencies

This keeps the plan at the architecture level. The delivery team converts it to their own planning tools.

Encapsulation Strategy

To use this squad in a new context:

  1. Deploy the schemas — they define the interface contract
  2. Instantiate the pipeline manifest — configure members, loops, convergence rules
  3. Seed the decision log — if there's existing architectural knowledge, load it as initial entries
  4. Point Rosalind at the input source — her source.type enum is extensible for new input formats

The schemas are the portable part. The member implementations (prompts, tools) are separate and can be swapped. As long as a member produces valid output conforming to its schema, the pipeline works.

Extension Points

  • Rosalind's source.type: Add new input formats without touching downstream schemas
  • Hawking's concept_type: Add new concept classifications as the domain evolves
  • Feynman's challenge.type: Add domain-specific challenge categories
  • Turing's section_type: Add new document section types for different whitepaper styles
  • Von Braun's approach: Add new delivery strategies
  • Common memberId: Add new squad members

All enums are the extension points. Adding a value to an enum is a backward-compatible schema change.