Connecting specification-driven development with agent memory
This is an experimental project exploring how two incredible open-source tools can work together. The concepts here could be applied to any AI agent that supports GitHub Spec Kit — not just Claude Code.
|
The Brain — Planning & Principles
|
🔗 Beads by Steve YeggeThe Memory — Execution & Tracking
|
SpecBeads is a thin orchestration layer that connects these two tools:
- Spec Kit provides the planning — specs, plans, tasks, and constitutional principles
- Beads provides the execution — issue tracking, dependencies, and agent memory
- SpecBeads provides bidirectional sync — keeping both in harmony
flowchart LR
subgraph SPECKIT["🌱 SPEC KIT"]
direction TB
SPEC["spec.md"]
PLAN["plan.md"]
TASKS["tasks.md"]
end
subgraph BRIDGE["🌉 SPECBEADS"]
direction TB
SYNC["Bidirectional Sync"]
MAP[".beads-mapping.json"]
CONST["constitution.md"]
end
subgraph BEADS["🔗 BEADS"]
direction TB
ISSUES["issues.jsonl"]
READY["bd ready"]
CLOSE["bd close"]
end
TASKS <-->|"/speckit.taskstobeads"| SYNC
SYNC <--> MAP
MAP <-->|"Task ↔ Issue IDs"| ISSUES
CONST -->|"Enforces principles"| IMPL
READY --> IMPL["/speckit.implementwithbeads"]
IMPL --> CLOSE
CLOSE -->|"Status sync"| SYNC
style BRIDGE fill:#2d4a22,stroke:#4ade80,stroke-width:2px
style SPECKIT fill:#1e3a5f,stroke:#60a5fa,stroke-width:2px
style BEADS fill:#5c2d1a,stroke:#fb923c,stroke-width:2px
🔄 Bidirectional Sync: Create issues from tasks, OR sync completed Beads issues back to update tasks.md. Run
/speckit.taskstobeadsanytime to reconcile both directions.
| Gap | SpecBeads Solution |
|---|---|
| Spec Kit lacks execution tracking | taskstobeads converts tasks → Beads issues |
| Beads lacks specification context | implementwithbeads loads spec/plan/constitution |
| Neither enforces project principles | Constitutional compliance checks after each task |
| Disconnected planning and execution | taskstobeads provides bidirectional task ↔ issue sync |
SpecBeads uses psychological prompting techniques proven to improve AI performance by 40-115% on complex tasks:
| Technique | Impact | Research |
|---|---|---|
| Detailed Personas | 23% → 84% accuracy | ExpertPrompting (Xu et al., 2023) |
| Challenge Framing | +115% on hard tasks | EmotionPrompt (Li et al., 2023) |
| Stakes & Consequences | +10% avg performance | EmotionPrompt Study |
| Step-by-Step Reasoning | 34% → 80% accuracy | OPRO (Google DeepMind, 2023) |
| Self-Evaluation | Forces internal validation | Confidence Scoring (Tian et al., 2023) |
How it works:
- Commands use senior engineer personas with 15+ years experience
- Challenge prompts trigger competitive framing ("I bet you can't...")
- Clear stakes with quantified consequences ("blocks ${N} downstream tasks")
- Deep breath triggers for deliberate step-by-step reasoning
- 0.9 confidence threshold with calibration examples prevents overconfident outputs
Example prompt enhancement:
You are a senior software architect with 15+ years in distributed systems.
**CRITICAL**: This blocks 5 downstream tasks if it fails.
**CHALLENGE**: Implement with 80%+ test coverage and <300 lines.
Take a deep breath and work through this step-by-step.See PROMPTING_ENHANCEMENT_PLAN.md for implementation details.
- Convert
tasks.mdto Beads issues with full context - Sync completed Beads issues back to update
tasks.md - Maintain
.beads-mapping.jsonfor task ↔ issue relationships
- Research-backed prompting techniques (+40-115% performance)
- Detailed expert personas (senior engineers with 15+ years experience)
- Challenge framing and quantified stakes
- Self-evaluation with 0.9 confidence threshold
- Calibrated confidence ratings prevent overconfident outputs
Beads issues include:
- Relevant spec & plan sections (agents don't need to read full docs)
- Extracted file paths from task descriptions
- Acceptance criteria from task markdown
- Dependency counts ("blocks N downstream tasks")
- Persona labels (
persona:test,persona:feature, etc.) - Challenge framing to trigger higher-quality implementation
- Automatic validation after each task
- TDD ordering enforcement (tests before implementation)
- SOLID principles checking
- Commit size validation (300-600 lines)
- Phase-based dependencies (Foundational → User Stories → Polish)
- Explicit task dependencies from tasks.md
- Related links for context (tasks in same user story)
bd readyshows only unblocked work
npm install -g @anthropic-ai/claude-codeOr use any supported agent
uv tool install specify-cli --from git+https://github.com/github/spec-kit.git
specify init . --ai claudecurl -fsSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/install.sh | bash
bd initCopy all three components to your project:
cp -r .claude/commands/* /your/project/.claude/commands/The commands depend on these automation scripts:
cp -r .specify/scripts/bash/* /your/project/.specify/scripts/bash/Scripts included:
check-prerequisites.sh— Validates feature directory and tasks.mdstartup-checks.sh— Verifies Beads is installed and initializedconfig-loader.sh— Loads automation preferencescheck-constitutional-compliance.sh— Post-implementation compliance checksdetect-already-implemented.sh— Smart detection of existing implementationscommon.sh— Shared utilities
# Constitution template
cp .specify/memory/constitution.md /your/project/.specify/memory/
# Psychological prompting templates (reference documentation)
cp -r .specify/prompts /your/project/.specify/Templates included:
personas.md— Role definitions (architect, test engineer, DevOps, reviewer)framing.md— Stakes and challenge templatesself-eval.md— Confidence rating scales with calibration examples
Note: Prompts are inlined in command files for performance. These templates serve as reference documentation and customization starting points.
Restart your AI agent to load the new commands.
# 1. Create specification (Spec Kit)
/speckit.specify "Add user authentication"
# 2. Generate plan (Spec Kit)
/speckit.plan
# 3. Break into tasks (Spec Kit)
/speckit.tasks
# 4. Convert to Beads issues (SpecBeads!)
/speckit.taskstobeads
# 5. Implement with compliance (SpecBeads!)
/speckit.implementwithbeads| Command | Description |
|---|---|
/speckit.taskstobeads |
Convert tasks.md ↔ Beads issues (bidirectional) |
/speckit.implementwithbeads |
Implement next ready task with constitutional compliance |
your-project/
├── .specify/
│ ├── memory/
│ │ └── constitution.md # Project principles
│ ├── prompts/ # Psychological prompting templates
│ │ ├── README.md # Template usage guide
│ │ ├── personas.md # Role definitions
│ │ ├── framing.md # Stakes & challenge templates
│ │ └── self-eval.md # Confidence calibration examples
│ └── scripts/bash/ # SpecBeads automation scripts
│ ├── check-prerequisites.sh
│ ├── startup-checks.sh
│ ├── config-loader.sh
│ ├── check-constitutional-compliance.sh
│ ├── detect-already-implemented.sh
│ └── common.sh
├── .claude/commands/
│ ├── speckit.taskstobeads.md # Tasks ↔ Beads sync
│ └── speckit.implementwithbeads.md # Implement with compliance
├── .beads/
│ └── issues.jsonl # Beads database
└── specs/[feature]/
├── spec.md
├── plan.md
├── tasks.md
└── .beads-mapping.json # Task ↔ Issue mapping
SpecBeads' psychological prompting techniques are based on peer-reviewed research:
-
EmotionPrompt — Li et al. (2023). "Large Language Models Understand and Can be Enhanced by Emotional Stimuli." ICLR 2024. arXiv:2307.11760
-
OPRO ("Deep Breath") — Yang et al. (2023). "Large Language Models as Optimizers." Google DeepMind. arXiv:2309.03409
-
26 Prompting Principles — Bsharat et al. (2023). "Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4." arXiv:2312.16171
-
ExpertPrompting — Xu et al. (2023). "ExpertPrompting: Instructing Large Language Models to be Distinguished Experts." arXiv:2305.14688
See PROMPTING_ENHANCEMENT_PLAN.md for full implementation details.
MIT License — see LICENSE
Specification-first. Memory-enabled. Constitution-governed. AI-optimized.


