Connecting specification-driven development with agent memory
This is an experimental project exploring how two incredible open-source tools can work together. The concepts here could be applied to any AI agent that supports GitHub Spec Kit β not just Claude Code.
π± GitHub Spec KitThe Brain β Planning & Principles
|
π Beads by Steve YeggeThe Memory β Execution & Tracking
|
SpecBeads is a thin orchestration layer that connects these two tools:
- Spec Kit provides the planning β specs, plans, tasks, and constitutional principles
- Beads provides the execution β issue tracking, dependencies, and agent memory
- SpecBeads provides bidirectional sync β keeping both in harmony
flowchart LR
subgraph SPECKIT["π± SPEC KIT"]
direction TB
SPEC["spec.md"]
PLAN["plan.md"]
TASKS["tasks.md"]
end
subgraph BRIDGE["π SPECBEADS"]
direction TB
SYNC["Bidirectional Sync"]
MAP[".beads-mapping.json"]
CONST["constitution.md"]
end
subgraph BEADS["π BEADS"]
direction TB
ISSUES["issues.jsonl"]
READY["bd ready"]
CLOSE["bd close"]
end
TASKS <-->|"/speckit.taskstobeads"| SYNC
SYNC <--> MAP
MAP <-->|"Task β Issue IDs"| ISSUES
CONST -->|"Enforces principles"| IMPL
READY --> IMPL["/speckit.implementwithbeads"]
IMPL --> CLOSE
CLOSE -->|"Status sync"| SYNC
style BRIDGE fill:#2d4a22,stroke:#4ade80,stroke-width:2px
style SPECKIT fill:#1e3a5f,stroke:#60a5fa,stroke-width:2px
style BEADS fill:#5c2d1a,stroke:#fb923c,stroke-width:2px
π Bidirectional Sync: Create issues from tasks, OR sync completed Beads issues back to update tasks.md. Run
/speckit.taskstobeadsanytime to reconcile both directions.
| Gap | SpecBeads Solution |
|---|---|
| Spec Kit lacks execution tracking | taskstobeads converts tasks β Beads issues |
| Beads lacks specification context | implementwithbeads loads spec/plan/constitution |
| Neither enforces project principles | Constitutional compliance checks after each task |
| Disconnected planning and execution | taskstobeads provides bidirectional task β issue sync |
SpecBeads uses psychological prompting techniques proven to improve AI performance by 40-115% on complex tasks:
| Technique | Impact | Research |
|---|---|---|
| Detailed Personas | 23% β 84% accuracy | ExpertPrompting (Xu et al., 2023) |
| Challenge Framing | +115% on hard tasks | EmotionPrompt (Li et al., 2023) |
| Stakes & Consequences | +10% avg performance | EmotionPrompt Study |
| Step-by-Step Reasoning | 34% β 80% accuracy | OPRO (Google DeepMind, 2023) |
| Self-Evaluation | Forces internal validation | Confidence Scoring (Tian et al., 2023) |
How it works:
- Commands use senior engineer personas with 15+ years experience
- Challenge prompts trigger competitive framing ("I bet you can't...")
- Clear stakes with quantified consequences ("blocks ${N} downstream tasks")
- Deep breath triggers for deliberate step-by-step reasoning
- 0.9 confidence threshold with calibration examples prevents overconfident outputs
Example prompt enhancement:
You are a senior software architect with 15+ years in distributed systems.
**CRITICAL**: This blocks 5 downstream tasks if it fails.
**CHALLENGE**: Implement with 80%+ test coverage and <300 lines.
Take a deep breath and work through this step-by-step.See PROMPTING_ENHANCEMENT_PLAN.md for implementation details.
- Convert
tasks.mdto Beads issues with full context - Sync completed Beads issues back to update
tasks.md - Maintain
.beads-mapping.jsonfor task β issue relationships
- Research-backed prompting techniques (+40-115% performance)
- Detailed expert personas (senior engineers with 15+ years experience)
- Challenge framing and quantified stakes
- Self-evaluation with 0.9 confidence threshold
- Calibrated confidence ratings prevent overconfident outputs
Beads issues include:
- Relevant spec & plan sections (agents don't need to read full docs)
- Extracted file paths from task descriptions
- Acceptance criteria from task markdown
- Dependency counts ("blocks N downstream tasks")
- Persona labels (
persona:test,persona:feature, etc.) - Challenge framing to trigger higher-quality implementation
- Automatic validation after each task
- TDD ordering enforcement (tests before implementation)
- SOLID principles checking
- Commit size validation (300-600 lines)
- Phase-based dependencies (Foundational β User Stories β Polish)
- Explicit task dependencies from tasks.md
- Related links for context (tasks in same user story)
bd readyshows only unblocked work
npm install -g @anthropic-ai/claude-codeOr use any supported agent
uv tool install specify-cli --from git+https://github.com/github/spec-kit.git
specify init . --ai claudecurl -fsSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/install.sh | bash
bd initCopy all three components to your project:
cp -r .claude/commands/* /your/project/.claude/commands/The commands depend on these automation scripts:
cp -r .specify/scripts/bash/* /your/project/.specify/scripts/bash/Scripts included:
check-prerequisites.shβ Validates feature directory and tasks.mdstartup-checks.shβ Verifies Beads is installed and initializedconfig-loader.shβ Loads automation preferencescheck-constitutional-compliance.shβ Post-implementation compliance checksdetect-already-implemented.shβ Smart detection of existing implementationscommon.shβ Shared utilities
# Constitution template
cp .specify/memory/constitution.md /your/project/.specify/memory/
# Psychological prompting templates (reference documentation)
cp -r .specify/prompts /your/project/.specify/Templates included:
personas.mdβ Role definitions (architect, test engineer, DevOps, reviewer)framing.mdβ Stakes and challenge templatesself-eval.mdβ Confidence rating scales with calibration examples
Note: Prompts are inlined in command files for performance. These templates serve as reference documentation and customization starting points.
Restart your AI agent to load the new commands.
# 1. Create specification (Spec Kit)
/speckit.specify "Add user authentication"
# 2. Generate plan (Spec Kit)
/speckit.plan
# 3. Break into tasks (Spec Kit)
/speckit.tasks
# 4. Convert to Beads issues (SpecBeads!)
/speckit.taskstobeads
# 5. Implement with compliance (SpecBeads!)
/speckit.implementwithbeads| Command | Description |
|---|---|
/speckit.taskstobeads |
Convert tasks.md β Beads issues (bidirectional) |
/speckit.implementwithbeads |
Implement next ready task with constitutional compliance |
your-project/
βββ .specify/
β βββ memory/
β β βββ constitution.md # Project principles
β βββ prompts/ # Psychological prompting templates
β β βββ README.md # Template usage guide
β β βββ personas.md # Role definitions
β β βββ framing.md # Stakes & challenge templates
β β βββ self-eval.md # Confidence calibration examples
β βββ scripts/bash/ # SpecBeads automation scripts
β βββ check-prerequisites.sh
β βββ startup-checks.sh
β βββ config-loader.sh
β βββ check-constitutional-compliance.sh
β βββ detect-already-implemented.sh
β βββ common.sh
βββ .claude/commands/
β βββ speckit.taskstobeads.md # Tasks β Beads sync
β βββ speckit.implementwithbeads.md # Implement with compliance
βββ .beads/
β βββ issues.jsonl # Beads database
βββ specs/[feature]/
βββ spec.md
βββ plan.md
βββ tasks.md
βββ .beads-mapping.json # Task β Issue mapping
SpecBeads' psychological prompting techniques are based on peer-reviewed research:
-
EmotionPrompt β Li et al. (2023). "Large Language Models Understand and Can be Enhanced by Emotional Stimuli." ICLR 2024. arXiv:2307.11760
-
OPRO ("Deep Breath") β Yang et al. (2023). "Large Language Models as Optimizers." Google DeepMind. arXiv:2309.03409
-
26 Prompting Principles β Bsharat et al. (2023). "Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4." arXiv:2312.16171
-
ExpertPrompting β Xu et al. (2023). "ExpertPrompting: Instructing Large Language Models to be Distinguished Experts." arXiv:2305.14688
See PROMPTING_ENHANCEMENT_PLAN.md for full implementation details.
MIT License β see LICENSE
Specification-first. Memory-enabled. Constitution-governed. AI-optimized.


