Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 103 additions & 3 deletions experiments/NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ This file tracks patterns under exploration that may eventually be formalized in
- Hands-free code review and exploration

**Tools to Evaluate**:
- [WisprFlow](https://whisperflow.com/) - Voice-to-text for coding
- [WisprFlow](https://wisprflow.ai/) - Voice-to-text for coding
- [Talon Voice](https://talonvoice.com/) - Voice control for development
- [Voice Control for VSCode](https://marketplace.visualstudio.com/items?itemName=pokey.cursorless) - VSCode voice extensions
- Native OS voice control (macOS Voice Control, Windows Speech Recognition)
Expand All @@ -40,8 +40,8 @@ This file tracks patterns under exploration that may eventually be formalized in

**Related Patterns**:
- [Tool Integration](../README.md#tool-integration) - Voice as input tool for AI
- [Custom Commands](#custom-commands) - Voice-triggered slash commands
- [Event Automation](#event-automation) - Voice input as lifecycle event
- [Developer Lifecycle](../README.md#developer-lifecycle) - Voice-triggered workflow commands
- [Context Persistence](../README.md#context-persistence) - Voice input as context source

**Anti-patterns to Avoid**:
- Over-reliance on voice for precise code editing (better for high-level commands)
Expand All @@ -50,6 +50,106 @@ This file tracks patterns under exploration that may eventually be formalized in

---

### Agentic Loops

**Status**: Early exploration
**Date Added**: 2025-01-11

**Description**: Enable long autonomous coding sessions where AI iteratively improves work until explicit completion criteria are met. Uses a stop hook to intercept exit attempts and feed the same prompt back, allowing Claude to self-correct through test failures, error messages, and its own code. See the [Claude Code Ralph Wiggum plugin](https://github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/README.md).

**Core Mechanics**:
- **Stop hook** intercepts exit attempts and re-injects the original prompt
- **File persistence** allows each iteration to see previous work
- **Completion promise** (e.g., `<promise>COMPLETE</promise>`) signals success
- **Iteration limits** provide safety bounds (e.g., `--max-iterations 50`)

**Potential Use Cases**:
- Greenfield projects you can start and walk away from
- TDD workflows: write failing tests → implement → run tests → fix → repeat
- Multi-phase feature builds with clear success criteria
- Tasks with automatic verification (tests, linters, type checkers)

**Tools to Evaluate**:
- [Ralph Wiggum Plugin](https://github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/README.md) - Official Claude Code agentic loop implementation
- Custom stop hooks with iteration tracking
- Prompt templates with completion promises

**Research Questions**:
1. How do you craft effective completion promises that prevent false positives?
2. What iteration limits balance thoroughness vs. cost for different task types?
3. How should prompts structure incremental goals for multi-phase work?
4. When should loops include explicit fallback/escape instructions?
5. What metrics distinguish productive iteration from thrashing?

**Next Steps**:
- [ ] Test /ralph-loop with various task types (API builds, test suites, refactoring)
- [ ] Document effective prompt templates with completion promises
- [ ] Measure iteration counts and API costs for common workflows
- [ ] Define prompt patterns for self-correction (TDD cycles, debug loops)
- [ ] Identify tasks unsuitable for agentic loops (design decisions, unclear criteria)

**Related Patterns**:
- [Parallel Agents](../README.md#parallel-agents) - Multiple loops running concurrently
- [Developer Lifecycle](../README.md#developer-lifecycle) - Triggering loops on events
- [CheckPoint](#checkpoint) - Validation criteria within loop iterations

**Anti-patterns to Avoid**:
- Missing iteration limits (runaway costs, infinite loops)
- Vague completion criteria ("make it good" vs. explicit success metrics)
- Tasks requiring human judgment or design decisions
- Prompts without self-correction guidance (test → fix → retry cycles)
- Generating large codebases you don't understand or know how to maintain

---

### CheckPoint

**Status**: Early exploration
**Date Added**: 2025-01-11

**Description**: A systematic validation gate that runs a series of quality checks (refactoring, security, code quality, performance, architecture, documentation) after each development task to ensure continuous quality.

**Potential Use Cases**:
- Post-commit quality validation before pushing
- Pre-merge checks in pull request workflows
- Continuous compliance verification during development
- Architecture drift detection after feature additions
- Documentation freshness validation

**Tools to Evaluate**:
- Claude Code slash commands (/xsecurity, /xquality, /xrefactor, etc.)
- Pre-commit hooks with multi-check orchestration
- Custom checkpoint scripts with configurable check suites
- CI/CD pipeline quality gates

**Research Questions**:
1. What's the optimal set of checks to run after each task?
2. How do you balance thoroughness vs. developer velocity?
3. Should checkpoints be blocking or advisory?
4. How do you handle check failures mid-workflow?
5. Can AI assistants auto-remediate checkpoint failures?

**Next Steps**:
- [ ] Define standard checkpoint check categories
- [ ] Create configurable checkpoint profiles (quick, standard, thorough)
- [ ] Implement checkpoint as Claude Code custom command
- [ ] Measure impact on code quality metrics over time
- [ ] Document checkpoint integration with CI/CD pipelines

**Related Patterns**:
- [Code Quality Prerequisites](../README.md#code-quality-prerequisites) - CI/CD quality enforcement
- [Security Sandbox](../README.md#security-sandbox) - Running agents in isolated environments
- [Agentic Loops](#agentic-loops) - Long autonomous coding sessions with self-correction
- [Guided Refactoring](../README.md#guided-refactoring) - Code improvement checks

**Anti-patterns to Avoid**:
- Running all checks on every minor change (developer fatigue)
- Checkpoint failures without actionable remediation guidance
- Skipping checkpoints under time pressure (quality debt)
- One-size-fits-all checks regardless of change scope

---

## Notes Template

When adding new pattern explorations, copy this template:
Expand Down
90 changes: 56 additions & 34 deletions experiments/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1474,52 +1474,72 @@ Simon's caveat: *"They can't prove something is impossible—just because the co
### Centralized Rules

**Maturity**: Advanced
**Description**: Enforce organization-wide AI rules through a central gateway service or shared SDK library rather than distributing configuration files to each repository.
**Description**: Enforce organization-wide AI rules through a central Git repository that syncs to standard AI assistant configuration files (CLAUDE.md, AGENTS.md, .cursorrules) with automatic language and framework detection.

**Related Patterns**: [Codified Rules](../README.md#codified-rules), [Policy Generation](../README.md#policy-generation), [Security Orchestration](../README.md#security-orchestration)
**Related Patterns**: [Codified Rules](../README.md#codified-rules), [Progressive Disclosure](#progressive-disclosure), [Security Orchestration](../README.md#security-orchestration)

#### Core Implementation

Centralize AI rules in a three-layer architecture:
**Sync-based Architecture** (Recommended):

1. **Gateway service** - Internal service that owns all org rules, calls AI providers
2. **Wrapper library** - Shared SDK package that embeds org rules in system prompts
3. **CLI/editor layer** - Developer tools that call gateway or wrapper, never AI providers directly

**Gateway pattern**:
```
Developer tool → Internal gateway → AI provider
Org rules applied
Input/output filtered
Usage logged
Central Rules Repository (Git)
├── base/universal-rules.md
├── languages/ (python.md, typescript.md, go.md)
└── frameworks/ (react.md, django.md, fastapi.md)
[sync-ai-rules.sh]
Project Repository
├── CLAUDE.md (auto-generated)
├── AGENTS.md (auto-generated)
└── .cursorrules (auto-generated)
```

**Wrapper library pattern**:
```
Developer tool → @yourorg/ai-client → AI provider
Org rules in system prompt
Consistent across all repos
**How it works**:

1. **Central repository** stores organization rules organized by language/framework
2. **Sync script** detects project language (Python, TypeScript, Go) and framework (React, Django, FastAPI)
3. **Auto-generates** standard config files (CLAUDE.md, .cursorrules, etc.) with relevant rules
4. **Works offline** - no API calls, no internet dependencies after initial sync

**Example sync**:
```bash
# One-time setup per project
curl -O https://yourorg.com/sync-ai-rules.sh
chmod +x sync-ai-rules.sh

# Run sync (manual or via pre-commit hook)
./sync-ai-rules.sh

# Generates CLAUDE.md with:
# - Universal org rules
# - Python-specific rules (auto-detected from pyproject.toml)
# - FastAPI rules (auto-detected from dependencies)
```

**Governance capabilities**:
- Input filters (block secrets, enforce read-only paths)
- Output filters (scan for banned APIs, license violations)
- Policy-as-code integration (OPA rules before/after AI calls)
- Centralized audit logging (repo, task type, tokens, files touched)
**Key benefits**:
- ✅ **Works with existing AI tools** - Claude Code, Cursor, Gemini all read standard config files
- ✅ **Offline-friendly** - No API gateway, no internet dependencies
- ✅ **Simple** - Single bash script, no Node.js services to deploy
- ✅ **Language-aware** - Auto-detects Python/TypeScript/Go and pulls relevant rules
- ✅ **Version-controlled** - Rules in Git, changes are auditable

**Benefits over distributed config**:
- Change rules once, all tools updated
- Enforceable guardrails (not just suggestions)
- Aggregate metrics across teams
- Model switching without repo changes
**Alternative: Gateway Pattern** (for advanced use cases):

Complete Example: See [examples/centralized-rules/](examples/centralized-rules/) for working gateway, wrapper library, and CLI implementations.
For organizations needing input/output filtering, policy enforcement, or usage logging, see [examples/centralized-rules/gateway-strategy/](examples/centralized-rules/gateway-strategy/) for API gateway approach with:
- Request/response filtering
- Policy-as-code integration (OPA/Cedar)
- Centralized audit logging
- Usage metrics aggregation

Complete Examples:
- **[Sync Strategy](examples/centralized-rules/sync-strategy/)** - Simple Git-based sync (recommended)
- **[Gateway Strategy](examples/centralized-rules/gateway-strategy/)** - Advanced API gateway pattern

#### Anti-pattern: Scattered Configuration

Copying AI rule files into every repository:
Copying AI rule files into every repository without central source:

```
repo-a/.cursorrules # v1.2 of org rules
Expand All @@ -1529,9 +1549,11 @@ repo-c/.ai/rules/ # Custom fork, diverged

**Problems**:
- Rules drift across repositories
- No enforcement (developers can ignore or modify)
- No visibility into AI usage patterns
- Model/rule changes require updating every repo
- Manual updates required for every repo
- No consistency enforcement
- Difficult to track which repos have current rules

**Solution**: Use centralized sync approach where rules are maintained in one place and automatically distributed to projects.

---

Expand Down
Loading