diff --git a/experiments/NOTES.md b/experiments/NOTES.md index ef1064c..5f1ed72 100644 --- a/experiments/NOTES.md +++ b/experiments/NOTES.md @@ -19,7 +19,7 @@ This file tracks patterns under exploration that may eventually be formalized in - Hands-free code review and exploration **Tools to Evaluate**: -- [WisprFlow](https://whisperflow.com/) - Voice-to-text for coding +- [WisprFlow](https://wisprflow.ai/) - Voice-to-text for coding - [Talon Voice](https://talonvoice.com/) - Voice control for development - [Voice Control for VSCode](https://marketplace.visualstudio.com/items?itemName=pokey.cursorless) - VSCode voice extensions - Native OS voice control (macOS Voice Control, Windows Speech Recognition) @@ -40,8 +40,8 @@ This file tracks patterns under exploration that may eventually be formalized in **Related Patterns**: - [Tool Integration](../README.md#tool-integration) - Voice as input tool for AI -- [Custom Commands](#custom-commands) - Voice-triggered slash commands -- [Event Automation](#event-automation) - Voice input as lifecycle event +- [Developer Lifecycle](../README.md#developer-lifecycle) - Voice-triggered workflow commands +- [Context Persistence](../README.md#context-persistence) - Voice input as context source **Anti-patterns to Avoid**: - Over-reliance on voice for precise code editing (better for high-level commands) @@ -50,6 +50,106 @@ This file tracks patterns under exploration that may eventually be formalized in --- +### Agentic Loops + +**Status**: Early exploration +**Date Added**: 2025-01-11 + +**Description**: Enable long autonomous coding sessions where AI iteratively improves work until explicit completion criteria are met. Uses a stop hook to intercept exit attempts and feed the same prompt back, allowing Claude to self-correct through test failures, error messages, and its own code. See the [Claude Code Ralph Wiggum plugin](https://github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/README.md). + +**Core Mechanics**: +- **Stop hook** intercepts exit attempts and re-injects the original prompt +- **File persistence** allows each iteration to see previous work +- **Completion promise** (e.g., `COMPLETE`) signals success +- **Iteration limits** provide safety bounds (e.g., `--max-iterations 50`) + +**Potential Use Cases**: +- Greenfield projects you can start and walk away from +- TDD workflows: write failing tests → implement → run tests → fix → repeat +- Multi-phase feature builds with clear success criteria +- Tasks with automatic verification (tests, linters, type checkers) + +**Tools to Evaluate**: +- [Ralph Wiggum Plugin](https://github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/README.md) - Official Claude Code agentic loop implementation +- Custom stop hooks with iteration tracking +- Prompt templates with completion promises + +**Research Questions**: +1. How do you craft effective completion promises that prevent false positives? +2. What iteration limits balance thoroughness vs. cost for different task types? +3. How should prompts structure incremental goals for multi-phase work? +4. When should loops include explicit fallback/escape instructions? +5. What metrics distinguish productive iteration from thrashing? + +**Next Steps**: +- [ ] Test /ralph-loop with various task types (API builds, test suites, refactoring) +- [ ] Document effective prompt templates with completion promises +- [ ] Measure iteration counts and API costs for common workflows +- [ ] Define prompt patterns for self-correction (TDD cycles, debug loops) +- [ ] Identify tasks unsuitable for agentic loops (design decisions, unclear criteria) + +**Related Patterns**: +- [Parallel Agents](../README.md#parallel-agents) - Multiple loops running concurrently +- [Developer Lifecycle](../README.md#developer-lifecycle) - Triggering loops on events +- [CheckPoint](#checkpoint) - Validation criteria within loop iterations + +**Anti-patterns to Avoid**: +- Missing iteration limits (runaway costs, infinite loops) +- Vague completion criteria ("make it good" vs. explicit success metrics) +- Tasks requiring human judgment or design decisions +- Prompts without self-correction guidance (test → fix → retry cycles) +- Generating large codebases you don't understand or know how to maintain + +--- + +### CheckPoint + +**Status**: Early exploration +**Date Added**: 2025-01-11 + +**Description**: A systematic validation gate that runs a series of quality checks (refactoring, security, code quality, performance, architecture, documentation) after each development task to ensure continuous quality. + +**Potential Use Cases**: +- Post-commit quality validation before pushing +- Pre-merge checks in pull request workflows +- Continuous compliance verification during development +- Architecture drift detection after feature additions +- Documentation freshness validation + +**Tools to Evaluate**: +- Claude Code slash commands (/xsecurity, /xquality, /xrefactor, etc.) +- Pre-commit hooks with multi-check orchestration +- Custom checkpoint scripts with configurable check suites +- CI/CD pipeline quality gates + +**Research Questions**: +1. What's the optimal set of checks to run after each task? +2. How do you balance thoroughness vs. developer velocity? +3. Should checkpoints be blocking or advisory? +4. How do you handle check failures mid-workflow? +5. Can AI assistants auto-remediate checkpoint failures? + +**Next Steps**: +- [ ] Define standard checkpoint check categories +- [ ] Create configurable checkpoint profiles (quick, standard, thorough) +- [ ] Implement checkpoint as Claude Code custom command +- [ ] Measure impact on code quality metrics over time +- [ ] Document checkpoint integration with CI/CD pipelines + +**Related Patterns**: +- [Code Quality Prerequisites](../README.md#code-quality-prerequisites) - CI/CD quality enforcement +- [Security Sandbox](../README.md#security-sandbox) - Running agents in isolated environments +- [Agentic Loops](#agentic-loops) - Long autonomous coding sessions with self-correction +- [Guided Refactoring](../README.md#guided-refactoring) - Code improvement checks + +**Anti-patterns to Avoid**: +- Running all checks on every minor change (developer fatigue) +- Checkpoint failures without actionable remediation guidance +- Skipping checkpoints under time pressure (quality debt) +- One-size-fits-all checks regardless of change scope + +--- + ## Notes Template When adding new pattern explorations, copy this template: diff --git a/experiments/README.md b/experiments/README.md index d3fcb6d..4d2b5b7 100644 --- a/experiments/README.md +++ b/experiments/README.md @@ -1474,52 +1474,72 @@ Simon's caveat: *"They can't prove something is impossible—just because the co ### Centralized Rules **Maturity**: Advanced -**Description**: Enforce organization-wide AI rules through a central gateway service or shared SDK library rather than distributing configuration files to each repository. +**Description**: Enforce organization-wide AI rules through a central Git repository that syncs to standard AI assistant configuration files (CLAUDE.md, AGENTS.md, .cursorrules) with automatic language and framework detection. -**Related Patterns**: [Codified Rules](../README.md#codified-rules), [Policy Generation](../README.md#policy-generation), [Security Orchestration](../README.md#security-orchestration) +**Related Patterns**: [Codified Rules](../README.md#codified-rules), [Progressive Disclosure](#progressive-disclosure), [Security Orchestration](../README.md#security-orchestration) #### Core Implementation -Centralize AI rules in a three-layer architecture: +**Sync-based Architecture** (Recommended): -1. **Gateway service** - Internal service that owns all org rules, calls AI providers -2. **Wrapper library** - Shared SDK package that embeds org rules in system prompts -3. **CLI/editor layer** - Developer tools that call gateway or wrapper, never AI providers directly - -**Gateway pattern**: ``` -Developer tool → Internal gateway → AI provider - ↓ - Org rules applied - Input/output filtered - Usage logged +Central Rules Repository (Git) + ├── base/universal-rules.md + ├── languages/ (python.md, typescript.md, go.md) + └── frameworks/ (react.md, django.md, fastapi.md) + ↓ + [sync-ai-rules.sh] + ↓ + Project Repository + ├── CLAUDE.md (auto-generated) + ├── AGENTS.md (auto-generated) + └── .cursorrules (auto-generated) ``` -**Wrapper library pattern**: -``` -Developer tool → @yourorg/ai-client → AI provider - ↓ - Org rules in system prompt - Consistent across all repos +**How it works**: + +1. **Central repository** stores organization rules organized by language/framework +2. **Sync script** detects project language (Python, TypeScript, Go) and framework (React, Django, FastAPI) +3. **Auto-generates** standard config files (CLAUDE.md, .cursorrules, etc.) with relevant rules +4. **Works offline** - no API calls, no internet dependencies after initial sync + +**Example sync**: +```bash +# One-time setup per project +curl -O https://yourorg.com/sync-ai-rules.sh +chmod +x sync-ai-rules.sh + +# Run sync (manual or via pre-commit hook) +./sync-ai-rules.sh + +# Generates CLAUDE.md with: +# - Universal org rules +# - Python-specific rules (auto-detected from pyproject.toml) +# - FastAPI rules (auto-detected from dependencies) ``` -**Governance capabilities**: -- Input filters (block secrets, enforce read-only paths) -- Output filters (scan for banned APIs, license violations) -- Policy-as-code integration (OPA rules before/after AI calls) -- Centralized audit logging (repo, task type, tokens, files touched) +**Key benefits**: +- ✅ **Works with existing AI tools** - Claude Code, Cursor, Gemini all read standard config files +- ✅ **Offline-friendly** - No API gateway, no internet dependencies +- ✅ **Simple** - Single bash script, no Node.js services to deploy +- ✅ **Language-aware** - Auto-detects Python/TypeScript/Go and pulls relevant rules +- ✅ **Version-controlled** - Rules in Git, changes are auditable -**Benefits over distributed config**: -- Change rules once, all tools updated -- Enforceable guardrails (not just suggestions) -- Aggregate metrics across teams -- Model switching without repo changes +**Alternative: Gateway Pattern** (for advanced use cases): -Complete Example: See [examples/centralized-rules/](examples/centralized-rules/) for working gateway, wrapper library, and CLI implementations. +For organizations needing input/output filtering, policy enforcement, or usage logging, see [examples/centralized-rules/gateway-strategy/](examples/centralized-rules/gateway-strategy/) for API gateway approach with: +- Request/response filtering +- Policy-as-code integration (OPA/Cedar) +- Centralized audit logging +- Usage metrics aggregation + +Complete Examples: +- **[Sync Strategy](examples/centralized-rules/sync-strategy/)** - Simple Git-based sync (recommended) +- **[Gateway Strategy](examples/centralized-rules/gateway-strategy/)** - Advanced API gateway pattern #### Anti-pattern: Scattered Configuration -Copying AI rule files into every repository: +Copying AI rule files into every repository without central source: ``` repo-a/.cursorrules # v1.2 of org rules @@ -1529,9 +1549,11 @@ repo-c/.ai/rules/ # Custom fork, diverged **Problems**: - Rules drift across repositories -- No enforcement (developers can ignore or modify) -- No visibility into AI usage patterns -- Model/rule changes require updating every repo +- Manual updates required for every repo +- No consistency enforcement +- Difficult to track which repos have current rules + +**Solution**: Use centralized sync approach where rules are maintained in one place and automatically distributed to projects. ---