AI Development Patterns

📋 How ready is your team for AI-assisted development? Take the free AI Development Readiness Scorecard to find out — and discover which patterns to adopt first.

A comprehensive collection of patterns based on my experience for building software with AI assistance, organized by implementation maturity and development lifecycle phases. These patterns are subject to change as the field evolves.

graph TB
    %% ROW 1: Foundation start (left to right)
    RA([Readiness<br/>Assessment]) --> CR([Codified<br/>Rules])
    CR --> SS([Security<br/>Sandbox])
    SS --> DL([Developer<br/>Lifecycle])
    DL --> TI([Tool<br/>Integration])

    %% ROW 2: Operations & branches (loops back)
    SS --> SO([Security<br/>Orchestration])
    SS --> PG([Policy<br/>Generation])
    SO --> CZR([Centralized<br/>Rules])

    %% ROW 3: Development patterns (flows forward again)
    DL --> OD([Observable<br/>Development])
    DL --> SD([Spec-Driven<br/>Development])
    CR --> GR([Guided<br/>Refactoring])
    CR --> CP([Context<br/>Persistence])
    RA --> IG([Issue<br/>Generation])
    CR --> EA([Event<br/>Automation])
    SS --> EA
    EA --> CC([Custom<br/>Commands])
    SD --> CC
    CP --> PD([Progressive<br/>Disclosure])
    CR --> PD
    SD --> IS([Image<br/>Spec])
    PD --> CZR

    %% ROW 4: Development chain
    PE([Progressive<br/>Enhancement]) --> AD([Atomic<br/>Decomposition])
    AD --> PA([Parallel<br/>Agents])
    PE --> IS

    %% ROW 5: Additional development patterns
    PE --> AE([Adversarial<br/>Evaluator])
    DL --> ER([Error<br/>Resolution])
    OD --> ER
    TI --> ER
    EA --> AR([Autonomous<br/>Remediation])
    CR --> AR
    GR --> AR
    ER --> AR
    PI([Planned<br/>Implementation])

    %% STYLING
    classDef foundation fill:#a8d5ba,stroke:#2d5a3f,stroke-width:2px,color:#1a3a25
    classDef development fill:#f9e79f,stroke:#b7950b,stroke-width:2px,color:#7d6608
    classDef operations fill:#f5b7b1,stroke:#c0392b,stroke-width:2px,color:#78281f

    class RA,CR,SS,DL,TI,IG foundation
    class PE,SD,AD,PA,OD,GR,EA,CC,PD,IS,CP,AE,ER,PI,AR development
    class PG,SO,CZR operations

    %% CLICKABLE LINKS
    click RA "https://github.com/PaulDuvall/ai-development-patterns#readiness-assessment"
    click CR "https://github.com/PaulDuvall/ai-development-patterns#codified-rules"
    click SS "https://github.com/PaulDuvall/ai-development-patterns#security-sandbox"
    click DL "https://github.com/PaulDuvall/ai-development-patterns#developer-lifecycle"
    click TI "https://github.com/PaulDuvall/ai-development-patterns#tool-integration"
    click IG "https://github.com/PaulDuvall/ai-development-patterns#issue-generation"
    click CP "https://github.com/PaulDuvall/ai-development-patterns#context-persistence"
    click PE "https://github.com/PaulDuvall/ai-development-patterns#progressive-enhancement"
    click SD "https://github.com/PaulDuvall/ai-development-patterns#spec-driven-development"
    click AD "https://github.com/PaulDuvall/ai-development-patterns#atomic-decomposition"
    click PA "https://github.com/PaulDuvall/ai-development-patterns#parallel-agents"
    click OD "https://github.com/PaulDuvall/ai-development-patterns#observable-development"
    click GR "https://github.com/PaulDuvall/ai-development-patterns#guided-refactoring"
    click EA "https://github.com/PaulDuvall/ai-development-patterns#event-automation"
    click CC "https://github.com/PaulDuvall/ai-development-patterns#custom-commands"
    click PD "https://github.com/PaulDuvall/ai-development-patterns#progressive-disclosure"
    click IS "https://github.com/PaulDuvall/ai-development-patterns#image-spec"
    click PG "https://github.com/PaulDuvall/ai-development-patterns#policy-generation"
    click SO "https://github.com/PaulDuvall/ai-development-patterns#security-orchestration"
    click CZR "https://github.com/PaulDuvall/ai-development-patterns#centralized-rules"
    click PI "https://github.com/PaulDuvall/ai-development-patterns#planned-implementation"
    click AE "https://github.com/PaulDuvall/ai-development-patterns#adversarial-evaluator"
    click ER "https://github.com/PaulDuvall/ai-development-patterns#error-resolution"
    click AR "https://github.com/PaulDuvall/ai-development-patterns#autonomous-remediation"

Legend: 🟢 Foundation | 🟡 Development | 🔴 Operations

Pattern Organization

This repository provides a structured approach to AI-assisted development through three pattern categories:

Foundation Patterns - Essential patterns for team readiness and basic AI integration
Development Patterns - Daily practice patterns for AI-assisted coding workflows
Operations Patterns - CI/CD, security, and production management with AI
Experimental Patterns - Advanced and experimental patterns under active development and/or consideration.

Harness Engineering Lens

Birgitta Böckeler's framing — Agent = Model + Harness — describes the controls built around a coding agent (separate from the model) to make its output trustworthy. This is not a pattern to adopt; it is a lens that explains why the patterns below work together. Every control is one of two kinds and runs in one of two ways:

Feedforward (guides) steer the agent before it acts.
Feedback (sensors) observe after it acts so it can self-correct.
Computational controls are deterministic, fast, and reliable — linters, type checks, tests, fitness functions.
Inferential controls are semantic, slower, and probabilistic — AI review, LLM-as-judge.

A healthy harness balances all four: feedforward-only agents never learn whether the rules worked; feedback-only agents repeat the same mistakes. The catalog maps onto the lens as follows.

Pattern	Direction	Execution	Regulates
Codified Rules / Centralized Rules	Feedforward	—	Conventions
Spec-Driven Development	Feedforward	—	Behaviour
Planned Implementation	Feedforward	—	Approach
Custom Commands	Feedforward	—	Workflow
Observable Development	Feedforward + Feedback	Computational + Inferential	Architecture fitness
Guided Refactoring	Feedback	Computational + Inferential	Maintainability
Adversarial Evaluator	Feedback	Inferential	Behaviour
Error Resolution	Feedback	Computational	Runtime
Autonomous Remediation	Feedback	Computational + Inferential	Runtime

Two principles from the source are worth stating directly:

Keep Quality Left — run cheap, fast controls early (linters, basic review pre-commit) and reserve expensive ones (mutation testing, deep AI review) for later, so issues are caught where they cost least.
Steer, don't automate — when the agent repeats a mistake, improve the harness (the guides and sensors), not just the prompt. The human's job is to iterate on the harness.

Source: Birgitta Böckeler, "Harness Engineering", martinfowler.com.

Pattern Dependencies & Implementation Order

Important: These phases represent a learning progression for teams new to AI development, not a waterfall approach. Teams with existing DevOps/security expertise should implement patterns continuously across all phases from day one, following a "continuous everything" model.

graph TD
    subgraph "Phase 1: Foundation (Weeks 1-2)"
        A[Readiness Assessment] --> B[Codified Rules]
        B --> C[Security Sandbox]
        C --> D[Developer Lifecycle]
        A --> E[Issue Generation]
        D --> F[Tool Integration]
    end

    subgraph "Phase 2: Development (Weeks 3-4)"
        D --> G[Spec-Driven Development]
        H[Planned Implementation]
        I[Progressive Enhancement]
        I --> J[Adversarial Evaluator]
        Q[Event Automation]
        R[Custom Commands]
        S[Progressive Disclosure]
        U[Image Spec]
        V[Autonomous Remediation]
        I --> K[Atomic Decomposition]
        K --> L[Parallel Agents]
    end

    subgraph "Phase 3: Operations (Weeks 5-6)"
        C --> M[Policy Generation]
        M --> N[Security Orchestration]
        N --> T[Centralized Rules]
    end

    B --> Q
    C --> Q
    Q --> R
    G --> R
    B --> S
    R --> S
    B --> T
    S --> T
    G --> U
    I --> U
    Q --> V
    B --> V

Continuous Implementation Note: Security patterns (Security Sandbox, AI Security & Compliance) and deployment patterns should be implemented continuously throughout development, not delayed until specific phases. The dependencies shown represent learning prerequisites, not deployment gates.

Complete Pattern Reference

Pattern	Maturity	Type	Description	Dependencies
Readiness Assessment	Beginner	Foundation	Systematic evaluation of codebase and team readiness for AI integration	None
Codified Rules	Beginner	Foundation	Version and maintain AI coding standards as explicit configuration files	Readiness Assessment
Security Sandbox	Beginner	Foundation	Run AI tools in isolated environments without access to secrets or sensitive data	Codified Rules
Developer Lifecycle	Intermediate	Workflow	Structured 9-stage process from problem definition through deployment with AI assistance	Codified Rules, Security Sandbox
Tool Integration	Intermediate	Foundation	Connect AI systems to external data sources, APIs, and tools for enhanced capabilities beyond prompt-only interactions	Security Sandbox, Developer Lifecycle
Issue Generation	Intermediate	Foundation	Generate Kanban-optimized work items (4-8 hours max) from requirements using AI to ensure continuous flow with clear acceptance criteria and dependencies	Readiness Assessment
Spec-Driven Development	Intermediate	Development	Use executable specifications to guide AI code generation with clear acceptance criteria before implementation	Developer Lifecycle
Image Spec	Intermediate	Development	Upload images (diagrams, mockups, flows) as primary specifications for AI coding tools to build accurate implementations from visual context	Spec-Driven Development, Progressive Enhancement
Planned Implementation	Beginner	Development	Interview, constrain, and plan before writing code so AI implementation matches actual requirements instead of confident-sounding assumptions	None
Progressive Enhancement	Beginner	Development	Build complex features through small, deployable iterations rather than big-bang generation	None
Adversarial Evaluator	Intermediate	Development	Separate the generating agent from an independent judging agent (ideally a different model) and use adversarial pressure or cross-model divergence as eval signal for high-stakes decisions	Progressive Enhancement
Atomic Decomposition	Intermediate	Development	Break complex features into atomic, independently implementable tasks for parallel AI agent execution	Progressive Enhancement
Parallel Agents	Advanced	Development	Run multiple AI agents concurrently on isolated tasks or environments to maximize development speed and exploration	Atomic Decomposition
Context Persistence	Intermediate	Development	Manage AI context as a finite resource through structured memory schemas, prompt pattern capture, and session continuity protocols	Codified Rules
Event Automation	Intermediate	Development	Execute custom commands automatically at assistant lifecycle events to enforce policies and automate workflows	Codified Rules, Security Sandbox
Custom Commands	Intermediate	Development	Discover and use built-in command vocabularies, then extend them with custom commands that encode domain expertise and sophisticated workflows	Event Automation, Spec-Driven Development, Codified Rules
Progressive Disclosure	Intermediate	Development	Load AI assistant rules incrementally based on task context to prevent instruction saturation and context bloat	Codified Rules, Context Persistence
Observable Development	Intermediate	Development	Logging and tracing as a bidirectional control: feeds forward to steer the agent, feeds back as a sensor it reads to self-correct	Developer Lifecycle
Guided Refactoring	Intermediate	Development	Systematic code improvement using AI to detect and resolve code smells with measurable quality metrics	Codified Rules
Error Resolution	Intermediate	Development	Automatically collect error context from logs, system state, and git history, then use AI to diagnose root causes and generate validated fixes	Developer Lifecycle, Observable Development, Tool Integration
Autonomous Remediation	Intermediate	Development	Pair deterministic rule-based detectors with LLM remediators inside an event-driven loop so codified rule violations are caught and fixed automatically before the AI session continues	Codified Rules, Event Automation
Security & Compliance		Operations	Category containing security and compliance patterns
Policy Generation	Advanced	Operations	Transform compliance requirements into executable Cedar/OPA policy files with AI assistance	Security Sandbox
Security Orchestration	Intermediate	Workflow	Aggregate multiple security tools and use AI to summarize findings for actionable insights	Security Sandbox
Centralized Rules	Advanced	Operations	Enforce organization-wide AI rules through a central Git repository that syncs to standard AI assistant configuration files with automatic language and framework detection	Codified Rules, Progressive Disclosure, Security Orchestration
Deployment Automation		Operations	Category containing deployment and pipeline patterns

Pattern Maturity Levels

Patterns are classified by implementation complexity and prerequisite knowledge:

Beginner: Basic AI tool usage with minimal setup required

Prerequisites: Basic programming skills, access to AI tools
Complexity: Single tool usage, straightforward prompts
Examples: Simple code generation, basic constraint setting

Intermediate: Multi-tool coordination and process integration

Prerequisites: Development workflow experience, team coordination
Complexity: Multiple tools, orchestration patterns, quality gates
Examples: Testing strategies, parallel workflows, choice generation

Advanced: Complex systems with enterprise concerns

Prerequisites: Architecture experience, security/compliance knowledge
Complexity: Multi-agent systems, advanced safety, compliance automation
Examples: Enterprise security, compliance automation, chaos engineering

Task Sizing Framework

The patterns use different task sizing approaches based on their purpose and context:

graph TD
    A[Feature Request] --> B[Issue Generation]
    B --> C[4-8 Hour Work Items]
    C --> D{Parallel Implementation?}
    D -->|Yes| E[Atomic Decomposition]
    D -->|No| F[Progressive Enhancement]
    E --> G[1-2 Hour Atomic Tasks]
    F --> H[Daily Deployment Cycles]

    G --> I[Parallel Agent Execution]
    H --> J[Sequential Enhancement]
    C --> K[Standard Kanban Flow]

Task Sizing Hierarchy:

Issue Generation (4-8 hours): Standard Kanban work items for continuous flow and rapid feedback
Atomic Decomposition (1-2 hours): Ultra-small tasks for parallel agent execution without conflicts
Progressive Enhancement (Daily cycles): Deployment-focused iterations that may contain multiple work items

When to Use Each Approach:

Use Issue Generation for standard team development with human developers
Use Atomic Decomposition when implementing with parallel AI agents
Use Progressive Enhancement when prioritizing rapid market feedback over task granularity

Pattern Differentiation:

Issue Generation: Creates Kanban work items (4-8 hours) for human team workflows
Atomic Decomposition: Creates ultra-small tasks (1-2 hours) for parallel AI agents
Progressive Enhancement: Creates deployment cycles (daily) focused on user feedback

Pattern Selection Decision Framework

Choose the right patterns based on your team's context, project requirements, and AI development maturity:

Decision Tree

graph TD
    A[Starting AI Development] --> B{Team AI Experience?}
    B -->|New to AI| C[Start with Foundation Patterns]
    B -->|Some Experience| D[Focus on Development Patterns]
    B -->|Advanced| E[Implement Operations Patterns]

    C --> F[Readiness Assessment]
    F --> G[Codified Rules]
    G --> H[Security Sandbox]
    H --> I{Need Structured Development?}
    I -->|Yes| J[Developer Lifecycle]
    I -->|No| K[Planned Implementation]
    K --> L[Progressive Enhancement]

    D --> M{Multiple Developers/Agents?}
    M -->|Yes| N[Parallel Agents]
    M -->|No| O[Spec-Driven Development]
    N --> P[Atomic Decomposition]

    E --> R{Enterprise Requirements?}
    R -->|Compliance| S[Policy Generation]
    R -->|Scale| T[Centralized Rules]
    R -->|Quality| U[Debt Forecasting]

Context-Based Pattern Selection

For New Teams (First 2 weeks):

Readiness Assessment - Evaluate current state
Codified Rules - Establish consistent standards
Security Sandbox - Ensure safe experimentation
Planned Implementation - Learn structured planning approaches
Progressive Enhancement - Start with simple iterations

For Development Teams (Weeks 3-8):

Developer Lifecycle - Structured development process
Spec-Driven Development - Quality-focused development
Issue Generation - Organized work breakdown
Testing Orchestration - Quality assurance

For Parallel Implementation:

Atomic Decomposition - Ultra-small independent tasks
Workflow Orchestration - Agent coordination
Review Automation - Automated integration
Security Sandbox - Enhanced with parallel safety

For Enterprise/Production (Month 2+):

Policy Generation - Compliance automation
Security Orchestration - Integrated security
Centralized Rules - Organization-wide AI standards
Debt Forecasting - Proactive maintenance

Project Type Recommendations

MVP/Startup Projects:

Primary: Progressive Enhancement, Planned Implementation
Secondary: Security Sandbox, Adversarial Evaluator
Avoid: Complex orchestration patterns until scale demands

Enterprise Applications:

Primary: Developer Lifecycle, Policy Generation
Secondary: Spec-Driven Development, Security Orchestration
Essential: All foundation patterns before development patterns

Research/Experimental Projects:

Primary: Adversarial Evaluator, Observable Development
Secondary: Context Persistence, Context Optimization
Focus: Learning and exploration over production readiness

High-Scale Production:

Primary: Parallel Agents, Observable Development
Secondary: Chaos Engineering, Incident Automation
Critical: All security and monitoring patterns

Team Size Considerations

Solo Teams:

Focus on Progressive Enhancement and Adversarial Evaluator
Add Observable Development for debugging
Skip parallel orchestration patterns

Two-Pizza Teams (small, autonomous teams):

Implement Issue Generation for coordination
Use Spec-Driven Development for quality
Consider Tool Integration for role clarity
Full Developer Lifecycle implementation
Parallel Agents for complex features
Spec-Driven Development for quality gates and traceability

Multi Two-Pizza Team Organizations:

Atomic Decomposition for parallel work across teams
Spec-Driven Development for coordination at scale via shared specifications
All Operations Patterns for organizational management

Technology Stack Considerations

Cloud-Native Applications:

Emphasize Policy Generation and Evidence Automation
Implement Drift Remediation for infrastructure
Use Deployment Synthesis for safe releases

On-Premise Systems:

Focus on Security Sandbox with network isolation
Implement Context Persistence for institutional knowledge
Use Debt Forecasting for maintenance planning

Microservices Architecture:

Parallel Agents for service coordination
Observable Development across service boundaries
Autonomous Remediation for cross-service code-quality consistency

Monolithic Applications:

Progressive Enhancement for gradual modernization
Guided Refactoring for code quality improvement
Planned Implementation to prevent over-engineering through its constraint phase

Foundation Patterns

Foundation patterns establish the essential infrastructure and team readiness required for successful AI-assisted development. These patterns must be implemented first as they enable all subsequent patterns.

Readiness Assessment

Maturity: Beginner Description: Systematic evaluation of codebase and team readiness for AI-assisted development before implementing AI patterns.

Related Patterns: Codified Rules, Issue Generation

📋 Quick start: Use the free AI Development Readiness Scorecard to score your team against this framework in about 10 minutes and get a tailored pattern adoption sequence.

Assessment Framework

graph TD
    A[Codebase Assessment] --> B[Team Assessment]
    B --> C[Infrastructure Assessment]
    C --> D[Readiness Score]
    D --> E[Implementation Plan]

Codebase Readiness Checklist

## Code Quality Prerequisites
□ Consistent code formatting and style guide
□ Comprehensive test coverage (>80% for critical paths)
□ Clear separation of concerns and modular architecture
□ Documented APIs and interfaces
□ Version-controlled configuration and secrets management

## Documentation Standards
□ README with setup and development instructions
□ API documentation (OpenAPI/Swagger)
□ Architecture decision records (ADRs)
□ Coding standards and conventions documented
□ Deployment and operational procedures

Anti-pattern: Premature Adoption Starting AI adoption without proper assessment leads to inconsistent practices, security vulnerabilities, and team frustration.

Codified Rules

Maturity: Beginner Description: Version and maintain AI coding standards as explicit configuration files that persist across sessions and team members.

Related Patterns: Developer Lifecycle, Context Persistence, Progressive Disclosure, Event Automation, Custom Commands, Centralized Rules, Harness Engineering Lens

Standardized Project Structure

project/
├── .ai/                          # AI configuration directory
│   ├── rules/                    # Modular rule sets
│   │   ├── security.md          # Security standards
│   │   ├── testing.md           # Testing requirements
│   │   ├── style.md             # Code style guide
│   │   └── architecture.md      # Architectural patterns
│   ├── prompts/                 # Reusable prompt templates
│   │   ├── implementation.md    # Implementation prompts
│   │   ├── review.md            # Code review prompts
│   │   └── testing.md           # Test generation prompts
│   └── knowledge/               # Captured patterns and gotchas
│       ├── successful.md        # Proven successful patterns
│       └── failures.md          # Known failure patterns
├── .cursorrules                 # Cursor IDE configuration
├── CLAUDE.md                    # Claude Code session context
└── .windsurf/                   # Windsurf configuration
    └── rules.md

Complete Example: See examples/codified-rules/ for:

Comprehensive development workflow rules and standards
Pipeline automation and CI/CD rules
Code quality standards and enforcement guidelines
Claude Code configuration for rules-as-code implementation

Anti-pattern: Broken Context Each developer maintains their own prompts and preferences, leading to inconsistent code across the team.

Security Sandbox

Maturity: Beginner Description: Run AI tools in isolated environments without access to secrets or sensitive data to prevent credential leaks and maintain security compliance.

Related Patterns: Security & Compliance Patterns, Codified Rules, Event Automation

Core Security Implementation

Claude Code Users: Use the /sandbox command to instantly create isolated environments without manual Docker configuration:

/sandbox
# Creates a secure, isolated environment with:
# - No access to credentials or sensitive files
# - Restricted network access
# - Controlled file system permissions

Docker-Based Implementation: For custom isolation or multi-agent scenarios:

# Basic AI isolation with complete network isolation
services:
  ai-development:
    network_mode: none                    # Zero network access
    cap_drop: [ALL]                       # No system privileges
    volumes:
      - ./src:/workspace/src:ro           # Read-only source code
      # DO NOT mount ~/.aws, .env, secrets/, etc.

Complete Example: See examples/security-sandbox/ for:

Complete Docker isolation configurations for single and multi-agent setups
Resource locking and emergency shutdown procedures
Security monitoring and violation detection
Multi-agent coordination with conflict resolution

Production Implementations

Modern AI development platforms provide enterprise-grade implementations of these security controls:

Cloud-Based Sandboxes:

Claude Code for the web: Sandboxed AI coding with isolated execution environments
Google Jules: Google's AI coding assistant with secure development environments
OpenAI Codex: Cloud-based AI coding with secure execution environments
Google Vertex AI Agent Engine Code Execution: Managed secure runtimes for AI agent code execution
GitHub Codespaces: Isolated cloud development VMs with configurable security policies
E2B: Specialized AI agent sandboxes with microVM isolation

Cloud & Self-Hosted Options:

Daytona: microVM-based isolation for development environments (available as cloud service or self-hosted)
Coder: Cloud development environments with enterprise security controls (available as cloud service or self-hosted)

Anti-pattern: Unrestricted Access Allowing AI tools full system access risks credential leaks, data breaches, and security compliance violations.

Anti-pattern: Conflicting Workspaces Allowing multiple parallel agents to write to the same directories creates race conditions, file conflicts, and unpredictable behavior that can corrupt the development environment.

Developer Lifecycle

Maturity: Intermediate Description: Structured 9-stage process from problem definition through deployment with AI assistance.

Related Patterns: Codified Rules, Spec-Driven Development, Planned Implementation, Atomic Decomposition, Observable Development

Workflow Interaction Sequence

sequenceDiagram
    participant D as Developer
    participant AI as AI Assistant
    participant S as System/CI
    participant T as Tests
    participant M as Monitoring
    
    Note over D,M: Stage 1-3: Problem → Plan → Requirements
    D->>AI: Problem Definition (e.g., JWT Authentication)
    AI->>D: Technical Architecture Plan
    D->>AI: Requirements Clarification
    AI->>D: API Specs + Kanban Tasks + Security Requirements
    
    Note over D,M: Stage 4-5: Issues → Specifications
    D->>AI: Generate Executable Tests
    AI->>T: Gherkin Scenarios + API Tests + Security Tests
    T->>D: Test Suite Ready (Performance Criteria: <200ms)
    
    Note over D,M: Stage 6: Implementation
    D->>AI: Implement Following Specifications
    AI->>S: Code + Tests + Error Handling + Logging
    S->>D: Implementation Results
    
    Note over D,M: Stage 7-9: Testing → Deployment → Monitoring
    D->>S: Run All Tests
    S->>D: Test Results + Security Scan + Performance Benchmark
    alt Tests Pass
        S->>S: Deploy to Production
        S->>M: Setup Monitoring Alerts
        M->>D: Deployment Complete + Monitoring Active
    else Tests Fail
        S->>D: Failure Report
        D->>AI: Fix Issues
        AI->>S: Updated Implementation
    end
    
    Note over D,M: Continuous Monitoring
    M->>D: Performance Alerts + Security Events

Core Workflow Implementation

# Stage 1-3: Problem → Plan → Requirements
ai "Analyze request → Generate architecture, tasks, API specs"

# Stage 4-5: Issues → Specifications  
ai "Generate executable tests → Gherkin scenarios, API tests, security tests"

# Stage 6: Implementation
ai "Implement following specifications → Use tests as guide, security best practices"

# Stage 7-9: Testing → Deployment → Monitoring  
ai "Complete QA → Run tests, security scan, deploy, monitor"

Complete Implementation: See examples/developer-lifecycle/ for full 9-stage workflow scripts, detailed prompts for each stage, enhanced implementation techniques (Five-Try Rule, markdown iteration, function decomposition), and integration with CI/CD pipelines.

Anti-pattern: Unplanned Development Jumping straight to coding with AI without proper planning, requirements, or testing strategy. Also avoid continuing with the same AI approach after 3-4 failures without decomposing the problem or changing strategy.

Tool Integration

Maturity: Intermediate
Description: Connect AI systems to external data sources, APIs, and tools for enhanced capabilities beyond prompt-only interactions.

Related Patterns: Security Sandbox, Developer Lifecycle, Observable Development

Core Concept

Modern AI development requires more than chat-based interactions. AI systems become significantly more capable when connected to real-world data sources and tools. This pattern demonstrates the architectural shift from isolated prompt-only AI to tool-augmented AI systems.

Implementation Overview

# Core tool-augmented AI system with security controls
class ToolAugmentedAI:
    def __init__(self, config_path: str = ".ai/tools.json"):
        self.available_tools = {
            "database_query": self._query_database,     # Read-only SQL queries
            "file_operations": self._file_operations,   # Controlled file access
            "api_requests": self._api_requests,         # Allowlisted HTTP requests
            "system_info": self._system_info            # Safe system information
        }
    
    def execute_with_tools(self, ai_request: str, tool_calls: list) -> dict:
        """Execute AI request with secure tool access"""
        # Process tool calls with security validation
        # Return structured results with error handling

Tool Categories & Security

Database Access: Read-only queries with operation whitelisting (SELECT, WITH only)
File Operations: Path-restricted read/write within configured directories
API Integration: HTTP requests limited to allowlisted domains with timeouts
System Information: Safe environment data without sensitive details

Configuration Example

{
  "allowed_apis": ["api.github.com", "api.openweathermap.org"],
  "file_access_paths": ["./data/", "./logs/", "./generated/"],
  "max_query_results": 100,
  "security": {
    "read_only_database": true,
    "api_rate_limits": true,
    "file_size_limits": "10MB"
  }
}

Model Context Protocol (MCP) Integration

This pattern can be implemented using Anthropic's Model Context Protocol (MCP) for standardized tool integration across AI systems:

{
  "mcp_servers": {
    "filesystem": {
      "command": "npx",
      "args": ["@modelcontextprotocol/server-filesystem", "./data"]
    },
    "sqlite": {
      "command": "npx",
      "args": ["@modelcontextprotocol/server-sqlite", "app_data.db"]
    }
  }
}

What Tool Integration Enables

Real-time data access: AI queries current database state, not training data
File system interaction: Read logs, write generated code, manage project files
API integration: Fetch live data from external services and APIs
System awareness: Access to current environment state and configuration
Enhanced context: AI decisions based on actual system state, not assumptions

Complete Implementation

See examples/tool-integration/ for:

Full Python implementation with security controls
Configuration examples and MCP integration
Usage patterns and deployment guidelines
Integration with Security Sandbox

Anti-pattern: Disconnected Prompting Attempting to solve complex data analysis, system integration, or real-time problems using only natural language prompts without providing AI access to actual data sources, APIs, or system tools. This leads to hallucinated responses, outdated information, and inability to interact with real systems.

Issue Generation

Maturity: Intermediate Description: Generate small, deployable work items (<1 hour with AI assistance) from requirements using AI to ensure continuous delivery with clear acceptance criteria and dependency tracking.

Methodology Note: This pattern aligns well with Kanban principles (continuous flow, small batches) but works with any development methodology including Scrum, Scrumban, or ad-hoc workflows.

Related Patterns: Readiness Assessment, Spec-Driven Development

Issue Generation Framework

graph TD
    A[Requirements Document] --> B[AI Feature Analysis]
    B --> C[Work Item Splitting]
    C --> D{<1 hour?}
    D -->|No| E[Split Further]
    E --> C
    D -->|Yes| F[Story Generation]
    F --> G[Acceptance Criteria]
    G --> H[Cycle Time Target]
    H --> I[Dependency Mapping]
    I --> J[Work Item Creation]

Core Principles

Small Batch Sizing: Each work item sized for 4-8 hours max to enable continuous delivery and rapid feedback
AI-Assisted Decomposition: Use AI to break down requirements into implementable tasks
Traceability Integration: Connect issues to implementation files and CI workflows
Dependency Mapping: Establish clear relationships between work items and epics
Acceptance-Driven: Each task includes specific, testable acceptance criteria

Work Item Attributes

Generated issues must include:

Title: Specific, actionable description of the work
Cycle Time Target: Estimated completion time (<1 hour with AI assistance)
Acceptance Criteria: Testable conditions for completion
File Scope: Which files will be added, updated, or removed
CI Requirements: Test coverage, pipeline steps, quality gates
Dependencies: Blocking and enabling relationships with other issues

Epic Relationship Management

Bidirectional Linking: Parent-child references maintained automatically
Progress Tracking: Epic completion calculated from subissue status
Dependency Validation: Automated checking for circular dependencies
Status Propagation: Subissue changes update epic progress

Implementation Examples: See examples/issue-generation/ for detailed AI prompts, epic breakdown workflows, CI integration patterns, and traceability implementations. For AI-first workflows, see Beads guide - a git-native issue tracker with CLI access and persistent agent memory.

"Small, frequent deliveries expose issues early and keep teams aligned." – Agile Alliance

Kanban Context: This pattern embodies Kanban principles of continuous flow and small batch sizes. If using Kanban: "If a task takes more than one day, split it." (Kanban Guide, Lean Kanban University). However, the pattern works equally well with Scrum sprints, continuous delivery, or any methodology that values incremental progress.

Anti-pattern: Under-Specified Issues Creating generic tasks without specific acceptance criteria, proper sizing, or clear dependencies leads to scope creep and estimation errors.

Anti-pattern: Broken Integration Creating issues without CI workflow integration, file tracking, or traceability requirements leads to disconnected development cycles and poor visibility into implementation progress.

Anti-pattern Examples:

❌ "Fix the login page"
❌ "Make the dashboard better"
❌ "Add some tests"
❌ "AUTH-002: Implement password validation" (no file tracking or CI requirements)

✅ "Add OAuth 2.0 token validation endpoint (<1 hour with AI)"
✅ "Implement dashboard metric WebSocket connection (45 minutes)"
✅ "Write unit tests for user service login method (30 minutes)"
✅ "AUTH-002: Password validation service with CI integration"
   - Files: src/auth/validators.py, tests/test_validators.py
   - Coverage: 95%, unit + integration tests
   - CI: lint, test, security-scan must pass
   - AI-assisted: Use AI for implementation and test generation

Development Patterns

Development patterns provide tactical approaches for day-to-day AI-assisted coding workflows, focusing on quality, maintainability, and team collaboration.

Spec-Driven Development

Maturity: Intermediate Description: Use executable specifications to guide AI code generation with clear acceptance criteria before implementation.

Core Principle: Precision Enables Productivity

SpecDriven AI combines three key elements:

Machine-readable specifications with unique identifiers and authority levels
Rigorous Test-Driven Development with coverage tracking and automated validation
AI-powered implementation with persistent context through structured specifications

Key Innovation: Authority Level System

Specifications use authority levels to resolve conflicts and establish precedence:

authority=system: Core business logic and security requirements (highest precedence)
authority=platform: Infrastructure and technical architecture decisions
authority=feature: User interface and experience requirements (lowest precedence)

When requirements conflict, higher authority levels take precedence, enabling clear decision-making for AI implementation.

Related Patterns: Developer Lifecycle, Tool Integration, Custom Commands, Image Spec, Testing Orchestration, Observable Development, Harness Engineering Lens

SpecDriven AI Workflow

graph TD
    A[Machine-Readable Specifications<br/>with Authority Levels] --> B[Coverage Tracking<br/>& Validation]
    B --> C[AI Implementation<br/>with Ephemeral Prompts]
    C --> D[Automated Testing<br/>& Compliance Check]
    D --> E{Specs Pass?}
    E -->|No| F[Refine Prompts<br/>Not Specs]
    F --> C
    E -->|Yes| G[Coverage Report<br/>& Deployment]
    G --> H[Specification Persistence<br/>for Regression]
    
    style A fill:#e1f5e1
    style B fill:#e1f5e1
    style H fill:#e1f5e1
    style C fill:#ffe6e6
    style F fill:#ffe6e6

Core Implementation

Machine-Readable Specification with Authority Levels

# IAM Policy Generator Specification {#iam_policy_gen}

## CLI Requirements {#cli_requirements authority=system}
The system MUST provide a command-line interface that:
- Accepts policy type via `--policy-type` flag
- Validates input parameters against AWS IAM constraints
- Generates syntactically correct IAM policy JSON [^test_iam_syntax]
- Returns exit code 0 for success, 1 for validation errors

## Input Validation {#input_validation authority=platform}  
The system MUST:
- Reject invalid AWS service names with clear error messages
- Validate resource ARN format before policy generation
- Implement rate limiting for API calls [^test_rate_limit]

[^test_iam_syntax]: tests/test_iam_policy_syntax.py
[^test_rate_limit]: tests/test_rate_limiting.py

Automated Coverage Tracking

# Run specification compliance validation
pytest --cov=src --cov-report=html --spec-coverage
python spec_validator.py --check-coverage --authority-conflicts

# Output shows specification coverage
# Specification Coverage Report:
# ✅ cli_requirements: 100% (3/3 tests linked)
# ✅ input_validation: 85% (6/7 tests linked) 
# ⚠️  Missing test: [^test_malformed_arn] in line 23

Tooling Integration

# Pre-commit hook validates specification compliance
# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: spec-coverage
        name: Specification Coverage Check
        entry: python spec_validator.py --check-coverage
        language: python
        pass_filenames: false

# Git workflow with specification traceability  
git commit -m "feat: implement rate limiting [spec:rl2c]
  
Implements rate limiting requirement from input_validation
section. Closes specification anchor #failed_auth.

Coverage: 94% (31/33 spec requirements covered)"

Key Benefits

Authority-based conflict resolution prevents requirement ambiguity
Automated coverage tracking ensures no specifications are missed
AI tool independence through persistent, machine-readable requirements
Precise traceability from specification anchors to test implementations
Living documentation that evolves with the system

Automated Traceability

Specification anchors ({#cli_requirements}) and test footnotes ([^test_iam_syntax]) are the link layer that ties requirements to tests to code to docs. Maintain those links automatically on every change instead of in a spreadsheet:

# After each commit, validate anchor coverage and flag drift
git diff --name-only HEAD~1 | while read file; do
    ai "For $file: confirm referenced spec anchors still exist, propose new
        anchor links for any uncovered behavior, and emit an impact list of
        which downstream tests and docs need updating."
done

When a spec section moves or a test is renamed, the same loop surfaces the broken link before it ships. The result is a living traceability graph that stays accurate without manual upkeep — the alternative (traceability in a spreadsheet) is stale within a week.

Anti-pattern: Broken Traceability Maintaining requirement-to-test links in spreadsheets or manual documentation that becomes stale and inaccurate within days of being written.

Complete Implementation

See examples/spec-driven-development/ for:

Complete IAM Policy Generator implementation
spec_validator.py tool for automated compliance checking
Pre-commit hooks and Git workflow integration
Full specification examples with authority levels
Coverage tracking and reporting tools

Anti-pattern: Spec-Ignored Writing code with AI first, then trying to retrofit tests, resulting in tests that mirror implementation rather than specify behavior.

Anti-pattern: Over-Prompting Saving collections of prompts as if they were specifications. Prompts are implementation details; specifications are behavioral contracts.

Image Spec

Maturity: Intermediate Description: Upload images (diagrams, mockups, flows) as primary specifications for AI coding tools to build accurate implementations from visual context.

Related Patterns: Spec-Driven Development, Progressive Enhancement, Context Optimization

Core Implementation

Use images as the source of truth for structure and intent, then supplement with minimal text constraints:

# 1. Prepare visual specifications
# - architecture.png (components + labeled ports)
# - data-model.png (fields + relationships)
# - ui-mock.png (layout + key interactions)

# 2. Attach images and provide a minimal build request
cat > build-request.txt << 'EOF'
Build the system from the attached diagrams.
Tech stack: Node.js + Express + PostgreSQL
Start with the User Service exactly as shown.
Include /health endpoints for every service.
EOF

# 3. Iterate with visual feedback
# - Screenshot the running output
# - Annotate with required changes
# - Re-attach and request the next iteration

Complete Implementation

See examples/image-spec/ for prompt templates, diagram checklists, and a repeatable image-first iteration loop.

Anti-pattern: Overwhelming Visuals

Uploading many diagrams at once without hierarchy or a clear starting point overwhelms context, increases contradictions, and reduces accuracy. Start with one high-level diagram, implement one slice, then add more visuals progressively.

Planned Implementation

Maturity: Beginner Description: Interview, constrain, and plan before writing code so AI implementation matches actual requirements instead of confident-sounding assumptions.

Related Patterns: Developer Lifecycle, Spec-Driven Development, Progressive Enhancement, Adversarial Evaluator, Harness Engineering Lens

Core Principle: Think Before You Code

The costliest bugs come from building the wrong thing, not building it wrong. This pattern front-loads three activities before any code is written:

Interview — have AI ask structured questions to surface tacit knowledge, hidden constraints, and decisions that would otherwise emerge mid-implementation.
Constrain — translate the interview answers into explicit boundaries the AI must respect (line counts, dependencies, performance budgets, prohibited approaches).
Plan — generate an explicit step-by-step implementation plan, review it, and iterate before any code is generated.

Planning Workflow

graph TD
    A[Idea or Request] --> B[AI Interviews You<br/>Clarifying Questions]
    B --> C{Gaps Remain?}
    C -->|Yes| B
    C -->|No| D[Define Explicit Constraints<br/>Line count, deps, perf budget]
    D --> E[Generate Initial Plan]
    E --> F[Review & Refine Plan]
    F --> G{Plan Approved?}
    G -->|No| E
    G -->|Yes| H[Execute Implementation]
    H --> I[Validate Against Plan]
    I --> J{Meets Plan?}
    J -->|No| H
    J -->|Yes| K[Complete]

    style B fill:#e1f5e1
    style D fill:#e1f5e1
    style F fill:#e1f5e1
    style H fill:#ffe6e6
    style K fill:#e1f5e1

Interview Phase

Before writing any plan, have AI act as an interviewer:

ai "I want to build a notification system for our app.

Before writing any code or plan, interview me:
1. Ask clarifying questions about requirements I haven't stated
2. Identify constraints I should decide on upfront
3. Surface assumptions that could cause rework later
4. Group your questions by category (scope, technical, users, edge cases)

Ask one category at a time. Wait for my answers before continuing."

Typical interview output groups questions by category — scope ("Which channels? In-app, email, SMS?"), technical ("Expected volume — 10/day vs 10,000/hour changes architecture"), users ("Can users configure preferences?"), edge cases ("What happens when delivery fails?"). After answers are collected, ask AI to consolidate them into a requirements summary, an explicit non-goals list, and remaining open questions — that document becomes the input to the planning phase.

Constraint Phase

Translate the interview answers into the boundaries the AI must respect. Constraints prevent over-engineering more reliably than instructions to "keep it simple":

Bad:  "Create user service"
Good: "Create user service: <100 lines, 3 methods max, only bcrypt dependency"

Bad:  "Add caching"
Good: "Add caching using Map, max 1000 entries, LRU eviction"

Bad:  "Improve performance"
Good: "Reduce p99 latency to <50ms without new dependencies"

Carry these constraints into every subsequent prompt — they're the steering, not the suggestion.

Core Implementation

1. Plan Generation Phase

# Example planning prompt structure
CONTEXT: "Building user authentication for SaaS application"
REQUIREMENTS: "JWT tokens, password reset, rate limiting"
CONSTRAINTS: "Must integrate with existing user table, 2-hour time limit"

REQUEST: "Generate step-by-step implementation plan with:
- Database changes needed
- API endpoints to create/modify
- Security considerations
- Testing approach
- Rollback strategy"

2. Plan Review and Iteration

# Generated Plan Review Checklist

### Technical Approach
- [ ] Database schema changes are backwards compatible
- [ ] API design follows existing conventions
- [ ] Security measures address OWASP top 10
- [ ] Performance impact is minimal

### Implementation Strategy
- [ ] Tasks are broken into deployable increments
- [ ] Dependencies are clearly identified
- [ ] Rollback plan is feasible
- [ ] Testing strategy covers edge cases

### Resource Requirements
- [ ] Time estimate is realistic
- [ ] Required permissions are available
- [ ] External dependencies are identified

3. Execution with Plan Validation

# During implementation, validate against plan
echo "✓ Step 1: Created user_sessions table (matches plan)"
echo "✓ Step 2: Added JWT service (matches plan)"
echo "⚠ Step 3: Rate limiting - using Redis instead of in-memory (plan deviation documented)"

Tool-Agnostic Planning Approach

Planning Session Structure

## 1. Problem Definition (2-3 sentences)
Clear statement of what needs to be built and why

## 2. Constraints & Requirements
- Technical constraints (existing systems, performance, security)
- Business requirements (timeline, user experience, compliance)
- Resource constraints (team size, expertise, budget)

## 3. Implementation Options
- Option A: [Brief description, pros/cons, time estimate]
- Option B: [Brief description, pros/cons, time estimate]
- Recommended: [Choice with justification]

## 4. Detailed Plan
- [ ] Step 1: [Concrete action with acceptance criteria]
- [ ] Step 2: [Concrete action with acceptance criteria]
- [ ] Step 3: [Concrete action with acceptance criteria]

## 5. Validation Strategy
How to verify each step works and overall solution meets requirements

When to Use Plan-First Development

Complex Features: Multi-step implementations requiring coordination
Unknown Domains: Working in unfamiliar technologies or business areas
Team Collaboration: When multiple developers need to understand the approach
High-Stakes Changes: Security, performance, or business-critical modifications
Learning Contexts: When using AI to explore new implementation approaches

Complete Implementation

See examples/planned-implementation/ for:

Tool-specific planning examples (Claude Code, Cursor)
Planning templates and checklists
Markdown iteration techniques and stakeholder review cycles
Integration with existing development workflows
Plan validation and iteration strategies

Anti-pattern: Blind Generation Jumping straight from a vague idea to code generation without interviewing for requirements or setting constraints. AI fills the gaps with assumptions — often reasonable-sounding but wrong for your context — and you discover requirements through failed implementations instead of conversation.

Anti-pattern: Unconstrained Generation Skipping the constraint phase. Telling AI to "make it good" or "add features" without explicit boundaries produces over-engineered solutions that are hard to review.

Anti-pattern: Over-Constrained Stacking so many constraints ("exactly 50 lines, 2 methods, no dependencies, 100% test coverage, sub-10ms response time") that AI can't find a coherent solution. Constraints are budgets, not handcuffs — pick the ones that matter for this task.

Anti-pattern: Over-Analysis Spending excessive time refining plans without moving to implementation, missing opportunities for rapid feedback and iterative improvement.

Progressive Enhancement

Maturity: Beginner
Description: Build complex features through small, deployable iterations rather than big-bang generation.

Related Patterns: Planned Implementation, Developer Lifecycle, Image Spec, Adversarial Evaluator

Examples Building authentication progressively:

# Day 1: Minimal login
"Create POST /login that returns 200 for admin/admin, 401 otherwise"
→ Deploy

# Day 2: Real password check
"Modify login to check passwords against users table. Keep existing API."
→ Deploy

# Day 3: Add security
"Add bcrypt hashing to login. Support both hashed and plain passwords temporarily."
→ Deploy

# Day 4: Modern tokens
"Replace session with JWT. Keep session endpoint for backward compatibility."
→ Deploy

Developer Review Required: Each iteration requires developer review and testing of AI-generated code before deployment.

When to Use Progressive Enhancement

MVP Development: When you need to get to market quickly with minimal features
Uncertain Requirements: When requirements are likely to change based on user feedback
Risk Mitigation: When you want to reduce the risk of large, complex implementations
Continuous Delivery: When you have automated deployment and want rapid iterations
Learning Projects: When the team is learning new technologies or domains

Anti-pattern: Monolithic Generation Asking AI to "create a complete user management system" results in 5000 lines of coupled, untested code that takes days to review and debug.

Adversarial Evaluator

Maturity: Intermediate Description: Separate the agent that generates work from an independent agent that judges it — ideally a different model — so adversarial pressure and cross-model divergence, not a model grading its own output, become the eval signal for high-stakes decisions.

Related Patterns: Planned Implementation, Progressive Enhancement, Spec-Driven Development, Autonomous Acceptance, Harness Engineering Lens

Core Principle: Separate the Producer from the Judge

Borrowed from GANs (Generative Adversarial Networks): two networks compete — one generates, one judges — and that adversarial tension forces quality up. The same dynamic applies to multi-agent systems. A model asked to grade its own output shares its own blind spots; the confident reasoning that produced a flawed answer also produces a confident review of it. The fix is to separate the generator from the evaluator completely, and to make the evaluator as independent as the stakes require.

Independence is a spectrum, not a switch:

Independence level	How	Strength
Same model, different prompt/role	"Now critique the above as a skeptical reviewer"	Weak — shared training priors, correlated blind spots
Different model as judge	Claude generates, GPT-5 or Gemini judges	Strong — independent training data and failure modes
Different identity + signing key	Judge owned by a separate party, attestation signed	Strongest — see Autonomous Acceptance

The pattern has two topologies. Pick the one that matches the decision: an adversarial judge when you have one candidate and need to know whether it holds up, or cross-model divergence when there is no single right answer and you want to surface the space of reasonable ones.

Topology 1: Adversarial Judge (generate → attack)

Sequential and asymmetric. One agent produces; a second, independent agent is told to find fault — not to summarize, not to agree, but to refute. Surviving that pressure is the quality signal.

graph LR
    G[Generator Agent<br/>Claude] -->|candidate| J[Judge Agent<br/>different model]
    J -->|attack + grade| V{Survives?}
    V -->|Yes| A[Accept]
    V -->|No| R[Return with findings]
    R --> G
    style G fill:#a8d5ba,stroke:#2d5a3f,color:#1a3a25
    style J fill:#f9e79f,stroke:#b7950b,color:#7d6608

# adversarial-judge.sh — generate with one model, attack with another
GENERATOR="claude-opus-4-8"
JUDGE="gpt-5"

llm -m "$GENERATOR" < task.md > candidate.md

llm -m "$JUDGE" <<EOF
You are an adversarial reviewer. Your goal is to REFUTE the work below, not
to praise it. Find every correctness bug, security hole, unhandled edge case,
and unstated assumption. If you cannot break it, say so explicitly.

TASK: $(cat task.md)
CANDIDATE: $(cat candidate.md)
EOF

The judge must run on a different model than the generator. A Claude-generated answer reviewed by Claude inherits Claude's blind spots; the same answer attacked by GPT-5 or Gemini meets a different set of priors. That diversity of training data is what makes the adversary's findings real rather than confirmatory.

Topology 2: Cross-Model Divergence (fan out → compare)

Parallel and symmetric. Instead of one judge attacking one output, run the same task across several frontier models at once and treat their disagreement as the signal. No model is cast as the judge — the divergence between independent peers is the verdict.

graph TD
    A[High-Stakes Prompt] --> B[Fan Out to N Models]
    B --> C1[Claude Opus]
    B --> C2[GPT-5]
    B --> C3[Gemini]
    C1 --> D[Side-by-Side Outputs]
    C2 --> D
    C3 --> D
    D --> E{Convergent?}
    E -->|Yes| F[Stronger Prior<br/>Proceed With Confidence]
    E -->|No| G[Investigate the Divergence<br/>Disagreement IS the Finding]
    G --> H[Choose, Synthesize, or Re-Prompt]

    style D fill:#a8d5ba,stroke:#2d5a3f,color:#1a3a25
    style G fill:#f9e79f,stroke:#b7950b,color:#7d6608
    style F fill:#f5b7b1,stroke:#c0392b,color:#78281f

#!/usr/bin/env bash
# fan-out.sh — run the same prompt across multiple models
PROMPT_FILE="$1"
mkdir -p .cross-model/$(date +%Y%m%d-%H%M%S)
RUN_DIR=$(ls -td .cross-model/* | head -1)

for model in \
  "claude-opus-4-8" \
  "gpt-5" \
  "gemini-2.5-pro"; do
    echo "→ $model"
    llm -m "$model" < "$PROMPT_FILE" > "$RUN_DIR/${model}.md"
done

# Diff the outputs to surface divergence quickly
diff -u "$RUN_DIR/claude-opus-4-8.md" "$RUN_DIR/gpt-5.md"     > "$RUN_DIR/claude-vs-gpt.diff"  || true
diff -u "$RUN_DIR/gpt-5.md"           "$RUN_DIR/gemini-2.5-pro.md" > "$RUN_DIR/gpt-vs-gemini.diff" || true

echo "Outputs in $RUN_DIR — review the .diff files first."

Reading the divergence:

Outcome	What it means	Action
All models agree	Stronger prior than any single model alone	Proceed
2 agree, 1 disagrees	The minority report may be catching something the majority missed	Read the dissent carefully before discarding
All three disagree	The prompt is underspecified, the task is genuinely ambiguous, or you're at the frontier of model capability	Re-prompt with sharper constraints, or treat as a human-judgment call

The disagreement IS the signal. Don't reduce three rich outputs to a vote count — investigate why the models split. That investigation is the value the pattern delivers, not the "winning" answer.

When to Use

Independent evaluation costs more than a single pass — an extra model call per judgment, or N parallel calls — so don't apply it to every prompt. Reach for it when:

Irreversible decisions: schema migrations, public API contracts, security model changes
High-stakes reviews: pre-merge architecture review, threat modeling, incident post-mortems
Eval-style spot-checks: validating a single canonical prompt that drives downstream automation
Onboarding a new model: comparing a candidate model's output against your trusted baseline before adopting it

For routine prompts, the single-model degenerate form (below) is sufficient and cheaper.

Single-Model Degenerate Form

When the cost of a second model isn't justified, ask one model to generate and then critique its own work, or to produce multiple options in a single call:

"Generate 3 different authentication approaches. For each: performance profile,
security trade-offs, implementation complexity, and when to choose it.
Then recommend one based on a typical SaaS startup's constraints."

This is cheaper but provides weaker signal — the critique shares the generator's training biases. Modern IDE assistants offer this natively as "alternative completions." It's the budget-friendly cousin of the full pattern, not a substitute for genuine independence on high-stakes calls.

Anti-pattern: Self-Grading

Letting the model that produced the work also certify it — a satisfaction score, a "looks good to me," a self-review — and treating that as independent verification. The reviewer shares every blind spot of the author because it is the author: the confident reasoning that generated a subtly wrong answer generates an equally confident approval of it. Taken to its institutional extreme — a self-signed acceptance score gating a merge — this becomes the Autonomous Acceptance anti-pattern, where the rubber stamp simply moves from a human to a number.

Anti-pattern: Single-Model Bias

Committing irreversible decisions on a single model's output without ever checking whether another frontier model would have made the same call. The decision feels well-reasoned because the model's prose is confident — but confidence is not correctness, and one model's blind spots become the project's blind spots.

Anti-pattern: Voting Theater

Running three models and treating majority rule as truth. Frontier models are trained on overlapping data and exhibit correlated errors; 2-of-3 agreement on a wrong answer is common when the wrong answer is the most plausible-sounding one. Use the votes as a prompt for investigation, never as a verdict.

Parallel Agents

Maturity: Advanced
Description: Run multiple AI agents concurrently on isolated tasks or environments to maximize development speed and exploration.

Related Patterns: Workflow Orchestration, Atomic Decomposition, Security Sandbox

Agent Coordination Lifecycle

sequenceDiagram
    participant M as Manager
    participant A1 as Auth Agent
    participant A2 as API Agent  
    participant A3 as Test Agent
    participant SM as Shared Memory
    participant CS as Conflict Scanner
    
    M->>A1: Start (OAuth2 Task)
    M->>A2: Start (REST API Task)
    M->>A3: Start (Test Suite Task)
    
    par Parallel Development
        A1->>A1: Implement OAuth2 Flow
        A1->>SM: Record Learning
    and
        A2->>A2: Implement REST Endpoints
        A2->>SM: Record API Patterns
    and
        A3->>A3: Generate Integration Tests
        A3->>SM: Record Test Patterns
    end
    
    SM->>CS: Trigger Conflict Analysis
    CS->>M: Report Conflicts/All Clear
    M->>M: Merge Components & Cleanup

Core Implementation Approaches

# Container-based isolation
# docker-compose.parallel-agents.yml
services:
  agent-auth:
    image: ai-dev-environment:latest
    volumes:
      - ./feature-auth:/workspace:rw
      - shared-memory:/shared:ro
    environment:
      - AGENT_ID=auth-feature
      - TASK_ID=implement-oauth2
    networks:
      - agent-network

  agent-api:
    image: ai-dev-environment:latest
    volumes:
      - ./feature-api:/workspace:rw
      - shared-memory:/shared:ro
    environment:
      - AGENT_ID=api-feature
      - TASK_ID=implement-rest-endpoints

volumes:
  shared-memory:
    driver: local
networks:
  agent-network:
    driver: bridge
    internal: true

Git Worktree Parallelization

# Create isolated branches for parallel work
git worktree add -b agent/auth ../agent-auth
git worktree add -b agent/api ../agent-api
git worktree add -b agent/tests ../agent-tests

# Launch agents in parallel
parallel --jobs 3 << EOF
cd ../agent-auth && ai-agent implement-oauth2
cd ../agent-api && ai-agent implement-rest-endpoints
cd ../agent-tests && ai-agent generate-integration-tests
EOF

# Automated conflict detection and merge
for branch in $(git branch -r | grep 'agent/'); do
  git checkout -b temp-merge main
  if git merge --no-commit --no-ff $branch; then
    echo "✓ No conflicts in $branch"
    git merge --abort
  else
    echo "⚠ Conflicts detected - using AI resolution"
    ai-agent resolve-conflicts --branch $branch
  fi
  git checkout main && git branch -D temp-merge
done

# Cleanup
git worktree remove ../agent-auth
git worktree remove ../agent-api
git worktree remove ../agent-tests

Shared Memory & Coordination

import fcntl

# Agent coordination with shared knowledge
class AgentMemory:
    def record_learning(self, agent_id, key, value):
        """Thread-safe learning capture with file locking"""
        with fcntl.flock(self.lock_file, fcntl.LOCK_EX):
            self.memory[agent_id][key] = value
        
    def get_shared_knowledge(self):
        """Consolidated knowledge from all agents"""
        return self.consolidated_memory

# Task definition
tasks = {
    "auth-service": {
        "agent_count": 1,
        "isolation": "container", 
        "dependencies": [],
        "instructions": "Implement OAuth2 with JWT tokens"
    },
    "api-endpoints": {
        "agent_count": 2,
        "isolation": "worktree",
        "dependencies": ["auth-service"],
        "instructions": "REST endpoints: user mgmt + CRUD"
    }
}

Complete Implementation: See examples/parallel-agents/ for:

Full Docker isolation and coordination setup
Git worktree management and conflict resolution
Shared memory system with file locking
Emergency shutdown and safety monitoring
Task distribution and dependency management

When to Use Parallel Agents

Complex features requiring multiple specialized implementations
Time-critical projects where speed trumps coordination overhead
Exploration phases testing multiple approaches simultaneously
Large teams with strong DevOps and coordination processes

Source: AI Native Dev - How to Parallelize AI Coding Agents

Anti-pattern: Uncoordinated Agents Running multiple agents without isolation, shared memory, or conflict resolution leads to race conditions, lost work, and system instability.

Context Persistence

Maturity: Intermediate Description: Manage AI context as a finite resource through structured memory schemas, prompt pattern capture, and session continuity protocols for efficient multi-session development.

Related Patterns: Codified Rules, Progressive Disclosure, Spec-Driven Development, Parallel Agents

Core Principles

AI context is a finite resource with diminishing returns. Effective context engineering requires:

Minimal High-Signal Tokens: Find the smallest set of information that maximizes outcomes
Just-in-Time Retrieval: Load context dynamically rather than pre-loading everything
Progressive Disclosure: Explore and discover information as needed, not upfront

Structured Memory Schemas

Persist information outside the context window using standardized memory formats:

# TODO.md - Task tracking across sessions
- [ ] Implement JWT middleware (blocked: key rotation design)
- [x] Add bcrypt password hashing (2024-01-15)
- [ ] Rate limiting (next: research token bucket vs sliding window)

# DECISIONS.log - Architectural decisions with timestamp
2024-01-15 10:30: Use RS256 for JWT (not HS256)
Rationale: Asymmetric keys enable better key rotation
Alternatives: HS256 (simpler but less flexible)
Impact: auth-service, api-gateway

# NOTES.md - Session continuity and discoveries
Session 2024-01-15:
  Context: Implementing authentication system
  Discoveries: bcrypt has performance issues >100 req/s
  Blockers: Need to decide on refresh token storage
  Next: Benchmark argon2 as bcrypt alternative

# scratchpad.md - Working memory (cleared after task)
Exploring JWT refresh token flow...
- httpOnly cookies prevent XSS
- Need CSRF protection for cookie-based auth

Prompt Pattern Library

Capture successful prompts and failures with success rates for reuse:

# Initialize knowledge structure
./knowledge-capture.sh --init

# Capture successful pattern
./knowledge-capture.sh --success \
  --domain "auth" \
  --pattern "JWT Auth" \
  --prompt "JWT with RS256, 15min access, httpOnly cookie" \
  --success-rate "95%"

# Document failure to avoid repeating
./knowledge-capture.sh --failure \
  --domain "auth" \
  --bad-prompt "Make auth secure" \
  --problem "Too vague → AI over-engineers" \
  --solution "Specify exact JWT requirements"

Context Window Management

Compaction Strategy - When context approaches limits:

Distill critical decisions to DECISIONS.log
Summarize key discoveries in NOTES.md
Update TODO.md with current state and blockers
Create "Previously on..." recap for session continuity

Session Continuity Protocol - Resume work across sessions:

Read NOTES.md for previous session context
Review TODO.md for current tasks and blockers
Check DECISIONS.log for recent architectural choices
Scan scratchpad.md for active explorations

# Compact context when nearing limits
./context-compact.sh --summarize

# Resume from previous session
./session-resume.sh  # Displays TODO + recent decisions + notes recap

Complete Implementation: See examples/context-persistence/ for:

Memory schema templates (TODO.md, DECISIONS.log, NOTES.md, scratchpad.md)
Context compaction and session resume automation scripts
Prompt pattern capture and maintenance tools
Working examples of memory schemas in use

Anti-pattern: Over-Documentation

Creating extensive knowledge bases that become maintenance burdens instead of accelerating development through selective, actionable knowledge capture.

Why it's problematic:

Knowledge bases become outdated and misleading
Developers spend more time documenting than developing
Overly detailed entries are ignored in favor of quick experimentation
Knowledge becomes siloed and not easily discoverable

Instead, focus on:

Capture only high-impact patterns (>80% success rate)
Document failures that wasted significant time (>30 minutes)
Keep entries concise and immediately actionable
Review and prune knowledge quarterly

Anti-pattern: Bloated Context

Loading entire codebases, documentation, or conversation history into context rather than using structured memory and just-in-time retrieval.

Why it's problematic:

Wastes tokens on low-signal information
Degrades AI performance due to information overload
Slows interaction latency and increases costs
Misses the forest for the trees

Instead:

Use lightweight identifiers (file paths, links) rather than full content
Load context progressively as needed
Externalize detailed information to memory schemas
Prefer 3-5 high-quality examples over exhaustive documentation

Event Automation

Maturity: Intermediate Description: Execute custom commands automatically at assistant lifecycle events (pre/post tool use, session start, prompt submission) for workflow automation, validation, and policy enforcement.

Related Patterns: Codified Rules, Security Sandbox, Custom Commands, Autonomous Remediation

Core Concept

Attach shell commands to AI assistant lifecycle events. Commands receive context via environment variables (file paths, tool names, user prompts) and return exit codes to allow/block/warn.

Event Flow Example

sequenceDiagram
    participant Dev as Developer
    participant AI as AI Assistant
    participant Pre as PreToolUse Hook
    participant Post as PostToolUse Hook

    Dev->>AI: Edit .env file
    AI->>Pre: Run security-hook.sh
    Pre->>Pre: Check if file is sensitive
    Pre-->>AI: Exit 2 (BLOCK)
    AI->>Dev: ❌ Blocked: Cannot edit sensitive file

    Dev->>AI: Edit src/api.js
    AI->>Pre: Run security-hook.sh
    Pre->>Pre: Check if file is sensitive
    Pre-->>AI: Exit 0 (Allow)
    AI->>AI: Execute file edit
    AI->>Post: Run security-hook.sh
    Post->>Post: Scan for secrets with gitleaks
    alt Secret Found
        Post-->>AI: Exit 1 (Warning)
        AI->>Dev: ⚠️ Secret detected! Review before committing
    else No Secret
        Post-->>AI: Exit 0 (Success)
        AI->>Dev: ✅ Edit complete
    end

Simple Security Example

Prevent editing sensitive files and scan for secrets:

#!/bin/bash
# security-hook.sh

FILE="$TOOL_INPUT_FILE_PATH"

# Block .env and credentials files
if echo "$FILE" | grep -E "(\\.env|secrets\\.json|credentials)" > /dev/null; then
  echo "❌ Blocked: Cannot edit sensitive file"
  exit 2
fi

# Scan for hardcoded secrets (requires gitleaks)
if command -v gitleaks > /dev/null; then
  if gitleaks detect --no-git --source="$FILE" 2>&1 | grep -q "leaks found"; then
    echo "⚠️ Secret detected! Review before committing."
    exit 1
  fi
fi

exit 0

Configuration Example (Claude Code)

{
  "hooks": {
    "PreToolUse": [{
      "matcher": "Edit",
      "hooks": [{"type": "command", "command": "./security-hook.sh"}]
    }],
    "PostToolUse": [{
      "matcher": "Edit",
      "hooks": [{"type": "command", "command": "./security-hook.sh"}]
    }]
  }
}

Common Use Cases

Auto-format code after edits (prettier, black, gofmt)
Block sensitive file modifications
Log AI interactions for compliance
Run linters before commits

Security Warning

Event commands run with full system access. Always review scripts before enabling. Test in isolated environments first.

Complete Implementation

See examples/event-automation/ for a working implementation with security scanning and hooks.

Anti-pattern: Unchecked Events

Running automation from untrusted sources without review exposes your system to malicious code execution and credential theft. Always audit event scripts before installation.

Custom Commands

Maturity: Intermediate Description: Discover and use built-in command vocabularies, then extend them with custom commands that encode domain expertise and sophisticated workflows.

Related Patterns: Event Automation, Spec-Driven Development, Codified Rules, Progressive Disclosure, Harness Engineering Lens

Core Concept

AI coding tools provide built-in commands for common operations and support custom commands (markdown files with AI instructions) for project-specific workflows. Commands are manual/on-demand (invoked like /refactor), while events fire automatically (see Event Automation).

Command Discovery

Discover built-in commands first:

# Claude Code
/help /model /clear /review

# Cursor IDE
Cmd+K /edit /chat

# Gemini CLI
/stats /memory /tools /clear

Use Built-in Commands	Create Custom Commands
Generic operations (clear, help, model)	Domain expertise (refactoring, security analysis)
Tool features (review, edit)	Project workflows (deploy, implement-spec)
Universal commands	Team standards and conventions

Example: Refactoring Assistant

Encode Martin Fowler's refactoring catalog for systematic code improvement:

---
description: Interactive refactoring assistant based on Martin Fowler's refactoring catalog
argument-hint: Optional flags (--smell, --duplicates, --suggest)
---

# Refactoring Assistant

You are helping a developer improve code maintainability by identifying code smells and recommending specific refactoring techniques from Martin Fowler's catalog.

# Usage
/refactor              # Full analysis
/refactor --smell      # Code smells only

# Implementation

### 1. Code Smell Detection
- Long methods (>20 lines), duplicate code, complex conditionals
- For each: location (file:line), severity, specific refactoring, effort estimate

### 2. Bloater Detection
- Excessive parameters (>4), data clumps, primitive obsession

### 3. Refactoring Strategy
1. Name the code smell
2. Recommend technique from Fowler's catalog
3. Show before/after example
4. Estimate maintainability improvement

Generate step-by-step refactoring plan prioritized by impact.

More Examples

Additional command examples with detailed implementations:

Implement-Spec - Spec-driven implementation with TDD and traceability
Security Review - Multi-layer security analysis (secrets, vulnerabilities, config)
Safe-Refactor - Safe refactoring with automated testing and rollback
Test Runner - Smart test selection with coverage and health monitoring

Tool Support

# Claude Code: .claude/commands/*.md
mkdir -p .claude/commands
cp examples/custom-commands/commands/*.md .claude/commands/

# Cursor IDE: .cursorrules
cat examples/custom-commands/commands/refactor.md >> .cursorrules

# Generic: .ai/commands/ (tool-agnostic)
mkdir -p .ai/commands
cp examples/custom-commands/commands/*.md .ai/commands/

Complete Implementation

See examples/custom-commands/ for ready-to-use commands, configuration files, and setup guide.

Anti-pattern: Redundant Commands

Creating /clear when the tool already provides it. Always discover built-in commands first.

Anti-pattern: Shallow Commands

# Bad: Just wraps shell command
Run: npm run deploy:staging

# Good: Encodes expertise
1. Verify staging environment health
2. Check for active incidents
3. Review recent commits for risk
4. Run deployment with rollback plan

Anti-pattern: Hardcoded Context

# Bad: Hardcoded values
Deploy to prod-db-instance-1.us-east-1.rds.amazonaws.com

# Good: Parameterized
Deploy to database: $1 (default: $STAGING_DB)

Progressive Disclosure

Maturity: Intermediate Description: Load AI assistant rules incrementally based on task context rather than bundling all instructions upfront, preventing context bloat and improving instruction-following consistency.

Related Patterns: Codified Rules, Context Persistence, Custom Commands, Event Automation, Centralized Rules, Context Optimization

Core Problem

AI coding assistants already consume part of their context window with built-in system instructions. When a project loads a single, monolithic rules file (hundreds of lines) for every task, instruction-following accuracy drops and irrelevant guidance crowds out what the model needs right now.

Implementation Strategy: Three-Tier Rule Architecture

Keep a small universal rules file, and load specialized rules only when the task touches the relevant area:

.ai/
├── CLAUDE.md                    # Universal rules only (<60 lines)
├── rules/                       # Specialized rules loaded on-demand
│   ├── security/                # secrets, auth, dependencies
│   ├── development/             # api-design, database, testing
│   ├── operations/              # deployment, monitoring, cicd
│   └── architecture/            # patterns, performance
└── prompts/                     # Reusable task templates

Main Rules File = Router

The main file should explicitly tell the assistant what to load based on context:

# AI Development Rules

# Universal Principles (Always Apply)
- Follow existing patterns in the codebase
- Never commit secrets or credentials
- Run tests after code changes

# Progressive Disclosure (Context Loading)
- **Security work** (auth/, .env, credentials): Read `.ai/rules/security/`
- **API development** (api/, routes/): Read `.ai/rules/development/api-design.md`
- **Database changes** (migrations/, models/): Read `.ai/rules/development/database.md`
- **Testing** (tests/, specs/): Read `.ai/rules/development/testing.md`
- **CI/CD** (.github/workflows/): Read `.ai/rules/operations/cicd.md`

Automatic Loading with Hooks

Combine with Event Automation to auto-load the right rules before tool use:

#!/bin/bash
# .ai/hooks/auto-load-context.sh

FILE_PATH="$TOOL_INPUT_FILE_PATH"
LOADED_RULES=""

if echo "$FILE_PATH" | grep -Eq "(\\.env|credentials|secrets|auth/)"; then
  LOADED_RULES="$LOADED_RULES .ai/rules/security/"
fi

if echo "$FILE_PATH" | grep -Eq "(api/|routes/|controllers/)"; then
  LOADED_RULES="$LOADED_RULES .ai/rules/development/api-design.md"
fi

if echo "$FILE_PATH" | grep -Eq "(tests?/|spec/|\\.test\\.|\\.spec\\.)"; then
  LOADED_RULES="$LOADED_RULES .ai/rules/development/testing.md"
fi

if [ -n "$LOADED_RULES" ]; then
  echo "AI: Before proceeding, read these files: $LOADED_RULES"
fi

Complete Implementation

See examples/progressive-disclosure/ for templates and a ready-to-adapt rules router + directory layout.

Anti-pattern: Bloated Configuration

Loading a single, massive rules file for every task wastes context and reduces instruction-following accuracy—especially for small edits that need only a handful of universal rules.

Anti-pattern: Missing Guidance

Creating specialized rule files but never documenting when/how to load them forces humans to remember the routing and prevents consistent, automated context loading.

Atomic Decomposition

Maturity: Intermediate
Description: Break complex features into atomic, independently implementable tasks for parallel AI agent execution.

Related Patterns: Developer Lifecycle, Workflow Orchestration, Progressive Enhancement, Issue Generation, Parallel Agents

Atomic Task Criteria

graph TD
    A[Feature Requirement] --> B[Task Analysis]
    B --> C{Atomic Task Check}
    C -->|✓ Independent| D[Can run in parallel]
    C -->|✓ <2 hours| E[Rapid feedback cycle]
    C -->|✓ Clear I/O| F[Testable interface]
    C -->|✓ No shared state| G[Conflict-free]
    C -->|✗ Fails check| H[Split Further]
    H --> B
    
    D --> I[Ready for Agent]
    E --> I
    F --> I
    G --> I

Core Decomposition Process

# Feature: User Authentication System
# Bad: Monolithic task
❌ "Implement complete user authentication with JWT, password hashing, rate limiting, and email verification"

# Good: Atomic breakdown with AI validation
ai_decompose "Break down user authentication into atomic tasks:

Task 1: Password validation service (1.5h)
- Input: plain text password, validation rules
- Output: validation result object
- Dependencies: None (pure function)

Task 2: JWT token generation service (1h)  
- Input: user ID, role, expiration config
- Output: signed JWT token
- Dependencies: None (crypto operations only)

Task 3: Rate limiting middleware (2h)
- Input: request metadata, rate limit config
- Output: allow/deny decision
- Dependencies: None (stateless logic)

Task 4: Login endpoint integration (1h)
- Input: credentials, services from tasks 1-3
- Output: HTTP response with token/error
- Dependencies: Tasks 1-3 (integration only)"

# Validate atomicity
ai_task_validator "Check each task for:
1. <2 hour completion time
2. No shared mutable state
3. Clear input/output contracts
4. Testable in isolation
5. No circular dependencies"

Agent Assignment & Coordination

# .ai/task-assignment.yml
authentication_feature:
  parallel_tasks:
    - id: "auth-001" # Password validation
      agent: "backend-specialist-1"
      estimated_hours: 1.5
      dependencies: []
      
    - id: "auth-002" # JWT generation
      agent: "security-specialist"
      estimated_hours: 1
      dependencies: []
      
    - id: "auth-003" # Rate limiting
      agent: "backend-specialist-2"
      estimated_hours: 2
      dependencies: []
      
  integration_tasks:
    - id: "auth-004" # Login endpoint
      agent: "integration-specialist"
      estimated_hours: 1
      dependencies: ["auth-001", "auth-002", "auth-003"]

Task Contract Validation

# Ensure tasks meet atomic criteria
class TaskContract:
    def validate_atomic(self) -> bool:
        return all([
            len(self.side_effects) == 0,    # No side effects
            self.estimated_hours <= 2,      # Rapid completion
            self.has_clear_io_contract()    # Testable interface
        ])

# Example validation
task = TaskContract("auth-001")
task.inputs = {"password": str, "rules": PasswordRules}  
task.outputs = {"is_valid": bool, "errors": List[str]}
assert task.validate_atomic()  # ✓ Passes atomic criteria

Complete Implementation: See examples/atomic-decomposition/ for:

Contract validation system with automated checking
Function-level decomposition techniques and trigger indicators
Task dependency resolution and scheduling
Parallel execution coordination and monitoring
Agent assignment and resource management

When to Use Atomic Decomposition

Parallel Agent Implementation: Multiple AI agents working simultaneously
Complex Feature Development: Large features benefiting from parallel work
Time-Critical Projects: Speed through parallelization essential
Risk Mitigation: Reduce blast radius of individual task failures

Anti-pattern: False Atomicity Creating tasks that appear independent but secretly share state, require specific execution order, or have hidden dependencies on other concurrent work.

Anti-pattern: Over-Decomposition
Breaking tasks so small that coordination overhead exceeds the benefits of parallelization, leading to more complexity than value.

Observable Development

Maturity: Intermediate
Description: Design systems with logging and tracing that make behavior visible to AI as a bidirectional control — observability feeds forward as a standard that steers the agent while it writes code, and feeds back as a sensor the agent reads (and grades) to self-correct.

Related Patterns: Developer Lifecycle, Tool Integration, Codified Rules, Autonomous Remediation, Testing Orchestration, Spec-Driven Development, Harness Engineering Lens

Observability as feedforward (a guide that steers)

Codify logging and tracing conventions as a Codified Rule so the agent emits AI-legible, structured context as it writes code — before any failure occurs.

# AI-friendly structured logging (the standard the agent writes against)
def log_operation(operation, **context):
    logging.info(json.dumps({
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "operation": operation,
        "correlation_id": current_correlation_id(),
        "context": context
    }))

# Observable business logic with comprehensive context
def process_order(order):
    log_operation("order_start", order_id=order.id, total=order.total)
    try:
        validate_order(order)
        log_operation("validation_success")
        result = charge_payment(order)
        log_operation("payment_success", transaction_id=result.id)
        return result
    except PaymentError as e:
        log_operation("payment_error", error=str(e), code=e.code)
        raise

Observability as feedback (a sensor the agent reads back)

Logs and traces are not write-only. They are a feedback signal the agent consumes to diagnose issues — and, critically, a signal the agent should grade and improve. After the agent resolves a failure, have it reflect on whether the logs it had were sufficient, then feed that judgment back into the logging standard:

ai "You just diagnosed this failure. Rate the logs you actually had to work with.
What context was missing that would have made the root cause obvious in one pass?
Emit the structured-logging changes that close that gap, and propose the update
to .ai/rules/observability.md so the next operation is logged the same way."

This turns a feedback signal into a feedforward improvement — the harness gets better each time it is used, rather than the agent repeating the same blind spots. The human's job is to steer by iterating on the standard, not to hand-write every log line.

Enforce observability as a fitness function (computational sensor)

A standard that lives only in a doc is aspirational. Make it an enforced gate that runs on every change, alongside the agent, so black-box code fails fast (Keep Quality Left):

# Fitness function: every public operation must emit structured context
def test_operations_are_observable():
    for fn in public_operations(module):
        assert emits_structured_log(fn), f"{fn.__name__} is a black box"
        assert propagates_correlation_id(fn), f"{fn.__name__} drops correlation_id"

Pair the deterministic check with a probabilistic one. The two answer different questions:

Check	Type	Cost / reliability	Answers
Structured log present, correlation ID propagates	Computational	ms, reliable	Is it logged?
Are these logs actually useful for diagnosis?	Inferential (LLM-as-judge)	slow, probabilistic	Is it useful?

Runtime feedback beyond the change lifecycle

Observability as a sensor does not stop at debug time. Extend it into production so the agent acts on drift: AI judges continuously sampling response quality and flagging log anomalies, and degrading SLOs that prompt the agent to suggest remediations (see Autonomous Remediation).

Complete Implementation: See examples/observable-development/ for:

Full structured logging framework with correlation IDs
Performance monitoring decorators and utilities
AI-friendly debug tools and log analysis scripts
Integration examples for e-commerce and authentication systems

Source: Birgitta Böckeler, "Harness Engineering", martinfowler.com — for the feedforward/feedback (guides vs. sensors) and computational/inferential control framing.

Anti-pattern: Blind Development

Building systems with minimal observability that provide insufficient context for AI to understand system behavior, diagnose issues, or suggest improvements.

Why it's problematic: AI cannot debug systems with generic logs like "Payment failed" or "Something went wrong" - it needs specific context, timing, and error details.

# Bad: Black box logging
def process_payment(amount):
    try:
        result = payment_service.charge(amount)
        logger.info("Payment processed")
        return result
    except Exception:
        logger.error("Payment failed")
        raise

# Good: Observable implementation  
def process_payment(amount):
    log_operation("payment_start", amount=amount)
    try:
        result = payment_service.charge(amount)
        log_operation("payment_success", transaction_id=result.id)
        return result
    except Exception as e:
        log_operation("payment_error", error=str(e), amount=amount)
        raise

Anti-pattern: Write-Only Observability

Rich, structured logs exist, but nothing closes the loop. No agent reads them back to self-correct, no fitness function enforces the standard, and no runtime sensor watches them drift. Observability becomes a checkbox instead of a sensor — the logs feed forward but never feed back, so the harness never improves and the same blind spots recur.

Guided Refactoring

Maturity: Intermediate
Description: Systematic code improvement using AI to detect and resolve code smells with measurable quality metrics, following established refactoring rules and maintaining test coverage throughout the process.

Related Patterns: Codified Rules, Autonomous Remediation, Testing Orchestration, Debt Forecasting, Harness Engineering Lens

Code Smell Detection Framework

graph TD
    A[Code Analysis] --> B[Smell Detection]
    B --> C[Refactoring Strategy]
    C --> D[AI Implementation]
    D --> E[Test Validation]
    E --> F[Quality Metrics]
    F --> G{Improvement?}
    G -->|Yes| H[Commit Changes]
    G -->|No| I[Revert & Retry]
    H --> J[Update Knowledge Base]
    I --> C

Core Workflow

# 1. Define refactoring rules
cat > .ai/rules/refactoring.md << 'EOF'
## Long Method Smell
- Max lines: 20 (excluding docstrings)
- Max cyclomatic complexity: 10
- Detection: flake8 C901, pylint R0915

## Large Class Smell  
- Max class lines: 250, Max methods: 20
- Detection: pylint R0902, R0904
EOF

# 2. Detect code smells with AI
flake8 --select=C901 src/ > smells.txt
pylint src/ --disable=all --enable=R0915,R0902,R0904 >> smells.txt

ai "Analyze smells.txt using .ai/rules/refactoring.md:
1. Prioritize by impact and complexity
2. Suggest specific refactoring strategy for each smell
3. Generate implementation plan with risk assessment"

# 3. Apply refactoring with test preservation
pytest --cov=src tests/  # Baseline coverage

ai "Refactor process_user_data() method (35 lines, exceeds threshold):
- Apply Extract Method pattern for validation, database, notifications
- Maintain test coverage >90% and API contract
- Create atomic commits for each extracted method"

# 4. Validate and track improvements
pytest --cov=src tests/
flake8 src/ && pylint src/

ai "Generate refactoring impact report:
Before: complexity=12, length=35 lines, coverage=85%
After: complexity=4+2+2, length=8+6+7 lines, coverage=92%
Document lessons learned in .ai/knowledge/refactoring.md"

Common Refactoring Patterns

Extract Method: Break down long methods (>20 lines)
Extract Class: Split large classes (>250 lines, >20 methods)
Replace Primitive: Convert strings/dicts to value objects
Consolidate Duplicates: Merge similar code patterns

Complete Implementation: See examples/guided-refactoring/ for:

Automated refactoring pipeline with CI integration
Quality metrics tracking and reporting
Risk assessment guidelines and rollback procedures
Knowledge base templates for refactoring outcomes

Anti-pattern: Scattered Refactoring Making widespread changes without systematic analysis leads to introduced bugs and degraded code quality.

Anti-pattern: Premature Refactoring Refactoring code for hypothetical future requirements rather than addressing current code smells and quality issues.

Error Resolution

Maturity: Intermediate Description: Automatically collect comprehensive error context from logs, system state, and git history, then use AI to diagnose root causes and generate validated fixes.

Related Patterns: Developer Lifecycle, Observable Development, Tool Integration, Autonomous Remediation, Testing Orchestration, Harness Engineering Lens

Error Resolution Workflow

graph TD
    A[Error Occurs] --> B[Collect Error Context]
    B --> C[Enrich with Git History]
    C --> D[AI Diagnosis]
    D --> E[Generate Fix Proposals]
    E --> F{Human Review}
    F -->|Approved| G[Apply Fix]
    F -->|Rejected| H[Refine Context]
    G --> I[Run Tests]
    I --> J{Tests Pass?}
    J -->|Yes| K[Commit Fix]
    J -->|No| L[Rollback]
    H --> D
    L --> D

    style B fill:#e1f5e1
    style D fill:#ffe6e6
    style G fill:#ffe6e6
    style K fill:#e1f5e1

Core Implementation

Step 1: Collect Error Context

# Create comprehensive error context file
cat > .error-context.md << EOF
# Error Analysis

**Error Output:**
[Complete error message, stack trace, and exit codes]

**Recent Changes:**
$(git log --oneline -5)

**Affected Files:**
$(git diff --name-only HEAD~1)

**File Contents:**
$(cat path/to/affected/file.ext)

**Environment:**
- OS: $(uname -s)
- Shell: $SHELL
- Working Directory: $(pwd)
EOF

Step 2: AI-Powered Diagnosis

# Send structured context to AI for analysis
ai "Analyze this error and provide actionable fixes:

CONTEXT:
$(cat .error-context.md)

REQUIRED OUTPUT:
1. Root cause analysis
2. Specific fix commands (copy-pasteable)
3. Prevention strategy (pre-commit hooks, tests, etc.)

Format fixes as executable bash commands."

Step 3: Validate and Apply Fixes

# Create checkpoint before applying fixes
git stash push -m "Pre-fix checkpoint"

# Review AI suggestions
cat ai-suggestions.md

# Apply fixes with validation
bash fix-commands.sh

# Verify with tests
./run-tests.sh

# Commit if successful, rollback if not
if [ $? -eq 0 ]; then
    git add .
    git commit -m "fix: [description based on AI analysis]"
    git stash drop
else
    git stash pop
    echo "Fix failed validation, rolled back"
fi

Practical Workflow Example

# 1. Capture error from any source (CI, terminal, logs)
ERROR_LOG="path/to/error.log"

# 2. Enrich with context
cat > error-analysis.md << EOF
**Error:**
$(cat $ERROR_LOG)

**Recent Commits:**
$(git log --oneline -3)

**Changed Files:**
$(git diff --name-only HEAD~1)

**File Contents:**
$(for file in $(git diff --name-only HEAD~1); do
    echo "**$file:**"
    cat $file
done)
EOF

# 3. AI diagnosis
ai "Diagnose and fix:
$(cat error-analysis.md)

Provide:
1. Root cause
2. Exact fix commands
3. How to prevent recurrence"

# 4. Apply and validate
# [Review AI output]
# [Execute suggested fixes]
# [Run tests]
# [Commit]

When to Use Error Resolution

CI/CD Failures: Diagnose build, test, or deployment failures
Local Development Errors: Debug unexpected errors during development
Configuration Issues: Resolve environment or configuration problems
Dependency Conflicts: Analyze and resolve version conflicts
Integration Failures: Debug issues with external services or APIs

Complete Implementation: See examples/error-resolution/ for:

Reusable templates for error context collection and AI prompts
Common error scenarios (test failures, dependency conflicts, configuration errors)
GitHub Actions integration for CI/CD error diagnosis
Automated context extraction and diagnosis workflows

Anti-pattern: Blind Diagnosis

Sending only the error message to AI without surrounding context.

# ❌ Bad: No context
ai "Fix error: Process completed with exit code 1"

Why it's problematic: AI cannot diagnose the issue without seeing:

What command failed
Recent changes that might have caused it
File contents or configuration
System environment

Instead, provide full context:

# ✅ Good: Comprehensive context
ai "Fix this error:

Error: Process completed with exit code 1

Command that failed: terraform fmt -check -recursive
Files affected: main.tf, outputs.tf

Recent change:
$(git log -1 --oneline)

File content:
$(cat terraform/main.tf)

Environment: Terraform v1.6.0"

Anti-pattern: Brittle Fixes

Applying AI-suggested fixes without validation or rollback strategy.

# ❌ Bad: Apply without review or rollback
ai "Fix this error" | bash

Why it's problematic:

AI suggestions may introduce new bugs
May break existing functionality
Could make security or data loss mistakes
No rollback strategy if fix fails

Instead, validate fixes before applying:

# ✅ Good: Validate before applying with rollback
git stash push -m "Pre-fix checkpoint"

# Generate fix
ai "Fix this error" > proposed-fix.sh

# Review the proposed changes
cat proposed-fix.sh

# Apply fix
bash proposed-fix.sh

# Verify changes
git diff

# Run tests to validate
./run-tests.sh

# Commit or rollback
if [ $? -eq 0 ]; then
    git add .
    git commit -m "fix: [description]"
    git stash drop
else
    git stash pop
    echo "Fix failed validation, rolled back"
fi

Autonomous Remediation

Maturity: Intermediate Description: Pair deterministic rule-based detectors with LLM remediators inside an event-driven loop so codified rule violations are caught and fixed automatically before the AI session continues.

Related Patterns: Codified Rules, Event Automation, Guided Refactoring, Error Resolution, Harness Engineering Lens

Source: Paul Duvall, "Code Quality Gates: Using Claude Code Hooks to Block Code Smells on Every Write", February 24, 2026

Core Loop

graph TD
    A[AI Writes or Edits File] --> B[PostToolUse Hook Fires]
    B --> C[Deterministic Detector<br/>AST, Linter, Scanner, Policy Engine]
    C --> D{Violations?}
    D -->|No| E[AI Continues]
    D -->|Yes| F[Structured Violation Report<br/>+ Prescribed Fix Hint]
    F --> G[LLM Remediator<br/>Same Session Context]
    G --> A
    D -->|Retry Budget Exhausted| H[Escape Hatch<br/>Raise Threshold, Skip, or Suppress]

    style C fill:#a8d5ba,stroke:#2d5a3f,color:#1a3a25
    style G fill:#f9e79f,stroke:#b7950b,color:#7d6608
    style H fill:#f5b7b1,stroke:#c0392b,color:#78281f

Pattern Anatomy

Four components must be present for the loop to close:

Deterministic detector. Codified rules executed by non-LLM logic (AST walker, linter, scanner, type checker, policy engine). Output is reproducible across runs.
Structured violation report. Machine-parseable list of findings with file path, line number, rule ID, severity, and a prescribed fix hint.
LLM remediator. The same model that produced the violation, fed the report as feedback. Runs in the same session so prior context carries forward.
Retry budget and escape hatch. Bounded loop count, per-rule threshold override, or per-file suppression marker. Without this, the loop can run indefinitely on legitimate edge cases.

Core Implementation

PostToolUse hook (the mechanism comes from Event Automation):

#!/usr/bin/env python3
# ~/.claude/hooks/auto-remediate.py
import json, os, sys
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from detectors import run_all_detectors
from fixes import FIX_HINTS, format_report

event = json.load(sys.stdin)
file_path = event.get("tool_input", {}).get("file_path", "")
if not file_path or not os.path.isfile(file_path):
    sys.exit(0)

violations = run_all_detectors(file_path)
if not violations:
    sys.exit(0)

report = format_report(file_path, violations, FIX_HINTS)
print(json.dumps({"decision": "block", "reason": report}))
sys.exit(0)

Each detector emits both the finding AND the prescribed fix, so the LLM remediator gets a concrete starting point instead of guessing which technique to apply:

# detectors/fixes.py
FIX_HINTS = {
    "complexity":      "Use extract-method, early returns, guard clauses, or lookup tables.",
    "long_function":   "Extract helper functions for distinct logical steps.",
    "deep_nesting":    "Use guard clauses and early returns to flatten control flow.",
    "duplicate_block": "Extract repeated code into a shared helper function.",
    "vuln_dependency": "Upgrade to the patched version listed in the advisory.",
    "secret_leak":     "Move the value to a secret manager and reference via env var.",
}

Loop semantics:

Condition	Behavior
Clean detector run	Exit 0, AI continues
Violations found	Exit with `block` decision and structured report; LLM re-attempts
Same file blocked more than 3 consecutive times	Surface to developer; suggest threshold raise or skip-list entry
Hook itself crashes	Fail open (exit 0); never silently disable a rule without telling the developer

When to Use

Use Autonomous Remediation when...	Use plain Event Automation when...
Violation has a deterministic fix the LLM can apply	Violation is a hard policy ("never edit .env") with no fix path
Rule is mechanical (complexity, nesting, dep version, secret pattern)	Rule requires human judgment (architecture, business logic, security tradeoff)
Fix can be verified by re-running the same detector	Fix needs out-of-band verification (manual review, production test)

Domain Instances

This pattern recurs across the catalog under domain-specific names. Each is a concrete instantiation of the same detect-fix-verify loop:

Domain	Detector	LLM Remediator Output	Existing Pattern
Code smells	AST or Lizard complexity rules	Refactored function	Guided Refactoring
Runtime errors	Stack trace + log scanner	Validated bug fix	Error Resolution
Infrastructure drift	Terraform plan diff	Corrective patch	Drift Remediation
Flaky tests	Build history analyzer	Stabilization patch	Suite Health
Stale or vulnerable deps	npm audit, pip-audit, dependabot	Staged upgrade PR	Upgrade Advisor
Security findings	Bandit, Semgrep, gitleaks	Patched code	New instance

Complete Example: See examples/autonomous-remediation/ for a working PostToolUse hook with code-smell detectors, fix-hint dictionary, retry budget, multi-language support via Lizard, and skip-list configuration.

Anti-pattern: Manual Remediation

Detecting violations with codified rules but leaving the fix to a human reviewer or a separate CI cycle. AI sessions write dozens of files between commits. Violations compound faster than humans can triage them. By the time a reviewer flags one issue, three more have been built on top of it. This converts AI write velocity from an asset into a liability.

# Bad: detect at commit time, human fixes later
git commit  # pre-commit linter fails, dev manually fixes 12 files

# Good: detect at write time, LLM fixes immediately
# PostToolUse hook blocks, LLM remediates, next file starts clean

Anti-pattern: Unbounded Loop

Configuring the loop without a retry budget or escape hatch. When the LLM and detector legitimately disagree (a 12-state machine that genuinely needs cyclomatic complexity 14, a wrapper function with 6 parameters mapping to an external API), they ping-pong indefinitely. Each retry costs roughly one model turn of tokens and wall-clock time.

# Bad: no exit condition
retry_budget: unlimited
suppression: none

# Good: bounded with explicit escape paths
retry_budget: 3
on_exhaustion: surface_to_developer
suppression:
  directory_skip_list: [tests/, generated/, vendored/]
  threshold_override: .ai/thresholds.yml  # per-file or per-rule
  inline_marker: "# rule: ignore"          # per-call-site

Operations Patterns

Operations patterns focus on CI/CD, security, compliance, and production management with AI assistance, building on the foundation and development patterns.

Security & Compliance Patterns

Policy Generation

Maturity: Advanced Description: Transform compliance requirements into executable policy files with AI assistance, ensuring regulatory requirements become testable code.

Related Patterns: Security Sandbox, Codified Rules, Centralized Rules

# Transform compliance requirements into executable policies
ai "Convert compliance requirements into Cedar policy code:
SOC 2: Data at rest must be AES-256 encrypted" > encryption.cedar

# Validate generated Cedar policies
cedar validate --schema schema.cedarschema encryption.cedar

Complete Implementation: See examples/policy-generation/ for:

Complete policy generation pipeline with AI assistance
Cedar/OPA policy templates and compliance mapping
Policy testing and validation frameworks
CI/CD integration examples

Anti-pattern: Untested Policies Hand-coding policies from written requirements introduces inconsistencies and interpretation errors.

Security Orchestration

Maturity: Intermediate
Description: Aggregate multiple security tools and use AI to summarize findings for actionable insights, reducing alert fatigue while maintaining security rigor.

Related Patterns: Policy Generation, Centralized Rules

# Orchestrate multiple security tools
snyk test --json > snyk.json
bandit -r src -f json > bandit.json
trivy fs --format json . > trivy.json

# AI-powered summarization for actionable insights
ai "Summarize security findings; focus on CRITICAL issues" > pr-comment.txt
gh pr comment --body-file pr-comment.txt

Complete Implementation: See examples/security-orchestration/ for:

Complete security scanning pipeline with tool orchestration
AI-powered report summarization and prioritization
CI/CD integration and automated PR commenting
Custom security tool configurations and reporting

Anti-pattern: Over-Alerting Posting every low-severity finding buries real issues and frustrates developers.

Centralized Rules

Maturity: Advanced Description: Enforce organization-wide AI rules through a central Git repository that syncs language- and framework-specific guidance into standard assistant configuration files.

Related Patterns: Codified Rules, Progressive Disclosure, Security Orchestration, Harness Engineering Lens

Core Implementation

Sync-based Architecture (Recommended):

Central Rules Repository (Git)
  ├── base/universal-rules.md
  ├── languages/ (python.md, typescript.md, go.md)
  └── frameworks/ (react.md, django.md, fastapi.md)
           ↓
    [sync-ai-rules.sh]
           ↓
  Project Repository
    ├── CLAUDE.md (auto-generated)
    ├── AGENTS.md (auto-generated)
    └── .cursorrules (auto-generated)

How it works:

A central repository stores organization rules organized by language/framework.
A sync script detects the project language/framework (from files and dependencies).
The script generates standard configuration files that common assistants read automatically.

Key benefits:

✅ Works with existing AI tools (no gateway required)
✅ Offline-friendly after initial sync
✅ Version-controlled and auditable in Git
✅ Language-aware via auto-detection

Alternative: Gateway Strategy (advanced use cases)

For organizations needing request/response filtering, policy enforcement, or usage logging, use the gateway approach:

Gateway Strategy

Complete Implementation

Sync Strategy - Simple Git-based sync (recommended)
Gateway Strategy - Central policy + logging + filters

Anti-pattern: Scattered Configuration

Copying AI rules into every repository without a central source causes drift, inconsistent enforcement, and manual update toil across teams.

Anti-Patterns Reference

Common AI Development Anti-Patterns

Foundation Anti-Patterns

Rushing Into AI: Starting AI adoption without proper assessment
Context Drift: Inconsistent AI rules across team members
Unrestricted Access: Allowing AI tools access to sensitive data
Ad-Hoc Development: Skipping structured development lifecycle

Development Anti-Patterns

Implementation-First AI: Writing code before defining acceptance criteria
Test Generation Without Strategy: Creating tests without coherent quality goals
Big Bang Generation: Attempting complex features in single AI interaction
Uncoordinated Multi-Tool Usage: Using multiple AI tools without orchestration
Black Box Systems: Insufficient logging for AI debugging
Unclear Boundaries: Ambiguous human-AI handoff points

Operations Anti-Patterns

Fragmented Security: Isolated security tools without unified framework
Alert Fatigue: Overwhelming developers with low-priority findings
Static Deployment: Fixed scripts without AI adaptation
Trusting AI Blue-Green Generation: Accepting AI output without validation for deployment patterns
Reactive Maintenance: Firefighting instead of proactive AI-assisted management
Blind Chaos Testing: Random fault injection without understanding dependencies

Implementation Guide

Getting Started

Phase 1: Foundation (Weeks 1-2)

Readiness Assessment - Evaluate team and codebase readiness
Codified Rules - Establish consistent AI coding standards
Security Sandbox - Implement secure AI tool isolation
Developer Lifecycle - Define structured development process
Issue Generation - Generate structured work items from requirements

Phase 2: Development (Weeks 3-4)

Spec-Driven Development - Implement specification-first approach
Progressive Enhancement - Practice iterative development
Adversarial Evaluator - Stress-test high-stakes decisions with an independent judge or across multiple frontier models
Atomic Decomposition - Break down complex features

Phase 3: Operations (Weeks 5-6)

Policy Generation - Codify compliance into executable policy files
Security Orchestration - Aggregate scanner findings into actionable summaries
Centralized Rules - Sync organization-wide AI standards from a central Git repo

Note: For teams practicing continuous delivery, implement security (Security Sandbox, Security Orchestration, Policy Generation) from week 1 alongside foundation patterns. The phases represent learning dependencies, not deployment sequences.

Success Metrics

Foundation Metrics

Team readiness score improvement
Consistent AI rule adherence across projects
Zero credential leaks in AI-generated code
Reduced onboarding time for new developers

Development Metrics

Test coverage maintenance (>90% for AI-generated code)
Reduced code review cycles
Faster feature delivery with maintained quality
Decreased debugging time for AI-generated issues

Operations Metrics

Automated policy compliance verification
Reduced deployment failures
Faster incident response with AI-generated runbooks
Proactive technical debt management

Contributing

Have a pattern that's working well for your team? Open an issue or PR to share your experience. The AI development landscape is evolving rapidly, and we're all learning together.

Pattern Contribution Guidelines

Follow the established pattern template (Maturity, Description, Related Patterns, Examples, Anti-patterns)
Include practical, runnable examples
Specify clear success criteria and anti-patterns
Reference related patterns appropriately
Test patterns with multiple AI tools when applicable

License

MIT License - See LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 277 Commits
.ai		.ai
.beads		.beads
.claude/prompts		.claude/prompts
.github/workflows		.github/workflows
docs		docs
examples		examples
experiments		experiments
scripts		scripts
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
PATTERN_MIGRATION_GUIDE.md		PATTERN_MIGRATION_GUIDE.md
README.md		README.md
index.html		index.html
pattern-spec.md		pattern-spec.md
patterns.yaml		patterns.yaml
pytest.ini		pytest.ini

Folders and files

Latest commit

History

Repository files navigation

AI Development Patterns

Pattern Organization

Harness Engineering Lens

Pattern Dependencies & Implementation Order

Complete Pattern Reference

Pattern Maturity Levels

Task Sizing Framework

Pattern Selection Decision Framework

Decision Tree

Context-Based Pattern Selection

Project Type Recommendations

Team Size Considerations

Technology Stack Considerations

Foundation Patterns

Readiness Assessment

Codified Rules

Security Sandbox

Developer Lifecycle

Tool Integration

Issue Generation

Development Patterns

Spec-Driven Development

Image Spec

Planned Implementation

Progressive Enhancement

Adversarial Evaluator

Parallel Agents

Context Persistence

Event Automation

Custom Commands

Progressive Disclosure

Atomic Decomposition

Observable Development

Guided Refactoring

Error Resolution

Autonomous Remediation

Operations Patterns

Security & Compliance Patterns

Policy Generation

Security Orchestration

Centralized Rules

Anti-Patterns Reference

Common AI Development Anti-Patterns

Foundation Anti-Patterns

Development Anti-Patterns

Operations Anti-Patterns

Implementation Guide

Getting Started

Phase 1: Foundation (Weeks 1-2)

Phase 2: Development (Weeks 3-4)

Phase 3: Operations (Weeks 5-6)

Success Metrics

Foundation Metrics

Development Metrics

Operations Metrics

Contributing

Pattern Contribution Guidelines

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages