Factory Mode - Agent Guide

Quick Reference for AI Agents

This guide teaches you how to use Ralph CLI's Factory Mode for complex, multi-stage workflows with verification gates.

What is Factory Mode?

Factory Mode is a meta-orchestration layer that lets you chain multiple Ralph commands (PRD → Plan → Build) into declarative pipelines with:

✅ Tamper-resistant verification - Can't fake success by claiming "tests pass!"
✅ Conditional branching - Different paths based on results
✅ Self-correcting loops - Retry failed stages with context
✅ Parallel execution - Run independent stages concurrently

Key principle: Verification gates check actual artifacts (git commits, test results, files) - not agent output text.

When to Use Factory Mode

✅ Use Factory Mode When:

Multi-stage workflows - PRD → Plan → Build → Test → Deploy chains
Quality gates required - Must verify tests pass, builds succeed, commits exist
Branching logic needed - Different paths based on complexity, test results, etc.
Self-correction needed - Auto-retry failed builds with accumulated context
Parallel execution - Build frontend + backend simultaneously
Recursive workflows - Keep trying until tests pass (with max attempts)

❌ Don't Use Factory Mode For:

Simple single PRD - Just use ralph prd, ralph plan, ralph build
Manual workflows - User wants control over each step
Exploratory work - Requirements unclear, need human input
One-off tasks - No need for verification or loops

Quick Start - Decision Tree

User request received
│
├─ "Build X feature" (single feature)
│  └─ Don't use factory - use: ralph prd → ralph plan → ralph build
│
├─ "Build and test X, auto-fix failures"
│  └─ USE FACTORY - needs verification + self-correction loop
│
├─ "Build full app with frontend + backend"
│  └─ USE FACTORY - needs parallel builds + integration tests
│
└─ "Create pipeline for X with quality gates"
   └─ USE FACTORY - needs verification stages

Core Commands

# Create new factory from scratch
ralph factory init my-factory

# Run factory pipeline
ralph factory run my-factory

# Check execution status
ralph factory status my-factory

# Resume from checkpoint (after failure)
ralph factory resume my-factory

# List all factories
ralph factory list

# Stop running factory
ralph factory stop my-factory

# Visualize dependency graph
ralph factory graph my-factory

Factory File Structure

# .ralph/factory/my-factory.yaml
version: "1"
name: "my-factory"

# Optional variables (accessible in all stages)
variables:
  max_iterations: 10
  test_command: "npm test"

# Pipeline stages (executed in dependency order)
stages:
  - id: stage_one          # Unique identifier
    type: custom           # Stage type (prd|plan|build|custom|factory)
    command: "..."         # Shell command (for custom type)
    depends_on: []         # Stage IDs that must complete first
    condition: "..."       # Optional: only run if expression is true
    verify: []             # Verification gates (critical!)
    retry:
      max_attempts: 3
      delay: 1000

Stage Types Reference

Type	Purpose	Required Fields	Example Use
`prd`	Generate PRD	`input.request`	Create requirements document
`plan`	Create plan from PRD	`depends_on` (prd stage)	Break PRD into stories
`build`	Execute stories	`config.iterations`	Implement features
`custom`	Run shell command	`command`	Run tests, deploy, custom scripts
`factory`	Nested factory	`factory_name`	Recursive factory workflows

Verification Gates - THE CRITICAL PART

Why Verification Matters

Problem: Agents can claim success without doing work:

Agent output: "✓ All 21 tests pass!"
Agent exit code: 0

Solution: Verification gates check actual artifacts:

Verifier runs: npm test
Actual result: 8 failed, 13 passed
Verification: FAILED ❌

Verifier Types

Type	Checks	Use When
`test_suite`	Runs tests, counts pass/fail	Validating implementation
`build_success`	Runs build, checks exit code	Ensuring code compiles
`git_commits`	Checks git log for commits	Ensuring work committed
`git_files_changed`	Checks git diff for file changes	Verifying specific files modified
`file_exists`	Checks filesystem for files	Ensuring files created
`file_contains`	Reads file, matches pattern	Validating file content
`lint_pass`	Runs linter, checks errors/warnings	Code quality
`custom`	Runs command, matches success pattern	Custom checks

Critical Examples

✅ GOOD: Test Suite Verification

- id: implement_feature
  type: build
  config:
    iterations: 10
  verify:
    - type: test_suite
      command: "npm test"
      min_passing: 21    # MUST have 21 passing tests
      max_failing: 0     # ZERO failures allowed

Why this works: Factory actually runs npm test and parses Jest/Mocha output. Agent cannot fake this.

❌ BAD: No Verification

- id: implement_feature
  type: build
  config:
    iterations: 10
  # NO VERIFY BLOCK - agent can claim success without testing

✅ GOOD: Multi-Gate Verification

verify:
  - type: test_suite        # Tests must pass
    command: "npm test"
    min_passing: 20

  - type: build_success     # Code must compile
    command: "npm run build"

  - type: git_commits       # Work must be committed
    min_commits: 1
    pattern: "^(feat|fix):"

  - type: lint_pass         # Code must be clean
    command: "npm run lint"
    max_errors: 0

Conditional Execution & Branching

Available Context Variables

Variable	Type	Description
`{{ stages.STAGE_ID.passed }}`	Boolean	Stage completed successfully
`{{ stages.STAGE_ID.failed }}`	Boolean	Stage failed
`{{ stages.STAGE_ID.output }}`	Object	Stage output data
`{{ stages.STAGE_ID.exit_code }}`	Number	Exit code
`{{ variables.VAR_NAME }}`	Any	Factory variable
`{{ learnings }}`	String	Accumulated learnings
`{{ recursion_count }}`	Number	Current loop iteration

Example: Conditional Branching

stages:
  - id: run_tests
    type: custom
    command: "npm test"

  # Only runs if tests FAILED
  - id: fix_failures
    type: prd
    depends_on: [run_tests]
    condition: "{{ stages.run_tests.failed }}"
    input:
      request: "Fix test failures: {{ stages.run_tests.error_summary }}"

  # Only runs if tests PASSED
  - id: deploy
    type: custom
    depends_on: [run_tests]
    condition: "{{ stages.run_tests.passed }}"
    command: "npm run deploy"

Recursive Loops (Self-Correction)

Pattern: Test → Fix → Retry

stages:
  - id: implement
    type: build
    config:
      iterations: 10

  - id: verify_tests
    type: custom
    depends_on: [implement]
    command: "npm test"
    verify:
      - type: test_suite
        command: "npm test"
        min_passing: 20
        max_failing: 0

  # Loop back if tests fail (max 3 attempts)
  - id: fix_failures
    type: prd
    depends_on: [verify_tests]
    condition: "{{ stages.verify_tests.failed && recursion_count < 3 }}"
    input:
      request: "Fix: {{ stages.verify_tests.verification.failures }}"
    loop_to: implement       # Go back to implement stage
    max_loops: 3             # Safety limit

Flow:

implement → verify_tests → PASSED → done ✓
                    ↓
                 FAILED
                    ↓
              fix_failures (recursion_count = 1)
                    ↓
         loop_to: implement
                    ↓
         implement → verify_tests → PASSED → done ✓

Safety: max_loops prevents infinite loops. Always set this!

Common Patterns

Pattern 1: Simple PRD → Plan → Build Pipeline

version: "1"
name: "feature-pipeline"

variables:
  feature_request: "Add user authentication"

stages:
  - id: generate_prd
    type: prd
    input:
      request: "{{ variables.feature_request }}"

  - id: create_plan
    type: plan
    depends_on: [generate_prd]

  - id: build_feature
    type: build
    depends_on: [create_plan]
    config:
      iterations: 10
    verify:
      - type: test_suite
        command: "npm test"
        min_passing: 5

Pattern 2: Parallel Execution with Merge

stages:
  - id: setup
    type: custom
    command: "npm install"

  # These run in PARALLEL (both depend only on setup)
  - id: build_frontend
    type: build
    depends_on: [setup]
    config:
      iterations: 8

  - id: build_backend
    type: build
    depends_on: [setup]
    config:
      iterations: 8

  # Merge point - waits for BOTH to complete
  - id: integration_test
    type: custom
    depends_on: [build_frontend, build_backend]
    command: "npm run test:integration"

Pattern 3: Self-Correcting Pipeline

stages:
  - id: implement
    type: build
    config:
      iterations: 10

  - id: run_tests
    type: custom
    depends_on: [implement]
    command: "npm test"
    verify:
      - type: test_suite
        command: "npm test"
        min_passing: 20
        max_failing: 0

  - id: fix_issues
    type: prd
    depends_on: [run_tests]
    condition: "{{ stages.run_tests.failed && recursion_count < 3 }}"
    input:
      request: "Fix failures: {{ stages.run_tests.verification.failures }}"
    loop_to: implement
    max_loops: 3

  - id: deploy
    type: custom
    depends_on: [run_tests]
    condition: "{{ stages.run_tests.passed }}"
    command: "npm run deploy"

Agent Workflow - Step by Step

When User Requests Factory Workflow:

Understand requirements
- What stages are needed?
- What verification is required?
- Any branching/looping needed?
Create factory YAML
```
ralph factory init my-factory
```
- Edit .ralph/factory/my-factory.yaml
- Define stages with proper dependencies
- ADD VERIFICATION GATES (critical!)
- Set up conditions for branching
- Configure loops with max_loops
Run factory
```
ralph factory run my-factory
```
Monitor progress
```
ralph factory status my-factory
```
Handle failures
- Check logs: .ralph/factory/runs/run-TIMESTAMP/execution.log
- Check stage results: .ralph/factory/runs/run-TIMESTAMP/stages/STAGE_ID/result.json
- Resume if needed: ralph factory resume my-factory
Report results
- Show which stages completed
- Report verification results
- Show final artifacts (files created, tests passed, commits made)

Critical Rules for Agents

✅ DO:

Always add verification gates to critical stages
Use specific verifiers (test_suite, not just custom)
Set max_loops on recursive stages
Use depends_on to enforce ordering
Check actual artifacts (git commits, files, test results)
Set reasonable timeouts for long-running stages
Use variables for configuration values

❌ DON'T:

Trust agent output text - always verify with gates
Skip verification on build/test stages
Create infinite loops - always set max_loops
Use exit code alone - use test_suite verifier
Forget dependencies - stages run in parallel without depends_on
Hardcode values - use variables for flexibility
Ignore verification failures - they mean real problems

Debugging Tips

Check Execution Status

# Current status
ralph factory status my-factory

# View full log
cat .ralph/factory/runs/run-TIMESTAMP/execution.log

# Check specific stage result
cat .ralph/factory/runs/run-TIMESTAMP/stages/stage_id/result.json

Common Issues

Issue	Cause	Fix
Stage skipped	Condition evaluated to false	Check condition expression syntax
Verification failed	Actual results don't match requirements	Check verify config, run command manually
Infinite loop	No exit condition	Add max_loops or proper condition
Stage never runs	Missing dependency	Check depends_on chain
Wrong execution order	Missing depends_on	Add proper dependencies

Examples from Codebase

Example 1: Math Challenge (Verification Demo)

File: .ralph/factory/math-challenge.yaml

Purpose: Demonstrates that agents cannot fake test results.

stages:
  - id: implement_math_lib
    type: custom
    command: |
      # Agent claims success but uses buggy code
      echo "✓ All tests pass! 21 tests, 0 failures!"
      exit 0

    # Verification ACTUALLY RUNS TESTS
    verify:
      - type: test_suite
        command: "npm test"
        min_passing: 21    # All math tests must pass
        max_failing: 0

Result: Stage fails because verification actually runs npm test and finds failures.

Example 2: Wedding Planner (Full Pipeline)

File: .ralph/factory/wedding-planner-simple.yaml

Purpose: Build a complete feature with PRD → Plan → Build → Test flow.

stages:
  - id: setup_directory
    type: custom
    command: "mkdir -p project && ..."
    verify:
      - type: file_exists
        files: ["project/package.json"]

  - id: generate_guest_prd
    type: prd
    depends_on: [setup_directory]
    input:
      request: "Create guest management module with CRUD..."

  - id: plan_guest_module
    type: plan
    depends_on: [generate_guest_prd]

  - id: build_guest_module
    type: build
    depends_on: [plan_guest_module]
    config:
      iterations: 10
    verify:
      - type: file_exists
        files: ["project/src/guests.js"]

  - id: run_tests
    type: custom
    depends_on: [build_guest_module]
    command: "cd project && npm test"
    verify:
      - type: test_suite
        command: "cd project && npm test"
        min_passing: 1

Example 3: Wedding Planner Website (Complex Multi-Stage)

File: .ralph/factory/wedding-planner-website.yaml

Purpose: Full-stack app with parallel builds, integration tests, and self-correction.

Key features:

20 stages total
Parallel frontend + backend builds
Integration tests with verification
Self-correcting loop on test failures
Conditional deployment (only if tests pass)
Documentation generation stage

Structure:

Setup → Database → Backend ↘
                            → Integration Tests → Fix Loop / Deploy
Setup → Database → Frontend ↗

Response Templates

When Creating Factory

I'll create a factory pipeline for [task]. This requires:

1. **Stages**:
   - [Stage 1]: [Purpose]
   - [Stage 2]: [Purpose]
   - ...

2. **Verification**:
   - [What will be verified]

3. **Branching/Loops** (if applicable):
   - [Conditions and flow]

Creating factory configuration...

After Factory Completes

Factory "[name]" completed successfully!

**Results**:
- ✅ Stage 1 ([id]): [Result]
- ✅ Stage 2 ([id]): [Result]
- ...

**Verification Results**:
- ✅ Tests: [X] passed, 0 failed
- ✅ Build: Success
- ✅ Commits: [X] commits created

**Artifacts**:
- Files created: [list]
- PRDs generated: [list]
- Test results: [summary]

When Factory Fails

Factory "[name]" failed at stage [stage_id].

**Error**: [Error message]

**Verification Failed**:
- Expected: [requirements]
- Actual: [results]

**Logs**: `.ralph/factory/runs/run-TIMESTAMP/execution.log`

**Next steps**:
1. Review stage output in logs
2. Fix the issue
3. Resume: `ralph factory resume [name]`

Checklist Before Running Factory

Each stage has unique id
Dependencies specified with depends_on
Critical stages have verify blocks
Verification thresholds match actual test counts
Conditions use correct syntax: {{ expression }}
Loops have max_loops to prevent infinite execution
Variables defined before use
Commands properly escaped in YAML
Stage types are correct (prd|plan|build|custom|factory)

Summary

Factory Mode = Declarative Pipelines + Tamper-Proof Verification

Key Takeaways:

Always use verification gates - Don't trust agent output
Use test_suite verifier for test validation
Set max_loops on recursive stages
Use depends_on for proper ordering
Check git commits, test results, files - not text output
Follow the examples in .ralph/factory/ directory

Remember: Verification gates are what make Factory Mode powerful. Without them, agents can claim success without doing work. With them, you have proof-of-work that cannot be faked.

FilesExpand file tree

AGENT_GUIDE.md

Latest commit

History