Comprehensive guide to testing Ralph CLI components
- Test Organization
- Running Tests
- UI Testing with agent-browser
- Common Test Scenarios
- Writing New Tests
- Best Practices
All test files MUST be in /tests directory. This is a strict requirement.
tests/
├── *.mjs # Integration and E2E tests
│ ├── cli-smoke.mjs # CLI smoke tests
│ ├── agent-loops.mjs # Agent loop behavior
│ ├── agent-ping.mjs # Agent health checks
│ ├── integration.mjs # Main integration suite
│ ├── integration-actions.mjs # Actions integration
│ ├── integration-checkpoint.mjs # Checkpoint system
│ ├── integration-doctor.mjs # Doctor command
│ ├── integration-metrics.mjs # Metrics collection
│ ├── integration-notify.mjs # Notification system
│ ├── integration-risk.mjs # Risk analysis
│ ├── integration-switcher.mjs # Agent switcher
│ ├── integration-ui-api.mjs # UI API integration
│ ├── integration-watch.mjs # File watching
│ ├── e2e-workflow.mjs # End-to-end workflows
│ ├── real-agents.mjs # Real agent execution
│ └── lib-python.mjs # Python library tests
│
├── test-*.js # Unit tests
│ ├── test-analyzer.js # Code analyzer
│ ├── test-committer.js # Git committer
│ ├── test-complexity.js # Complexity analysis
│ ├── test-context-budget.js # Context budget
│ ├── test-context-directives.js # Context directives
│ ├── test-context-scorer.js # Context scoring
│ ├── test-context-selector.js # Context selection
│ ├── test-context-visualization.js # Context visualization
│ ├── test-error-handling.js # Error handling
│ ├── test-executor.js # Story executor
│ ├── test-executor-us003.js # Specific user stories
│ ├── test-git-fallback.js # Git fallback
│ ├── test-merger.js # Branch merger
│ ├── test-parallel-index.js # Parallel execution
│ ├── test-realistic-scenarios.js # Realistic workflows
│ ├── test-risk-analyzer.js # Risk analyzer
│ ├── test-token-usage.js # Token usage tracking
│ └── test-with-anthropic-api.js # Anthropic API integration
│
├── fixtures/ # Test fixtures and sample data
├── helpers/ # Test utility functions
└── mocks/ # Mock implementations
✅ DO:
- Place ALL test files in
/testsdirectory - Use
.mjsextension for integration and E2E tests - Use
test-*.jsnaming pattern for unit tests - Use subdirectories (
fixtures/,helpers/,mocks/) for supporting files - Keep test file names descriptive and consistent
❌ DON'T:
- Place test files in
/lib,/bin, or any source directory - Mix test files with production code
- Use inconsistent naming conventions
- Create test files in the project root
# Smoke tests - fast validation
npm test
# Agent health check
npm run test:ping# All integration tests
npm run test:all
# Specific integration tests
npm run test:checkpoint # Checkpoint system
npm run test:switcher # Agent switching
npm run test:risk # Risk analysis
npm run test:actions # Actions workflow
npm run test:notify # Notifications
npm run test:metrics # Metrics collection
npm run test:doctor # Doctor diagnostics
npm run test:watch # File watching
npm run test:ui-api # UI API# End-to-end workflow
npm run test:e2e
# Real agent execution (requires configured agents)
npm run test:real
# With coverage reporting
npm run test:coverage
# Integration tests with environment flag
RALPH_INTEGRATION=1 npm test- Smoke Tests (
*.mjs) - Quick validation, no real agent needed - Integration Tests (
integration-*.mjs) - Multiple components, may require mock/real agents - Unit Tests (
test-*.js) - Isolated module tests - E2E Tests (
e2e-*.mjs) - Full workflow simulations - Real Agent Tests - Execute against actual Claude/Codex/Droid agents (requires API keys)
Vercel's agent-browser - Fast Rust-based CLI for browser automation, optimized for AI agents.
Why agent-browser?
- ✅ Fast & reliable: Rust CLI with Node.js fallback
- ✅ AI-optimized: Snapshot + ref workflow (
@e1,@e2) for deterministic selection - ✅ No buggy MCP: Standalone CLI tool
- ✅ Persistent sessions: Isolated browser instances with cookies/storage
- ✅ JSON output: Machine-readable results
# Install agent-browser
npm install -g agent-browser
agent-browser install # Downloads Chromium
# Start UI server
cd ui && npm run dev
# Initialize browser session
agent-browser open http://localhost:3000# Open a URL
agent-browser open http://localhost:3000
# Go back/forward
agent-browser back
agent-browser forward
# Reload page
agent-browser reload
# Take snapshot (see all interactive elements)
agent-browser snapshot -i # Interactive elements only
agent-browser snapshot -c # Compact format
agent-browser snapshot # Full snapshot
# Take screenshot
agent-browser screenshot page.png
agent-browser screenshot --full page.png # Full page scroll
# Get page title
agent-browser eval "document.title"
# Get current URL
agent-browser eval "window.location.href"# Click elements
agent-browser click @e1 # Use @eN ref from snapshot
agent-browser click "button:has-text('Start')" # CSS selector
agent-browser click "[role=button]" # Attribute selector
# Type text
agent-browser type @e17 "PRD-67"
agent-browser fill @e17 "PRD-67" # Same as type
# Press keys
agent-browser press Enter
agent-browser press Escape
agent-browser press "Control+a"
# Select dropdown
agent-browser select @e12 "Codex" # Select by value/text# Get text content
agent-browser get text @e1
agent-browser get text "h1"
# Get attribute value
agent-browser get attribute @e1 "href"
agent-browser get attribute "button" "disabled"
# Check visibility
agent-browser is visible @e1
agent-browser is visible "button:has-text('Start Build')"
# Check if element exists
agent-browser find "button:has-text('Start Build')"
# Get all matching elements
agent-browser find-all ".stream-card"# Run JavaScript
agent-browser eval "document.querySelectorAll('.stream-card').length"
agent-browser eval "localStorage.getItem('theme')"
# Wait for element
agent-browser wait-for "text=Build completed"
agent-browser wait-for "[data-status='running']"
# Check console messages
agent-browser console
agent-browser errors
# Network activity
agent-browser network requests
agent-browser network responses# Navigate to dashboard
agent-browser open http://localhost:3000
agent-browser click @e1 # Click "Press Enter"
agent-browser click @e1 # Click "Back to Dashboard"
# Verify elements are visible
agent-browser snapshot -i
agent-browser is visible "button:has-text('Start Build')"
agent-browser is visible "[data-testid='stream-select']"
# Check for errors
agent-browser console
agent-browser errors# Get available streams
agent-browser get text @e18 # Stream listbox
# Select a specific stream
agent-browser click @e18 # Open dropdown
agent-browser type "PRD-67" # Type to search
agent-browser press Enter # Select
# Verify selection
agent-browser get text "[data-testid='selected-stream']"# Set iterations
agent-browser fill @e11 "5" # Iterations spinbutton
# Select agent
agent-browser click @e12 # Open agent dropdown
agent-browser click @e14 # Select "Codex"
# Toggle dry run
agent-browser click @e20 # Dry run checkbox
# Verify form state
agent-browser get attribute @e20 "checked"
agent-browser get text @e12 # Selected agent# Navigate to Streams page
agent-browser click @e3 # Streams link
agent-browser snapshot -i # See stream cards
# Navigate to Logs page
agent-browser click @e5 # Logs link
agent-browser snapshot -i # See log viewer
# Navigate to Documentation
agent-browser click @e4 # Documentation link
agent-browser snapshot -i # See docs# Navigate to Streams page
agent-browser click @e3
# Take snapshot to find buttons
agent-browser snapshot -i
# Click "Monitor" for first stream
agent-browser click @e13
# Verify modal/page opened
agent-browser snapshot -i
# Close modal (if applicable)
agent-browser press Escape# Go to Streams page
agent-browser click @e3
# Find search input
agent-browser snapshot -i
# Search for specific PRD
agent-browser type @e17 "PRD-67"
# Verify filtered results
agent-browser eval "document.querySelectorAll('.stream-card').length"# Open dashboard
agent-browser open http://localhost:3000
# Start a build in another terminal:
# ralph build 1 --prd=67
# Watch for status updates
agent-browser wait-for "text=running"
agent-browser wait-for "[data-status='running']"
# Check real-time progress
agent-browser get text "[data-testid='build-status']"
# Take screenshot
agent-browser screenshot build-running.png# Try to build without selecting stream
agent-browser click "button:has-text('Start Build')"
# Check for error message
agent-browser wait-for "text=Please select"
agent-browser snapshot -i
# Check console for errors
agent-browser errors# Navigate to Logs
agent-browser click @e5
# Wait for logs to load
agent-browser wait-for "[data-testid='log-entries']"
# Get log count
agent-browser eval "document.querySelectorAll('[data-log-entry]').length"
# Filter logs
agent-browser fill "[data-testid='log-filter']" "ERROR"
# Verify filtered results
agent-browser snapshot -i# Navigate to Tokens page
agent-browser click @e6
# Verify cost data loads
agent-browser wait-for "text=$" # Wait for cost to appear
agent-browser snapshot -i
# Get total cost
agent-browser get text "[data-testid='total-cost']"
# Check chart renders
agent-browser is visible "canvas" # Chart.js renders to canvas# Open headed browser for visual debugging
BROWSER_HEADLESS=false agent-browser open http://localhost:3000
# Slow down actions for observation
BROWSER_SLOW_MO=500 agent-browser click @e1
# Keep browser open after script
BROWSER_KEEP_ALIVE=true agent-browser open http://localhost:3000
# Verbose output
DEBUG=* agent-browser open http://localhost:3000
# Save HTML for inspection
agent-browser eval "document.documentElement.outerHTML" > page.html# Quick snapshot of homepage
.agents/ralph/test-ui.sh snapshot
# Test PRD list page (automated)
.agents/ralph/test-ui.sh test-list
# Test logs page (automated)
.agents/ralph/test-ui.sh test-logs
# Interactive mode (opens headed browser)
.agents/ralph/test-ui.sh interactive
# Clean up browser session
.agents/ralph/test-ui.sh cleanup
# Custom UI URL
UI_URL=http://localhost:8080 .agents/ralph/test-ui.sh snapshotThe Ralph UI server uses RALPH_ROOT environment variable:
Production mode (default):
# Uses parent directory's .ralph/ (ralph-cli/.ralph)
cd ui && npm run devTest mode:
# Uses ui/.ralph/ for isolated testing
cd ui && npm run dev:testCustom RALPH_ROOT:
# Point to any .ralph directory
RALPH_ROOT=/path/to/.ralph npm run devAlways use /tests directory.
.mjsfor integration/E2E tests.jsfor unit tests
- Integration:
integration-feature-name.mjs - Unit:
test-component-name.js - E2E:
e2e-workflow-name.mjs
If adding new npm scripts, update package.json:
{
"scripts": {
"test:my-feature": "node tests/integration-my-feature.mjs"
}
}Add comments for complex test logic:
// Test PRD status detection with direct-to-main workflow
// This verifies git commits are used as source of truth, not checkboxes
test('detects completed PRDs via git log', async () => {
// ... test logic
});- Isolation - Tests should not depend on each other
- Cleanup - Clean up any created files/state after tests
- Fast - Keep unit tests fast; use mocks when possible
- Descriptive - Use clear test names and assertions
- Maintainable - Keep tests simple and focused
- Documented - Add comments for complex test logic
- Always snapshot first - Use
agent-browser snapshot -ito see page state - Use semantic selectors - Prefer
button:has-text('Start')over brittle@eNrefs - Add delays - Give dynamic content time to load (
sleep 2orwait-for) - Check console errors - Run
agent-browser errorsafter interactions - Take screenshots - Visual evidence:
agent-browser screenshot test.png - Test unhappy paths - Try invalid inputs, missing data, error states
- Verify state changes - Check text/attributes after actions
- Clean up - Close browser sessions when done
- Use scripts - Automate repetitive tests
- Document findings - Save screenshots and error logs
❌ DON'T:
- Rely on element refs (
@eN) - they change on every snapshot - Assume instant loads - elements may not be ready
- Ignore console errors - they indicate real issues
- Test only with mouse - users also use keyboard
- Skip visual verification - some bugs are visual only
✅ DO:
- Use semantic selectors like
button:has-text('Start Build') - Use
wait-foror addsleepdelays for async content - Check
agent-browser errorsafter each interaction - Test keyboard navigation with
press Tab,press Enter - Take screenshots and compare against expected states
Create test-ui.sh:
#!/bin/bash
set -e
echo "🧪 Testing Ralph UI..."
# 1. Navigate to homepage
echo "1. Loading homepage..."
agent-browser open http://localhost:3000
agent-browser click @e1 # Press Enter
# 2. Go to dashboard
echo "2. Navigating to dashboard..."
agent-browser click @e1 # Back to Dashboard
# 3. Verify elements
echo "3. Verifying dashboard elements..."
agent-browser snapshot -i > /tmp/dashboard-snapshot.txt
grep -q "Start Build" /tmp/dashboard-snapshot.txt || echo "❌ Start Build button missing"
grep -q "Stream" /tmp/dashboard-snapshot.txt || echo "❌ Stream selector missing"
# 4. Test navigation
echo "4. Testing navigation..."
agent-browser click @e3 # Streams
sleep 1
agent-browser click @e5 # Logs
sleep 1
agent-browser click @e2 # Back to Dashboard
# 5. Check for errors
echo "5. Checking console errors..."
ERRORS=$(agent-browser errors)
if [ -n "$ERRORS" ]; then
echo "❌ Console errors found:"
echo "$ERRORS"
else
echo "✅ No console errors"
fi
# 6. Take final screenshot
echo "6. Taking screenshot..."
agent-browser screenshot test-result.png
echo "✅ All tests passed!"Run it:
chmod +x test-ui.sh
./test-ui.shWatch UI in real-time during builds:
#!/bin/bash
# watch-ui.sh
while true; do
clear
echo "=== Ralph UI Status ==="
agent-browser snapshot -c | head -20
agent-browser errors | tail -5
sleep 5
doneRun in split terminal:
# Terminal 1
ralph build 10 --prd=67
# Terminal 2
./watch-ui.shWhen adding new UI features, verify:
- Homepage loads without errors
- Navigation between all pages works
- Dashboard shows all controls
- Stream list loads with data
- Logs page displays entries
- Tokens page shows cost data
- Search/filter functionality works
- Build controls are interactive
- No console errors anywhere
- Screenshots captured for reference
- Keyboard navigation works
- Error states display correctly
- Real-time updates reflect correctly
- Data persists across page reloads
For agent-browser debugging:
BROWSER_HEADLESS=false # Show browser window
BROWSER_SLOW_MO=500 # Slow down actions (ms)
BROWSER_KEEP_ALIVE=true # Keep browser open
DEBUG=* # Verbose logging
UI_URL=http://... # Custom URLAll test files have been migrated to /tests directory:
- Moved from
/lib/metrics/test-git-fallback.js→/tests/test-git-fallback.js - Moved from root
test-*.jsfiles →/tests/test-*.js - Updated import paths to reflect new locations
- Worktree test copies remain in
.ralph/worktrees/(not part of main codebase)
- Testing Cheatsheet - Quick reference
- Voice Guide - Testing voice features
- CLAUDE.md - UI testing section
Last Updated: January 19, 2026