AgentForge is designed for continuous integration workflows. This guide covers exit codes, CI mode, and artifact handling.
AgentForge uses standard exit codes to communicate results:
| Code | Meaning | Description |
|---|---|---|
0 |
Success | All assertions passed |
1 |
Assertion Failure | One or more assertions failed |
2 |
Infrastructure Error | Setup failure, file not found, etc. |
Enable CI mode with the --ci flag or CI=true environment variable:
forge-sim run sim/scenarios/stress.ts --ci --seed 42CI mode changes behavior:
- No colors: Plain text output for log parsing
- Stable run IDs: Uses
<scenario>-ciinstead of timestamps - Strict exit codes: Failures exit immediately with code 1
name: Simulations
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
simulate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install dependencies
run: npm ci
- name: Run simulations
run: npx forge-sim run sim/scenarios/stress.ts --ci --seed 42
- name: Upload artifacts
uses: actions/upload-artifact@v4
if: always()
with:
name: simulation-results
path: sim/results/simulate:
image: node:20
script:
- npm ci
- npx forge-sim run sim/scenarios/stress.ts --ci --seed 42
artifacts:
when: always
paths:
- sim/results/version: 2.1
jobs:
simulate:
docker:
- image: cimg/node:20.0
steps:
- checkout
- run: npm ci
- run: npx forge-sim run sim/scenarios/stress.ts --ci --seed 42
- store_artifacts:
path: sim/resultsRun multiple scenarios in CI:
- name: Run stress tests
run: |
npx forge-sim run sim/scenarios/stress.ts --ci --seed 42
npx forge-sim run sim/scenarios/edge-cases.ts --ci --seed 42Or use the sweep command for seed variation:
- name: Seed sweep
run: npx forge-sim sweep sim/scenarios/stress.ts --seeds 1..50 --ciOr use matrix runs for multi-variant checks:
- name: Variant matrix
run: npx forge-sim matrix sim/scenarios/stress.ts --variants sim/variants.ts --seeds 1..5 --ciUse the compare command to diff runs:
- name: Compare with baseline
run: |
npx forge-sim run sim/scenarios/stress.ts --ci --seed 42 --out sim/results/current
npx forge-sim compare sim/baseline/stress-ci sim/results/current/stress-ciGenerate reports for CI artifacts:
- name: Generate reports
run: |
npx forge-sim run sim/scenarios/stress.ts --ci --seed 42
npx forge-sim report sim/results/stress-ciGenerate static dashboards when you need artifact UIs:
- name: Build dashboard
run: npx forge-sim dashboard sim/results/stress-ciAfter a CI run, artifacts are organized as:
sim/results/
└── stress-ci/
├── summary.json
├── metrics.csv
├── actions.ndjson
├── config_resolved.json
└── report.md
Failed assertions produce exit code 1. To capture failures:
- name: Run simulation
id: simulate
continue-on-error: true
run: npx forge-sim run sim/scenarios/stress.ts --ci --seed 42
- name: Check results
if: steps.simulate.outcome == 'failure'
run: |
echo "Simulation failed - checking assertions"
cat sim/results/stress-ci/summary.json | jq '.failedAssertions'For programmatic processing, use --json:
- name: Run simulation
run: |
npx forge-sim run sim/scenarios/stress.ts --ci --seed 42 --json > result.json
echo "Success: $(jq '.success' result.json)"Run doctor before simulations to verify environment:
- name: Check environment
run: |
npx forge-sim doctor --json > doctor.json
if [ $(jq '.allPassed' doctor.json) != "true" ]; then
echo "Environment check failed"
exit 1
fi- Pin seeds: Always specify seeds for reproducibility
- Upload artifacts: Capture results even on failure
- Use CI mode: Stable naming makes artifact comparison easier
- Generate reports: Include report.md for human review
- Compare runs: Use compare command to detect regressions
- Sweep critical scenarios: Run multiple seeds for confidence
| Variable | Description |
|---|---|
CI |
Set to true to enable CI mode automatically |
FORGE_SIM_OUT |
Default output directory |
AGENTFORGE_AUTONOMOUS_RPC_POLICY |
Exploration RPC policy override (strict or aggressive) |
AGENTFORGE_DISABLE_AUTONOMOUS_RPC |
Disable autonomous exploration RPC when set to 1 |
Add timeout to CI job and check for infinite loops in agents.
Ensure same Node.js version and seed. Check for non-deterministic code in pack or agents.
Large simulations may need more memory:
env:
NODE_OPTIONS: --max-old-space-size=4096