Skip to content

Commit b7475f0

Browse files
feat(scouts): isolated scout fleet with hooks, intra-scout subagents, knowledge index, and fleet ops (#46)
* feat: scouts — fleet of focused discovery watchers Adds a `ralph scout` command that runs a fleet of `watch` discovery loops, one per focused interest area, with namespaced results and per-scout manifests. Each scout is an isolated sensor for one topic cluster: no cross-contamination between, say, ai-security and ai-eval. New surface: - `ralph scout run [--name <n>]` — deploy one or all scouts - `ralph scout ls` — list configured scouts - `ralph scout results [name]` — recent runs, per scout or fleet-wide - `ralph scout init <name>` — scaffold a new scout with an empty manifest Layout: - `scouts/<name>/manifest.json` per scout (same shape as `watch-manifest.json`, with its own topics/languages/thresholds) - `results/<scout>/pw-YYYYMMDD-HHmm/` — namespaced run output - Five scouts seeded: agent-sdks, ai-eval, ai-security, computer-use, inference-infra Under the hood: - `SCOUTS_DIR` constant in `src/utils/paths.ts` - `runWatch` accepts `manifestPath` and `scoutName` options; results namespacing flows from `scoutName` - `watch` gains `--manifest` and `--scout` flags so it can be pointed at a per-scout manifest directly - `readManifest`, `listRuns` exported for the scout command - Steering gets a manifest-lifecycle section documenting auto-add, freshness tracking, pruning suggestions, and the discoveryLog cap Gitignore the 105MB compiled `/ralph-for-kiro` binary so it doesn't sneak into commits; `results/` and `watch-manifest.json` were already ignored. Verified: typecheck, lint (biome 2.4.12), 76/76 tests. * feat(hooks): agentSpawn + stop lifecycle hooks with per-turn sidecars Wires Kiro CLI's `agentSpawn` and `stop` hooks (added in Kiro 1.24) into every Ralph agent config so each turn drops a structured sidecar under the run's `iterations/` directory alongside the session-derived markdown. Why: the current runner relies on `<promise>COMPLETE</promise>` in Kiro's SQLite session history to detect turn end. That works, but leaves no per-turn audit trail when a run stalls — the `pw-20260331-0300` run we dug up earlier had `status: running` and iteration 0 with no way to tell whether kiro-cli ever spawned an agent at all. The hook sidecars close that gap without replacing the existing completion check. Contract for hook scripts — passed via child-process env vars, not stdin, because the hook's stdin-JSON payload shape varies across Kiro versions: RALPH_RUN_DIR absolute path to `results/<scout>/pw-*` or `results/pw-*` (for non-scout watches) RALPH_ITERATION 1-based iteration this kiro-cli invocation runs RALPH_SCOUT_NAME scout name, empty string for non-scout watches Scripts: - `.kiro/hooks/on-agent-spawn.sh` — writes `iterations/NN-spawn.json` at turn start - `.kiro/hooks/on-stop.sh` — writes `iterations/NN-turn.json` at turn end Implementation: - `KiroClient.runChat(prompt, hookEnv?)` layers RALPH_* onto `process.env` and spawns kiro-cli with the merged env - `LoopConfig` gains nullable `runDir` + `scoutName` fields - `runWatch` threads `resultsPath` + `scoutName` into the loop config - `installHookScripts()` helper stamps the two hook scripts chmod 0o755 under `.kiro/hooks/`; both `ralph init` and `ralph watch init` call it so new installs ship with hooks - Hook scripts bundled as text via Bun's loader; `*.sh` module decl added to types Tests: 8 new in tests/hooks.test.ts covering env build, script installation + idempotency + executable bits, and LoopConfig round-trip for runDir/scoutName. 84/84 pass, typecheck clean, biome clean. * feat(scouts): per-scout .kiro/ tree with OS-process isolation Each scout now gets its own `.kiro/` tree under `scouts/<name>/.kiro/` — agents, steering, hooks, settings, session history — and kiro-cli is spawned with cwd = `scouts/<name>/` so Kiro picks up the per-scout config instead of the shared repo-root one. Scouts on divergent topics (ai-security, ai-eval, ...) cannot cross-contaminate because they never share a session, a steering doc, or a SQLite DB. Changes: - `ensureScoutKiroTree(scoutDir)` helper (src/core/scout-init.ts) stamps the per-scout tree idempotently on every run, reusing the existing agent JSON + hook installer. Agent config is re-written on each run so bumps propagate; steering is preserved once customized. - `LoopConfig` gains `scoutCwd` (nullable). When set, the loop runner passes it through to `KiroClient.runChat` as the spawn cwd. - `KiroClient.runChat` signature widened from (prompt, hookEnv?) to (prompt, { hookEnv?, cwd? }) — the subprocess cwd override goes through this path. - `runWatch` detects scout runs, resolves an absolute results path (hooks must write there regardless of subprocess cwd), calls `ensureScoutKiroTree`, and passes `scoutCwd` into the loop config. Biome note: turned off `complexity/useLiteralKeys` in biome.json because it conflicts with TS strict's `noPropertyAccessFromIndexSignature` on `NodeJS.ProcessEnv`. TS strict is the authority. Tests: 2 new in tests/scout-init.test.ts covering tree shape, executable bits, agent/steering content, and idempotency under user customization. 86/86 pass. * feat(scouts): intra-scout probe-topic subagent for parallel per-topic dives Inside a scout, the project-watcher agent now delegates per-topic deep dives to a new probe-topic subagent via the `use_subagent` tool's `InvokeSubagents` command — one subagent invocation per manifest topic, all in parallel. Each probe runs in isolated context and writes a single Markdown report to `<run-dir>/probes/<topic>.md`. The parent watcher reads probe outputs and owns the synthesis step (summary.md, manifest updates). Cross-scout contamination stays impossible because the probe-topic subagent only gets spawned from a scout's kiro-cli subprocess, which is already isolated by cwd under `scouts/<name>/`. Subagents share the parent's session — but the parent session only sees one scout's manifest because of M2a's process boundary. Config changes: - src/data/probe-topic.json — new subagent config, narrow allowedTools (read, @brave-search, @Tavily, @exa), points at probe-topic.md steering - src/data/probe-topic.md — subagent steering: one-topic, read-only-manifest, structured report contract - src/data/project-watcher.json and .kiro/agents/project-watcher.json gain availableAgents=["probe-*"] and trustedAgents=["probe-topic"] (Kiro 1.25 glob whitelist — nothing else can be spawned) - src/data/watcher-context.md documents the fan-out pattern with a copy-pasteable use_subagent InvokeSubagents example; skip when topic count ≤ 1 Implementation: - ensureScoutKiroTree() now stamps probe-topic.{json,md} into each scout's .kiro/{agents,steering}/ alongside project-watcher. Steering is preserved on user customization; agent config is re-written to propagate defaults. Grounded the schema via `kiro-cli chat --no-interactive` tool introspection — the real tool is `use_subagent` with {command:"InvokeSubagents", content:{subagents:[{query, agent_name?}]}}; agent_name matches the `name` field of the agent JSON. Tests: scout-init.test.ts extended to assert probe-topic config + steering land in the scout's .kiro/, and that the parent's availableAgents/trustedAgents glob whitelist is present. 86/86 pass. * feat(scouts): per-scout knowledge-base index + scout-cwd-relative resources Each scout's project-watcher agent now gets a `scout-history` knowledge-base `resources` entry scoped to `results/<this-scout>/**/summary.md` — so prior findings feed the retrieval layer but an `ai-security` scout can't see `ai-eval`'s history, and vice versa. Bright line stays at the scout boundary. Also fixes the `resources` paths. The default agent config ships with paths relative to the repo-root cwd (`file://../../watch-manifest.json`, `file://../ralph-loop.local.json`). When scout isolation puts kiro-cli in `scouts/<name>/`, those are wrong. scout-init rewrites them to `file://manifest.json` and `file://.kiro/ralph-loop.local.json` before stamping the per-scout config, and the repo-root fallback keeps its old paths intact. Steering: adds a short section documenting Kiro 1.26 `@path` inline refs as the cheaper alternative to `fs_read` for known files (`@manifest.json`, `@.kiro/ralph-loop.local.json`), and explains the `scout-history` knowledge resource as the natural-language retrieval path for past iterations. Decision deferred: global compaction settings (`compaction.excludeContextWindowPercent` etc.) live in `~/.kiro/settings/cli.json` per Kiro's 1.24 docs — they're a user-home setting, not an agent-config field, so they can't be isolated per scout via cwd. Out of scope for the wrapper; users who want them run `kiro-cli settings compaction.excludeContextWindowPercent 20` once at their home level. Tests: scout-init.test.ts asserts the knowledge-base entry shape, type="knowledgeBase", and that `include` is scoped to the scout's own results/ subtree. 86/86 pass. * feat(scouts): fleet ops — parallel run, status, tail Three fleet-level commands for operating multiple isolated scouts: - `ralph scout run --concurrency N` — drain pattern with N parallel workers, each pulling the next scout off a shared queue and running it to completion before pulling the next. One scout's failure is caught per-item so the fleet doesn't abort. Default concurrency=1 preserves the old sequential behavior. - `ralph scout status` — one line per scout showing the latest run's task ID, status, iteration count, wall-clock duration, and repos-discovered count. Fleet-wide glance. - `ralph scout tail <name>` — follow the in-flight run of a named scout by polling the `iterations/` directory for new hook sidecars (NN-spawn.json / NN-turn.json from M1). Prints a timestamped line per new sidecar, and stops automatically when `status.json` flips out of `running`. Configurable poll interval via `--interval <ms>` (default 2000). Each parallel worker still runs its scout as a separate kiro-cli child process with its own cwd under `scouts/<name>/` (M2a), so concurrency does not compromise isolation — each worker is a fully independent OS process. Tests: 4 new in tests/scout-concurrency.test.ts covering the drain pattern — sequential timing at concurrency=1, parallel speedup at concurrency=N, one-failure-doesn't-abort invariant, and input-order preservation in results regardless of finish order. 90/90 pass. * fix(tests): drop TOCTOU access+stat pattern flagged by CodeQL CodeQL's js/file-system-race rule fires when a test calls access(p) then opens/stats the same path — the file may be deleted between the two calls. Replaced with a single stat() that throws ENOENT if the file doesn't exist, which fails the test cleanly without the race. * fix(tests): use Bun.file() for single-handle reads to clear CodeQL TOCTOU CodeQL's js/file-system-race fires whenever the same path is passed to two separate fs calls (stat + readFile, access + stat, ...). The tests legitimately want to check mode AND content of a file the test itself just wrote. Switching to Bun.file(path) gives a single handle from which we can derive stat() and text()/json(), which satisfies the rule without weakening the assertions.
1 parent 86d93c3 commit b7475f0

34 files changed

Lines changed: 1742 additions & 47 deletions

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -642,3 +642,6 @@ watch-manifest.json
642642
.kiro/settings/mcp.json
643643
.kiro/ralph-loop.local.json
644644
.kiro/ralph-session.json
645+
646+
# Compiled single-file binary (105MB) from `bun build --compile`
647+
/ralph-for-kiro

.kiro/agents/project-watcher.json

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,22 @@
11
{
2-
"name": "project-watcher",
3-
"description": "Iterative deep research agent for discovering trending GitHub repos matching watched topics and languages",
4-
"tools": [
5-
"read",
6-
"write",
7-
"@brave-search",
8-
"@tavily",
9-
"@exa"
10-
],
11-
"allowedTools": [
12-
"*"
13-
],
14-
"includeMcpJson": true,
15-
"resources": [
16-
"file://../../watch-manifest.json",
17-
"file://../ralph-loop.local.json"
18-
],
19-
"prompt": "file://../steering/watcher-context.md"
2+
"name": "project-watcher",
3+
"description": "Iterative deep research agent for discovering trending GitHub repos matching watched topics and languages. Fans out per-topic deep dives to the probe-topic subagent.",
4+
"tools": ["read", "write", "@brave-search", "@tavily", "@exa"],
5+
"allowedTools": ["*"],
6+
"includeMcpJson": true,
7+
"resources": [
8+
"file://../../watch-manifest.json",
9+
"file://../ralph-loop.local.json"
10+
],
11+
"prompt": "file://../steering/watcher-context.md",
12+
"availableAgents": ["probe-*"],
13+
"trustedAgents": ["probe-topic"],
14+
"hooks": {
15+
"agentSpawn": [
16+
{ "command": ".kiro/hooks/on-agent-spawn.sh" }
17+
],
18+
"stop": [
19+
{ "command": ".kiro/hooks/on-stop.sh" }
20+
]
21+
}
2022
}

.kiro/hooks/on-agent-spawn.sh

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
#!/usr/bin/env bash
2+
# Kiro `agentSpawn` hook — fired by kiro-cli when the agent starts a new turn.
3+
#
4+
# ralph-for-kiro sets these env vars on the subprocess before invoking kiro-cli
5+
# chat, so hooks always know where to write artifacts without parsing the
6+
# hook's JSON stdin payload:
7+
#
8+
# RALPH_RUN_DIR Absolute path to this run's `results/<scout>/pw-*/`
9+
# directory. Always exists.
10+
# RALPH_ITERATION 1-based iteration number for this kiro-cli invocation.
11+
# RALPH_SCOUT_NAME Scout name (empty for non-scout watch runs).
12+
#
13+
# The hook writes an iteration-marker JSON so the runner never has to grep
14+
# stdout or re-read Kiro's SQLite DB to know a turn started.
15+
16+
set -euo pipefail
17+
18+
RUN_DIR="${RALPH_RUN_DIR:-}"
19+
ITERATION="${RALPH_ITERATION:-0}"
20+
21+
if [[ -z "${RUN_DIR}" ]]; then
22+
exit 0
23+
fi
24+
25+
mkdir -p "${RUN_DIR}/iterations"
26+
27+
# Pad to 2 digits so `ls iterations/` sorts chronologically.
28+
ITER_PADDED="$(printf '%02d' "${ITERATION}")"
29+
MARKER="${RUN_DIR}/iterations/${ITER_PADDED}-spawn.json"
30+
31+
cat > "${MARKER}" <<JSON
32+
{
33+
"iteration": ${ITERATION},
34+
"spawnedAt": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
35+
"scout": "${RALPH_SCOUT_NAME:-}",
36+
"cwd": "$(pwd)"
37+
}
38+
JSON

.kiro/hooks/on-stop.sh

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
#!/usr/bin/env bash
2+
# Kiro `stop` hook — fired when the agent signals the turn is complete.
3+
#
4+
# See on-agent-spawn.sh for the RALPH_* env-var contract. This hook writes a
5+
# per-turn `NN-turn.json` sidecar with a simple "turn ended" marker the loop
6+
# runner can watch for. Kiro versions vary in what they pipe on stdin for this
7+
# hook; we intentionally do not parse it — the env vars are enough.
8+
9+
set -euo pipefail
10+
11+
RUN_DIR="${RALPH_RUN_DIR:-}"
12+
ITERATION="${RALPH_ITERATION:-0}"
13+
14+
if [[ -z "${RUN_DIR}" ]]; then
15+
exit 0
16+
fi
17+
18+
mkdir -p "${RUN_DIR}/iterations"
19+
20+
ITER_PADDED="$(printf '%02d' "${ITERATION}")"
21+
MARKER="${RUN_DIR}/iterations/${ITER_PADDED}-turn.json"
22+
23+
cat > "${MARKER}" <<JSON
24+
{
25+
"iteration": ${ITERATION},
26+
"completedAt": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
27+
"scout": "${RALPH_SCOUT_NAME:-}"
28+
}
29+
JSON

.kiro/steering/watcher-context.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,10 +150,55 @@ Trends and patterns observed across discoveries.
150150
## Recommended for Watch List
151151
Repos that meet auto-add thresholds.
152152

153+
## Pruning Suggestions
154+
Repos with `source: "discovered"` where `lastSeen` > 90 days ago.
155+
153156
## Open Questions
154157
Leads for future discovery runs.
155158
```
156159

160+
## Manifest Lifecycle — IMPORTANT
161+
162+
You are responsible for keeping `watch-manifest.json` up to date. On your FINAL iteration:
163+
164+
### Auto-Add (Growth)
165+
When you discover a repo that exceeds the `autoAddStars` threshold AND is not already in the `watch` array:
166+
- Add it to the `watch` array with these fields:
167+
- `repo`: owner/name
168+
- `added`: today's date (YYYY-MM-DD)
169+
- `source`: "discovered"
170+
- `tags`: relevant topic tags from the manifest
171+
- `lastSeen`: today's date
172+
- `discoveryCount`: 1
173+
- `runId`: the task ID from your prompt
174+
175+
### Freshness Tracking
176+
For every repo already in the `watch` array that you encounter in search results:
177+
- Update `lastSeen` to today's date
178+
- Increment `discoveryCount` by 1
179+
180+
### Pruning Suggestions
181+
For repos where `source` is `"discovered"` and `lastSeen` is more than 90 days ago:
182+
- Do NOT remove them
183+
- List them in the "Pruning Suggestions" section of `summary.md` with reasoning
184+
- Manual entries (`source: "manual"`) are NEVER flagged
185+
186+
### Discovery Log
187+
Append an entry to the `discoveryLog` array in the manifest:
188+
```json
189+
{
190+
"runId": "TASK_ID",
191+
"date": "YYYY-MM-DD",
192+
"reposAdded": N,
193+
"reposUpdated": N,
194+
"pruningSuggested": N
195+
}
196+
```
197+
Cap the `discoveryLog` at 50 entries (remove oldest if needed).
198+
199+
### Writing the Manifest
200+
After making changes, write the updated `watch-manifest.json` back to the project root. Preserve existing manual entries exactly as they are.
201+
157202
## Completion Rules
158203

159204
- **Min iterations (3):** Do NOT signal completion before iteration 3

biome.json

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,10 @@
88
"linter": {
99
"enabled": true,
1010
"rules": {
11-
"recommended": true
11+
"recommended": true,
12+
"complexity": {
13+
"useLiteralKeys": "off"
14+
}
1215
}
1316
},
1417
"assist": { "actions": { "source": { "organizeImports": "on" } } }

scouts/agent-sdks/manifest.json

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
{
2+
"version": "1.0",
3+
"topics": ["ai-agents", "mcp", "llm-tooling", "developer-tools", "strands-agents"],
4+
"languages": ["python", "typescript", "rust"],
5+
"watch": [],
6+
"thresholds": {
7+
"minStars": 50,
8+
"autoAddStars": 500,
9+
"maxAgeDays": 30
10+
},
11+
"discoveryLog": []
12+
}

scouts/ai-eval/manifest.json

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
{
2+
"version": "1.0",
3+
"topics": ["ai-evaluation", "llm-benchmarks", "agent-testing", "prompt-testing", "red-teaming", "rag-evaluation"],
4+
"languages": ["python", "typescript"],
5+
"watch": [
6+
{
7+
"repo": "confident-ai/deepeval",
8+
"added": "2026-03-23",
9+
"source": "manual",
10+
"tags": ["ai-evaluation", "llm-benchmarks", "pytest"]
11+
},
12+
{
13+
"repo": "explodinggradients/ragas",
14+
"added": "2026-03-23",
15+
"source": "manual",
16+
"tags": ["rag-evaluation", "ai-evaluation"]
17+
},
18+
{
19+
"repo": "promptfoo/promptfoo",
20+
"added": "2026-03-23",
21+
"source": "manual",
22+
"tags": ["prompt-testing", "red-teaming", "ai-evaluation"]
23+
},
24+
{
25+
"repo": "braintrustdata/braintrust-sdk",
26+
"added": "2026-03-23",
27+
"source": "manual",
28+
"tags": ["ai-evaluation", "llm-benchmarks"]
29+
}
30+
],
31+
"thresholds": {
32+
"minStars": 50,
33+
"autoAddStars": 300,
34+
"maxAgeDays": 60
35+
},
36+
"discoveryLog": []
37+
}

scouts/ai-security/manifest.json

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
{
2+
"version": "1.0",
3+
"topics": ["ai-safety", "prompt-injection", "llm-guardrails", "ai-red-teaming", "agent-security", "ai-alignment"],
4+
"languages": ["python", "typescript", "rust"],
5+
"watch": [
6+
{
7+
"repo": "NVIDIA/NeMo-Guardrails",
8+
"added": "2026-03-23",
9+
"source": "manual",
10+
"tags": ["llm-guardrails", "ai-safety"]
11+
},
12+
{
13+
"repo": "guardrails-ai/guardrails",
14+
"added": "2026-03-23",
15+
"source": "manual",
16+
"tags": ["llm-guardrails", "ai-safety"]
17+
},
18+
{
19+
"repo": "rebuff-ai/rebuff",
20+
"added": "2026-03-23",
21+
"source": "manual",
22+
"tags": ["prompt-injection", "ai-safety"]
23+
}
24+
],
25+
"thresholds": {
26+
"minStars": 30,
27+
"autoAddStars": 200,
28+
"maxAgeDays": 60
29+
},
30+
"discoveryLog": []
31+
}

scouts/computer-use/manifest.json

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
{
2+
"version": "1.0",
3+
"topics": ["browser-automation", "computer-use", "desktop-agents", "gui-grounding", "web-agents", "screen-understanding"],
4+
"languages": ["python", "typescript", "rust"],
5+
"watch": [
6+
{
7+
"repo": "anthropics/anthropic-quickstarts",
8+
"added": "2026-03-23",
9+
"source": "manual",
10+
"tags": ["computer-use", "desktop-agents"]
11+
},
12+
{
13+
"repo": "browser-use/browser-use",
14+
"added": "2026-03-23",
15+
"source": "manual",
16+
"tags": ["browser-automation", "web-agents"]
17+
},
18+
{
19+
"repo": "anthropics/claude-computer-use-demo",
20+
"added": "2026-03-23",
21+
"source": "manual",
22+
"tags": ["computer-use", "desktop-agents"]
23+
}
24+
],
25+
"thresholds": {
26+
"minStars": 50,
27+
"autoAddStars": 500,
28+
"maxAgeDays": 45
29+
},
30+
"discoveryLog": []
31+
}

0 commit comments

Comments
 (0)