Releases: Q00/ouroboros
v0.42.0
v0.42.0 — A new kernel, frugal-or-frontier, and Claude without the SDK
v0.40 closed the loop, v0.41 let it run anywhere and trust what it ships. v0.42
adds another runtime kernel, lets you choose how hard it thinks — frugal or
frontier, per stage — and runs Claude with no SDK at all. Much of this release's
frugality and usability work grew out of @deepakdgupta1's direction-check
discussions; the credit, and the mapping, are at the bottom.
The headline
- A new runtime kernel — GJC. gajae-code (
gjc) joins Pi, Claude, Codex,
Gemini, OpenCode, Goose, and Copilot as a swappable runtime. Ouroboros stays the
workflow engine; GJC is just another kernel you can drop under it. Select it with
orchestrator.runtime_backend: gjc(orouroboros config), andsetup --runtime gjc
installs the ooo bridge soooo …commands route back through Ouroboros. - Frugal or frontier — your call. A new reasoning-effort dial
(low/medium/high) makes effort, not model family, the primary cost
lever — dial it up where it counts, down where it doesn't. The dead
complexity→tier router is gone, and the model floor for structured-extraction
tasks dropped from Opus to Sonnet. Pick the agent and model per stage in the new
guidedouroboros configGUI, with an effective-config view that shows what
actually runs. - Claude without the SDK. A new ourocode ACP backend drives
ourocode as a subprocess — no Anthropic SDK,
no API key, just your Claude Pro/Max sign-in — for the in-process completion
path (interview / seed / qa / evaluate). If you've wanted Ouroboros on your
Claude subscription instead of an API key, this is the path.
ouroboros config — set the runtime and model per stage (interview → execute →
evaluate → reflect), or one-click a Frugal / Balanced / Frontier preset for
every stage. The effective-config view shows what actually runs, with inheritance
and env-override sources. (Note: GJC is selectable as a stage agent right here —
Interview is running on gjc above.)
What's Changed
Runtimes & Agent OS
- feat(runtime): GJC (gajae-code) runtime — foundation + RPC envelope protocol,
GjcRuntime,GjcLLMAdapter,setup --runtime gjc+ ooo bridge, docs (#1379–#1383); allow GJC backend validation (#1439) - feat(providers): SDK-free Claude via the ourocode ACP backend (#1438)
- feat(config): guided settings GUI + effective-config view — pick agent & model per stage (#1416)
- fix(cli): pi backend switching and quiet dispatch logs (#1362)
Frugality & the effort dial
- feat(providers): reasoning-effort dial — effort-first investment lever (#1435)
- chore(frugality): remove dead tier-router organs + effort-first model defaults (#1434)
- fix(frugality): stop Fable-5-style seed AC over-atomization (#1432)
Reliability — jobs, processes & WAL
- feat(mcp): reconcile zombie jobs whose owning process has died (#1373)
- fix(#1419): terminal jobs leak codex worker + companion shell processes (#1425)
- fix(mcp): stop orphaned servers and reclaim WAL; speed up session list (#1359)
- fix(mcp): detect client death behind uvx, bound shutdown, drain jobs before store close (#1407)
- fix(mcp): keep terminal job results after handle ttl (#1433); mark run results as unevaluated (#1437)
- fix(#1422): honor worktrees for dirty delegated execution (#1428); honor Codex stream timeout overrides (#1387)
Cleanup & refactoring
- chore(cleanup): remove dead execution/secondary/routing packages — −14k LOC (#1410)
- refactor(providers): consolidate cli-stream/child-env/kiro duplication, fix hermes unbounded buffer (#1409)
- refactor(mcp): consolidate tool-layer duplication, fix durable cancel + lost audit rows (#1408)
Config, CLI & orchestrator
- feat(cli): read-only job event polling (#1436); meaningful
job_waitlong-poll mode (#1426) - fix(config): make web port zero choose a free port (#1440); accept prose exit-condition lists (#1431)
- feat(orchestrator): pace non-Claude delivery within a rate budget (#1372); plan delivery fan-out within concurrency limits (#1361); capability contracts — map skills to MCP tools, subagent orchestration, code-investigation + lateral persona metadata (#1365–#1371)
- fix(orchestrator): tighten usage-limit pause classification (#1364); preserve typed error metadata on hermes failures (#1360)
Docs & maintenance
- docs(rfc): frugality control loop, spend estimator, spend actuator (effort dial), atomicity & decomposition, configuration coherence, journey transparency (#1392, #1402–#1406)
- chore(security): refresh & exact-pin deps, add Dependabot, harden release.yml (#1347); dependency & GitHub Actions bumps (#1348–#1358)
Shaped in the open
The frugality and usability arc of this release started as direction-check
discussions from @deepakdgupta1 — failure analysis and acceptance properties
posted before any code — and landed as the RFC series and the work above. Thank
you.
- Theme: token frugality — waste-only, goal-subordinate, learned guardrails → RFC #1403
- Investment case: the complexity→tier mapping behind frugality → RFC #1404 / #1405 — which concluded the answer was not a tier map but an effort dial (#1435)
- Investment case: atomicity & AC decomposition reliability → RFC #1406, and the over-atomization fix (#1432)
- Theme: making Ouroboros more usable — transparency → RFC #1402, journey transparency (#1392), and the
ouroboros configGUI (#1416)
Welcome also to first-time contributors @Yeachan-Heo (the entire GJC runtime
stack), @sergiobuilds, and @deepakdgupta1.
What's Changed
- chore(security): refresh & exact-pin deps, add Dependabot, harden release.yml by @Q00 in #1347
- chore(deps): bump actions/download-artifact from 4 to 8 by @dependabot[bot] in #1348
- chore(deps): bump actions/setup-python from 5 to 6 by @dependabot[bot] in #1350
- chore(deps): bump astral-sh/setup-uv from 4 to 8.1.0 by @dependabot[bot] in #1351
- chore(deps): bump actions/checkout from 4 to 6.0.2 by @dependabot[bot] in #1352
- chore(deps): bump softprops/action-gh-release from 2 to 3 by @dependabot[bot] in #1353
- chore(deps): bump actions/upload-artifact from 4 to 7 by @dependabot[bot] in #1354
- chore(deps-dev): bump types-pyyaml from 6.0.12.20250915 to 6.0.12.20260518 by @dependabot[bot] in #1356
- chore(deps): bump rich from 14.3.3 to 15.0.0 by @dependabot[bot] in #1357
- chore(deps-dev): bump mypy from 1.19.1 to 2.1.0 by @dependabot[bot] in #1358
- fix(orchestrator): preserve typed error metadata on hermes failures (usage-limit pause regression) by @deepakdgupta1 in #1360
- feat(orchestrator): plan delivery fan-out within backend concurrency limits by @deepakdgupta1 in #1361
- fix(mcp): stop orphaned servers and reclaim WAL; speed up session list by @Q00 in #1359
- fix(cli): add pi backend switching and quiet dispatch logs by @Q00 in #1362
- fix(orchestrator): tighten usage-limit pause classification by @Q00 in #1364
- feat(orchestrator): [stack 1/7] map skills to MCP tools by @Q00 in #1365
- feat(backends): [stack 2/7] describe subagent orchestration support by @Q00 in #1366
- fix(evaluation): [stack 3/7] preserve mechanical diagnostic changes by @Q00 in #1367
- feat(orchestrator): [stack 4/7] add owned tool capability contracts by @Q00 in #1368
- feat(mcp): [stack 5/7] emit code-investigation metadata by @Q00 in #1369
- feat(mcp): [stack 6/7] consume lateral persona metadata by @Q00 in #1370
- test(orchestrator): [stack 7/7] cover capability review follow-ups by @Q00 in #1371
- chore(deps): bump codecov/codecov-action from 3 to 6.0.1 by @dependabot[bot] in #1349
- chore(deps): bump the python-minor-patch group across 1 directory with 10 updates by @dependabot[bot] in #1355
- feat(orchestrator): pace non-Claude delivery within a configurable rate budget by @deepakdgupta1 in #1372
- feat(mcp): reconcile zombie jobs whose owning process has died by @deepakdgupta1 in #1373
- feat(orchestrator): surface non-native execution-parameter handling (#1374 rebased) by @Q00 in #1393
- docs(rfc): journey transparency — state breadcrumb + TUI surfacing by @Q00 in #1392
- docs(rfc): configuration coherence — surface config, make reload honest (#1376) by @deepakdgupta1 in #1402
- docs(rfc): token frugality as one control loop — attribution + advisory guardrails (#1377) by @deepakdgupta1 in #1403
- docs(rfc): the spend estimator — difficulty + stakes from measured inputs (#1384) by @DeepakDG...
v0.41.0
v0.41.0 — Run it anywhere, and trust what it ships
A week ago,
ooo autolearned to finish the job on its own. This release makes
that autonomy something you can actually rely on: it runs on one more runtime,
it refuses to start building until the goal is unambiguous, and the verdict that
decides "is this actually done?" can no longer be gamed.
The headline
Autonomy is only worth as much as the trust behind it. v0.40.0 closed the loop —
goal in, product out. v0.41.0 spends its week hardening the two ends of that
loop and widening the floor it runs on.
- Run it anywhere. Pi joins Claude, Codex, Gemini, OpenCode, Goose, and
Copilot as a first-class runtime. Ouroboros stays the workflow engine; the
runtime is a swappable kernel. Installing it got more reliable, and every
default model pin now lives in one place. - Trust what it ships — at the input. The Socratic interview no longer thinks
alone. At every ambiguity milestone it convenes a panel — a researcher, a
contrarian, a simplifier — to surface hidden assumptions before the question
reaches you. Andooo autowill not start building until the Seed is genuinely
low-ambiguity and passes QA. - Trust what it ships — at the output. The verifier's verdict is now typed,
audited, and routed by an explicit admission policy. A test that really ran but
reported the wrong evidence form is no longer smeared as "fabrication," and a
faked clean run still doesn't pass.
🖥️ Run it anywhere — the Agent OS gets a new kernel
Pi is now a first-class Ouroboros runtime. Ouroboros owns the workflow engine,
Seed decomposition, checkpointing, evaluation handoff, and ooo skill dispatch;
for each runtime task it shells out to pi --mode json and normalizes Pi's JSONL
events into Ouroboros AgentMessage values. As the new runtime guide puts it:
"Pi is an Ouroboros runtime" means the runtime is selectable — not that Pi is
imported into Ouroboros. That is the whole Agent OS thesis in one sentence.
PiLLMAdapterfor--llm-backend pi;pi/pi_cliregistered as LLM- and
interview-driver-capable in the backend registry and provider factory (#1326)- Pi backend-aware default-model normalization — default
--llm-backend piuses
Pi's own backend default instead of forwarding an Anthropic model name (#1326) - Align the Pi runtime with documented JSON mode (#1321)
- Report malformed Pi runtime events as a typed
ProviderErrorinstead of
failing opaquely (#1325) - Wire the Pi runtime setup surface —
ouroboros setup --runtime piinstalls the
managed Pi bridge (5c674c1) - Opt-in native Pi CLI smoke test for end-to-end confidence (#1329)
Installing and updating got more trustworthy. The week's two same-day releases
surfaced real install-path risk; this closes it.
- Run
setupwith the freshly installedouroborosbinary, not a stale one
left onPATH(#1345) - Installer UX improvements; pipx/pip install paths now preserve existing
PATH
precedence (#1343)
One source of truth for model pins. The same default model strings were
hand-copied across three layers, so Opus had silently been frozen at 4.6 since
February. They now live in a single _model_defaults.py.
- Centralize every default Claude model pin into one source of truth
(_model_defaults.py) and pin exact snapshots rather than the"default"
sentinel, so evaluation/consensus grading stays reproducible. Net move: the
Opus reasoning tier → 4.8 (interview, seed, ontology, evaluation, consensus
advocate); the Sonnet judgment tier (qa_model) stays pinned at 4.6,
retiring the datedclaude-sonnet-4-20250514(#1324, #1323)
Roadmap, in the open. A point-in-time AgentOS issue-sequencing graph
(Track A / B / C) is now published so you can see which merged PRs resolved which
roadmap tracks. #961 remains the canonical roadmap SSOT (#1293).
Housekeeping. Prune unused optional packages (#1301); pin typer before the
vendored click to stabilize resolution (#1300).
🧠 Trust what it ships — at the input: the interview stops thinking alone
Ouroboros has always opened with a single questioner. Now that questioner has a
panel. Milestone lateral review is promoted from a non-blocking advisory to a
required lightweight subagent pass at exactly the moments hidden assumptions
start to bite.
- When an interview crosses an ambiguity milestone —
initial → progress,
progress → refined,refined → ready— the main session dispatches
ouroboros_lateral_thinkwithresearcher,contrarian, andsimplifier
personas (addingarchitectwhen the answer changes system shape or ownership)
before answering or asking the returned question (9d229c4) - This is the supported "deep research style" interview experience: multiple
perspectives visibly help, while the final prompt stays easy to answer. Results
are folded into 2–3 concrete options or one recommended draft — not dumped
as a report - Lateral review also fires whenever the main session would otherwise compress a
user's free-text into a decision, or when the question is about tradeoffs,
priorities, non-goals, risk, success criteria, or rollout run_lateral_reviewis now a declared interview capability, with per-runtime
capability/instruction artifacts wired in (9d229c4)
ooo auto won't build something underspecified. The interview no longer
closes on ledger completeness alone.
- Gate auto runs on backend-confirmed low ambiguity (≤ 0.20) plus a pre-run
Seed QA pass for both the MCP and CLI entrypoints; QA findings feed back into
bounded Seed-repair attempts before blocking, so failures are actionable and
resumable (#1302) - Normalize natural worktree-policy names (e.g.
create_isolated_worktree → always)
and fail fast whencomplete_product=trueis paired with a too-short timeout,
instead of burning the budget in the interview and blocking late (#1305)
🛡️ Trust what it ships — at the output: a verdict you can't game
The more autonomous the loop, the more its "done" has to mean done. This release
makes the verifier's decision typed, auditable, and policy-routed (RFC #814,
Verdict Envelope v1).
- Promote TraceGuard verdict admission into
VerifierVerdict: H1 verifier
output now carries a typed status, evidence refs, and aretry_admission, and
ACCEPT / RETRY / REDISPATCH / ESCALATE_MODEL / ESCALATE_HUMAN / BLOCK decisions
are persisted on atomic typed-evidence events (#1330)- Benchmark fixtures:
accepted → ACCEPT,missing evidence → EVIDENCE_MISSING / RETRY,semantic miss → SCOPE_CREEP / REDISPATCH,repeated fabrication → FABRICATION_SUSPECTED / ESCALATE_MODEL
- Benchmark fixtures:
- Prefer the verifier's retry-admission policy (H7): re-run the same leaf only
whenretry_admission=RETRY; honor intentional divergence between
failure_classandretry_admission(e.g.FABRICATION_SUSPECTED+
REDISPATCH) instead of inferring policy from the failure class alone (#1331) - Classify masked test evidence fairly (#1292): a transcript that clearly ran
the test command but masked its status behind an output filter (… | tail) is
nowEVIDENCE_FORM_MISMATCH— retryable, with actionable feedback (e.g. add
set -o pipefail) — rather thanFABRICATION_SUSPECTED. The #1208 guard holds:
unprotected output-filter pipelines still don't prove a cleancommands_run
claim. The verifier's evidence boundary is now codified in docs so core stays
language- and runner-agnostic
What's Changed
Runtimes & Agent OS
- feat(providers): add Pi LLM adapter (#1326)
- fix(pi): align runtime with documented JSON mode (#1321)
- fix(pi): report malformed runtime events (#1325)
- fix(setup): wire Pi runtime setup surface (5c674c1)
- test(orchestrator): add opt-in Pi CLI smoke test (#1329)
- fix(installer): prefer freshly installed ouroboros for setup (#1345)
- feat(installer): improve install script UX (#1343)
- refactor(config): centralize Claude model pins into a single source of truth (align to 4.8) (#1324)
- fix(config): replace retiring qa_model default with claude-sonnet-4-6 (#1323)
- chore(deps): prune unused optional packages (#1301)
- fix(deps): pin typer before vendored click (#1300)
- fix(opencode): cover Windows cleanup review blockers (#1320)
- fix(goose): keep LLM completion calls profile-free (#1303)
- fix(run): guard home dir in
_detect_project_root_from_seed_path(#1313)
Interview (the philosophy layer)
- feat(interview): dispatch lateral review at milestones (9d229c4)
- fix(auto): gate runs on low-ambiguity seed QA (#1302)
- Harden
ooo autopolicy aliases and timeout preflight (#1305)
Verifier & harness integrity
- feat(harness): promote TraceGuard verdict admission (#1330, refs #814)
- fix(h7): prefer verifier retry admission policy (#1331)
- fix(orchestrator): classify masked test evidence forms (#1292, refs #1234)
Docs
- docs(providers): document Pi provider surfaces (#1327)
- docs(runtime): fix shipped backend wording (#1332)
- docs(agentos): add issue sequencing graph snapshot (#1293)
- Verdict Envelope v1 RFC, verifier-evidence-policy, runtime-capability-matrix,
Pi runtime guide, and contributing/key-patterns updates
What's Changed
- fix(orchestrator): classify masked test evidence forms by @Q00 in #1292
- docs(agentos): add issue sequencing graph snapshot by @Q00 in #1293
- fix(deps): pin typer before vendored click by @Q00 in #1300
- chore(deps): prune unused optional packages by @Q00 in #1301
- fix(goose): keep LLM completion calls profile-free by @mdc2122 in #1303
- fix(run): guard home dir in _detect_project_root_from_seed_path by @kenlin8827 in #1313
- fix(opencode): cover Windo...
v0.40.1
What's Changed
Bug Fixes
- Include
clickas an installer runtime dependency (#1299)
Full Changelog: v0.40.0...v0.40.1
What's Changed
Full Changelog: v0.40.0...v0.40.1
v0.40.0
v0.40.0 — ooo auto crosses the line
This is the release where
ooo autostops being a demo and becomes a machine that finishes your work.
You drop in a vague intention. The Socratic interview pins it down into a precise,
machine-checkable goal — and then the engine refuses to stop until that goal is
actually built, verified, and shipped. No babysitting. No "it drew up a plan
and gave up halfway." The loop owns the outcome, end to end.
This is not "generate a plan." This is goal in, product out — autonomously.
The headline
The interview no longer stops at understanding. It stops at done.
ooo auto specifies your goal and then drives the full
Interview → Seed → Execute → Evaluate loop on its own — and keeps going until
the goal is real. Then it goes a little further. This is a feature that runs
beyond the goal.
-
A seamless run, end to end.
ooo autois no longer a convenience wrapper
around the steps — it's a single closed loop that takes your intent and carries
it all the way to a shipped product. Interview lifecycle events stream into the
EventStore, detached job tracking got real UX, andauto.product.emittedfires
the moment Ralph hits a successful terminal — so you know it actually delivered. -
An interview that specifies your goal AND finishes the job. The interview
no longer just clarifies and quits. Closure ladders,auto_fill_remainingand
partial_seed_from_evidencesubstrates, and safe-default synthesis mean a
non-converging conversation still becomes a real, executable product instead of
a dead end. Deadlines route through a closure ladder, not a terminal BLOCKED. -
A loop that will not quit until your goal is done. Ralph persistence,
wall-clockRuntimeControls+ Watchdog, checkpoint-committed coding sessions,
and oscillation detection routed through lateral UNSTUCK escalation keep the
engine grinding through stalls, recovery, and dead patches until verification
actually passes — not until it gets tired. -
Beyond the goal. This is the headline: the system is now built to run past
the point where you stop watching. You set the goal; it carries the goal to
completion on its own. That capability did not exist before this release —
a feature beyond the goal.
What's Changed
ooo auto — autonomous completion
- Emit
auto.product.emittedon Ralph success terminal (#1297) auto_fill_remainingsubstrate for non-converging interviews (#1296)- Interview deadline → closure ladder, not terminal BLOCKED (#1270)
partial_seed_from_evidencesubstrate (#1269)- Degraded seed → partial product terminal (#1271)
- Isolate coding sessions with checkpoint commits (#1281)
- Wire interview lifecycle events to EventStore (#1260)
- Make
ledger_donethe primary interview closure check (#1252) - Safe-default lateral escalation substrate + matcher-fire routing (#1250, #1251)
- Route Ralph
oscillation_detectedthrough UNSTUCK_LATERAL (#1175) - Propagate
closure_modeto seed grading (#1265) - Safe-default closure mode + partial-unsafe blocker code (#1167)
- Ledger-derived task-class inference + task-class catalog (#1173, #1177)
- Additive
assumption_sourcesprovenance surface (#1169) - Surface
defaulted_sectionsinAutoPipelineResult(#1146) - Canonical
stop_reason_codefor interview-layer blockers (#1151) - Relay auto interview questions in progress output (#1284)
- Improve detached auto job tracking UX (#1286)
Runtime, orchestrator & acceptance
- Runtime acceptance evidence (L3-1) (#1181)
RuntimeControls+ wall-clock Watchdog runner (L2-1) (#1178)- Canonical acceptance harness skeleton (L0-a) (#1174)
- AgentOS health-readiness table + release-readiness triage (#1282)
- Plugin v0.4 schema + tool-call hook type promotion (#1277)
- Plumb
forceflag throughouroboros_generate_seed(#1158)
Bug Fixes
- Prevent timezone comparison
TypeErrorwhen merging events (#1298) - Stop blocking language/runtime/greenfield interview questions (#1295)
- Size Ralph per-iteration timeout to pipeline budget, not 1800s default (#1294)
- Require terminal run evidence before Ralph (#1279)
- Clean up synchronous complete-product runs (#1280)
- Close backend-ready fallback resumes (#1278)
- Synthesize seed when authoring backend is unavailable (#1261)
- Make L1 CLI predicate match on goal signal alone (#1264)
- Make watchdog controls replay safe (#1207)
- Clear stale stop reason codes / expose active task class in MCP meta (#1194, #1196)
- Honor prompt-declared
non_goalsin unsafe-context matcher (#1221) - Use
.ouroborosmarker and existence gate forproject_dirresolution (#1246) - Resolve project dirs for central seeds (#1161)
- Migrate Codex setup profiles to profile-v2; accept official Rust CLI binaries (#1268, #1162)
- Correlate verifier diagnostics; credit transcript test commands for
tests_passed(#1198, #1166) - Match command claims wrapped in output redirection/pager pipes (#1168)
- Tolerate gradle verifier evidence; extend AC stall watchdog budget (#1238, #1233)
- Register and wire the
qacommand (#1230) - Preserve MCP tool metadata across transport (#1210)
- Bound workflow lifecycle
reason_code/refs and nesting depth (#1144) - Guard
workflow_iraggregate type against raw append (#1147) - Sanitize Windows-reserved chars in checkpoint
seed_id(#1156) - Enforce runtime artifact env; keep plugin artifact hooks self-contained (#1197, #1206)
- Project blocked plugin invocations as workflow failures (#1215)
- Prompt required grants during plugin install (#1209)
- Keep fallback/installer installs stable-only; align plugin version fallback with hatch-vcs (#1217, #1216)
- Reject status-masking test pipelines (#1208)
Docs, Tests & Maintenance
- AgentOS profile taxonomy, ControlJournal, IR ↔ projection mapping contracts locked (#1275, #1274, #1150)
- Plugin artifact/state and before/after tool-call hook contracts defined (#1276, #1145)
- Canonical regression coverage for closure ladder, runtime probes, live-run evidence (#1272, #1222, #1195)
- CI: enforce
ooo autoR-run section per RFC #1256 §I5 (#1259) - Make Hermes runtime cwd assertions cross-platform (#1288)
Full Changelog: v0.39.1...v0.40.0
What's Changed
- fix(persistence): sanitize Windows-reserved chars in checkpoint seed_id (fixes #1155) by @Jun-0913 in #1156
- feat(auto): canonical stop_reason_code for interview-layer blockers by @shaun0927 in #1151
- fix(auto): close interview on ledger-only consensus at max_rounds by @shaun0927 in #1148
- feat(auto): surface defaulted_sections in AutoPipelineResult by @shaun0927 in #1146
- feat(auto): safe-default closure mode + partial-unsafe blocker code (PR-B2) by @shaun0927 in #1167
- feat(auto): additive assumption_sources provenance surface (PR-C2) by @shaun0927 in #1169
- fix(hook): treat configured installs without prefs.json as returning users by @lifrary in #1152
- feat(auto): task-class catalog data (L1-a) by @shaun0927 in #1173
- feat(mcp): plumb force flag through ouroboros_generate_seed by @hooni0918 in #1158
- feat(auto): route Ralph oscillation_detected through UNSTUCK_LATERAL (L5-a) by @shaun0927 in #1175
- fix(orchestrator): credit transcript test commands for tests_passed claims by @nkjunbc in #1166
- fix(orchestrator): match command claims wrapped in output redirection/pager pipes by @nkjunbc in #1168
- feat(tests): canonical acceptance harness skeleton (L0-a) by @shaun0927 in #1174
- feat(auto): ledger-derived task-class inference (L1-b) by @shaun0927 in #1177
- feat(runtime): RuntimeControls + wall-clock Watchdog runner (L2-1) by @shaun0927 in #1178
- feat(orchestrator): runtime acceptance evidence (L3-1) by @shaun0927 in #1181
- fix(plugin): accept command-level AgentOS metadata by @shaun0927 in #1180
- test(canonical): emit L0 summary and lock fixture failures by @shaun0927 in #1182
- docs(auto): align L1 task-class follow-up labels by @shaun0927 in #1183
- docs(auto): align convergence contract with safe-default closure by @shaun0927 in #1184
- fix(auto): surface assumption_sources through envelope clients by @shaun0927 in #1185
- fix(orchestrator): preserve shell-preamble command proof after output plumbing by @shaun0927 in #1186
- fix(auto): route Ralph oscillation replay through UNSTUCK_LATERAL by @shaun0927 in #1187
- feat(auto): Seed AC injection + active_task_class envelope (L1-d, L1-e) by @shaun0927 in #1188
- feat(auto): wall-clock watchdog integration in AutoPipeline (L2-2) by @shaun0927 in #1189
- feat(auto): runtime-probe envelope + advisory probe_runner (L3-2) by @shaun0927 in #1190
- feat(tests): L0 live-wire + L1 catalog cross-validate (P1) by @shaun0927 in #1191
- Keep plugin runtime writes outside trusted homes by @shaun0927 in #1193
- fix(tests): align canonical live-run evidence by @Q00 in #1195
- fix(auto): expose active ...
v0.39.1
What's Changed
Features
- Add
ouroboros status run --jsonprojection surface (#1133) - Record durable workflow lifecycle events in orchestrator (#1134)
- Add
on_error/on_cancelplugin observability hooks (PR E) (#1137) - Expose MCP interview reasoning metadata (#1140)
- Prompt for required trust grants on plugin install (#1141)
- Expose Ralph-start alias while preserving runtime ownership
- Dispatch lifecycle hooks within plugin trust boundaries
- Make plugin permission waits share the typed HITL contract
- Expose projection checkpoint anchors safely
- Expose plugin manifests as harness descriptors
- Let safe-default synthesis close persisted interviews
- Surface malformed Claude tool-use turns at the provider boundary
Bug Fixes
- Defer lateral advisory side effects in interview (#1130)
- Make plugin workflow ids collision-proof
- Advise first live milestone crossing in interview
- Make auto ledger conflicts deterministic
- Preserve bounded recovery redispatch semantics
- Validate HITL timeout decisions through replayed state
- Keep safe defaults tied to persisted interviews
Testing & Hardening
- Expand workflow IR conformance harness (#1135)
- Add mechanical-evaluation projection fixture (#1132)
- Lock plugin lifecycle conformance baseline
- Lock the short-goal interview convergence matrix against regression
- Lock projection fixture evidence flow
Instrumentation & Docs
- Emit structured-log events at safe-default decision points in auto (#1138)
- Mark completed projection follow-up slots in agentos docs (#1136)
- Persist init interview HITL telemetry without coupling the renderer
- Record interview lateral-review design before implementation
- Update README
Full Changelog: v0.39.0...v0.39.1
What's Changed
- docs: define interview milestone lateral contract by @honor2030 in #1108
- feat(plugin): add hook runtime audit schema names by @shaun0927 in #1109
- fix(runtime): surface malformed tool-use turns by @shaun0927 in #1111
- feat(hitl): record init interview responses by @shaun0927 in #1112
- feat(mcp): add start ralph tool alias by @shaun0927 in #1113
- feat(hitl): validate timeout events from replay by @shaun0927 in #1114
- Document runtime delegation ownership contract by @shaun0927 in #1115
- Specify plugin permission HITL contract by @shaun0927 in #1116
- test(auto): cover #821 short-goal interview convergence matrix by @shaun0927 in #1117
- feat(plugin): dispatch v1 lifecycle hooks by @shaun0927 in #1110
- feat(plugin): expose manifest descriptor projection by @shaun0927 in #1118
- feat(auto): consume lateral recovery plans for Ralph redispatch by @shaun0927 in #1120
- feat(auto): centralize deterministic ledger conflict policy by @shaun0927 in #1121
- test(plugin): lock v0.3 lifecycle conformance by @shaun0927 in #1119
- fix(auto): close safe-defaultable interview gaps at max rounds by @shaun0927 in #1122
- test(projection): lock mechanical evaluation fixture by @shaun0927 in #1123
- feat(projection): surface context checkpoint anchors by @shaun0927 in #1124
- test(workflow): lock projection boundary fixture by @shaun0927 in #1125
- feat(plugin): classify terminal hook contract by @shaun0927 in #1127
- feat(interview): surface milestone lateral review advisories by @shaun0927 in #1128
- feat(workflow): represent plugin actions as planned nodes by @shaun0927 in #1126
- fix(auto): let safe-default synthesis close interviews by @shaun0927 in #1129
- fix(interview): defer lateral advisory side effects by @shaun0927 in #1130
- feat(plugin): prompt for required trust grants on install by @Q00 in #1141
- docs(agentos): mark completed projection follow-up slots by @shaun0927 in #1136
- test(harness): add mechanical-evaluation projection fixture by @shaun0927 in #1132
- instrument(auto): emit structured-log events at safe-default decision points by @shaun0927 in #1138
- Expose MCP interview reasoning metadata by @Q00 in #1140
- feat(cli): add ouroboros status run --json projection surface by @shaun0927 in #1133
- feat(orchestrator): record durable workflow lifecycle events by @shaun0927 in #1134
- feat(plugin): add on_error/on_cancel observability hooks (PR E) by @shaun0927 in #1137
- test(orchestrator): expand workflow IR conformance harness by @shaun0927 in #1135
Full Changelog: v0.39.0...v0.39.1
v0.39.0
Ouroboros v0.39.0
This release lands a high-severity security fix, flips ooo run to the
fat-harness execution path by default, and completes the AgentOS roadmap
wiring/baseline milestone tracked in #961.
🔒 Security
RCE via untrusted project-directory .env (high severity)
Ouroboros is run inside cloned repositories. config/loader.py loaded
./.env from the working directory into os.environ at import time with the
same trust as the home-directory ~/.ouroboros/.env. Because
OUROBOROS_*_CLI_PATH and the runtime/backend selector env vars decide which
binary the Claude Agent SDK / runtime adapters spawn, a malicious repository
could ship a .env plus an executable script and achieve arbitrary code
execution on the victim's machine as soon as they ran any command that builds
a runtime adapter (e.g. ooo, ouroboros init).
- Classification: CWE-426 (Untrusted Search Path) + CWE-15 (External
Control of System or Configuration Setting) - Root cause: the project-directory
.envtravels with whatever
repository the user cloned and is therefore an untrusted trust boundary;
it was conflated with the trusted home config.
Fixes:
- Denylist for untrusted
.env(#1078):
blocks the 8OUROBOROS_*_CLI_PATHkeys plus the runtime/backend selectors
(OUROBOROS_AGENT_RUNTIME,OUROBOROS_RUNTIME,OUROBOROS_LLM_BACKEND)
when loading an untrusted.env. - Fail-closed default:
_load_env_filenow defaults totrusted=False;
only~/.ouroboros/.envopts into trust explicitly, so any future caller is
safe by default. - Defense in depth:
ClaudeCodeAdapter._resolve_cli_pathrejects any
resolved CLI path inside the current working directory and falls back to the
SDK bundled CLI — a legitimate Claude CLI is always a global install, never
shipped inside a repo. - Additional hardening: block
PATHfrom untrusted project env
(#1098) and refuse symlinked
managed install roots (#1097).
Trusted sources — shell export, ~/.ouroboros/.env,
~/.ouroboros/config.yaml — keep full custom-CLI support, so no legitimate
workflow regresses. The fix was adversarially reviewed by a security-focused
agent over two rounds (round 2 returned APPROVED with no remaining bypasses).
🙏 Reported by @qerogram — thank you for the responsible disclosure.
🚀 AgentOS Roadmap Progress (#961)
The AgentOS substrate wiring + baseline milestone is now complete.
Track A — ooo run fat-harness
ooo runCLI now defaults to the fat-harness execution path- Verifier-capability, typed blocked evidence, profile-aware decomposition, and
profile-schema wiring landed; fat-harness AC acceptance now requires verifier
PASS with typed evidence verification - Baseline gate evidence captured and recorded;
#961carries
baseline-metrics-capturedand theagentos-substrate-wiringmilestone is
closed - Readable baseline-metrics rendering + semantic-miss baseline metric reporting
Track B — ooo auto self-healing
- Phase 2 typed recovery plan and Phase 3 DomainProfile merged
- Hardened auto: Seed goal-drift repair from the ledger, strict grading with
concrete coding evidence, observation/execution acceptance-criteria
separation, and complete-product Ralph-loop wiring
Track C — AgentOS substrate dump (#920–#960)
- Workflow IR v1 lifecycle replay, conformance fixtures, and projection
hardening against ambiguous run identity - Plugin lifecycle hook permission scope, v1 hook vocabulary, and bounded
Tier 1 hook contract surface - HITL state projection, run-snapshot projection, typed HITL resume
validation, and cancel-confirmation routing through typed events - Runtime transition contract validation (fail-closed on incomplete revision
checks, malformed input rejection, secret-alias detection) - Skill runtime guides installable for Hermes/Claude/Codex from backend metadata
✨ Features
ooo runCLI flipped to fat-harness by default (with temporary opt-in path
during rollout)- CLI: read-only Workflow IR inspection and status run projection JSON
(#1063,
#1064) - CLI: status health checks (#1101)
- Harness: strict projection records, project artifact/verdict records
(#1061) - Codex: live MCP doctor check (#1047),
missing-MCP-extra detection (#1046),
JSONL stdio for live MCP doctor (#1052) - Orchestrator: workflow lifecycle conformance report
(#1038),
HITL state projection (#1036),
run snapshot projection (#1037) - Experimental Goose runtime can be enabled safely
🐛 Bug Fixes
- Orchestrator: prevent execution workers from recursively invoking auto
(#1075), recover from invalid
dependency stages (#1070),
reconcile sibling ACs from execution evidence
(#1096) - Auto: surface execution terminal failures instead of reporting complete
(#1076), canonicalize
observation execution criteria (#1095),
keep repaired Seed identifiers synchronized
(#1071) - Jobs: preserve runner failure over terminal evidence
(#1094), fail stalled
progress-accounting executions (#1085),
wait for runner cleanup after progress-stall failure
(#1089) - Interview: scope completion-signal heuristic to user-prefix answers
(#1077) - Goose: preserve approval for default permission modes
(#1106) - Evidence scope hardening for observation/docs-only ACs
(#1072,
#1073,
#1093) - Bigbang: add force flag to
SeedGenerator.generate, replacing the
FORCED_SCORE_VALUEhack (#1107)
📚 Docs & Maintenance
- Clarify Windows WSL installation path
- Align contributing documentation guidance
(#1102) - AgentOS: sequence projection follow-up slots, clarify Workflow IR v1 boundary
- Remove legacy self-report acceptance fallback
(#1086) and unreachable verifier
branch
Full Changelog: v0.38.2...v0.39.0
What's Changed
- feat(orchestrator): add fat-harness baseline metrics report by @honor2030 in #977
- feat(plugin): define hook audit event vocabulary by @shaun0927 in #973
- feat(runtime): classify malformed tool-use turns by @shaun0927 in #972
- feat(plugin): accept optional hook declarations by @shaun0927 in #970
- docs(plugin): define lifecycle hook contract by @shaun0927 in #969
- feat(orchestrator): support typed blocked leaf evidence by @shaun0927 in #927
- feat(profiles): introduce profile YAML schema + loader by @honor2030 in #976
- feat(hitl): add typed WAIT/RESUME contract by @shaun0927 in #971
- feat(plugin): add v1 lifecycle hook contract types (#939) by @shaun0927 in #984
- feat(plugin): enforce v1 hook contract in manifest validator (#939 PR-2) by @shaun0927 in #985
- feat(plugin): add schema v0.3 with v1-only hook enum (#939 PR-3) by @shaun0927 in #986
- feat(plugin): add v1 hook lifecycle permission scope (#939 PR-4) by @shaun0927 in #987
- feat(orchestrator): add human-readable baseline metrics formatter by @shaun0927 in #988
- feat(orchestrator): record fat-harness baseline metrics evidence by @shaun0927 in #989
- feat(harness): add Run/Step/Artifact/Verdict projection records (#946 PR-1a) by @shaun0927 in #980
- feat(harness): add ProjectionBuilder over the EventStore (#946 PR-1b) by @shaun0927 in #983
- feat(harness): add journal → evidence-manifest normalizer (#978 P1) by @shaun0927 in #982
- feat(orchestrator): add typed Workflow IR schema and validator (#956 PR-1) by @shaun0927 in #981
- feat(harness): expose projection records through MCP query (#946 PR-2) by @shaun0927 in #990
- feat(orchestrator): add read-only Seed to Workflow IR adapter (#956 PR-2) by @shaun0927 in #991
- feat(orchestrator): audit profile-aware AC decomposition (#920 PR-1) by @shaun0927 in #992
- feat(harness): load AC manifests for TraceGuard deliver ga...
v0.38.2
What's Changed
Bug Fixes
- Close residual
allowed_tools=[]leak in sub-CLI envelope for interview
Testing
- Lock empty
allowedToolspassthrough - Cover strict empty allowed-tools envelope (#975)
Full Changelog: v0.38.1...v0.38.2
What's Changed
Full Changelog: v0.38.1...v0.38.2
v0.38.1
What's Changed
Features
- Persist typed recovery plans after QA failure (#928)
- Let decomposition consume execution profiles (#929)
- Route verifiers by profile capability (#926)
Bug Fixes
- Mutual-agreement closure gate for interview driver (#962)
Full Changelog: v0.38.0...v0.38.1
What's Changed
- fix(auto): mutual-agreement closure gate for interview driver by @Q00 in #962
- feat(orchestrator): route verifiers by profile capability by @shaun0927 in #926
- feat(orchestrator): let decomposition consume execution profiles by @shaun0927 in #929
- feat(auto): persist typed recovery plans after QA failure by @shaun0927 in #928
Full Changelog: v0.38.0...v0.38.1
v0.38.0
What's Changed
This release wraps up the #830 Orchestrator stack (9 PRs), the #809 P3 DomainProfile rollout (coding + research profiles wired through ooo auto), and the #518 AgentProcess durability work. It also brings a major round of security/safety hardening across plugin trust, secret redaction, and subprocess bounding.
Features
Orchestrator (#830 stack, PRs 1/9 → 9/9)
- Profile YAML schema + loader (#881)
- Typed evidence schema validator (#883)
- External verifier loop (#884)
- Profile-aware decomposition params (#885)
- PRE/POST phase wrappers (#886)
- Failure taxonomy + recovery policy (#887)
- Adaptive model/tool routing (#889)
- Per-dispatch context budget (#890)
ProfileBackedStrategy+ deprecatecode-executor.md(PR 9/9)
DomainProfile (#809 P3 stack)
- First built-in coding DomainProfile + parity tests (#851)
- Second built-in research DomainProfile + plurality acceptance (#850)
- 3-step DomainProfile activation in
ooo autoCLI (#852) - Route
AutoAnswererthrough DomainProfile (#854) - Route
safe_defaultsthrough DomainProfile (#853) - Recovery-loop guards (#888)
AgentProcess & Evolution (#518, #578)
- Durable pause/resume for
AgentProcessviaCheckpointStore(#844) - Wrap
evolve_stepinAgentProcess(#846) - Map watchdog timeouts onto Directive vocabulary (#836)
- Emit
control.directive.emittedfrom watchdog timeouts (#838)
MCP & Auto
ouroboros_start_evaluatefire-and-forget handler (#882)- Unified status surface for
auto+ralph(#792)
Bug Fixes
Orchestrator (#891 stack)
- Wire H3 wrappers into
ProfileBackedStrategy - Per-profile Bash activity semantics
- Direct executor through every AC, not just first
- Replace
build_post_blockreuse with multi-AC directive - Preserve legacy domain guidance in system prompt
- Derive guidance tool list from profile
- Single consolidated evidence record + blocker in JSON
- Drop blocker-marker contract until H2 schema lands
- Strip deprecation banner from live code-executor prompt
MCP & Auto
- Harden
start_autosession exclusivity - Bump interview/seed phase timeouts; exempt
user_preferencesfrom shell-metachar scan (#894) - Restore coding DomainProfile lightweight loading (#879)
- Restore coding profile lazy import boundary (#875)
- Bound encoded Seed filenames (#878)
AgentProcess durability
- Keep
AgentProcesscancel durable until restart observes it (#845) - Prevent false terminal cancellation for live
AgentProcesswork (#880) - Preserve
AgentProcessreplay across lifecycle slices (#847)
Security & Hardening
- Redact secret-shaped event resource payloads (#866)
- Avoid persisting full Codex auth paths in failure events (#864)
- Make trust and disable transitions atomic in CLI/plugin (#868)
- Bound firewall subprocess invocation time (#858)
Other
- Preserve raw JSON success fallback in copilot (#877)
- Ignore telemetry JSON in copilot success fallback (#870)
Refactoring
- Replace hardcoded model strings with config-aware getters in PM (#893)
Testing
- Register integration pytest marker (#896)
- Full Interview→Seed→Run→Ralph→QA E2E integration test (#793)
- Isolate
codex_cliprofile tests from user config
Documentation
- RFC: unified runtime timeout contract (#578) (#841)
- Clarify stable Python source checkout setup (#876, #874)
Full Changelog: v0.37.0...v0.38.0
What's Changed
- test(orchestrator): isolate codex_cli profile tests from user config by @Q00 in #872
- docs(rfc): unified runtime timeout contract (#578) by @shaun0927 in #841
- feat(auto): 3-step DomainProfile activation in ooo auto CLI (#809 P3, PR 3/6) by @shaun0927 in #852
- fix(plugin): bound firewall subprocess invocation time by @shaun0927 in #858
- fix(cli/plugin): make trust and disable transitions atomic by @shaun0927 in #868
- fix(security): redact secret-shaped event resource payloads by @shaun0927 in #866
- fix(interview): avoid persisting full Codex auth paths in failure events by @shaun0927 in #864
- feat(evolution): map watchdog timeouts onto Directive vocabulary (#578) by @shaun0927 in #836
- feat(orchestrator): durable pause/resume for AgentProcess via CheckpointStore (#518) by @shaun0927 in #844
- feat(evolution): wrap evolve_step in AgentProcess (#518) by @shaun0927 in #846
- feat(orchestrator): implement AgentProcess.replay() from control directive events (#518) by @shaun0927 in #847
- feat(auto): route safe_defaults through DomainProfile (#809 P3, PR 5/6) by @shaun0927 in #853
- feat(auto+jobs): unified status surface for auto + ralph by @shaun0927 in #792
- docs: clarify source checkout Python defaults by @shaun0927 in #874
- feat(auto): second built-in research DomainProfile + plurality acceptance (#809 P3, PR 6/6) by @shaun0927 in #850
- feat(auto): route AutoAnswerer through DomainProfile (#809 P3, PR 4/6) by @shaun0927 in #854
- fix(copilot): ignore telemetry JSON in success fallback by @shaun0927 in #870
- feat(evolution): emit control.directive.emitted from watchdog timeouts (#578) by @shaun0927 in #838
- test(integration): full Interview→Seed→Run→Ralph→QA E2E by @shaun0927 in #793
- feat(auto): first built-in coding DomainProfile + parity tests (#809 P3, PR 2/6) by @shaun0927 in #851
- fix(auto): restore coding DomainProfile lightweight loading by @shaun0927 in #879
- feat(auto): recovery-loop guards (#809 P2.2b, Stack 1/2) by @Q00 in #888
- docs: clarify stable Python source checkout setup by @honor2030 in #876
- fix(copilot): preserve raw JSON success fallback by @shaun0927 in #877
- fix(auto): bound encoded Seed filenames by @shaun0927 in #878
- fix(auto): restore coding profile lazy import boundary by @shaun0927 in #875
- fix(orchestrator): keep AgentProcess cancellation owned until work exits by @shaun0927 in #880
- feat(orchestrator): profile YAML schema + loader (#830 PR 1/9) by @Q00 in #881
- feat(mcp): add ouroboros_start_evaluate fire-and-forget handler by @Q00 in #882
- feat(orchestrator): typed evidence schema validator (#830 PR 2/9) by @Q00 in #883
- feat(orchestrator): durable cancel signal for AgentProcess (#518) by @shaun0927 in #845
- feat(orchestrator): external verifier loop (#830 PR 3/9) by @Q00 in #884
- feat(orchestrator): profile-aware decomposition params (#830 PR 4/9) by @Q00 in #885
- feat(orchestrator): PRE/POST phase wrappers (#830 PR 5/9) by @Q00 in #886
- feat(orchestrator): failure taxonomy + recovery policy (#830 PR 6/9) by @Q00 in #887
- feat(orchestrator): adaptive model/tool routing (#830 PR 7/9) by @Q00 in #889
- feat(orchestrator): per-dispatch context budget (#830 PR 8/9) by @Q00 in #890
- test: register integration pytest marker by @Q00 in #896
- fix(auto): bump interview/seed phase timeouts and exempt user_preferences from shell-metachar scan by @Q00 in #894
- refactor(pm): replace hardcoded model strings with config-aware getters by @cohemm in #893
- feat(orchestrator): ProfileBackedStrategy + deprecate code-executor.md (#830 PR 9/9) by @Q00 in #891
- feat(auto): fire-and-forget ouroboros_start_auto + relax user_preferences value types by @Q00 in #895
New Contributors
- @honor2030 made their first contribution in #876
Full Changelog: v0.37.0...v0.38.0
v0.37.0
What's Changed
Features
ooo auto Pipeline
DomainProfileandVerifiablePredicatecontracts (#849, #809 P3 PR 1/6)UNSTUCK_LATERALpersona advisor on EVALUATE fail (#829)- EVALUATE phase verifies run output against seed AC (#825)
- Formalize run-handoff idempotency contract (#843)
- Chain RUN→RALPH automatically with
--complete-product(#791) user_preferencesource + deterministic ambiguity floor (#811)- Top-level
pipeline_timeout_secondsdeadline (#790) - Steer interviews toward open ledger gaps (#761)
- Finalize safe-default interview gaps (#763)
- Classify interview questions by intent (#762)
- Expose ledger provenance as
ledger_provenancein pipeline result meta (#740) - CI lint guard for
ooo autoproduct boundary (#753)
Interview & Unstuck
- Debate mode for
ooo lateral(#812) - Raise prompt budget caps for richer answers
- Isolate adapter from plugin MCP servers + hardening RFC
Ralph & Evolution
- Total wall-clock budget
max_total_secondsfor Ralph (#789) - Oscillation / no-progress detection in Ralph (#788)
- Pin v0 watchdog cancellation contract (#842)
Plugin & CLI
- TrustStore concurrency primitives + LockEntry subject helper + manifest tuple ordering (#807)
- UserLevel program registry: cross-axis collisions + command-name index (#747)
argv_summaryin firewall audit events (observation-only) (#805)ooo plugin {discover,inspect,list}read-only commands (#750)- Warn on stderr when
ooo plugin listrow has unreadabletrust.json(#833) - Surface
trust_read_errorinooo plugin list --json(#832) - Route
ooo publish/ooo resume-sessionkeywords via hook (#742)
MCP
- Diagnostic event for interview response shape (#837)
- Structured envelope for interview length-guard branch (#834)
Bug Fixes
Interview
- Close parent-context leaks in sub-CLI envelope (#869)
- Close Restate gate bypass for short PATH 2 answers (#827)
- Scope strict MCP isolation
- Reserve CLI adapter prompt headroom
- Keep interview prompt budget below CLI failure ceiling
- Budget interview prompts with serialized CLI framing
Security & Plugin Firewall
- Contain auto Seed persistence paths (#865)
- Prevent argv secret leaks across firewall outputs (#857)
- Fail-closed on tampered plugin home + refuse legacy trust under subject contract (#808)
- Escape all C0/DEL chars in lockfile TOML basic strings (#795)
- Deep-copy audit event in
unwrap_plugin_event(#796) - Defensive name validation + tighten source schema (#746)
- Degrade row on corrupt
trust.jsoninstead of aborting list (#798) - Tighten
_word_boundary_matchto reject hyphen as token edge (#800)
Auto / Ralph
- Bound retry on
run_handoff_status="unknown"with idempotency-key (#787) SeedRepairer.converge()addsmax_iterations+ outerwait_for(#785)- NFKC-normalize unsafe-context input before regex bank (#794)
- Exact-match the canonical key in safe-default rollback (#804)
- Per-iteration wall-clock timeout for Ralph (#784)
- Close tool envelope on
max_turns=1to stop turn starvation (#770)
Providers & Misc
- Isolate subprocess from host plugin env (#754)
- Skip symlinks in
check-auto-boundaryscan (#797) - Keep Copilot completions from leaking tool events (#860)
Refactoring
- Extract material-progress taxonomy module (#839)
max_turns=1envelope sweep across remaining MCP sites (#786)
Testing
- Pin three-surface AgentProcess acceptance contract (#848)
- Pin watchdog resume/replay contract (#840)
- Widen
test_ralph_handler_returns_job_id_and_completes_loopdeadline to 60s - End-to-end contract proof with github-pr-ops fixture (#752)
- Define interview convergence contract (#760)
- Guard interview prompt cap against CLI ceiling
Documentation
- Forward
complete_product/pipeline_timeoutinskills/autoSKILL.md (#820) - Unify interview Step 9 payload schema + define Add-context retry (#828)
- Add Refine and Restate gates to interview SKILL.md (+ multiple follow-up refinements)
- Mark interview-hardening RFC as Accepted
- Broaden
uv installguidance for policy-restricted environments (#768) - Update version numbers in welcome skill (#810)
Full Changelog: v0.36.0...v0.37.0
What's Changed
- fix(providers,interview): isolate subprocess from host plugin env by @ASak1104 in #754
- feat(auto): CI lint guard for ooo auto product boundary by @shaun0927 in #753
- test(auto): define interview convergence contract by @shaun0927 in #760
- feat(auto): steer interviews toward open ledger gaps by @shaun0927 in #761
- feat(cli): ooo plugin {discover,inspect,list} (read-only) by @shaun0927 in #750
- feat(hook): route 'ooo publish' and 'ooo resume-session' keywords by @shaun0927 in #742
- feat(auto): expose ledger provenance in pipeline result meta as ledger_provenance (#640) by @shaun0927 in #740
- feat(plugin): lockfile + per-user trust store by @shaun0927 in #746
- feat(auto): classify interview questions by intent by @shaun0927 in #762
- feat(auto): finalize safe default interview gaps by @shaun0927 in #763
- feat(plugin): UserLevel program registry by @shaun0927 in #747
- docs(install): broaden uv install guidance for policy-restricted environments by @shaun0927 in #768
- fix(mcp,interview): close tool envelope on max_turns=1 to stop turn starvation by @shaun0927 in #770
- feat(plugin): add argv_summary to firewall audit events (observation-only) by @Q00 in #805
- fix(auto): exact-match the canonical key in safe-default rollback by @Q00 in #804
- fix(hook): tighten _word_boundary_match to reject hyphen as token edge by @Q00 in #800
- fix(cli/plugin): degrade row on corrupt trust.json instead of aborting list by @Q00 in #798
- fix(plugin): deep-copy audit event in unwrap_plugin_event by @Q00 in #796
- fix(auto): NFKC-normalize unsafe-context input before regex bank by @Q00 in #794
- fix(ralph): per-iteration wall-clock timeout by @shaun0927 in #784
- fix(auto): SeedRepairer.converge() add max_iterations + outer wait_for by @shaun0927 in #785
- refactor(mcp): max_turns=1 envelope sweep across remaining sites by @shaun0927 in #786
- fix(auto): bound retry on run_handoff_status="unknown" with idempotency-key by @shaun0927 in #787
- feat(ralph): oscillation / no-progress detection by @shaun0927 in #788
- feat(ralph): total wall-clock budget max_total_seconds by @shaun0927 in #789
- fix(plugin): escape all C0/DEL chars in lockfile TOML basic strings by @Q00 in #795
- fix(scripts): skip symlinks in check-auto-boundary scan by @Q00 in #797
- feat(auto): top-level pipeline_timeout_seconds deadline by @shaun0927 in #790
- test(plugin): end-to-end contract proof with github-pr-ops fixture by @shaun0927 in #752
- Fix stale welcomeVersion hardcoded in welcome skill by @adam0white in #810
- feat(plugin): TrustStore concurrency primitives + LockEntry subject helper + manifest tuple ordering by @shaun0927 in #807
- fix(plugin/firewall): fail-closed on tampered plugin home + refuse legacy trust under subject contract by @shaun0927 in #808
- feat(auto): user_preference source + deterministic ambiguity floor (#809 P1) by @Q00 in #811
- feat(auto): chain RUN→RALPH automatically with --complete-product by @shaun0927 in #791
- feat(interview): isolate adapter from plugin MCP servers + RFC by @Q00 in #822
- feat(interview): raise prompt budget caps for richer answers by @Q00 in #823
- docs(interview): add Refine and Restate gates to SKILL.md by @Q00 in #824
- docs(rfc): mark interview-hardening RFC as Accepted by @Q00 in #826
- feat(auto): EVALUATE phase verifies run output against seed AC (#809 P2.1) by @Q00 in #825
- feat(auto): UNSTUCK_LATERAL persona advisor on EVALUATE fail (#809 P2.2) by @Q00 in #829
- docs(interview): unify Step 9 payload schema and define Add context retry by @shaun0927 in #828
- feat(cli/plugin): surface trust_read_error in
ooo plugin list --json(#806) by @shaun0927 in #832 - feat(mcp): structured envelope for interview length-guard branch (#831) by @shaun0927 in #834
- docs(skills/auto): forward complete_product/pipeline_timeout in SKILL.md by @shaun0927 in #820
- refactor(evolution): extract material-progress taxonomy module (#578) by @shaun0927 in #839
- test(evolution): pin watchdog resume/replay contract (#578) by @shaun0927 in https://github.com/Q...

