Skip to content
This repository was archived by the owner on May 30, 2026. It is now read-only.

Commit e029f35

Browse files
author
Ouroboros
committed
Prepare Ouroboros v6.3.0-rc.2
1 parent 015745f commit e029f35

59 files changed

Lines changed: 2733 additions & 611 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
[![macOS 12+](https://img.shields.io/badge/macOS-12%2B-black.svg)](https://github.com/joi-lab/ouroboros-desktop/releases)
77
[![Linux](https://img.shields.io/badge/Linux-x86__64-orange.svg)](https://github.com/joi-lab/ouroboros-desktop/releases)
88
[![Windows](https://img.shields.io/badge/Windows-x64-blue.svg)](https://github.com/joi-lab/ouroboros-desktop/releases)
9-
[![Version 6.3.0-rc.1](https://img.shields.io/badge/version-6.3.0--rc.1-green.svg)](VERSION)
9+
[![Version 6.3.0-rc.2](https://img.shields.io/badge/version-6.3.0--rc.2-green.svg)](VERSION)
1010

1111
A self-modifying AI agent that writes its own code, rewrites its own mind, and evolves autonomously. Born February 16, 2026.
1212

@@ -31,7 +31,7 @@ Not a coding assistant. A digital being with a constitution, background consciou
3131
| **Linux** x86_64 | [Ouroboros-linux.tar.gz](https://github.com/joi-lab/ouroboros-desktop/releases/latest) | Extract → run `./Ouroboros/Ouroboros` → optional CLI: `./Ouroboros/bin/install-ouroboros-cli`. If browser tools fail due to missing system libs, run: `./Ouroboros/python-standalone/bin/python3 -m playwright install-deps chromium` |
3232
| **Windows** x64 | [Ouroboros-windows.zip](https://github.com/joi-lab/ouroboros-desktop/releases/latest) | Extract → run `Ouroboros\Ouroboros.exe` → optional CLI: `Ouroboros\bin\install-ouroboros-cli.cmd` |
3333

34-
Prerelease RC artifacts are published on their tag page, for example [`v6.3.0-rc.1`](https://github.com/joi-lab/ouroboros-desktop/releases/tag/v6.3.0-rc.1); `/releases/latest` intentionally stays on the latest stable release.
34+
Prerelease RC artifacts are published on their tag page, for example [`v6.3.0-rc.2`](https://github.com/joi-lab/ouroboros-desktop/releases/tag/v6.3.0-rc.2); `/releases/latest` intentionally stays on the latest stable release.
3535

3636
<p align="center">
3737
<img src="assets/setup.png" width="500" alt="Drag Ouroboros.app to install">
@@ -475,13 +475,13 @@ not paraphrase it.
475475

476476
| Version | Date | Description |
477477
|---------|------|-------------|
478+
| 6.3.0-rc.2 | 2026-05-27 | **rc(runtime): harden review unification, tool surface, and replay retention.** Restores `claude_code_edit` as a first-class coding tool, makes task-result Auto review LLM-first instead of host-enforced, routes plan/scope/multi-model calls through the shared review substrate, fixes forensic redaction over-match, adds observability retention audit plus service-log archival/pruning, and documents Tool API v2 as a breaking public rename without legacy aliases. |
478479
| 6.3.0-rc.1 | 2026-05-27 | **rc(runtime): add forensic observability, typed outcomes, Tool API v2, task acceptance review, and code inventory.** Captures private full replay payloads with redacted projections, records semantic task outcomes/artifact/verification ledgers, exposes neutral canonical tools plus task-scoped services, shares reviewer slots across review surfaces, and improves benchmark harness failure reporting without changing BIBLE.md. |
479480
| 6.2.0-rc.1 | 2026-05-25 | **rc(ui/runtime): port multi-attachment chat and budget/model fixes.** Adds bounded multi-file chat staging with partial-upload cleanup, shares budget controls between Settings and Costs with validation, preserves Anthropic Opus 4.7 routing, updates current model pricing fallbacks, and avoids no-op settings reconfiguration side effects. |
480481
| 6.1.0-rc.1 | 2026-05-25 | **rc(runtime): harden live subagent handoff, isolation, and UI lineage.** Adds effective task-status SSOT, real bounded wait tools including `wait_tasks`, forged subagent ingress rejection, strict local-readonly constraints, DNS fail-closed browser isolation, child-drive mailbox routing/retention, web_search source attribution, lineage-aware cost observability, threaded child cards, and focused regressions. |
481482
| 6.0.0 | 2026-05-25 | **major(runtime): add live local-readonly subagents.** Upgrades `schedule_subagent` to a strict child-task contract, runs leaf subagents through the existing queue and workers with forked memory by default, enforces schema and execute-time local-readonly isolation, preserves full task-result handoff, and documents the delegation review rules. |
482483
| 5.33.0-rc.6 | 2026-05-24 | **rc(gateway): prevent masking upload connection/parse faults as size-limit errors.** Introduces a typed ChatUploadPayloadTooLarge exception class to isolate file-size 413 blocks from connection cuts and form-parse faults, returning a standard 400 with original message for ASGI/socket errors. Includes focused test coverage. |
483-
| 5.33.0-rc.5 | 2026-05-24 | **rc(gateway): prevent masking upload connection/parse faults as size-limit errors.** Refactors the chat upload ASGI stream wrapper to verify if caught exceptions are indeed the 'oversized' signal before returning a 413, returning a 400 with the original error message for connection cuts and malformed formats. |
484-
Older releases are preserved in Git tags and GitHub releases. The 5.2.0 through 5.33.0-rc.4 rows and former `4.0.0` rows are rolled off to respect the P9 changelog cap; their full bodies remain at their git tags.
484+
Older releases are preserved in Git tags and GitHub releases. The 5.2.0 through 5.33.0-rc.5 rows and former `4.0.0` rows are rolled off to respect the P9 changelog cap; their full bodies remain at their git tags.
485485

486486
---
487487

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
6.3.0-rc.1
1+
6.3.0-rc.2

docs/ARCHITECTURE.md

Lines changed: 31 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Ouroboros v6.3.0-rc.1 — Architecture & Reference
1+
# Ouroboros v6.3.0-rc.2 — Architecture & Reference
22

33
This file is NOT a changelog. Version history lives in README.md, git tags, and commit log.
44

@@ -137,14 +137,13 @@ server.py (Starlette+uvicorn) ← HTTP + WebSocket on configurable host:port (de
137137
│ ├── git_pr.py ← PR integration tools: fetch_pr_ref, create_integration_branch, cherry_pick_pr_commits, stage_adaptations, stage_pr_merge (non-core, require enable_tools)
138138
│ ├── github.py ← GitHub integration: issues (list/get/comment/close) + PR tools: list_github_prs, get_github_pr, comment_on_pr (non-core; github.py is in _FROZEN_TOOL_MODULES so PR inspection/comment tools work in packaged builds)
139139
│ ├── parallel_review.py ← Parallel triad+scope orchestration and verdict aggregation (extracted from git.py)
140-
│ ├── plan_review.py ← Pre-implementation design review (2–3 parallel Atlas-backed reviewer slots, duplicate model IDs allowed, plan_task tool)
141-
│ ├── review.py ← Task acceptance review tool plus legacy internal multi-review helpers
140+
│ ├── plan_review.py ← Pre-implementation design review (adaptive context levels, shared ReviewCoordinator slots, duplicate model IDs allowed, plan_task tool)
141+
│ ├── review.py ← Task acceptance review tool plus multi-review adapters backed by the shared review substrate
142142
│ ├── review_context_atlas.py ← Deterministic bounded-context compiler for scope_review, plan_task, and deep_self_review; raw-inlines selected files and accounts for every tracked path in the manifest
143143
│ ├── review_helpers.py ← Shared review helpers (section loader, touched/head packs, intent, pytest preflight via agent interpreter)
144144
│ ├── review_revalidation.py ← Reviewed-commit fingerprint revalidation helpers (blocks when staged diff changes after review)
145145
│ ├── scope_review.py ← Scope reviewer (enforcement-aware, budget-aware)
146-
│ ├── services.py ← Task-scoped long-running service mini-manager: start/status/logs/stop with process-group cleanup
147-
│ ├── legacy_aliases.py ← Private v1→v2 tool-name migration aliases; old names are not exposed in public schemas
146+
│ ├── services.py ← Task-scoped long-running service mini-manager: start/status/logs/stop with process-group cleanup and retained private log blobs
148147
│ ├── skill_exec.py ← Phase 3 external-skill surface: list_skills, skill_review, toggle_skill, skill_exec (subprocess runner with cwd confinement, env scrubbing, timeout, runtime allowlist python/python3/bash/node/deno/ruby/go; gated by enabled + fresh executable review + fresh content hash — v5.1.2 Frame A: runtime_mode no longer blocks execution)
149148
│ ├── skill_publish.py ← Agent-callable `submit_skill_to_hub` tool: validates a fresh clean-reviewed local skill (sources `external`/`self_authored`/`user_repo`/`ouroboroshub`/`clawhub`; `native` only when no `.seed-origin` marker), infers OuroborosHub from `OUROBOROS_HUB_CATALOG_URL`, commits payload + catalog update to the user's fork via GitHub GraphQL, and opens a PR without mutating the local Ouroboros repo. For marketplace-managed sources the generated PR body is force-prefixed with a `## Provenance` block read from the local sidecar (`.ouroboroshub.json` slug / `.clawhub.json` clawhub_slug); when no sidecar exists the source is reclassified as `external` by skill_loader and submit proceeds without the block.
150149
│ └── skill_preflight.py ← v5.7.0 heal-safe, read-only skill payload preflight validator (manifest parse + Python compile() / node --check / bash -n; no review-state mutation)
@@ -760,10 +759,10 @@ Loop checkpoints are plain user-message self-checks by design. A prior structure
760759

761760
Tool API v2 exposes neutral canonical names directly. Public schemas use
762761
`read_file`, `list_files`, `search_code`, `write_file`, `edit_text`,
763-
`run_command`, `run_script`, service tools, `commit_reviewed`, `vcs_*`,
764-
`schedule_subagent`, `wait_task`, and `wait_tasks`. Private legacy aliases
765-
exist only in `tools/legacy_aliases.py` for migration; prompts and skills
766-
should not rely on them.
762+
`run_command`, `run_script`, `claude_code_edit`, service tools,
763+
`commit_reviewed`, `vcs_*`, `schedule_subagent`, `wait_task`, and
764+
`wait_tasks`. Legacy public tool names are a breaking rename in v6.3: they
765+
are not exposed and are not translated at execute time.
767766

768767
### Safety and runtime mode
769768

@@ -891,7 +890,9 @@ Runtime floors:
891890
| OUROBOROS_WEBSEARCH_MODEL | gpt-5.2 | Official OpenAI Responses model for `web_search` when `OPENAI_BASE_URL` is empty |
892891
| OUROBOROS_REVIEW_MODELS | openai/gpt-5.5,google/gemini-3.5-flash,anthropic/claude-opus-4.6 | Comma-separated reviewer slots for triad/plan/task/skill review; duplicate model IDs are independent slots |
893892
| OUROBOROS_SCOPE_REVIEW_MODELS | openai/gpt-5.5 | Comma-separated scope reviewer slots; falls back from legacy `OUROBOROS_SCOPE_REVIEW_MODEL` |
894-
| OUROBOROS_TASK_REVIEW_MODE | auto | Task result review mode: `off`, `auto`, or `required`; verdicts are advisory, full output is injected untruncated |
893+
| OUROBOROS_TASK_REVIEW_MODE | auto | Task result review mode: `off`, `auto`, or `required`; `auto` is agent-choice via the visible review tool, `required` is host-injected before finalization, verdicts are advisory, full output is injected untruncated |
894+
| OUROBOROS_OBSERVABILITY_RETENTION_DAYS | unset | Deprecated audit knob for private observability manifests/blobs; forensic replay blobs are kept compressed indefinitely |
895+
| OUROBOROS_SERVICE_LOG_RETENTION_DAYS | 14 | Startup prune for leftover task-scoped live service log directories; pruned small logs are copied into private blobs first and oversized logs are retained |
895896
| OUROBOROS_REVIEW_MODEL_TIMEOUT_SEC | 600 | Env-only override read directly by `ouroboros.tools.review`. Per-reviewer model call timeout for multi-model review; timed-out reviewers become ERROR actors and quorum still requires at least two parseable reviewers. |
896897
| OUROBOROS_REVIEW_ENFORCEMENT | advisory | Review enforcement: `blocking` blocks commit critical findings, fresh-advisory open obligations/debts, and skill `blockers`; `advisory` downgrades those to warnings by operator choice. Fresh advisory with open obligations/debts writes `advisory_obligations_acknowledged`; stale advisory still blocks. Skill `warnings` do not block execution in either mode. |
897898
| OUROBOROS_AUTO_GRANT_REVIEWED_SKILLS | false | Owner-confirmed setting. When enabled, a fresh executable skill review grants only the manifest-declared settings keys and host permissions for that exact content hash so closed-loop skill development can run without repeated manual grants. Under `blocking`, blocker reviews are not executable and do not auto-grant; under `advisory`, blocker findings may auto-grant only because the current enforcement mode makes the review executable. Plain `/api/settings` POST drops this key; desktop uses the launcher confirmation bridge and web uses `/api/owner/auto-grant`. |
@@ -1004,8 +1005,8 @@ The panic sequence (in `server.py:_execute_panic_stop()`):
10041005
3. Write ~/Ouroboros/data/state/panic_stop.flag
10051006
4. LocalModelManager.stop_server() ← kill local model server if running
10061007
5. kill_all_tracked_subprocesses() ← os.killpg(SIGKILL) every tracked
1007-
│ subprocess process group (SDK agent,
1008-
│ shell commands, and ALL their children)
1008+
foreground subprocess process group
1009+
(shell commands and ALL their children)
10091010
6. kill_workers(force=True) ← SIGTERM+SIGKILL all multiprocessing workers
10101011
7. os._exit(99) ← immediate hard exit, kills daemon threads
10111012
```
@@ -1029,19 +1030,32 @@ On next manual launch:
10291030

10301031
### 9.3 Subprocess Process Group Management
10311032

1032-
All subprocesses spawned by agent tools (`run_command`, `run_script`, service tools, and internal SDK gateways)
1033-
use `start_new_session=True` (via `_tracked_subprocess_run()` in
1034-
`ouroboros/tools/shell.py`). This creates a separate process group for each
1035-
subprocess and all its children.
1033+
Subprocesses spawned by foreground agent tools (`run_command` and `run_script`)
1034+
use `start_new_session=True` via `_tracked_subprocess_run()` in
1035+
`ouroboros/tools/shell.py`. Task-scoped service tools use
1036+
`ouroboros/tools/services.py::_start_service`, which starts each service with
1037+
`subprocess_new_group_kwargs()` and records it in the `_SERVICES` registry.
1038+
Both paths create a separate process group for each subprocess and its children.
10361039

10371040
On panic or timeout, the entire process tree is killed via
10381041
`os.killpg(pgid, SIGKILL)` — no orphans possible, even for deeply nested
1039-
subprocess trees (e.g., SDK agent processes spawned during internal review/advisory gateways).
1042+
foreground shell/script/service subprocess trees.
1043+
Panic/emergency paths call `kill_all_tracked_subprocesses()` and
1044+
`kill_all_services()` without log finalization so emergency stop remains fast;
1045+
normal lifespan shutdown may pass a drive root to `kill_all_services(drive_root)`
1046+
to archive server-process service logs before removing live log files. Services
1047+
started inside worker tasks normally finalize in `loop.py` task cleanup; forced
1048+
worker termination kills the worker process tree and archives remaining task
1049+
service logs best-effort from `data/services/<task_id>/`.
10401050

10411051
Active subprocesses are tracked in a thread-safe global set and cleaned up
10421052
automatically on completion or via `kill_all_tracked_subprocesses()` on panic.
10431053
`run_command` surfaces timeout-vs-signal distinctions in its result text so
10441054
`exit_code=-9` no longer looks like a silent success in summaries/reflections.
1055+
Claude Agent SDK gateways (`gateways/claude_code.py`) use the SDK client
1056+
lifecycle and SDK-level path/tool guards; they are not represented in
1057+
`_tracked_subprocess_run()` unless a future SDK transport exposes a first-class
1058+
child process handle.
10451059

10461060
---
10471061

docs/CHECKLISTS.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -539,11 +539,15 @@ block repo commits and vice versa.
539539

540540
Used by `plan_task` for pre-implementation design reviews, BEFORE any code is written.
541541
Reviewers see the proposed plan, HEAD snapshots of files planned to be touched,
542-
and a Generated Plan Review Atlas that raw-inlines selected protected/central files
543-
while accounting for every tracked path in its manifest.
542+
and an agent-selected context level: `minimal`, `localized`, `broad`, or
543+
`constitutional`. `minimal` keeps governance docs and touched-file snapshots
544+
but omits the generated Atlas; `localized` adds a bounded neighborhood around
545+
planned files, `broad` is for shared contracts, and `constitutional` is
546+
reserved for self-evolution / immune-system surfaces.
544547

545548
**Reviewer role is GENERATIVE, not audit.** The primary job is to contribute
546-
ideas the implementer may not see, using broad Atlas-backed repo access. Finding defects in
549+
ideas the implementer may not see, using the repository evidence available for
550+
the selected context level. Finding defects in
547551
the plan is secondary; proposing concrete alternatives, surfacing existing
548552
surfaces that already solve the goal, and flagging subtle contract breaks the
549553
implementer missed is primary.
@@ -553,7 +557,7 @@ implementer missed is primary.
553557
Reviewers must structure their response in this order:
554558

555559
1. **Your own approach** (1-2 sentences). State what YOU would do if this goal
556-
came to you with broad Atlas-backed repo access: the concrete alternative path, the
560+
came to you with the available repository evidence: the concrete alternative path, the
557561
existing file/function you would reuse, or the simpler route. If after real
558562
effort you genuinely see no better approach, say so explicitly.
559563
2. **`## PROPOSALS` section** (top 1-2 contributions). The highest-value thing

docs/DEVELOPMENT.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -151,9 +151,16 @@ Concrete requirements:
151151
| Background consciousness (`consciousness.py`) | ✅ full | ✅ full | — (not yet required) |
152152
| Advisory pre-review (`tools/claude_advisory_review.py`) | ✅ via `_load_doc` | ✅ via `_load_doc` | ✅ via `_load_doc` |
153153
| Scope review (`tools/scope_review.py`) | full canonical doc + Atlas accounting | full canonical doc + Atlas accounting | full canonical doc + Atlas accounting |
154-
| Plan review (`tools/plan_review.py`) | full canonical doc + Atlas accounting | full canonical doc + Atlas accounting | full canonical doc + Atlas accounting |
154+
| Plan review (`tools/plan_review.py`) | full canonical doc + adaptive context level | full canonical doc + adaptive context level | full canonical doc + adaptive context level |
155155
| Deep self-review (`deep_self_review.py`) | full canonical doc + Atlas accounting | full canonical doc + Atlas accounting | full canonical doc + Atlas accounting |
156156

157+
Plan review always keeps BIBLE.md, ARCHITECTURE.md, DEVELOPMENT.md, the proposed
158+
plan, touched-file snapshots, and reviewer-slot framing as first-class context.
159+
The agent must choose `context_level` explicitly; there is no host-side `auto`
160+
heuristic. That field controls only the generated repository Atlas: `minimal`
161+
omits Atlas accounting for bounded/local plans, while `localized`, `broad`, and
162+
`constitutional` add progressively larger Atlas packs.
163+
157164
### Invariant: No silent truncation
158165

159166
If a core governance artifact cannot fit in the available context budget:

0 commit comments

Comments
 (0)