Skip to content
This repository was archived by the owner on May 30, 2026. It is now read-only.

Commit ffb92e3

Browse files
AntonAnton
authored andcommitted
v4.3.0: remove silent truncation and harden routing
Preserve critical task and memory artifacts without silent clipping, make subtask lifecycle states honest, and restore cache-aware provider failover so degraded routes are visible instead of failing opaquely. Made-with: Cursor
1 parent 4e9a037 commit ffb92e3

32 files changed

Lines changed: 1190 additions & 238 deletions

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ __pycache__/
1717
.pytest_cache/
1818
.mypy_cache/
1919
.ruff_cache/
20+
.pyinstaller-cache/
2021

2122
# Virtual environments
2223
venv/

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
[![macOS 12+](https://img.shields.io/badge/macOS-12%2B-black.svg)](https://github.com/joi-lab/ouroboros-desktop/releases)
77
[![Linux](https://img.shields.io/badge/Linux-x86__64-orange.svg)](https://github.com/joi-lab/ouroboros-desktop/releases)
88
[![Windows](https://img.shields.io/badge/Windows-x64-blue.svg)](https://github.com/joi-lab/ouroboros-desktop/releases)
9-
[![Version 4.2.0](https://img.shields.io/badge/version-4.2.0-green.svg)](VERSION)
9+
[![Version 4.3.0](https://img.shields.io/badge/version-4.3.0-green.svg)](VERSION)
1010

1111
A self-modifying AI agent that writes its own code, rewrites its own mind, and evolves autonomously. Born February 16, 2026.
1212

@@ -238,6 +238,7 @@ Full text: [BIBLE.md](BIBLE.md)
238238

239239
| Version | Date | Description |
240240
|---------|------|-------------|
241+
| 4.3.0 | 2026-03-19 | Reliability and continuity release: remove silent truncation from critical task/memory paths, persist honest subtask lifecycle states and full task results, restore transient chat wake banner, replace local-model hard prompt slicing with explicit non-core compaction plus fail-fast overflow, route Anthropic/OpenRouter calls without hard provider pinning while keeping parameter guarantees, and align async review calls with shared LLM routing/usage observability. |
241242
| 4.2.0 | 2026-03-16 | Cross-platform hardening release: replace Unix-only file locking in memory/consolidation with Windows-safe locking, refresh default model tiers (Opus main/code, Sonnet light/fallback, task effort `medium`), improve reconnect recovery with heartbeat/watchdog/history resync, switch local model chat format to auto-detect, and sync public docs with the current codebase and BIBLE structure. |
242243
| 4.1.0 | 2026-03-16 | Public desktop release: port the v4 architecture and UI into the platform branch, preserve cross-platform packaging and Windows runtime support, and ship signed notarized macOS packaging. |
243244
| 4.0.9 | 2026-03-15 | Packaging completeness release: bundle `assets/`, restore custom app icon from `assets/icon.icns`, and copy assets into the bootstrapped repo on fresh install so the shipped app and repo are no longer missing the visual asset layer. |

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
4.2.0
1+
4.3.0

build.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,9 @@ PY
4747

4848
rm -rf build dist
4949

50+
export PYINSTALLER_CONFIG_DIR="$PWD/.pyinstaller-cache"
51+
mkdir -p "$PYINSTALLER_CONFIG_DIR"
52+
5053
echo "--- Running PyInstaller ---"
5154
python3 -m PyInstaller Ouroboros.spec --clean --noconfirm
5255

build_linux.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,9 @@ python-standalone/bin/pip3 install -q -r requirements.txt
2020

2121
rm -rf build dist
2222

23+
export PYINSTALLER_CONFIG_DIR="$PWD/.pyinstaller-cache"
24+
mkdir -p "$PYINSTALLER_CONFIG_DIR"
25+
2326
echo "--- Running PyInstaller ---"
2427
python -m PyInstaller Ouroboros.spec --clean --noconfirm
2528

build_windows.ps1

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,9 @@ Write-Host "--- Installing agent dependencies into python-standalone ---"
2323
if (Test-Path "build") { Remove-Item -Recurse -Force "build" }
2424
if (Test-Path "dist") { Remove-Item -Recurse -Force "dist" }
2525

26+
$env:PYINSTALLER_CONFIG_DIR = Join-Path (Get-Location) ".pyinstaller-cache"
27+
New-Item -ItemType Directory -Force -Path $env:PYINSTALLER_CONFIG_DIR | Out-Null
28+
2629
Write-Host "--- Running PyInstaller ---"
2730
python -m PyInstaller Ouroboros.spec --clean --noconfirm
2831

docs/ARCHITECTURE.md

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Ouroboros v4.2.0 — Architecture & Reference
1+
# Ouroboros v4.3.0 — Architecture & Reference
22

33
This document describes every component, page, button, API endpoint, and data flow.
44
It is the single source of truth for how the system works. Keep it updated.
@@ -184,6 +184,7 @@ Navigation is a left sidebar with 8 pages.
184184
- **Progress messages**: background consciousness thinking shown as dimmed bubbles with 💬 prefix.
185185
- **Typing indicator**: animated "thinking dots" bubble appears when the agent is processing.
186186
- **Persistence**: chat history loaded from server on page load (`/api/chat/history`), survives app restarts. Fallback to sessionStorage.
187+
- **Empty-chat init**: if neither server history nor sessionStorage has messages, the UI shows a transient assistant bubble: `Ouroboros has awakened`. This is visual-only and is not written to chat history.
187188
- Messages sent via WebSocket `{type: "chat", content: text}`.
188189
- Responses arrive via WebSocket `{type: "chat", role: "assistant", content: text, ts: "ISO"}`.
189190
- Supports slash commands: `/status`, `/evolve`, `/review`, `/bg`, `/restart`, `/panic`.
@@ -363,7 +364,7 @@ Each iteration (0.5s sleep):
363364
- Browser tools use thread-sticky executor (Playwright greenlet affinity)
364365
- All tools have hard timeout (default 360s, per-tool overrides for browser/search/vision)
365366
- Multi-layer safety: hardcoded sandbox (registry.py) → deterministic whitelist → LLM safety supervisor
366-
- Tool results truncated per-tool (repo_read/data_read: 80k, run_shell: 40k, default: 15k chars)
367+
- Tool results use explicit per-tool caps with visible truncation markers (`repo_read`/`data_read`/`knowledge_read`/`run_shell`: 80k, default: 15k chars). Cognitive reads (`memory/*`, prompts, BIBLE/docs, commit/review outputs) are exempt from silent clipping.
367368
- Context compaction kicks in after round 8 (summarizes old tool results)
368369

369370
### Git tools (tools/git.py + tools/review.py + supervisor/git_ops.py)
@@ -399,6 +400,7 @@ Multi-layer security:
399400
3. **LLM Layer 1 (fast)**: Light model checks remaining tool calls for SAFE/SUSPICIOUS/DANGEROUS.
400401
4. **LLM Layer 2 (deep)**: If flagged, heavy model re-evaluates with "are you sure?" nudge.
401402
5. **Post-execution revert**: After claude_code_edit, modifications to safety-critical files are automatically reverted.
403+
- Safety LLM calls now emit standard `llm_usage` events, so safety costs and failures appear in the same audit/health pipeline as other model calls.
402404
`identity.md` is intentionally mutable (self-creation) and can be rewritten radically;
403405
the constitutional guard is that the file itself must remain non-deletable.
404406

@@ -407,6 +409,7 @@ the constitutional guard is that the file itself must remain non-deletable.
407409
- Daemon thread, sleeps between wakeups (interval controlled by LLM via `set_next_wakeup`)
408410
- Loads full agent context: BIBLE, identity, scratchpad, knowledge base, drive state,
409411
health invariants, recent chat/progress/tools/events (same context as main agent)
412+
- Owner messages are forwarded to background consciousness in full text (not first-100-char previews).
410413
- Calls LLM with lightweight introspection prompt
411414
- Has limited tool access (memory, messaging, scheduling, read-only)
412415
- **Progress emission**: emits 💬 progress messages to UI via event queue + persists to `progress.jsonl`
@@ -444,6 +447,7 @@ the constitutional guard is that the file itself must remain non-deletable.
444447
- Stored in `logs/task_reflections.jsonl`; last 20 entries loaded into dynamic context
445448
- Pattern register: recurring error classes tracked in `memory/knowledge/patterns.md`
446449
via LLM, loaded into semi-stable context as "Known error patterns"
450+
- Secondary reflection/pattern prompts use explicit truncation markers when compacted for prompt size; no silent clipping of these helper summaries.
447451
- Runs synchronously (not in daemon thread) to avoid data loss on shutdown
448452

449453
### Crash report injection (agent.py)
@@ -453,17 +457,22 @@ the constitutional guard is that the file itself must remain non-deletable.
453457
- File is NOT deleted — persists so `build_health_invariants()` surfaces
454458
CRITICAL: RECENT CRASH ROLLBACK on every task until the agent investigates
455459

456-
### Subtask trace summaries
460+
### Subtask lifecycle and trace summaries
457461

458-
- When a subtask completes, a compact trace summary is included in the result
459-
- Parent tasks see tool call counts, error counts, and agent notes
460-
- Trace is truncated to 4000 chars; large traces show first/last 15 calls
462+
- `schedule_task` now writes durable lifecycle states in `task_results/<id>.json`: `requested``scheduled``running` → terminal status (`completed`, `rejected_duplicate`, `failed`, etc.)
463+
- Duplicate rejects are persisted explicitly, so `wait_for_task()` can report honest status instead of pretending the task is still running.
464+
- Completed subtasks persist the full result text; parent tasks no longer see silently clipped child output.
465+
- When a subtask completes, a compact trace summary is included alongside the full result.
466+
- Parent tasks see tool call counts, error counts, and agent notes.
467+
- Trace compaction remains explicit: max 4000 chars with visible omission markers, plus first/last 15 tool calls for long traces.
461468

462469
### Context building (context.py)
463470

464471
- As of v3.16.0, the Memory Registry digest (from `memory/registry.md`) is injected into every LLM context to enable source-of-truth awareness.
465472
- As of v3.20.0, `patterns.md` (Pattern Register) is injected into semi-stable context, and execution reflections from `task_reflections.jsonl` are injected into dynamic context.
466473
- As of v3.22.0, all docs are always in static context: BIBLE.md (180k), ARCHITECTURE.md (60k), DEVELOPMENT.md (30k), README.md (10k), CHECKLISTS.md (5k).
474+
- `build_health_invariants()` is split into focused helpers and now also surfaces recent provider/routing errors plus local context overflows.
475+
- Local-model path no longer silently slices the live system prompt. It compacts non-core sections explicitly and raises an overflow error if core context still cannot fit.
467476

468477
### Deep review (review.py)
469478

@@ -473,6 +482,7 @@ the constitutional guard is that the file itself must remain non-deletable.
473482
- Fallback to chunked previews if codebase exceeds 600K token budget
474483
- Security: skips sensitive files (.env, .pem, credentials.json, etc.)
475484
- Per-file cap: 1MB
485+
- Multi-model review now uses the shared async `LLMClient` OpenRouter path instead of raw one-off HTTP calls, so provider routing, Anthropic parameter requirements, usage normalization, and cache metadata are aligned with the rest of the runtime.
476486

477487
---
478488

ouroboros/agent.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@
4444
from ouroboros.agent_task_pipeline import (
4545
build_trace_summary, emit_task_results, build_review_context,
4646
)
47+
from ouroboros.task_results import STATUS_RUNNING, write_task_result
4748

4849

4950
_worker_boot_logged = False
@@ -161,6 +162,18 @@ def _prepare_task_context(self, task: Dict[str, Any]) -> Tuple[ToolContext, List
161162
drive_logs = self.env.drive_path("logs")
162163
sanitized_task = sanitize_task_for_event(task, drive_logs)
163164
append_jsonl(drive_logs / "events.jsonl", {"ts": utc_now_iso(), "type": "task_received", "task": sanitized_task})
165+
try:
166+
write_task_result(
167+
self.env.drive_root,
168+
str(task.get("id") or ""),
169+
STATUS_RUNNING,
170+
parent_task_id=task.get("parent_task_id"),
171+
description=task.get("description"),
172+
context=task.get("context"),
173+
result="Task is running.",
174+
)
175+
except Exception:
176+
log.debug("Failed to persist running task status", exc_info=True)
164177
self._emit_live_log(
165178
"context_building_started",
166179
task_id=str(task.get("id") or ""),

ouroboros/agent_task_pipeline.py

Lines changed: 49 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,19 @@
1616
import time
1717
from typing import Any, Dict, List
1818

19+
from ouroboros.task_results import STATUS_COMPLETED, write_task_result
1920
from ouroboros.utils import utc_now_iso, append_jsonl
2021

2122
log = logging.getLogger(__name__)
2223

2324

25+
def _truncate_with_notice(text: Any, limit: int) -> str:
26+
raw = str(text or "")
27+
if len(raw) <= limit:
28+
return raw
29+
return raw[:limit] + f"\n...[truncated from {len(raw)} chars; omitted {len(raw) - limit}]"
30+
31+
2432
def build_trace_summary(llm_trace: dict) -> str:
2533
"""Return a compact human-readable summary of tool calls and agent notes."""
2634
tool_calls = llm_trace.get("tool_calls", []) or []
@@ -44,6 +52,8 @@ def _fmt_call(idx: int, tc: dict) -> str:
4452
if len(v_str) > 60:
4553
v_str = v_str[:57] + "..."
4654
parts.append(f"{k}={v_str!r}")
55+
if len(args) > 2:
56+
parts.append(f"... (+{len(args) - 2} more args)")
4757
args_str = ", ".join(parts)
4858
else:
4959
args_str = repr(args)
@@ -148,23 +158,20 @@ def _store_task_result(env: Any, task: Dict[str, Any], text: str,
148158
usage: Dict[str, Any], llm_trace: Dict[str, Any]) -> None:
149159
"""Store task result for parent task retrieval."""
150160
try:
151-
results_dir = pathlib.Path(env.drive_root) / "task_results"
152-
results_dir.mkdir(parents=True, exist_ok=True)
153161
trace_summary = build_trace_summary(llm_trace)
154-
result_data = {
155-
"task_id": task.get("id"),
156-
"parent_task_id": task.get("parent_task_id"),
157-
"status": "completed",
158-
"result": text[:3500] if text else "",
159-
"trace_summary": trace_summary,
160-
"cost_usd": round(float(usage.get("cost") or 0), 6),
161-
"total_rounds": int(usage.get("rounds") or 0),
162-
"ts": utc_now_iso(),
163-
}
164-
result_file = results_dir / f"{task.get('id')}.json"
165-
tmp_file = results_dir / f"{task.get('id')}.json.tmp"
166-
tmp_file.write_text(json.dumps(result_data, ensure_ascii=False, indent=2))
167-
os.rename(tmp_file, result_file)
162+
write_task_result(
163+
env.drive_root,
164+
str(task.get("id") or ""),
165+
STATUS_COMPLETED,
166+
parent_task_id=task.get("parent_task_id"),
167+
description=task.get("description"),
168+
context=task.get("context"),
169+
result=text or "",
170+
trace_summary=trace_summary,
171+
cost_usd=round(float(usage.get("cost") or 0), 6),
172+
total_rounds=int(usage.get("rounds") or 0),
173+
ts=utc_now_iso(),
174+
)
168175
except Exception as e:
169176
log.warning("Failed to store task result: %s", e)
170177

@@ -194,14 +201,14 @@ def _run_task_summary(env, llm, task, usage, llm_trace, drive_logs):
194201
CONSOLIDATION_REASONING_EFFORT,
195202
)
196203
task_id = task.get("id", "unknown")
197-
goal = str(task.get("text", ""))[:500]
204+
goal = _truncate_with_notice(task.get("text", ""), 500)
198205
rounds = int(usage.get("rounds") or 0)
199206
cost = float(usage.get("cost") or 0)
200207
trace = build_trace_summary(llm_trace)
201208
prompt = _TASK_SUMMARY_PROMPT.format(
202209
task_id=task_id, goal=goal or "(no goal text)",
203210
task_type=task.get("type", "user"), rounds=rounds,
204-
cost=cost, trace_summary=trace[:3000],
211+
cost=cost, trace_summary=_truncate_with_notice(trace, 3000),
205212
)
206213
try:
207214
msg, _usage = llm.chat(messages=[{"role": "user", "content": prompt}],
@@ -217,7 +224,10 @@ def _run_task_summary(env, llm, task, usage, llm_trace, drive_logs):
217224
pass
218225
except Exception:
219226
log.warning("Task summary LLM call failed, using fallback", exc_info=True)
220-
summary_text = f"Task {task_id} ({task.get('type', 'user')}): {goal[:200]}. {rounds}r, ${cost:.2f}."
227+
summary_text = (
228+
f"Task {task_id} ({task.get('type', 'user')}): "
229+
f"{_truncate_with_notice(goal, 200)}. {rounds}r, ${cost:.2f}."
230+
)
221231
if summary_text:
222232
append_jsonl(drive_logs / "chat.jsonl", {
223233
"ts": utc_now_iso(), "direction": "system",
@@ -351,6 +361,16 @@ def build_review_context(env: Any) -> str:
351361
"\nUse repo_read to inspect specific files. "
352362
"Use run_shell for tests. Key files below:\n",
353363
]
364+
if stats.get("truncated"):
365+
parts.append(f"\nCompacted files: {stats['truncated']}\n")
366+
if stats.get("dropped"):
367+
dropped_paths = stats.get("dropped_paths") or []
368+
preview = ", ".join(dropped_paths[:5])
369+
parts.append(
370+
f"\nDropped files due review budget: {stats['dropped']}"
371+
+ (f" ({preview}{' ...' if len(dropped_paths) > 5 else ''})" if preview else "")
372+
+ "\n"
373+
)
354374
chunks = chunk_sections(sections)
355375
parts.append(chunks[0] if chunks else "(No reviewable content found.)")
356376
return "\n".join(parts)
@@ -380,6 +400,16 @@ def build_review_context(env: Any) -> str:
380400
"\nUse repo_read to inspect specific files. "
381401
"Use run_shell for tests. Key files below:\n",
382402
]
403+
if stats.get("truncated"):
404+
parts.append(f"\nCompacted files: {stats['truncated']}\n")
405+
if stats.get("dropped"):
406+
dropped_paths = stats.get("dropped_paths") or []
407+
preview = ", ".join(dropped_paths[:5])
408+
parts.append(
409+
f"\nDropped files due review budget: {stats['dropped']}"
410+
+ (f" ({preview}{' ...' if len(dropped_paths) > 5 else ''})" if preview else "")
411+
+ "\n"
412+
)
383413
chunks = chunk_sections(sections)
384414
parts.append(chunks[0] if chunks else "(No reviewable content found.)")
385415
return "\n".join(parts)

ouroboros/consciousness.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -258,10 +258,12 @@ def _think(self) -> None:
258258

259259
# Report usage to supervisor
260260
if self._event_queue is not None:
261+
provider = "local" if _use_local_light else "openrouter"
262+
model_name = f"{model} (local)" if _use_local_light else model
261263
self._event_queue.put({
262264
"type": "llm_usage",
263-
"provider": "openrouter",
264-
"model": model,
265+
"provider": provider,
266+
"model": model_name,
265267
"usage": usage,
266268
"cost": cost,
267269
"source": "consciousness",

0 commit comments

Comments
 (0)