You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fixes
- Codex file_write actually lands on disk — threaded chatId through all 6
built-in tool executors (fs_read/fs_write/fs_list/fs_search/shell_execute/
execute_code) via getActiveChatId(). The documented per-chat workspace
isolation (~/agent-workspace/<chatId>/) silently fell through to a shared
default/ fallback whenever the model emitted a relative path.
- agents.ts:executeTool file_write now returns the real data.path from Rust
instead of a fake "File written successfully" string that hid write
failures behind a green ✓.
- Codex chat bubble no longer floods with raw JSON for models that emit
tool calls as content (qwen2.5-coder:3b). New stripRanges() helper uses
the balanced-brace positions the extractor already computes to remove the
exact tool-call substrings, and an extractedFromContent flag drops the
residual narrative entirely so qwen's Codex UI looks identical to
gemma4's.
- Balanced-brace JSON extractor replaces the greedy \{[^}]*\} regex that
failed on nested braces / f-strings. Fixes extractToolCallsFromContent
for any code using f-strings or dict-literal string values.
- Arg-validator error-hint now concrete: lists required fields with types
+ the keys the model actually sent, so small models self-correct on
retry instead of repeating the same malformed call.
Added
- Context compaction in Codex (mirrors Agent Mode — compactMessages before
each sampling call).
- Memory injection + extraction in Codex (parity with Chat/Agent — reads
getMemoriesForPrompt into the system prompt + extractMemoriesFromPair
after the turn).
- CODEX_CATEGORIES tool-scope filter (filesystem/terminal/system/web only;
hides image_generate/screenshot/process_list/run_workflow from Codex).
- Codex iter cap 20 → 50 (large refactors need more tool calls; budget
still caps via agentMaxToolCalls/agentMaxIterations).
- Family grouping in ModelSelector dropdown (QWEN / GEMMA / LLAMA / HERMES /
PHI / DOLPHIN / MISTRAL / DEEPSEEK / …).
E2E verified on 5 tool-capable Ollama models (gemma4:e4b + full CLI task
with 4/4 unittest pass, qwen2.5-coder:3b, hermes3:8b, llama3.1:8b,
llama3.2:1b). Tests 2202 → 2202 green. cargo + tsc clean.
Drop-in upgrade from v2.3.7. No breaking changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+29Lines changed: 29 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,35 @@
2
2
3
3
All notable changes to Locally Uncensored are documented here.
4
4
5
+
## [2.3.8] - 2026-04-22
6
+
7
+
### Fixed
8
+
-**Codex `file_write` now actually lands on disk in the expected folder** — the built-in tool executors (`fs_read`, `fs_write`, `fs_list`, `fs_search`, `shell_execute`, `execute_code`) in `src/api/mcp/builtin-tools.ts` never threaded the active chat-id through to Rust even though `agent-context.ts` was designed for exactly that. The documented per-chat workspace isolation (`~/agent-workspace/<chatId>/`) silently fell through to a shared `default/` fallback whenever the model emitted a relative path, and no per-chat isolation ever happened. Now every executor reads `getActiveChatId()` and spreads it into the `backendCall` payload so Rust's `resolve_path()` / `resolve_agent_path()` can route relative paths into the right per-chat folder. `src/api/agents.ts:executeTool` also now returns the real `data.path` from Rust's `{status:"saved", path:…}` response instead of a hard-coded `"File written successfully"` string that masked write failures behind a green ✓ in the UI.
9
+
-**Codex chat bubble no longer floods with raw `{"name":"file_write", "arguments":{…}}` JSON objects for models that emit tool calls as content** — qwen2.5-coder:3b and similar small coder models put the tool call in the `content` field instead of the native `tool_calls` array. The pre-2.3.8 extractor caught the call but left the raw JSON visible in the chat, and the narrative around it ("I'm about to verify…" + ```python fence echoing the file content) was concatenated onto `fullContent` every iteration — a 4-iteration task rendered as four stacked JSON blobs with four duplicated paragraphs. Fix: new `stripRanges()` helper uses the `[startIdx, endIdx]` positions the balanced-brace extractor already computes to remove the exact tool-call substrings (not a greedy regex that fails on nested braces), and an `extractedFromContent` flag drops the residual narrative entirely so qwen's Codex UI now looks identical to gemma4's.
10
+
-**Balanced-brace JSON extractor replaces the greedy `\{[^}]*\}` regex** — the old regex failed on any JSON with nested braces OR string values containing `{` (e.g. Python f-strings `f'Hello, {name}!'` emitted by qwen2.5-coder). Replaced with a locate-header-then-balance scanner that respects string escapes. Fixes `extractToolCallsFromContent` for any code that uses f-strings or dict literals in string values.
11
+
-**Arg-validator error-hint now lists the exact missing fields with types and what the model actually sent** — pre-2.3.8 the generic "Re-issue the tool call with valid arguments matching the tool schema" hint meant small models (hermes3:8b, qwen2.5-coder:3b) kept retrying the same malformed call. Now the hint looks like `file_write requires {path: string, content: string}. You sent {command}. Retry with all required fields present.` — concrete enough that small models actually self-correct.
12
+
13
+
### Added
14
+
-**Context compaction in Codex** — long multi-tool turns used to blow past 8K-context local models' windows; Codex now mirrors Agent Mode's `compactMessages(…, Math.floor(maxCtx * 0.8))` call before each sampling pass, summarising older turns while keeping recent messages intact.
15
+
-**Memory injection + extraction in Codex** — Codex was the only chat surface that ignored the memory system. It now reads `useMemoryStore.getState().getMemoriesForPrompt(instruction, contextTokens)` into the system prompt at dispatch time, and runs `extractMemoriesFromPair()` after the turn lands. Parity with Chat + Agent Mode.
16
+
-**`CODEX_CATEGORIES` tool-scope filter** — Codex now filters `toolRegistry.getAll()` to the `filesystem | terminal | system | web` categories before passing tools to the model. The pre-2.3.8 code had the constant defined but never used, so small models were getting confused by `image_generate`, `screenshot`, `run_workflow`, and `process_list` showing up next to `file_write` and emitting tool calls with the wrong argument shape (confirmed repro: hermes3:8b calling `file_write({command: "python -m unittest …"})` when both shell_execute and file_write were in scope). The filter narrows the blade.
17
+
-**Codex iter cap raised 20 → 50** — large refactors across 10+ files legitimately need more than 20 tool calls. Budget still caps via `agentMaxToolCalls` / `agentMaxIterations` (defaults 50 / 25 from settings).
18
+
-**Family grouping in ModelSelector dropdown** — models are now grouped by family header (QWEN / GEMMA / LLAMA / HERMES / PHI / DOLPHIN / MISTRAL / DEEPSEEK / …) in the Codex/Chat/Code dropdown, with a subscribe effect that re-fetches the list when any provider's `enabled`/`baseUrl` changes so users don't have to open Model Manager to see newly-enabled providers.
19
+
20
+
### E2E verified
21
+
5 tool-capable Ollama models, each in a fresh Codex chat, writing to `C:\Users\<user>\Desktop\<test-folder>\`:
22
+
-**gemma4:e4b** — both simple (`file_write hello.py`) and a real Codex-style task ("build cli.py with argparse add/list/clear + test_cli.py with 4 unittest tests + run `python -m unittest test_cli.py` and report") succeeded end-to-end. Full trace: `file_write cli.py (2556B)` → `file_write test_cli.py (3759B)` → `shell_execute python -m unittest test_cli.py` → real output `....\nRan 4 tests in 1.612s\nOK` → final summary. 3 clean tool blocks in the UI, single final answer, Memory badge fired on extraction.
23
+
-**qwen2.5-coder:3b** — after the `stripRanges` + `extractedFromContent` fix, chat UI is visually identical to gemma4's (tool blocks + single summary, zero raw JSON).
24
+
-**hermes3:8b** — clean native tool-call flow.
25
+
-**llama3.1:8b** — clean native tool-call flow (freshly pulled for this verification).
26
+
-**llama3.2:1b** — plumbing correct; the 1B model hallucinated a Unix-style `/Users/ddrob/Desktop/tiny.py` path that landed at `C:/Users/ddrob/Desktop/tiny.py` on Windows instead of in the workdir. Model-quality artefact, not a Codex bug. Documented for users on the smallest class of models.
27
+
28
+
### Changed
29
+
- Test suite 2202 → 2202 (full regression) after `tool-call-repair` gained `extractToolCallsWithRanges` + `stripRanges` + `findBalancedBraceEnd` + `findPrecedingOpenBrace`.
30
+
31
+
### Notes
32
+
- Drop-in upgrade from v2.3.7. No breaking changes. No localStorage migration. Existing Codex chats continue to work; new chats benefit from the per-chat workspace isolation now that `chatId` threads through.
Copy file name to clipboardExpand all lines: README.md
+23-6Lines changed: 23 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,15 +35,32 @@ No cloud. No data collection. No API keys. Auto-detects 12 local backends. Your
35
35
36
36
---
37
37
38
-
## v2.3.7 — Current Release
38
+
## v2.3.8 — Current Release
39
39
40
-
**Remote Ollama + `OLLAMA_HOST` env var support, 2202 Tests**
40
+
**Codex end-to-end overhaul — tool calls that actually land on disk, clean UI across all models, memory + context compaction**
41
41
42
42
### Critical Fixes (why you want this update)
43
-
-**Remote Ollama now actually works** — Issue #31 by @k-wilkinson. Pre-2.3.7 the Ollama endpoint was hardcoded to `localhost:11434` in four places (frontend URL helper, Vite dev proxy, Ollama provider dev-mode path, Rust pull-model command), so setting `OLLAMA_HOST=0.0.0.0:11434`, `192.168.1.x:11434` or any custom port was silently ignored — LU reported "No local backend detected", model dropdowns stayed empty, Settings → Providers → Ollama → Endpoint field had zero effect, and the Test button always said Failed. Fixed end-to-end: a single `ollama_base` field reads, in priority, the persisted GUI value, then the `OLLAMA_HOST` env var (same semantics as Ollama itself), then the default. Vite dev proxy target is computed from `OLLAMA_HOST` at startup, the Rust SSRF allow-list in `proxy_localhost` accepts the configured Ollama + ComfyUI hosts, `pull_model_stream` reads from state.
44
-
45
-
### What's still in v2.3.7 from v2.3.6
46
-
Drop-in upgrade. v2.3.6's configurable ComfyUI host (Shoaib's remote ComfyUI feature), LM Studio / OpenAI-compat CORS fix, and ComfyUI port persistence all remain in place.
43
+
-**Codex file_write actually writes to disk now** — pre-2.3.8 Codex's built-in tool executors (`fs_read`, `fs_write`, `fs_list`, `fs_search`, `shell_execute`, `execute_code`) never threaded the active chat-id through to the Rust backend even though the whole `agent-context.ts` plumbing was designed for it. The documented per-chat workspace isolation (`~/agent-workspace/<chatId>/`) silently fell through to a shared `default/` fallback, and relative paths the model emitted landed nowhere useful. Fixed at the frontend layer — every builtin executor now calls `backendCall('fs_write', { …, chatId: getActiveChatId() })`. `agents.ts:executeTool` also returns the real `data.path` from Rust instead of a fake `"File written successfully"` string that hid write failures behind a green ✓.
44
+
-**Clean Codex UI for every model, not just the ones that emit native `tool_calls`** — qwen2.5-coder:3b (and other small coder models) emit tool calls as raw JSON in the `content` field instead of the native `tool_calls` array. The pre-2.3.8 extractor caught the JSON but left the raw `{"name":"file_write", "arguments": {...}}` object visible in the chat bubble, plus every iteration's narrative ("I'm about to verify…" + a ```python fence with the file content) was concatenated onto `fullContent` — so a 4-tool-call turn looked like four stacked JSON blobs with four duplicated paragraphs. Fix: new `stripRanges()` helper uses the balanced-brace positions the extractor already computes to remove the exact tool-call JSON substrings (not a greedy regex that fails on f-strings with `{name}`), and an `extractedFromContent` flag drops the residual narrative entirely so qwen's Codex chat now looks identical to gemma4's.
45
+
-**Context compaction in Codex** — long multi-tool turns used to blow past the model's context window on 8K-context local models; Codex now summarises older turns via `compactMessages` before every sampling call. Parity with Agent Mode.
46
+
-**Memory injection + extraction in Codex** — Codex was the only chat surface that ignored the memory system. It now reads `getMemoriesForPrompt()` at dispatch time and runs `extractMemoriesFromPair()` after the turn lands, so long-running coding sessions accumulate context like everywhere else.
47
+
-**Tool-scope filter for Codex** — Codex now filters the registry to `filesystem | terminal | system | web` categories before passing tools to the model. Small models were getting confused by `image_generate`, `screenshot`, `run_workflow`, `process_list` showing up next to `file_write` and emitting tool calls with the wrong argument shape (e.g. `file_write({command: …})`). The filter narrows the blade.
48
+
-**Balanced-brace JSON extractor** — the naive `\{[^}]*\}` regex in `tool-call-repair.ts` failed on any nested brace or string value containing `{` (e.g. Python f-strings `f'Hello, {name}!'`). Replaced with a locate-header-then-balance scanner that respects string escapes. Fixes qwen2.5-coder:3b tool-call extraction for any code that uses f-strings.
49
+
-**Concrete arg-validator error hints** — when a tool call fails schema validation, the retry message sent back to the model now lists the exact required fields with their types + the keys the model actually sent ("`file_write requires {path: string, content: string}. You sent {command}. Retry with all required fields present.`") instead of the old generic "matching the tool schema" hint. Small models self-correct much better with a concrete example.
50
+
-**Codex iter cap raised 20 → 50** — large refactors across 10+ files legitimately need more than 20 tool calls. Budget still caps via `agentMaxToolCalls` / `agentMaxIterations` (defaults 50 / 25 from settings).
Drop-in upgrade. v2.3.7's remote Ollama + `OLLAMA_HOST` env var support, v2.3.6's configurable ComfyUI host, LM Studio / OpenAI-compat CORS fix, and ComfyUI port persistence all remain in place.
47
64
48
65
### Remote Access + Mobile Web App
49
66
-**Access your AI from your phone** — Dispatch via LAN or Cloudflare Tunnel (Internet)
0 commit comments