Skip to content

Releases: Enderfga/claw-orchestrator

v4.1.2 — Opus 4.8 registry + Codex error surfacing

03 Jun 14:20

Choose a tag to compare

Added

  • Opus 4.8 (claude-opus-4-8, now the opus alias) and 4.7 in the model registry — model resolution and cost reporting are correct for opus and pinned Opus 4.x ids. claude-opus-4-6 remains available by id.

Fixed

  • Codex turn.failed / error stream events now reject the send with the reported message instead of resolving an empty string (even on exit 0).

Changed

  • Tested versions synced to Claude Code CLI 2.1.161; Codex stays 0.133.0.

v4.1.1 — Codex structured output + Gemini 0.43 compat

24 May 09:18

Choose a tag to compare

Added

  • Codex structured output via jsonSchema — the engine-agnostic jsonSchema session config now wires into Codex (codex exec --output-schema <FILE>, on first turn and resume). Requires Codex 0.132+.
  • Antigravity CLI (agy) custom-engine recipe — documented a ready-to-use CustomEngineConfig in multi-engine.md so Google's agy can be driven today via engine: 'custom'. Note: agy 1.0.2 has no structured output mode, so token counts are estimated.

Fixed

  • Gemini engine: pass --skip-trust — Gemini CLI 0.43 added a trusted-folders gate that aborted headless -p runs in untrusted directories (worktrees, arbitrary cwds). The wrapper now always passes --skip-trust.

Changed

  • Bumped tested engine CLI versions: Claude Code 2.1.150, Codex 0.133.0, Gemini 0.43.0.

v4.1.0 — Claude Code 2.1.140 sync

13 May 10:58

Choose a tag to compare

Catches up the wrapper on programmatic surface added between Claude Code CLI 2.1.126 and 2.1.140.

Added

  • claude_goal_set / claude_goal_clear / claude_goal_status tools wrap the CLI 2.1.139 /goal slash command. Claude Code keeps working across turns until the stated condition is met, evaluating after each turn via Haiku. The wrappers send the slash text via the existing session channel and enforce engine: "claude"; unlike Codex's /goal, there is no separate goal-state notification — the only surface is the assistant's reply text.
  • plugin_details tool wraps claude plugin details <name> (CLI 2.1.139+). Returns the plugin's component inventory plus per-session token cost.
  • pluginUrl session config maps to --plugin-url (CLI 2.1.129+). Accepts a single URL or array; each value is fetched as a plugin .zip archive for the session.

Not exposed (and why)

Settings.json fields (worktree.baseRef, autoMode.hard_deny, skillOverrides, sandbox.bwrapPath / socatPath, parentSettingsBehavior) are already user-controlled via the existing --settings flag. TTY-only env vars (CLAUDE_CODE_DISABLE_ALTERNATE_SCREEN, CLAUDE_CODE_FORCE_SYNC_OUTPUT, CLAUDE_CODE_SESSION_ID) do not apply to a non-interactive subprocess. Hook config (args: string[] exec form, continueOnBlock, hook input effort.level) and the subagent x-claude-code-agent-id HTTP header are internal to the CLI.

v4.0.7 — Test: mock council in setModeForDelta auto-fire (fixes CI flake)

13 May 09:50

Choose a tag to compare

Fixed — CI flake in manager.test.ts (`ENOTEMPTY` during afterEach)

The "setModeForDelta + interview-complete auto-fires startBuild" case let `startBuild` actually spawn a real council subprocess + git worktree, then dropped back to the polling loop as soon as mode hit `queued`. By the time `afterEach` ran `fs.rmSync(tmp, recursive)`, the council git workers were still writing into `/council-project/.git`, racing the recursive removal and surfacing as `ENOTEMPTY: directory not empty, rmdir '.git'`.

The test's contract is the interview-complete → startBuild handoff — nothing about the build pipeline's downstream behaviour. Now mocks `runCouncilSynth` and `runFixOnFailure` so the build short-circuits without spawning external workers.

Local stress (20× consecutive runs) passes cleanly.

v4.0.6 — Atomic JSON writes in UltraappStore (fixes flaky CI)

13 May 09:44

Choose a tag to compare

Fixed — UltraappStore JSON files now written atomically

The 4.0.5 CI run failed in manager.test.ts with Expected ',' or '}' after property value in JSON at position 148 from UltraappStore.readState — the manager test's polling loop caught state.json mid-write. The race exists in production too: any reader polling run state while another path mutates it can land in the partial-truncate window of fsp.writeFile.

Added atomicWriteJson(file, body) that writes to <file>.tmp.<pid>.<rand> and rename(2)s onto the target. POSIX rename is atomic, so concurrent readers see either the old file or the new — never a half-written file. All seven writer sites in store.ts route through the helper:

  • createRun (state + spec)
  • setMode
  • writeSpec (spec + state)
  • recordBuildArtifact
  • recordDeploy

Local stress (20× consecutive runs of the manager test) passes cleanly with the fix.

v4.0.5 — Coder & Reviewer panes restore after refresh

13 May 09:37

Choose a tag to compare

Fixed — Coder / Reviewer panes were blank after refresh

4.0.4 added chat.jsonl persistence for the Planner conversation but the Coder and Reviewer replies stayed SSE-only. The result: opening a run after a refresh / cross-process / Resume showed the Planner thread populated but the Coder and Reviewer panes empty until the next SSE event arrived — and for terminated runs, no SSE events ever come.

Changes

  • dispatcher.deliverToCoder and dispatcher.deliverToReviewer now append every reply to <ledger>/chat.jsonl alongside the existing emit('coder_reply' | 'reviewer_reply', ...) calls.
  • Each phase also writes a heartbeat entry the moment delivery starts: 🔨 Coder iter N working… / 🔍 Reviewer iter N auditing…. Useful for liveness checks on long turns (the dashboard sees activity even before the agent produces output) and survives refresh because it's on disk.
  • appendChatEntry's who union widened to include 'coder' | 'reviewer'.
  • The dashboard's chat_history hydration routes entries by who into the corresponding pane (coder → Coder pane, reviewer → Reviewer pane, others → Planner pane), so refreshing a mid-iter run shows the complete three-way conversation.

v4.0.4 — Terminated-run reopen + per-request token + chat history

13 May 09:11

Choose a tag to compare

Fixed

Auth token re-read per request

EmbeddedServer previously cached the auth token in memory at startup; a second clawo process (test runner, nohup launch, second launchd service) that briefly held the bind and wrote a different token would leave the live server with a stale in-memory value, and the reverse proxy (which reads ~/.openclaw/server-token per request) would inject the new value — producing a permanent 401 loop with no observable cause. The auth check now re-reads the token file on every request (64-byte read, kernel page cache, microsecond cost). The OPENCLAW_SERVER_TOKEN env override and the disabled opt-out are unchanged.

Reopening a terminated autoloop run no longer hangs on "Waiting…"

autoloopStatus(runId) previously returned undefined for any run that wasn't in this process's in-memory map, so the dashboard's /autoloop/<id>/state fetch 404'd on every terminated run and the UI stayed on its "Waiting…" placeholder forever. autoloopStatus now falls back to listAutoloopsFromRegistry and reconstructs a terminated-state shape from the on-disk ledger when there's no live runner. /push_log was refactored to go through autoloopStatus so it benefits from the same fallback. /events returns a single-shot SSE (snapshot + terminated event + close) for disk-only runs so the dashboard's existing handlers cleanly render history without hanging on a 404 EventSource.

Added

Chat history persistence + GET /autoloop/<id>/chat_history

Planner user-messages and Planner replies are now appended to <ledger>/chat.jsonl on every turn. The dashboard fetches this file on open and replays the conversation into the planner pane, so refreshing the page / re-opening a terminated run / coming back from a clawo serve restart no longer wipes the visible history. Returns [] for runs that predate this change.

POST /autoloop/<id>/resume + Resume button

Terminated runs can now be brought back in-process:

  1. Look up the run in ~/.claw-orchestrator/autoloop-registry.jsonl.
  2. Re-create the runner + dispatcher with the same run_id / workspace.
  3. ensurePlanner picks up the Planner's claudeSessionId from persistedSessions (now kept on disk because dispatcher.shutdown passes keepPersisted: true to stopSession) and Claude resumes the original conversation. Runs that predate this change have no persisted session — they get a fresh Planner with the same system prompt, while the dashboard replays chat.jsonl (when present) visually.

The dashboard surfaces a green Resume run button in the top bar whenever a run's status is terminated. Click → POST /resume → reconnect SSE.

Changed

SessionManager.stopSession(name, { keepPersisted? })

stopSession now accepts an opts bag. keepPersisted: true keeps the persistedSessions entry on disk so a later resume can re-attach the Claude session. Defaults to the old behaviour (entry deleted) so callers that haven't opted in are unaffected. Autoloop dispatcher.shutdown(...) passes keepPersisted: true automatically; autoloopDelete passes purge: true to ensure a real delete still scrubs everything.

Tests

  • embedded-server-launcher.test.ts snapshots and restores the host's ~/.openclaw/server-token so npm test no longer rotates the user's live dashboard token.
  • New cases for POST /autoloop/<id>/resume (200 + 404) and GET /autoloop/<id>/chat_history (200 + 404).

v4.0.3 — autoloop chat + planner write-gating + auth/pid hardening

13 May 08:39

Choose a tag to compare

v4.0.3

This release bundles three patches cut from a single coherent pass after v4.0.0:

Fixed — Dashboard auth token survives clawo serve restarts

EmbeddedServer regenerated the auth token on every construction, ignoring the on-disk ~/.openclaw/server-token. Every server restart invalidated browser cookies / open dashboard tabs / running CLI sessions. The server now reuses the persisted token (validated as ≥32 hex chars), only generating a fresh one when the file is missing or malformed. OPENCLAW_SERVER_TOKEN env override and disabled opt-out still take precedence; file remains mode 0600.

Fixed — session-pids.json no longer accumulates stale entries

SessionManager._savePids() unconditionally preserved entries from owners other than the current process, even after the owning SessionManager had exited. _savePids() now probes process.kill(ownerPid, 0) before keeping an other-owner entry; dead-owner rows are dropped.

Changed — Planner is physically prevented from authoring deliverables (breaking)

The Planner is meant to design plans and delegate; the Coder is meant to produce deliverables. The fix moves the role boundary from soft (prompt rule) to hard (tool gating):

  • Planner session now passes disallowedTools: ['Write', 'Edit', 'MultiEdit', 'NotebookEdit'] to Claude Code. Read / Glob / Grep / Bash stay enabled so the Planner can still discover and audit the workspace.
  • New autoloop tools write_plan and write_goal replace write_plan_committed / write_goal_committed. They take the full file content as a string + an optional commit_message. The orchestrator writes the file server-side, then commits. This is the Planner's only legitimate path to author plan.md / goal.json.
  • All three system prompts (Planner / Coder / Reviewer) rewritten with hard rules at the top under an # ABSOLUTE RULES heading.

Behavioural breaking change for callers that reference the old write_plan_committed / write_goal_committed tool names — the orchestrator surfaces "unknown tool" warnings if it sees them.

Fixed (4.0.2) — POST /autoloop/<id>/chat 524 timeout behind a reverse proxy

Chat route is now fire-and-forget: validates the run is alive, dispatches the message, returns 202 { ok, queued: true } immediately. The Planner's reply streams back via /events as a planner_reply event (dashboard already subscribes). New planner_error SSE event surfaces runtime failures. Dashboard clears textarea on send and shows a pending "Planner is thinking…" placeholder.

Fixed (4.0.1) — Autoloop chat in the dashboard

Dashboard's Planner compose box was posting to /v1/openclaw/tools/autoloop_chat, which only exists as an MCP tool — not as an embedded-server HTTP route. Added POST /autoloop/<id>/chat and POST /autoloop/<id>/delete to embedded-server, plus a hover-revealed Delete button on autoloop rows in the sidebar.

Install

npm install -g @enderfga/claw-orchestrator
clawo serve   # dashboard at http://127.0.0.1:18796/dashboard

v4.0.0 — ultraapp (Forge tab)

13 May 08:02

Choose a tag to compare

v4.0.0 — ultraapp (Forge tab)

A 3-agent Opus council turns a 5-question interview into a deployed web app: Tailwind UI, BYOK, file-queue runtime, smoke test, all live at localhost:19000/forge/<slug>/.

What's in 4.0.0

  • Forge tab + 14 MCP tools. + New → walk the interview → Start Build → council writes a complete codebase, fix-on-failure drives npm install && npm run build && npm test (plus docker build . only in opt-in --ultraapp-runtime docker mode) to green, deploy registers the slug. Share-card URL appears in chat.
  • Frontend quality §7 is binding. Council agents must capture Chrome-headless screenshots at 1440×900 AND 375×812 and visually inspect the PNGs before voting YES — source-code review is explicitly insufficient evidence. Every generated app uses a real styling system, four-state coverage on every async surface, drag-and-drop forms with previews + inline validation, and result presentation appropriate to type.
  • Two runtime modes. host (default, spawns generated app as a regular Node process — works anywhere Node works), docker (opt-in for shared-host isolation).
  • Done-mode iteration. Cosmetic chat ("make button green") runs an Opus patcher (diff + apply + validate + auto-revert + version snapshot); spec-delta chat ("also output a thumbnail") flips back to a focused interview + auto-rerun. Versions tagged v1, v2, … and switchable via Promote.
  • Cross-process visibility. Dashboard now sees council and autoloop runs across plugin-side SessionManager and standalone clawo serve (council transcript enumerator + autoloop registry).

Engine compatibility

Engine CLI Tested Version
Claude claude 2.1.126
Codex codex 0.128.0
Gemini gemini 0.36.0
Cursor agent 2026.03.30
OpenCode opencode 1.1.40

Install

npm install -g @enderfga/claw-orchestrator
clawo serve   # dashboard at http://127.0.0.1:18796/dashboard

Full operator reference for ultraapp: skills/references/ultraapp.md. MCP host setup: skills/references/mcp.md.

v3.7.1 — RE2 for /session/grep

11 May 15:05

Choose a tag to compare

Fixed

  • /session/grep and the session-grep tool now use RE2 to compile user-supplied regex patterns. RE2 runs in linear time and never backtracks, so patterns like (a+)+$ that could previously stall the Node event loop now complete in microseconds. Closes #64.

Thanks to @ybdesire for the report.

Note: RE2 does not support a handful of PCRE-only features (lookbehind, backreferences). Patterns using those features will now be rejected at compile time with an Invalid regex pattern error.

Full changelog: https://github.com/Enderfga/claw-orchestrator/blob/main/CHANGELOG.md