Releases: Enderfga/claw-orchestrator
v4.1.2 — Opus 4.8 registry + Codex error surfacing
Added
- Opus 4.8 (
claude-opus-4-8, now theopusalias) and 4.7 in the model registry — model resolution and cost reporting are correct foropusand pinned Opus 4.x ids.claude-opus-4-6remains available by id.
Fixed
- Codex
turn.failed/errorstream events now reject the send with the reported message instead of resolving an empty string (even on exit 0).
Changed
- Tested versions synced to Claude Code CLI 2.1.161; Codex stays 0.133.0.
v4.1.1 — Codex structured output + Gemini 0.43 compat
Added
- Codex structured output via
jsonSchema— the engine-agnosticjsonSchemasession config now wires into Codex (codex exec --output-schema <FILE>, on first turn and resume). Requires Codex 0.132+. - Antigravity CLI (
agy) custom-engine recipe — documented a ready-to-useCustomEngineConfigin multi-engine.md so Google'sagycan be driven today viaengine: 'custom'. Note:agy1.0.2 has no structured output mode, so token counts are estimated.
Fixed
- Gemini engine: pass
--skip-trust— Gemini CLI 0.43 added a trusted-folders gate that aborted headless-pruns in untrusted directories (worktrees, arbitrary cwds). The wrapper now always passes--skip-trust.
Changed
- Bumped tested engine CLI versions: Claude Code 2.1.150, Codex 0.133.0, Gemini 0.43.0.
v4.1.0 — Claude Code 2.1.140 sync
Catches up the wrapper on programmatic surface added between Claude Code CLI 2.1.126 and 2.1.140.
Added
claude_goal_set/claude_goal_clear/claude_goal_statustools wrap the CLI 2.1.139/goalslash command. Claude Code keeps working across turns until the stated condition is met, evaluating after each turn via Haiku. The wrappers send the slash text via the existing session channel and enforceengine: "claude"; unlike Codex's/goal, there is no separate goal-state notification — the only surface is the assistant's reply text.plugin_detailstool wrapsclaude plugin details <name>(CLI 2.1.139+). Returns the plugin's component inventory plus per-session token cost.pluginUrlsession config maps to--plugin-url(CLI 2.1.129+). Accepts a single URL or array; each value is fetched as a plugin.ziparchive for the session.
Not exposed (and why)
Settings.json fields (worktree.baseRef, autoMode.hard_deny, skillOverrides, sandbox.bwrapPath / socatPath, parentSettingsBehavior) are already user-controlled via the existing --settings flag. TTY-only env vars (CLAUDE_CODE_DISABLE_ALTERNATE_SCREEN, CLAUDE_CODE_FORCE_SYNC_OUTPUT, CLAUDE_CODE_SESSION_ID) do not apply to a non-interactive subprocess. Hook config (args: string[] exec form, continueOnBlock, hook input effort.level) and the subagent x-claude-code-agent-id HTTP header are internal to the CLI.
v4.0.7 — Test: mock council in setModeForDelta auto-fire (fixes CI flake)
Fixed — CI flake in manager.test.ts (`ENOTEMPTY` during afterEach)
The "setModeForDelta + interview-complete auto-fires startBuild" case let `startBuild` actually spawn a real council subprocess + git worktree, then dropped back to the polling loop as soon as mode hit `queued`. By the time `afterEach` ran `fs.rmSync(tmp, recursive)`, the council git workers were still writing into `/council-project/.git`, racing the recursive removal and surfacing as `ENOTEMPTY: directory not empty, rmdir '.git'`.
The test's contract is the interview-complete → startBuild handoff — nothing about the build pipeline's downstream behaviour. Now mocks `runCouncilSynth` and `runFixOnFailure` so the build short-circuits without spawning external workers.
Local stress (20× consecutive runs) passes cleanly.
v4.0.6 — Atomic JSON writes in UltraappStore (fixes flaky CI)
Fixed — UltraappStore JSON files now written atomically
The 4.0.5 CI run failed in manager.test.ts with Expected ',' or '}' after property value in JSON at position 148 from UltraappStore.readState — the manager test's polling loop caught state.json mid-write. The race exists in production too: any reader polling run state while another path mutates it can land in the partial-truncate window of fsp.writeFile.
Added atomicWriteJson(file, body) that writes to <file>.tmp.<pid>.<rand> and rename(2)s onto the target. POSIX rename is atomic, so concurrent readers see either the old file or the new — never a half-written file. All seven writer sites in store.ts route through the helper:
createRun(state + spec)setModewriteSpec(spec + state)recordBuildArtifactrecordDeploy
Local stress (20× consecutive runs of the manager test) passes cleanly with the fix.
v4.0.5 — Coder & Reviewer panes restore after refresh
Fixed — Coder / Reviewer panes were blank after refresh
4.0.4 added chat.jsonl persistence for the Planner conversation but the Coder and Reviewer replies stayed SSE-only. The result: opening a run after a refresh / cross-process / Resume showed the Planner thread populated but the Coder and Reviewer panes empty until the next SSE event arrived — and for terminated runs, no SSE events ever come.
Changes
dispatcher.deliverToCoderanddispatcher.deliverToReviewernow append every reply to<ledger>/chat.jsonlalongside the existingemit('coder_reply' | 'reviewer_reply', ...)calls.- Each phase also writes a heartbeat entry the moment delivery starts:
🔨 Coder iter N working…/🔍 Reviewer iter N auditing…. Useful for liveness checks on long turns (the dashboard sees activity even before the agent produces output) and survives refresh because it's on disk. appendChatEntry'swhounion widened to include'coder' | 'reviewer'.- The dashboard's
chat_historyhydration routes entries bywhointo the corresponding pane (coder→ Coder pane,reviewer→ Reviewer pane, others → Planner pane), so refreshing a mid-iter run shows the complete three-way conversation.
v4.0.4 — Terminated-run reopen + per-request token + chat history
Fixed
Auth token re-read per request
EmbeddedServer previously cached the auth token in memory at startup; a second clawo process (test runner, nohup launch, second launchd service) that briefly held the bind and wrote a different token would leave the live server with a stale in-memory value, and the reverse proxy (which reads ~/.openclaw/server-token per request) would inject the new value — producing a permanent 401 loop with no observable cause. The auth check now re-reads the token file on every request (64-byte read, kernel page cache, microsecond cost). The OPENCLAW_SERVER_TOKEN env override and the disabled opt-out are unchanged.
Reopening a terminated autoloop run no longer hangs on "Waiting…"
autoloopStatus(runId) previously returned undefined for any run that wasn't in this process's in-memory map, so the dashboard's /autoloop/<id>/state fetch 404'd on every terminated run and the UI stayed on its "Waiting…" placeholder forever. autoloopStatus now falls back to listAutoloopsFromRegistry and reconstructs a terminated-state shape from the on-disk ledger when there's no live runner. /push_log was refactored to go through autoloopStatus so it benefits from the same fallback. /events returns a single-shot SSE (snapshot + terminated event + close) for disk-only runs so the dashboard's existing handlers cleanly render history without hanging on a 404 EventSource.
Added
Chat history persistence + GET /autoloop/<id>/chat_history
Planner user-messages and Planner replies are now appended to <ledger>/chat.jsonl on every turn. The dashboard fetches this file on open and replays the conversation into the planner pane, so refreshing the page / re-opening a terminated run / coming back from a clawo serve restart no longer wipes the visible history. Returns [] for runs that predate this change.
POST /autoloop/<id>/resume + Resume button
Terminated runs can now be brought back in-process:
- Look up the run in
~/.claw-orchestrator/autoloop-registry.jsonl. - Re-create the runner + dispatcher with the same
run_id/ workspace. ensurePlannerpicks up the Planner'sclaudeSessionIdfrompersistedSessions(now kept on disk becausedispatcher.shutdownpasseskeepPersisted: truetostopSession) and Claude resumes the original conversation. Runs that predate this change have no persisted session — they get a fresh Planner with the same system prompt, while the dashboard replayschat.jsonl(when present) visually.
The dashboard surfaces a green Resume run button in the top bar whenever a run's status is terminated. Click → POST /resume → reconnect SSE.
Changed
SessionManager.stopSession(name, { keepPersisted? })
stopSession now accepts an opts bag. keepPersisted: true keeps the persistedSessions entry on disk so a later resume can re-attach the Claude session. Defaults to the old behaviour (entry deleted) so callers that haven't opted in are unaffected. Autoloop dispatcher.shutdown(...) passes keepPersisted: true automatically; autoloopDelete passes purge: true to ensure a real delete still scrubs everything.
Tests
embedded-server-launcher.test.tssnapshots and restores the host's~/.openclaw/server-tokensonpm testno longer rotates the user's live dashboard token.- New cases for
POST /autoloop/<id>/resume(200 + 404) andGET /autoloop/<id>/chat_history(200 + 404).
v4.0.3 — autoloop chat + planner write-gating + auth/pid hardening
v4.0.3
This release bundles three patches cut from a single coherent pass after v4.0.0:
Fixed — Dashboard auth token survives clawo serve restarts
EmbeddedServer regenerated the auth token on every construction, ignoring the on-disk ~/.openclaw/server-token. Every server restart invalidated browser cookies / open dashboard tabs / running CLI sessions. The server now reuses the persisted token (validated as ≥32 hex chars), only generating a fresh one when the file is missing or malformed. OPENCLAW_SERVER_TOKEN env override and disabled opt-out still take precedence; file remains mode 0600.
Fixed — session-pids.json no longer accumulates stale entries
SessionManager._savePids() unconditionally preserved entries from owners other than the current process, even after the owning SessionManager had exited. _savePids() now probes process.kill(ownerPid, 0) before keeping an other-owner entry; dead-owner rows are dropped.
Changed — Planner is physically prevented from authoring deliverables (breaking)
The Planner is meant to design plans and delegate; the Coder is meant to produce deliverables. The fix moves the role boundary from soft (prompt rule) to hard (tool gating):
- Planner session now passes
disallowedTools: ['Write', 'Edit', 'MultiEdit', 'NotebookEdit']to Claude Code. Read / Glob / Grep / Bash stay enabled so the Planner can still discover and audit the workspace. - New autoloop tools
write_planandwrite_goalreplacewrite_plan_committed/write_goal_committed. They take the full filecontentas a string + an optionalcommit_message. The orchestrator writes the file server-side, then commits. This is the Planner's only legitimate path to author plan.md / goal.json. - All three system prompts (Planner / Coder / Reviewer) rewritten with hard rules at the top under an
# ABSOLUTE RULESheading.
Behavioural breaking change for callers that reference the old write_plan_committed / write_goal_committed tool names — the orchestrator surfaces "unknown tool" warnings if it sees them.
Fixed (4.0.2) — POST /autoloop/<id>/chat 524 timeout behind a reverse proxy
Chat route is now fire-and-forget: validates the run is alive, dispatches the message, returns 202 { ok, queued: true } immediately. The Planner's reply streams back via /events as a planner_reply event (dashboard already subscribes). New planner_error SSE event surfaces runtime failures. Dashboard clears textarea on send and shows a pending "Planner is thinking…" placeholder.
Fixed (4.0.1) — Autoloop chat in the dashboard
Dashboard's Planner compose box was posting to /v1/openclaw/tools/autoloop_chat, which only exists as an MCP tool — not as an embedded-server HTTP route. Added POST /autoloop/<id>/chat and POST /autoloop/<id>/delete to embedded-server, plus a hover-revealed Delete button on autoloop rows in the sidebar.
Install
npm install -g @enderfga/claw-orchestrator
clawo serve # dashboard at http://127.0.0.1:18796/dashboardv4.0.0 — ultraapp (Forge tab)
v4.0.0 — ultraapp (Forge tab)
A 3-agent Opus council turns a 5-question interview into a deployed web app: Tailwind UI, BYOK, file-queue runtime, smoke test, all live at localhost:19000/forge/<slug>/.
What's in 4.0.0
- Forge tab + 14 MCP tools.
+ New→ walk the interview →Start Build→ council writes a complete codebase, fix-on-failure drivesnpm install && npm run build && npm test(plusdocker build .only in opt-in--ultraapp-runtime dockermode) to green, deploy registers the slug. Share-card URL appears in chat. - Frontend quality §7 is binding. Council agents must capture Chrome-headless screenshots at 1440×900 AND 375×812 and visually inspect the PNGs before voting YES — source-code review is explicitly insufficient evidence. Every generated app uses a real styling system, four-state coverage on every async surface, drag-and-drop forms with previews + inline validation, and result presentation appropriate to type.
- Two runtime modes.
host(default, spawns generated app as a regular Node process — works anywhere Node works),docker(opt-in for shared-host isolation). - Done-mode iteration. Cosmetic chat ("make button green") runs an Opus patcher (diff + apply + validate + auto-revert + version snapshot); spec-delta chat ("also output a thumbnail") flips back to a focused interview + auto-rerun. Versions tagged
v1,v2, … and switchable via Promote. - Cross-process visibility. Dashboard now sees council and autoloop runs across plugin-side SessionManager and standalone
clawo serve(council transcript enumerator + autoloop registry).
Engine compatibility
| Engine | CLI | Tested Version |
|---|---|---|
| Claude | claude |
2.1.126 |
| Codex | codex |
0.128.0 |
| Gemini | gemini |
0.36.0 |
| Cursor | agent |
2026.03.30 |
| OpenCode | opencode |
1.1.40 |
Install
npm install -g @enderfga/claw-orchestrator
clawo serve # dashboard at http://127.0.0.1:18796/dashboardFull operator reference for ultraapp: skills/references/ultraapp.md. MCP host setup: skills/references/mcp.md.
v3.7.1 — RE2 for /session/grep
Fixed
/session/grepand thesession-greptool now use RE2 to compile user-supplied regex patterns. RE2 runs in linear time and never backtracks, so patterns like(a+)+$that could previously stall the Node event loop now complete in microseconds. Closes #64.
Thanks to @ybdesire for the report.
Note: RE2 does not support a handful of PCRE-only features (lookbehind, backreferences). Patterns using those features will now be rejected at compile time with an Invalid regex pattern error.
Full changelog: https://github.com/Enderfga/claw-orchestrator/blob/main/CHANGELOG.md