Skip to content

Commit 1842a70

Browse files
authored
Release v2026.4.29: cron sidecar + /api/system cold-path hardening (#27)
Closes #25, closes #26. See CHANGELOG.md for the full v2026.4.29 entry.
1 parent 9f2d484 commit 1842a70

26 files changed

Lines changed: 1438 additions & 128 deletions

CHANGELOG.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,44 @@
11
# Changelog
22

3+
## v2026.4.29 — 2026-04-29
4+
5+
### Fixed
6+
7+
- **Cron table empty after OpenClaw v2026.4.20+ ([#25](https://github.com/mudrii/openclaw-dashboard/issues/25))** — OpenClaw v2026.4.20 split runtime cron state into a separate `~/.openclaw/cron/jobs-state.json` sidecar; the dashboard previously read only `jobs.json`, so the cron table rendered with blank `Last run` / `Next run` / `Last status` / `Last duration` columns. The refresh collector now merges the sidecar by `job.id`, with sidecar values winning wholesale and inline state preserved as the legacy fallback when the sidecar is absent. Dashboards now work against both pre- and post-v2026.4.20 OpenClaw installs.
8+
- **`/api/system` cold-start latency and Gateway Runtime stuck on "Loading…" ([#26](https://github.com/mudrii/openclaw-dashboard/issues/26))** — cold collections (no warm cache) could run 10–12s when the gateway was slow because version probes ran serially before the parallel host-metrics group; the frontend Gateway Runtime card had no fetch timeout and never repainted on `r.ok===false` or thrown errors, so it stayed on `Loading…` indefinitely.
9+
- **Backend**: introduced `system.coldPathTimeoutMs` (default 4000, validated [200, 15000]); `SystemService.refresh` now wraps the entire collection in `context.WithTimeout(ColdPathTimeoutMs)` and runs versions in parallel with runtime/host-metrics goroutines. Partial cold-path results return `degraded:true` rather than blocking on the slowest probe; the version cache is only updated on full success so a deadline-cancelled collection can never poison the cached version pair.
10+
- **Frontend**: `Sys.fetch()` now uses `AbortController` with a 6000ms ceiling (4000ms cold-path budget + jitter); on `r.ok===false` or thrown exception the new `renderGatewayDegraded(reason)` helper repaints the card with `State=Unavailable` and an explicit reason instead of leaving the placeholder text in place.
11+
- **Skills empty state**: `web/index.html` now falls back to a `No skills configured` empty-state element when `data.skills` is `null` or `[]`, matching the existing Git Log fallback pattern.
12+
- **`system.gatewayPort` default masked `ai.gatewayPort` inheritance**`appconfig.Default()` pre-filled `SystemConfig.GatewayPort` with `18789`, which defeated the `Load()` fallback that was supposed to inherit from `ai.gatewayPort` when `system.gatewayPort` was omitted. The default is now zero so the inheritance path activates as documented; user-supplied values (either side) still win.
13+
- **systemd unit missing `Environment=`** — Linux `service install` generated a unit file with no `OPENCLAW_HOME` or `PATH`, so the daemonized binary could not locate the openclaw CLI or OpenClaw runtime on fresh machines. The unit template now emits both `Environment=` directives, computed from the install-time `OPENCLAW_HOME` env override (falling back to `~/.openclaw`) and a deduplicated `PATH` with system bins (`/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin`) appended.
14+
- **systemd `service install` did not pick up changed flags on reinstall** — the install path called `systemctl --user start`, which is a no-op when the unit is already running. Switched to `systemctl --user restart` so reinstalls with changed `--bind` / `--port` / `Environment` actually apply; `restart` also starts a stopped unit so first-installs still work.
15+
- **Latest-version fetcher races with test cleanup** — the `getLatestVersionCached` background goroutine read a package-level `fetchLatestVersion` var that tests overrode during cleanup, occasionally producing data races under `-race`. Replaced with a per-instance `SystemService.fetchLatest` field set in the constructor; tests now isolate fully without touching shared state.
16+
- **Version banner double-`v` prefix** — when `BuildVersion` was injected via `-ldflags` (the `make build` path) and the `VERSION` file already started with `v`, the startup banner printed `vv2026.4.x`. Both `Main()` assignment sites now normalize via `strings.TrimPrefix(version, "v")` so the banner and `--version` flag agree on the rendered value.
17+
18+
### Added
19+
20+
- **`system.coldPathTimeoutMs` configuration** — overall budget for a cold `/api/system` collection; defaults to 4000ms, validated [200, 15000]ms. Documented in `README.md` and `docs/CONFIGURATION.md`.
21+
- **Frontend `renderGatewayDegraded(reason)` helper** — paints the Gateway Runtime card with an explicit `State=Unavailable` plus reason on fetch timeout, network error, or `r.ok===false`, so the card always reaches a terminal state.
22+
- **Skills empty-state fallback** in `web/index.html`, matching the Git Log pattern.
23+
- **Cron sidecar regression coverage** — new `internal/apprefresh/cron_state_test.go` plus split-store and legacy-inline fixtures exercise sidecar-only, legacy-only, sidecar-missing-job-id, malformed-sidecar, both-present, and `lastRunStatus` fallback paths.
24+
- **Cold-path regression coverage** — new `internal/appsystem/cold_path_test.go` asserts the deadline is honoured, `degraded:true` is set on partial collection, host metrics still ship when gateway probes hang, and a deadline-cancelled collection cannot poison the version cache.
25+
26+
### Changed
27+
28+
- **`/api/system` cold path is now fully parallel** — versions, gateway runtime, and host metrics goroutines all run inside the same bounded `context.WithTimeout`. Previously versions ran serially before the parallel block.
29+
- **CORS loopback-reflection invariants are now documented in code**`internal/appserver/server_routes.go` carries an inline doc above `setCORSHeaders` enumerating why arbitrary `localhost:*` / `127.0.0.1:*` / `[::1]:*` origins are reflected (loopback bind by default, no `Allow-Credentials`, server-side gateway token, rate-limited `/api/chat`).
30+
- **Planning doc preserved** — the issue #25/#26 fix plan moved to `docs/plans/2026-04-29-issue-25-26-fix-plan.md` alongside other historical planning docs.
31+
32+
### Security
33+
34+
- **GitHub external link hardened** — added `rel="noopener noreferrer"` to the `target="_blank"` link in the dashboard header so the linked tab cannot reach back to `window.opener`. Browsers default this for `noopener` since 2021, but the explicit attribute satisfies static auditors and older browsers.
35+
36+
### Documentation
37+
38+
- **`README.md`** and **`docs/CONFIGURATION.md`** — added rows for `system.coldPathTimeoutMs` and `system.gatewayPort` (including the inheritance behaviour). `examples/config.full.json` now includes the full `system` block which was previously missing.
39+
40+
---
41+
342
## v2026.4.13 — 2026-04-13
443

544
### Added

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -393,6 +393,8 @@ is usually the repo root. For `install.sh` installs it is
393393
| `system.metricsTtlSeconds` | `10` | Server-side metrics cache TTL (seconds) |
394394
| `system.versionsTtlSeconds` | `300` | Version/gateway probe cache TTL (seconds) |
395395
| `system.gatewayTimeoutMs` | `5000` | Timeout for gateway liveness probe (ms) |
396+
| `system.coldPathTimeoutMs` | `4000` | Overall budget for a cold `/api/system` collection (ms) |
397+
| `system.gatewayPort` | `18789` | Gateway port for health probes (defaults to `ai.gatewayPort`) |
396398
| `system.diskPath` | `"/"` | Filesystem path to report disk usage for |
397399
| `system.warnPercent` | `70` | Global warn threshold (% used) — overridden by per-metric values |
398400
| `system.criticalPercent` | `85` | Global critical threshold (% used) — overridden by per-metric values |

TODO.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
## ✅ Released
44

5+
- **v2026.4.29**: cron sidecar merge (#25), `/api/system` cold-path deadline + degraded fallback (#26), `system.gatewayPort` inheritance fix, systemd `Environment=` + `restart` on reinstall, per-instance latest-version fetcher
6+
- **v2026.4.13**: diagnostics + log visibility (#14), release hardening pass, structured logging cleanup
57
- Built-in service management (`install`/`uninstall`/`start`/`stop`/`restart`/`status`) via launchd (macOS) and systemd (Linux)
68
- Security hardening (XSS, CORS, O(N²), shell safety, file handles)
79
- Performance, dirty-checking & test suite (initial 44 ACs, rAF, scroll preserve, tab fix)

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
v2026.4.8
1+
v2026.4.29

docs/CONFIGURATION.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ the binary reports the installed release version correctly after upgrades.
4949
"metricsTtlSeconds": 10,
5050
"versionsTtlSeconds": 300,
5151
"gatewayTimeoutMs": 5000,
52+
"coldPathTimeoutMs": 4000,
5253
"gatewayPort": 18789,
5354
"diskPath": "/",
5455
"warnPercent": 70,
@@ -174,6 +175,7 @@ To change the OpenClaw data directory, set the `OPENCLAW_HOME` environment varia
174175
| `system.metricsTtlSeconds` | number | `10` | Server-side metrics cache TTL (2-60 seconds) |
175176
| `system.versionsTtlSeconds` | number | `300` | Version/gateway probe cache TTL (30-3600 seconds) |
176177
| `system.gatewayTimeoutMs` | number | `5000` | Timeout for gateway liveness probe (200-15000 ms) |
178+
| `system.coldPathTimeoutMs` | number | `4000` | Overall budget for a cold `/api/system` collection — bounds total wall time when no warm cache is available (200-15000 ms) |
177179
| `system.gatewayPort` | number | `18789` | Gateway port for health probes (defaults to `ai.gatewayPort`) |
178180
| `system.diskPath` | string | `"/"` | Filesystem path to report disk usage for |
179181
| `system.warnPercent` | number | `70` | Global warn threshold (% used) — overridden by per-metric values |

0 commit comments

Comments
 (0)