Skip to content

Latest commit

 

History

History
2452 lines (2148 loc) · 156 KB

File metadata and controls

2452 lines (2148 loc) · 156 KB

Changelog

All notable changes to DockPanel will be documented in this file.

The format is based on Keep a Changelog.

[Unreleased]

[2.10.2] - 2026-06-07

Fixes two fresh-install blockers reported in #70 and #71.

Fixed

  • Install failed on Debian 12 with GLIBC_2.38 / GLIBC_2.39 not found (#70). Release binaries were built on ubuntu-latest (now Ubuntu 24.04, glibc 2.39), so the dynamically-linked agent/API/CLI demanded glibc ≥ 2.38 and the agent refused to start on Debian 12 (glibc 2.36). The same break silently affected the rest of the documented support matrix — Ubuntu 20.04, Debian 11, CentOS 9, Rocky 9, Amazon Linux 2023 all ship glibc ≤ 2.34. The release workflow now builds fully static musl binaries (x86_64-unknown-linux-musl / aarch64-unknown-linux-musl) via cargo-zigbuild, so the binaries carry zero glibc dependency and run on any modern Linux regardless of distro libc version (ldd reports "statically linked"). DockPanel's TLS stack is entirely rustls, so there is no OpenSSL system dependency to block static linking.
  • First login bounced straight back to the login screen on a domain install served over HTTP (#71). When a domain is supplied at setup, setup.sh writes BASE_URL=https://<domain>, but the panel vhost is served over plain HTTP until TLS is added. The cookie helper keyed the Secure flag off BASE_URL, so it stamped Secure on the session cookie even though the response left over HTTP — the browser silently dropped the cookie and the next /api/auth/me 401'd, bouncing the user back to the login screen in every browser. This was the case the #47 fix left open (it only removed the empty-BASE_URL path). routes/auth.rs::cookie_secure_flag and the OAuth callback now derive Secure solely from the actual request scheme (X-Forwarded-Proto, which nginx always sets and is authoritative because the API only listens on 127.0.0.1 behind the proxy). HTTPS installs still get a Secure cookie; HTTP-served installs no longer bounce.

[2.10.1] - 2026-05-20

Hotfix for the v2.8.22 webmail reverse-proxy regression reported in #57.

Fixed

  • Webmail "Open" landed on the panel dashboard instead of Roundcube login. Roundcube emits root-anchored URLs in its HTML (form action="/?_task=login") and inline JS (comm_path: "/?_task=login") — it has no concept that it lives under /webmail/ on the panel vhost. The v2.8.22 nginx fragment proxied with proxy_redirect off and no body rewriting, so the browser navigated to /?_task=login → hit the panel's location / block → rendered the React SPA (dashboard). The CPU spike in the report was Roundcube's container booting on first hit. Fix in panel/agent/src/routes/mail.rs:926: added proxy_redirect / /webmail/; to rewrite 30x Location: headers and sub_filter '"/?_task=' '"/webmail/?_task='; to rewrite embedded URLs in HTML/JSON/JS bodies. Also clears Accept-Encoding to upstream so sub_filter receives uncompressed responses.
  • Auto-heal for existing webmail installs. v2.8.22 → v2.10.0 boxes already have the broken fragment on disk; the agent only writes on Install click, so users would have to Remove + Install to recover. scripts/update.sh:428 detects the old shape (no sub_filter line) and regenerates the fragment from the current template, using the current Roundcube container's host port from docker inspect.

[2.10.0] - 2026-05-16

Phase 4 W4 ships panel self-update from the UI with health-check rollback, persistent snapshots, update channels (stable / candidate / hold), and fleet rolling updates.

The reframing matters: scripts/update.sh:430-499 already ships a production-tested binary-swap + .bak-restore rollback flow. v2.10.0 does NOT reimplement that — the new orchestrator (services/panel_update.rs) shells out to the same script under a controlled environment (DOCKPANEL_NO_SELF_REFRESH=1 + DOCKPANEL_VERSION=<target>) so every bug fix already in update.sh (self-refresh ordering from v2.8.15-16, lock-wait conf from v2.8.17, fragment-include awk migration from v2.8.22, ACME cooldown from v2.8.23) keeps working unchanged. The new code is a presentation + persistence layer over a proven core.

Added

  • Apply Update button in Telemetry → Updates tab. Click → confirm modal showing target version + 4-step preview ("snapshot, download, swap, probe") → status modal that polls /api/update/status every 2s. Replaces the static SSH copy-paste block.
  • Pre-update snapshot service (services/panel_snapshot.rs). Each Apply call first writes a tar.gz triplet to /var/backups/dockpanel/snapshots/ containing binaries/{agent,api,cli} + db/dump.sql.gz (pg_dump --clean --if-exists) + etc/dockpanel/ + metadata.json. Written to .tmp then atomically renamed; DB row only inserted after rename succeeds. Refuses to create when the snapshot partition has less than 2 GiB free.
  • Operator-triggered rollback from the UI. Each snapshot row has a Roll back button (confirms then restores binaries + DB + /etc and bounces services). Reach back to any retained snapshot, not just the .bak that update.sh keeps for ~30 seconds.
  • Update channels: stable (GA only — current default behaviour), candidate (includes prerelease: true builds, takes the first by published_at desc), hold (skips the 6h auto-poll entirely; Manual Check button still works). Channel selector in Updates tab, single settings.update_channel row.
  • Fleet rolling update. Operator-initiated form in Updates tab: target version + halt-on-failure + include-panel toggles. Plan = all user-owned remote servers reachable in the last 5 minutes, sorted oldest agent_version first. POSTs to each agent's new /panel/update endpoint, polls /panel/update/status for terminal state, records per-server progress in fleet_update_runs.progress JSONB. Halts on first failure unless halt_on_failure: false.
  • Agent-side /panel/update receiver (panel/agent/src/routes/panel_update.rs). Distinct from the existing OS-package /system/updates/* endpoints. Bearer-auth, returns 202, spawns update.sh detached so the agent's own systemctl restart doesn't break the subprocess pipeline.
  • 10 new admin endpoints under /api/update/* + /api/snapshots/*: GET /status, POST /apply, POST /manual-check, POST /rollback, GET+PUT /channel, GET+POST /api/snapshots, DELETE /api/snapshots/{id}, GET+POST /update/fleet, GET /update/fleet/{id}. All admin-gated.
  • Snapshot retention sweep wired into the existing 24h run_retention_cleanup ticker in auto_healer.rs. Always-keep last 3 snapshots regardless of age; delete anything older than 7 days beyond that floor. File-delete first; DB row stays for retry if the file delete fails.
  • Startup finalize hook (finalize_pending_on_startup in services/panel_update.rs). Closes out any panel_snapshots rows with to_version IS NULL after a process restart by writing to_version = CARGO_PKG_VERSION. Equal from_version/to_version on a finalized row indicates update.sh's in-flight rollback fired; differing values indicate a successful apply.

Changed

  • Update poller honours update_channel (services/telemetry_collector.rs:302). hold skips the poll; candidate widens to /releases?per_page=20 (first by published_at desc); stable keeps the existing /releases/latest URL bit-for-bit.
  • scripts/update.sh accepts two new env vars. DOCKPANEL_NO_SELF_REFRESH=1 bypasses the v2.7.13 self-refresh block so the orchestrator can stream a single subprocess invocation's stdout into its state machine without a mid-flight re-exec breaking the pipe (SSH-operator flow keeps self-refresh on by default). DOCKPANEL_VERSION=vX.Y.Z pins the release tag instead of fetching /releases/latest, so a candidate-channel pick can't race a GA publish between the panel's poll and the operator's click.

Migration

  • One settings row inserted (update_channel = 'stable' — the implicit pre-W4 default). Two new empty tables (panel_snapshots, fleet_update_runs). No ALTER on existing tables; every install keeps current behaviour until an admin clicks Apply or changes the channel. Migration file: 20260520000000_panel_self_update.sql.

Operator notes

  • Snapshots consume disk: ~150-300 MB each typical, retained 7 days (last 3 always kept). Stored under /var/backups/dockpanel/snapshots/. A free-disk pre-check refuses to create a snapshot if the partition has less than 2 GiB free.
  • update.sh's SSH-only flow continues working unchanged — no operator forced to use the UI.
  • Cosign signature verification at download time is not in W4. HTTPS-to-GitHub is the existing trust boundary; cosign verify is a separate hardening pass (non-trivial key management).
  • The api process will be killed mid-binary-swap by update.sh's systemctl stop dockpanel-api. The orchestrator state lives in the DB rows (panel_snapshots.to_version); the new process boots Idle and the finalize hook closes out the in-flight row.

[2.9.0] - 2026-05-16

Phase 4 W3 ships on-call rotations and escalation policies. A small team can now self-host their on-call schedule in DockPanel — when an alert fires, the panel pages whoever is on-call right now (not every channel on the rule); if the alert isn't acknowledged within a policy-defined threshold, the panel routes the page to the next step in the chain.

Larger teams that already pay for PagerDuty keep using PagerDuty — the escalation policy supports a webhook:<url> route shape that forwards directly into their existing PD service key.

Added

  • on_call_schedules table + admin tab. A rotation = ordered list of user IDs plus a cadence in days (1–90). "Who's on-call at time T" is one-liner cadence math against an anchor; no calendar widget, no per-day overrides, no holiday handling. New endpoints GET/POST/PUT/DELETE /api/on-call/schedules[/{id}] (admin) and GET /api/on-call/whoami (any authenticated user) for "am I on the hook right now?"
  • escalation_policies table + admin tab. Policies are an ordered JSONB array of {after_minutes, route} steps. Routes are discriminated: on_call_schedule:<uuid> resolves to the current rotation holder, user:<uuid> pages a specific user, all_channels preserves the pre-W3 default (alert owner's channels), webhook:<url> is a direct outbound webhook bypass. New endpoints GET/POST/PUT/DELETE /api/escalation-policies[/{id}] (admin).
  • Per-alert-rule policy attachment. alert_rules gains a nullable escalation_policy_id FK. NULL = pre-W3 hardcoded 15-min unack → 30-min re-page (unchanged for every existing rule). Admin-only attach endpoint at PUT /api/alert-rules/{rule_id}/escalation-policy.
  • Ack actor + optional comment. PUT /api/alerts/{id}/acknowledge now accepts an optional { "comment": "..." } body (500-char cap) and stores both acknowledged_by (the actor) and acknowledged_comment. Older clients that PUT with no body keep working — they just don't carry a comment. The UI surfaces actor email + truncated comment inline on each acked alert row.
  • Frontend tabs. Alerts page grows two new tabs alongside Alerts + Runbooks: an On-call editor (rotation CRUD with reorderable member list) and an Escalation policies editor (step chain with route picker
    • live route description).

Changed

  • Escalation pages now carry the runbook payload. Phase 4 W2 added the runbook excerpt + URL to fire payloads via send_notification_with_runbook, but check_escalations was still calling bare send_notification — so re-pages on unacknowledged alerts lost the runbook context that the original fire had carried. The W3 rewrite of check_escalations extracts the shared load_runbook_payload helper so fire and escalation paths produce identical payloads.

Migration

No manual action is required. escalation_policy_id is added to alert_rules as a nullable FK with default NULL — every existing rule keeps its pre-W3 behaviour bit-for-bit. The three new alerts columns (acknowledged_by, acknowledged_comment, escalation_step_index) default to NULL/0 on existing rows.

[2.8.23] - 2026-05-16

Changed

  • SSL renewal cadence is now profile-aware. The auto-healer previously used a hardcoded 6h cooldown for both ARI re-fetch (RFC 9773) and the post-attempt retry. For the new shortlived profile (6-day certs whose renewal window is only ~4 days wide), 6h was 6% of the cert lifetime — a CA-issued early-renew nudge could be missed by a full quarter-day, and a failed attempt near expiry could burn the whole window. The cooldown is now 1h for shortlived and stays at 6h for tlsserver (45d) and classic (90d → 64d → 45d across the LE roadmap). Lets-Encrypt's tlsserver profile transitioned to 45-day issuance on 2026-05-13, which is what unblocked this change.

Added

  • New Prometheus counter: dockpanel_cert_renewals_total{result} with result="success" and result="failure" labels. Tracks auto-healer renewal attempts so operators can graph trend and alert on rate(...{result="failure"}[1h]). The counter is process-local (resets across restart — Prometheus increase() handles that gracefully) and adds zero DB queries per scrape.

    Exposed at /metrics alongside the existing dockpanel_info, dockpanel_site_count, and dockpanel_alerts_firing gauges.

[2.8.22] - 2026-05-16

Fixed

  • Webmail "Open" button on the Mail page was unreachable ([#57] third finding by @WiskeyPapa). The Roundcube container is bound to 127.0.0.1:8888 on the host (loopback only — never exposed to the public IP for security), but the frontend Open button generated a http://<panel-hostname>:8888 URL. That URL has nothing listening on it (and Cloudflare doesn't proxy port 8888 anyway), so the button produced a hang / connection refused.

    Fixed by reverse-proxying Roundcube under /webmail/ on the existing panel nginx vhost via a drop-in fragment file. The frontend Open URL is now ${origin}/webmail/ — same-origin, inherits the panel's TLS, works on both HTTPS-with-domain and HTTP-on-IP installs. The Roundcube container also gets new env vars (ROUNDCUBEMAIL_PROXY_WHITELIST=127.0.0.1, plus ROUNDCUBEMAIL_TRUSTED_HOSTS and ROUNDCUBEMAIL_FORWARDED_PROTO=https when the panel has a configured server_name) so Roundcube accepts the forwarded headers and generates correct URLs behind the proxy.

Changed

  • webmail_install is now idempotent — clicking Install when an existing dockpanel-roundcube container is present tears it down before recreating, so env-var additions across releases (like the v2.8.22 proxy/trusted-hosts envs) apply automatically on next Install click. Users who deployed Roundcube on v2.8.20/v2.8.21 just need to click Install again, which now rebuilds in place. The webmail_remove endpoint also tears down the panel-vhost reverse-proxy fragment for clean uninstall.

  • Panel nginx vhost gains an include /etc/nginx/conf.d/dockpanel-panel.locations/*.conf; directive (baked into scripts/setup.sh's vhost template; injected into existing vhosts by scripts/update.sh via an awk-based one-time migration — same shape as the v2.8.3 IPv6-listen migration). Drop-in directory for path-mounted tool reverse-proxies; webmail is the first user, but other tools (phpMyAdmin, Adminer) can use the same mechanism in the future.

Internal

  • New helper panel_server_name() in panel/agent/src/routes/mail.rs reads the panel vhost's server_name directive to drive Roundcube's TRUSTED_HOSTS computation — same approach update.sh uses to detect the panel domain for BASE_URL auto-population.
  • New helpers write_webmail_nginx(port) / remove_webmail_nginx() write the /webmail/ location fragment, validate via nginx -t, and reload on success. Failed validation unlinks the fragment so nginx is never left in a broken state.

[2.8.21] - 2026-05-16

Fixed

  • Firewall add/remove rule returned "Agent offline" with ufw: ERROR: '/etc/ufw/user.rules' is not writable ([#57] follow-up by @WiskeyPapa). The agent runs under ProtectSystem=strict with an explicit ReadWritePaths= allowlist, and /etc/ufw was never in that list. ufw status (read-only) worked, but writes to user.rules during add/delete were blocked by the sandbox mount. Added /etc/ufw and /var/lib/ufw to the canonical agent unit's ReadWritePaths, plus matching pre-create entries in scripts/setup.sh and scripts/update.sh so the namespace mount succeeds even on systems where ufw isn't installed yet. Same shape of fix as the v2.8.13 expansion that added /etc/modsecurity / /etc/cloudflared / /etc/postfix to the RWP list.

  • Dashboard "Set up backups" onboarding step stayed incomplete after a manual backup ran, and the card linked to Sites instead of the Backups page ([#57] follow-up by @WiskeyPapa). The completion check was sitesList.some(s => !!s.backup_schedule), but /api/sites doesn't return a backup_schedule field — so the check was always false regardless of how many backups had been created. Added a new GET /api/backup-setup-status endpoint (auth-gated, scoped by user) returning { has_schedule, has_backup } derived from real DB counts across backup_schedules, backups, database_backups, and volume_backups. Dashboard now fetches the status once and the card flips to complete as soon as any of those exist. Link retargeted from /sites/<id> to /backup-orchestrator (the global backup view).

[2.8.20] - 2026-05-15

Fixed

  • WAF install button stayed on "Install" after a successful install on Ubuntu 24.04 ([#57] follow-up by @WiskeyPapa). Ubuntu Noble's time_t-64 ABI transition renamed libmodsecurity3libmodsecurity3t64 as a virtual-provides (no transitional shim). The agent's install_status route was checking dpkg -l libmodsecurity3 literally, which never matches "ii" on Noble even though the install succeeded — frontend therefore kept showing "Install". Detect path in routes/service_installer.rs::install_status now accepts either name (OR-clause, matching the existing PHP fallback pattern). Same fix applied to uninstall_waf's apt purge list so uninstall on Noble actually removes the package instead of silently no-op'ing.

[2.8.19] - 2026-05-10

Fixed

  • Mail server install failed under the agent's strict sandbox ([#57] follow-up by @WiskeyPapa). routes/mail.rs::install_mail was running apt-get install via the sandboxed safe_command(...) wrapper, so the agent unit's ProtectSystem=strict made /var/lib/dpkg/lock-frontend read-only inside the namespace. apt printed Not using locking for read only lock file /var/lib/dpkg/lock-frontend warnings and then bailed when it tried to chown files. Switched the four apt-get call sites in mail.rs (install / purge / autoremove / rspamd install) to safe_command_unsandboxed, matching the #54-A pattern v2.8.14 applied to vmail useradd/groupadd (which lived right next to the apt-get call that this commit corrects). routes/system.rs::disk_cleanup's apt-get clean got the same treatment so /var/cache/apt actually gets cleared.
  • Cloudflare Tunnel install wrote a literal $(lsb_release -cs) into /etc/apt/sources.list.d/cloudflared.list ([#57] follow-up by @WiskeyPapa). The shell pipeline used single quotes around the echo argument, which prevents bash command substitution. Once the broken source landed, every subsequent apt-get update on the box failed (The repository '... $(lsb_release Release' does not have a Release file), blocking unrelated installs (Redis, WAF, Mail Server). Pre-resolve VERSION_CODENAME from /etc/os-release in Rust and printf the source line with the actual codename — also drops the lsb-release package dependency on minimal Debian images. Defensive: on install failure, delete a half-written source file so it doesn't break the rest of apt.
  • update.sh now repairs an existing broken cloudflared apt source on upgrade. Operators who already hit the bug get auto-cleanup on INSTALL_FROM_RELEASE=1 bash update.sh — no manual rm needed. Looks for the literal $(lsb_release string in /etc/apt/sources.list.d/cloudflared.list and removes the file if found.

[2.8.18] - 2026-05-06

Added

  • Phase 4 W2: Alert runbooks attached to fired alerts. Markdown text per alert type, indexed by alert_type. Excerpts (280 char, truncated at sentence boundary) ride along in slack/discord/pagerduty/webhook payloads; full markdown is rendered into email HTML and into the new Alerts page row expansion. Operator-edited runbooks survive panel upgrades by construction (apply-defaults uses ON CONFLICT DO NOTHING and never overwrites edits). Resolution is DB-row-then-default — fresh installs produce useful payloads from the compile-time const slice without the operator having to seed first.
  • 15 default runbooks shipped with the panel (panel/backend/runbooks/): 5 critical (offline / service_down / container_crashloop / backup_failure / gpu_temperature), 9 warning (cpu / memory / disk / disk_forecast / ssl_expiry / container_unhealthy / gpu_utilization / gpu_vram / memory_leak), 1 info (container_down). Each follows the same shape: Why this fired → First check → Common causes → Escalation. Authored for paging-grade discipline (info alerts won't wake anyone, critical ones page clearly).
  • Alerts → Runbooks tab with per-type list, edit modal (split textarea + live markdown preview, severity selector, Restore-default button), and "Seed missing default runbooks" action with insert-or-skip confirmation.
  • Inline runbook expansion on each fired alert — click any row in the Alerts list, the runbook for that alert type is fetched and rendered below the alert detail. Targets the W2 acceptance bar: an admin paged at 3am sees the runbook in the page, not as a "go look at our wiki" link.
  • 5 admin-only API endpoints under /api/alerts/runbooks: GET (list with is_default flag), GET {alert_type} (single), PUT {alert_type} (upsert, 50KB cap, severity validated), DELETE (restore default by removing DB row), POST apply-defaults (insert-or-skip from const slice, returns { inserted, skipped }).

Changed

  • services/notifications.rs::try_fire_alert now resolves a runbook by alert_type and threads runbook_excerpt + runbook_url through a new send_notification_with_runbook helper (the existing send_notification is unchanged, so the 14 non-alert callers across auto_healer, uptime, security_hardening, git_deploys, and incidents stay on the original API). Email gets full pulldown-cmark-rendered HTML appended to the body, slack and discord get a link plus excerpt, pagerduty extends custom_details, generic webhook adds runbook_url + runbook_excerpt as top-level keys.
  • New backend dep: pulldown-cmark = "0.10" (no_std-capable, ~50KB binary impact, fuzz-tested upstream; rendered output wrapped in catch_unwind defensively with HTML-escape fallback).
  • New frontend deps: marked@^14 + dompurify@^3 (~51KB gzipped combined). DOMPurify is non-negotiable defense-in-depth — runbook markdown is admin-authored but stored in DB and editable via API.
  • Email template variables now include {{runbook_excerpt}} and {{runbook_url}} alongside the existing {{title}}/{{message}}/ {{severity}}/{{timestamp}}. Backwards-compatible: existing custom templates ignore unknown placeholders.
  • Migration 20260507000000_alert_runbooks.sql adds the table: alert_runbooks(alert_type TEXT PK, runbook_md TEXT, severity_default TEXT CHECK (info|warning|critical), updated_by UUID FK users(id) ON DELETE SET NULL, updated_at TIMESTAMPTZ).

[2.8.17] - 2026-05-06

Fixed

  • Agent installers failed with Could not get lock /var/lib/dpkg/lock-frontend when another apt was running (#57 follow-up). On fresh Debian 13 boots, unattended-upgrades runs in the background and holds the dpkg frontend lock for several minutes. The panel UI's Install PHP 8.4 (and any other agent-driven apt install/purge — services, updates) failed immediately on contention instead of waiting. Both setup.sh (fresh installs) and update.sh (existing operators) now drop /etc/apt/apt.conf.d/99-dockpanel-lock-wait.conf setting DPkg::Lock::Timeout "300"; — every apt invocation on the system (agent and otherwise) now waits up to 5 minutes for the dpkg lock before giving up. No agent code change needed; the config file is read fresh on every apt run. Verified end-to-end on Debian 13 Trixie: python3 fcntl.lockf holding the dpkg lock for 15 s → apt-get install waits 15 s and succeeds (vs. 0 s fail-fast pre-fix).
  • Settings → Services → Install Redis (and Node.js, Composer, WAF, Cloudflare Tunnel) returned 404 (#57 follow-up). Latent backend gap since these services were added: the agent has full install/uninstall implementations in panel/agent/src/routes/service_installer.rs, but the backend's routes/mod.rs only proxied install for php/certbot/ufw/fail2ban/ powerdns. Frontend POST to /api/services/install/redis (and the other four) hit a non-existent route and returned 404 before reaching the agent. Added the 5 missing install handlers + the 2 missing uninstall handlers (waf, cloudflared) in panel/backend/src/routes/system.rs and registered all 7 routes in routes/mod.rs. Each handler is a 5-line proxy mirroring the existing pattern.

[2.8.16] - 2026-05-06

Fixed

  • PHP install failed on Debian 13 (trixie) (#57). setup.sh hardcoded PHP_VER=8.3 and reached for add-apt-repository -y ppa:ondrej/php whenever apt-cache show php8.3 returned nothing — but trixie ships PHP 8.4 in its default repo, and ppa:ondrej/php is an Ubuntu PPA that has no packages built for trixie. Fresh Debian 13 installs hit "PHP 8.3 installation failed" and ended up with no PHP at all. New flow: try the default-repo php-fpm metapackage first (covers Debian 13/12 and Ubuntu 24.04 cleanly with whatever PHP version each distro ships), fall back to deb.sury.org for older Debian or ppa:ondrej/php for Ubuntu when the default repo can't satisfy the install. Same Debian-vs-Ubuntu split applied to the panel-driven PHP installer in panel/agent/src/routes/php.rs so Settings → Services → Install PHP works on Debian too.
  • update.sh self-refresh never fired on the default code path. Mode auto-detection (INSTALL_FROM_RELEASE=1 when no Rust toolchain / no source) ran after the self-refresh check, so a user running plain bash /opt/dockpanel/scripts/update.sh entered with INSTALL_FROM_RELEASE=0, failed the self-refresh gate, and then got bumped to 1 by auto-detect — but with the stale local script still executing. Effect: pre-v2.8.16 panels swapped binaries to the latest release just fine, but never picked up script-side fixes (unit-file deploys, nginx config tweaks, install-agent.sh drop into FE_DIST). That's why issue #56 resurfaced after the v2.8.14 fix shipped — operators on v2.8.13 ran update.sh and stayed on v2.8.13's update.sh logic. Fix: move mode detection ahead of the self-refresh block so INSTALL_FROM_RELEASE is correct by the time the gate evaluates. Operators on v2.8.13/v2.8.14/v2.8.15 should run INSTALL_FROM_RELEASE=1 bash /opt/dockpanel/scripts/update.sh once to trigger self-refresh; from v2.8.16 onward, plain bash update.sh works.
  • PHP 8.4 not detected as installed. panel/agent/src/routes/service_installer.rs enumerated php8.{1,2,3}-fpm to determine if PHP was installed/running, so a Debian 13 install (which lands PHP 8.4 from the default repo) was reported as "PHP not installed" in Settings → Services even when it was running fine. Added php8.4-fpm to both checks.

[2.8.15] - 2026-05-06

Fixed

  • update.sh skipped the repo sync in INSTALL_FROM_RELEASE=1 mode, so v2.8.14's canonical-unit changes never deployed on the standard upgrade path. Found by the v2.8.13 → v2.8.14 VPS upgrade test: binaries upgraded to v2.8.14 successfully, but the systemd unit file on disk was still v2.8.13's content (no RuntimeDirectory=dockpanel, no /var/cache/nginx in ReadWritePaths=). Root cause: line 106 gated git pull behind INSTALL_FROM_RELEASE != 1, but the code at line 215 deploys the canonical unit from $AGENT_SRC regardless of mode. Same family as the v2.8.13 "dev fiction" bug — canonical file in repo, installer reads stale on-disk copy.
    • git pull --ff-only also didn't cover installs cloned with -b v2.8.13 (or any explicit tag — they end up on a detached HEAD with no local main). Replaced the conditional with an unconditional git fetch --depth=1 origin main + git reset --hard FETCH_HEAD so the canonical unit, nginx templates, and install-agent.sh are always at the latest origin/main when update.sh runs. Operators who already upgraded to v2.8.14 via bash update.sh should re-run it on v2.8.15 to pick up the unit changes; the self-refresh logic added in v2.7.13 will fetch the fixed update.sh from this release.

[2.8.14] - 2026-05-06

Fixed

  • WordPress provisioning failures on every fresh install (#54). Three independent regressions surfaced when the v2.8.12 strict sandbox shipped, each only firing on specific paths so they slipped through the v2.8.13 verification:

    • Failed to download wp-cliservices/wordpress.rs::ensure_cli ran safe_command("curl") -o /usr/local/bin/wp, but /usr/local/bin is not in the agent's ReadWritePaths under ProtectSystem=strict so the write was blocked silently and bubbled up as a 422 on the WP install endpoint. Switched to safe_command_unsandboxed("curl", &[]) (the same systemd-run escape used for apt/dpkg in v2.8.12) and now surface the curl stderr in the error message instead of just the static "Failed to download wp-cli" string.
    • mkdir() "/var/cache/nginx/fastcgi/<site>" failed (ENOENT)routes/nginx.rs::put_site called create_dir_all on the per-site FastCGI cache path before rendering the vhost, but /var/cache/nginx was not in the agent's ReadWritePaths so the create silently failed (only a tracing::warn!). The config was written anyway; nginx -t then fired its own mkdir of the cache leaf, found no parent, and rejected the reload. Added /var/cache/nginx to ReadWritePaths= in the canonical unit, pre-created /var/cache/nginx/fastcgi in setup.sh and update.sh, and promoted the agent-side create_dir_all failure from a warn! to a 500 with an actionable message ("Ensure /var/cache/nginx is in the agent's ReadWritePaths") so we never again render a config we know nginx can't validate.
    • tar: unrecognized option '--no-dereference' — three call sites (services/backups.rs, services/wordpress.rs::create_update_snapshot, routes/mail.rs::mailbox_backup) passed --no-dereference to tar -c. GNU tar 1.35 (current Trixie/Noble default and the version on this server) does not accept that option in create mode, so every site backup, every WP update snapshot, and every mail backup since the flag was introduced has been failing silently — including on the panel's own demo. GNU tar's create-mode default is already "do not follow symlinks", so the fix is to drop the flag from all three sites.
  • curl … {panel_url}/install-agent.sh | bash returned the SPA HTML (#56). The multi-server install command surfaced in routes/servers.rs pointed users at {panel_url}/install-agent.sh, but the panel's nginx config has try_files $uri $uri/ /index.html; with no override for that path — so the URI fell through to the SPA's index.html and bash choked on <!DOCTYPE html>. The script also wasn't deployed under any served path. Fixed by having setup.sh and update.sh copy scripts/install-agent.sh into $FE_ROOT/install-agent.sh so the existing try_files $uri rule serves it directly with the right MIME.

  • HTTP-on-IP installs were stuck in a login bounce (#47). The cookie helper in routes/auth.rs::issue_session set Secure whenever BASE_URL was empty (the assumption being that production deployments use HTTPS and an empty default should not regress them). For users running on the bare http://<ip>:<port> URL before adding a domain, the browser silently dropped the Secure cookie on the plain-HTTP response and /api/auth/me then 401'd on the very next request — login appeared to succeed and immediately bounced back to the login screen. Replaced the BASE_URL-only check with a combined BASE_URL=https://… || X-Forwarded-Proto: https check (nginx already sets X-Forwarded-Proto $scheme), and threaded the request HeaderMap through issue_session_pub / logout / OAuth callback / passkey auth_complete so every login path uses the same scheme detection.

  • /run/dockpanel disappeared mid-upgrade and pinned the agent at StartLimitBurst (v2.8.13 followup, surfaced during the demo upgrade-path test). update.sh mkdir's /var/run/dockpanel before the systemctl stop / start cycle, but between stop and start the directory disappeared on Ubuntu — the agent's namespace mount (which now resolves /run/dockpanel as a ReadWritePaths= symlink target) failed five times in 60s and the unit refused to start until manual systemd-tmpfiles --create plus systemctl reset-failed. Added RuntimeDirectory=dockpanel and RuntimeDirectoryPreserve=yes to the canonical unit so systemd creates and persists the directory itself, which fires before the namespace setup and survives every restart.

  • Agent socket occasionally left at 0600 root:root, breaking the panel's "Failed to load system update status" toast. The systemd unit's ExecStartPost was the only thing that chown'd the socket to www-data and chmod'd it to 0660 — and it failed silently in some restart sequences, leaving the panel unable to reach the agent over its UNIX socket. The agent now sets the permissions inline right after UnixListener::bind (via libc getgrnam / chown / set_permissions), so the unit's ExecStartPost is belt-and-suspenders rather than load-bearing.

  • Mail provisioning's groupadd/useradd for the vmail user failed under strict sandbox. Same family as #54-A: the safe_command wrapper runs sandboxed, but the user-management binaries write /etc/passwd / /etc/shadow / /etc/group, which are too sensitive to put in ReadWritePaths=. Switched both calls to safe_command_unsandboxed("groupadd", &[]) / safe_command_unsandboxed("useradd", &[]).

[2.8.13] - 2026-05-02

Changed

  • dockpanel-agent.service is now deployed from a single source of truth (#48 followup). The in-repo unit file at panel/agent/dockpanel-agent.service was historically a hardened reference (ProtectSystem=strict + a curated ReadWritePaths= list) that no installer ever deployed — scripts/setup.sh and scripts/update.sh both wrote a permissive ProtectSystem=no/ProtectHome=no/PrivateTmp=no unit inline via heredoc, so every install.sh-based install ran with no namespace hardening at all. v2.8.13 deletes both heredocs and has the install scripts cp the canonical unit file from the repo. Existing installs upgrading via update.sh get the strict sandbox automatically on the next update; the daemon-reload + agent restart that update.sh already performs at the end of its run picks up the new unit. The remote-agent installer (scripts/install-agent.sh) is intentionally left on its own inline heredoc — it deploys a different unit (after docker.service, no nginx dep, env-file driven) for the multi-host remote-agent path.

Security

  • Hardened the deployed agent sandbox to ProtectSystem=strict plus the full Protect* / Restrict* set (#48 followup). The new ReadWritePaths= covers everything the agent actually writes via std::fs::write / tokio::fs::write / create_dir_all: the original eight (/etc/nginx /etc/dockpanel /var/run/dockpanel /var/backups/dockpanel /var/lib/dockpanel /var/www /var/log /etc/letsencrypt) plus ten new paths grepped from current agent code (/etc/apt /etc/fail2ban /etc/systemd/system /etc/powerdns /etc/modsecurity /etc/cloudflared /etc/postfix /etc/dovecot /var/spool/postfix /opt). v2.8.12's safe_command_unsandboxed systemd-run wrapper continues to handle the apt/dpkg/snap subprocess paths that can't be expressed via ReadWritePaths=. Net effect: the agent now runs with meaningful kernel-namespace isolation — ProtectKernel{Logs,Modules,Tunables}=yes, ProtectControlGroups=yes, ProtectClock=yes, ProtectHostname=yes, RestrictRealtime=yes, RestrictSUIDSGID=yes, LockPersonality=yes, RestrictNamespaces=~user, NoNewPrivileges=yes, ProtectHome=yes, PrivateTmp=yes. None of this hardening was ever active on install.sh-installed users; demo had a hand-deployed strict version which is what surfaced the v2.8.12 EROFS bug.

Known limitations

  • The mail subsystem (panel/agent/src/routes/mail.rs:173-174) still spawns useradd/groupadd via the sandboxed safe_command, which fails under ProtectSystem=strict because /etc/passwd, /etc/shadow, and /etc/group are too sensitive to add to ReadWritePaths=. This was already broken under demo's strict sandbox; mail provisioning has been silently failing on that path. v2.8.14 will wrap the user/group creation calls with the v2.8.12 safe_command_unsandboxed pattern (systemd-run escape) for a clean fix.

[2.8.12] - 2026-05-01

Fixed

  • Service Installers + System → Updates fail silently with Read-only file system errors under the agent's ProtectSystem=strict sandbox (#48 followup). dockpanel-agent.service runs with ProtectSystem=strict and a ReadWritePaths= list that omits /var/cache/apt, /var/lib/apt, /var/lib/dpkg, and /usr — the paths apt and dpkg must write to. Every install / upgrade path that spawned apt-get, snap install, dpkg, or curl | bash from the agent inherited the sandbox and EROFS'd the moment it tried to download a .deb or install a binary into /usr/bin. Surfaced when insxa clicked Install on Redis / Composer / Node.js / Cloudflare Tunnel / WAF in Settings → Services — every one failed. System → Updates' "Update All" button hit the same wall.

    • Added safe_command_unsandboxed() (and a sync sibling) to panel/agent/src/safe_cmd.rs. The helper invokes the binary via systemd-run --quiet --pipe --wait --collect --setenv=... -- <bin>, which routes through PID1 to spawn a transient unit in PID1's own mount namespace. The inner binary sees the full filesystem read-write while the agent itself stays sandboxed for everything else. Every --setenv flag explicitly re-establishes the sanitized env (PATH/HOME/LANG/LC_ALL/DEBIAN_FRONTEND) so the inner binary doesn't inherit PID1's wider environment.
    • Converted ~25 call sites that legitimately need /usr write access to use the new helper: panel/agent/src/routes/updates.rs (apt-get update and apt-get install/upgrade), service_installer.rs (every install_* and uninstall_* shell script + the rm /usr/local/bin/composer in uninstall_composer), php.rs (add-apt-repository ppa:ondrej/php, apt-get install for PHP base + extensions, apt-get purge/autoremove), server_utils.rs (enable_auto_updates's apt-get install unattended-upgrades), and services/smtp.rs (ensure_msmtp's apt-get install msmtp).
    • Read-only callers (apt list --upgradable, apt-cache show, dpkg -l, which <bin>) keep using the sandboxed safe_commandProtectSystem=strict permits reads of /var/lib/apt/lists and /var/lib/dpkg, so wrapping them with systemd-run would just add overhead.
    • Empirically verified: from inside the agent's mount namespace, touch /var/cache/apt/archives/_test returns EROFS; the same touch wrapped in systemd-run --quiet --pipe --wait --collect -- succeeds because the transient unit gets a fresh mount namespace. Smoke-tested on demo: GET /system/updates returned 69 upgradable packages cleanly (was returning empty pre-fix because apt-get update EROFS'd before populating the lists).

    WAF + Cloudflare Tunnel installers will partially succeed in v2.8.12 (apt step now works) but still hit EROFS on follow-up std::fs::write / create_dir_all calls into /etc/modsecurity and /etc/cloudflared. v2.8.13 will close those by either adding the directories to the unit's ReadWritePaths or by routing those writes through the same helper.

[2.8.11] - 2026-05-01

Fixed

  • Settings → Services tab missing from the tab bar — PowerDNS / Image Scan / SBOM / Prometheus config UIs unreachable from the panel for over a month (#48 followup). Commit fd44a31 (2026-03-24, "UX: fix overlaps, decompose Settings, create System page") removed the { id: "services", label: "Services" } entry from Settings.tsx's tab list intending to move the contents to the new System page, but the actual content block ({tab === "services" && (<>...</>)} at lines 2169-2245) was orphaned in place and never relocated. The DNS page's "configure PowerDNS API in Settings" hint pointed users at a tab that didn't exist. Surfaced when an insxa followup on issue #48 asked for a screenshot of where to find the Services tab — there wasn't one. Fix: restored the Services tab button so the existing content block is reachable. (A proper move-to-System-page refactor remains on the list but is a bigger UX restructure than tonight's scope.)

[2.8.10] - 2026-05-01

Fixed

  • Dashboard "Restart nginx" / "Restart PHP-FPM" buttons did nothing on click (#48 followup). The frontend was POSTing the wrong request shape to the agent — { fix: "restart_nginx" } and { fix: "restart_php" }, while the agent's /diagnostics/fix endpoint deserializes { fix_id: "restart-service:<name>" }. Even after deserializing, the value restart_nginx doesn't match any of the supported apply_fix actions. Two changes:

    • Frontend (panel/frontend/src/pages/Dashboard.tsx) now sends { fix_id: "restart-service:nginx" } and { fix_id: "restart-service:php-fpm" }.
    • Agent (panel/agent/src/services/diagnostics.rs) treats php-fpm (no version) as a smart alias: it enumerates loaded php<ver>-fpm.service (Ubuntu/Debian) or plain php-fpm.service units via systemctl list-units and restarts every match, so multi-version installs (PHP 8.1 + 8.2 + 8.3) all reload their OPcache after the click. Returns a clear error if no PHP-FPM unit is installed at all.
  • Disk-full forecast fired during install on otherwise-idle systems. services/alert_engine.rs extrapolated linearly from the most recent 60 metrics_history rows. On a fresh install the first 30-60 minutes show 5-10%/hour disk growth (binary writes, frontend tarball, postgres init, container layers); the extrapolation predicted "disk full in 9 hours" even at 30% usage. Surfaced in the same insxa followup on issue #48 — alerts fired non-stop on a 40 GB box minutes after install. Forecast now requires (a) at least 6 hours of trend data so the install spike bleeds out, AND (b) current disk usage already over 60% so we're on a runway to a real full disk, not extrapolating from noise on an empty box. Existing thresholds (forecast horizon < 48h, severity cutoff at 12h) preserved.

[2.8.9] - 2026-05-01

Fixed

  • Agent's restart-service validator rejected systemd unit names containing a dot. php8.3-fpm, containerd.service, etc. are legitimate unit names but the regex was [a-z0-9_-]+ only. Surfaced in the same insxa followup on issue #48 — every PHP-FPM auto-heal attempt was returning "Invalid service name" silently. Also affected the post-restore PHP-FPM reload in routes/backups.rs:244. Fix: allow . in service names. Dots in systemd unit names cannot be used for path traversal because systemctl restart <name> doesn't treat the argument as a path.

  • Seven Settings toggles silently rejected by the backend whitelist (#48 followup). PUT /api/settings validates incoming keys against a hard-coded allow-list. Several security/registration toggles in the Settings UI wrote keys that were absent from that list — self_registration_enabled, security_approval_required, security_geo_alert_enabled, security_session_recording, security_db_backup_enabled, security_canary_enabled, security_lockdown_threshold. Toggling them returned 400 Unknown setting: <key>, the toast surfaced as "Failed", and the value never persisted. Backend code paths (routes/auth.rs, services/security_hardening.rs) already read these keys, so the runtime behaviour was tied to whatever value was set out-of-band. Surfaced when an insxa followup on issue #48 reported "these two in settings are not opted: Self-Registration, Require Approval for New Users" — same root cause as v2.8.5's ipv6only-strip migration miss: a list that grew implicit coupling to other parts of the codebase that nobody updated when new toggles were added. Frontend-only would have masked the issue with try/catch; the right fix is at the writer-side gate. No agent / cli / frontend code changes; binaries recompiled to carry the v2.8.9 version string.

[2.8.8] - 2026-05-01

Fixed

  • Password reset link bounced to /login instead of rendering the reset form (#48 followup). When an unauthenticated user clicked the reset link from their email, ServerProvider (mounted at the top of the SPA tree) fired api.get("/servers") on mount → 401 because no session → api.ts's 401 handler redirected to /login because its no-redirect allow-list only covered /login and /setup. Net effect: user lands on the login form, never sees the reset password fields, and the one-time token expires unused. Same hole hit /forgot-password, /register, and /verify-email for any unauth visitor — though those were less obviously broken because users typically reach them already knowing they need to log in. Fix extends the allow-list in panel/frontend/src/api.ts to all six top-level public routes (/login, /setup, /register, /forgot-password, /reset-password, /verify-email). Surfaced when an insxa followup on issue #48 reported the bounce; the page rendering was fine on the demo when probed, which made it look like an email-client mangling issue at first — empirically confirmed by insxa pasting the URL bar after click as https://your-panel/login, proving a synchronous redirect from the SPA was firing. No backend changes; binaries recompiled to carry the v2.8.8 version string.

[2.8.7] - 2026-05-01

Added

  • Branding logo upload (#48 follow-up). Settings → Branding now exposes an "Upload image…" button next to the existing Logo URL field. The frontend POSTs the file's raw bytes to a new POST /api/branding/logo endpoint (admin-only, PNG / JPEG / WebP, 2 MB cap, content-type and magic-bytes validated to defend against MIME spoofing). Files are stored content-addressed at /var/lib/dockpanel/branding/logo-<hash>.<ext> and served back over GET /api/branding/logo/{filename} (public — the login page is unauthenticated and needs to render the logo) with Cache-Control: public, max-age=31536000, immutable. The upload handler auto-saves logo_url so the new image takes effect on the next page render. Surfaced when an insxa follow-up on issue #48 reported "branding image could not be saved" — the existing settings field only accepted a URL, with no file upload UI. Admins on air-gapped panels with no public CDN can now self-host their logo.

[2.8.6] - 2026-05-01

Fixed

  • update.sh defaulted to compile-from-source on production VPS installs that don't have Rust — surfaced when an insxa follow-up on #48 hit Rust toolchain not found then OOM'd on proc-macro2 after they installed rustup. The script already auto-switches to release binaries when the source tree is missing, but production installs do have the source tree (install.sh writes it) — the missing signal was whether cargo was on $PATH. update.sh now also auto-switches to the pre-built release binaries when the Rust toolchain isn't available, so a fresh bash /opt/dockpanel/scripts/update.sh on a stock VPS works without the operator having to know about INSTALL_FROM_RELEASE=1 or install ~4 GB of rustup. Developers who do want to compile from source can set BUILD_FROM_SOURCE=1 to override. The "Rust toolchain not found" error message also rewords to recommend dropping BUILD_FROM_SOURCE=1 over installing rustup, with the RAM-cost callout up front.

[2.8.5] - 2026-05-01

Fixed

  • v2.8.4 upgrade path still hit duplicate listen options for [::]:443 on multi-site installs that ran v2.8.3 first (#48). v2.8.4 reverted the agent templates and panel vhost to plain listen [::]:80; / listen [::]:443 ssl;, and v2.8.4's update.sh stripped ipv6only=on from the panel vhost — but it dropped the v2.8.3 site-vhost migration block, so any site provisioned on v2.8.3 kept listen [::]:443 ssl ipv6only=on; on disk. nginx accepts panel-plain + ONE site-with-ipv6only=on on a shared [::]:443 socket, but rejects TWO-or-more site vhosts both setting ipv6only=on with duplicate listen options for [::]:443. The reload triggered by v2.8.4's update.sh therefore failed silently on any install with 2+ sites — the new panel listen never took effect, the IPv6 hijack from the original #48 persisted, and the next systemctl restart nginx would refuse to start. update.sh now strips ipv6only=on from every site vhost in /etc/nginx/sites-enabled/*.conf (skipping the panel vhost, which is already handled), bringing the listener options back in line so nginx reloads cleanly. No code changes — fix is pure upgrade-script. Manual one-liner for v2.8.4-stuck users:
    for f in /etc/nginx/sites-enabled/*.conf; do [ "$(basename "$f")" = dockpanel-panel.conf ] && continue; sed -i -E 's|^([[:space:]]*)listen \[::\]:(80\|443 ssl) ipv6only=on;|\1listen [::]:\2;|' "$f"; done && nginx -t && nginx -s reload
    

[2.8.4] - 2026-05-01

Fixed

  • v2.8.3 nginx duplicate listen options regression on multi-site installs. v2.8.3 added ipv6only=on to listen [::]:80 and listen [::]:443 ssl in agent templates + the panel vhost to fix the IPv6 hijack from #48. Two vhosts on the same shared socket both declaring ipv6only=on caused nginx to emit duplicate listen options for [::]:80 and refuse the config — surfaced when a second site was added on a v2.8.3 install. Reverted: agent templates and the panel vhost now use plain listen [::]:80; and listen [::]:443 ssl; (dual-stack, no ipv6only=on). Linux's default dual-stack behaviour means a single shared [::] socket handles both IPv6 and IPv4-without-specific-binding, and nginx routes by server_name across that shared socket without conflict. The underlying #48 fix still holds — the panel vhost gains a [::]: IPv6 listen so site vhosts can no longer be the only IPv6 listener and hijack panel-domain traffic. update.sh now also strips any ipv6only=on left on a v2.8.3 panel vhost so the upgrade path doesn't inherit the regression.

[2.8.3] - 2026-05-01

Fixed

  • Manual "Let's Encrypt SSL" provisioning failed with Template render error: Invalid PHP socket path on PHP sites (#48). Four backend call sites built the agent's php_socket field as /run/php/phpX-fpm.sock, but the agent's strict validator (is_safe_php_socket, panel/agent/src/services/nginx.rs:149) requires the unix:/... prefix and 500'd the request. The auto-SSL background task at site-creation time was correct (sites.rs:557), which is why Auto-SSL attempt 2 succeeded after the manual click 500'd in between. Fixed in routes/ssl.rs:118,404, services/auto_healer.rs:598, and services/security_scanner.rs:271 — all now emit unix:/run/php/phpX-fpm.sock like the working site-creation path.

  • Visiting the panel URL redirected to a freshly-installed WordPress site after that site's Let's Encrypt SSL was provisioned (#48). Root cause was a dual-stack listen mismatch: agent-rendered site nginx vhosts declared listen [::]:443 ssl; (no ipv6only=on), but scripts/setup.sh bound the panel's vhost to IPv4 only. The first site to provision SSL therefore became the de-facto default for any IPv6 (or non-matched-IPv4) request — WordPress saw a Host that didn't match home_url and 301'd to its canonical domain. Fixed by adding ipv6only=on to all [::]:80 and [::]:443 ssl listens across panel/agent/src/templates/nginx/{http,https,proxy}.conf, pairing every panel IPv4 listen in setup.sh with an ipv6only=on IPv6 listen, and adding a one-shot migration in scripts/update.sh so existing installs gain the IPv6 listen on next upgrade.

[2.8.2] - 2026-04-30

Added

  • Chain-of-trust report extended to database + volume backups. v2.8.1 shipped site-only because only the backups table carried integrity-hash columns. v2.8.2 lands the matching migration (20260430200000_db_volume_backup_hashes.sql) — sha256_hash, previous_hash, and chain_valid on both database_backups and volume_backups, applied in a single transaction so a partial apply can't leave one table chained and the other not. The agent now computes SHA-256 on every database dump (mysql / postgres / mongo) and every volume tarball, and the backend persists the hash + previous-hash link on the same INSERT path that lands the new backup row — both the on-demand routes (POST /api/backup-orchestrator/db-backup / volume-backup) and the policy-executor scheduled path. The All Backups tab now shows the Report | JSON | PDF 3-segment control on every row regardless of kind.

    The chain-report routes were collapsed from kind-specific (/chain-report/site/{id}[/pdf]) into one generic shape:

    • GET /api/backup-orchestrator/chain-report/{kind}/{id} — JSON.
    • GET /api/backup-orchestrator/chain-report/{kind}/{id}/pdf — PDF.

    {kind}{site, database, volume}; bogus kinds 400 cleanly. The JSON backup object now carries kind plus optional kind-specific fields (database_id, container_id, volume_name, db_type); the former site_name field was renamed to resource_name (domain for site, db_name for database, container:volume for volume) so the same consumer can render any kind. build_site_chain_reportbuild_chain_report(kind, id) with table-name dispatch. The typst template is now a single file that branches on data.backup.kind for the resource label and kind-specific extras (db engine, container ID), so the three kinds can't drift apart.

  • typst tarball SHA-256 pinning. v2.8.1 trusted GitHub TLS for the v0.13.0 musl tarball (matching the existing grype installer). v2.8.2 pins the per-arch SHA-256 (x86_64-unknown-linux-musl: cd1148da…feb6, aarch64-unknown-linux-musl: 1a1b3841…46e6), verified at install time before tar ever sees the bytes. Operators can override per arch via DOCKPANEL_TYPST_SHA256_X86_64 / _AARCH64 env vars (e.g. air-gapped mirror, custom typst version). Mismatch surfaces as a distinct error rather than a generic install failure. Install timeout bumped 90 → 120 s to absorb the second pass over the bytes.

Tests

  • tests/chain-report-e2e.sh extended to all three kinds. The site block became a kind-agnostic assert_kind helper; the suite now iterates site → database → volume and runs the same shape of assertions per kind (auth gate, kind validation, 404 on bogus id, JSON 200, JSON backup.kind + backup.id + backup.resource_name round-trip + shape, PDF 200 + Content-Type + Content-Disposition
    • %PDF magic + size > 1 KB). Fixtures are discovered per kind from backups / database_backups / volume_backups; missing fixtures skip rather than fail (so CI hosts that haven't seeded volume backups still green-light). Total suite ~50 assertions.

[2.8.1] - 2026-04-30

Added

  • Chain-of-trust report for site backups (Phase 4 W1.3). Every site backup is now downloadable as a single forensic artifact bundling its full provenance chain — the backup itself (filename, size, SHA-256, previous-hash link, chain-validity flag), every passive verification run against it (status, checks-passed/total, duration, errors), and every end-to-end restore drill (status, HTTP probe result, body excerpt, duration). Two formats from the same data:

    • GET /api/backup-orchestrator/chain-report/site/{id} — JSON.
    • GET /api/backup-orchestrator/chain-report/site/{id}/pdf — typst-rendered PDF with DockPanel branding, status pills, and a full chain-integrity summary. Designed to be handed to an auditor as proof a backup was actually verified and restorable.

    All Backups tab on the Backup Orchestrator page now shows a Report | JSON | PDF 3-segment control on every site row. The first PDF request lazy-installs the typst CLI into /var/lib/dockpanel/typst/ (~30 MB, one-time, ~30 s on a fresh box); subsequent renders are instant. Compile timeout 30 s; install timeout 90 s; concurrent installs serialised via a process-wide async mutex so a burst of first requests doesn't stampede.

    Site-only for v2.8.1 because only backups.sha256_hash / previous_hash / chain_valid are populated today (added in audit migration 20260324000000). The db + volume backup tables don't carry hashes yet — extending chain reports across all three kinds is a v2.8.2 follow-up that needs a hash-columns migration plus agent changes to compute SHA-256 during db/volume backup.

Fixed

  • /api/backup-orchestrator/health 500 once any backup exists. SUM(size_bytes) returns NUMERIC in PostgreSQL (since aggregating BIGINT can overflow int8); the existing query bound it to Option<i64> without an explicit cast. Empty backup tables returned NULL and decoded fine, but the moment a real backup row landed the endpoint started 500ing with INT8 not compatible with NUMERIC. Cast to ::bigint in three sites in routes/backup_orchestrator.rs::health
    • the rolled-up SUM in services/backup_policy_executor.rs. Caught by the v2.8.1 fresh-VPS test once a synthetic backup row was seeded for the chain-report PDF round-trip.

Tests

  • New tests/chain-report-e2e.sh sub-suite: unauthenticated request blocked, bogus uuid → 404, JSON shape, PDF magic bytes / Content-Type / Content-Disposition, file-size sanity. Wired into full-e2e.sh alongside the tier2-pin sub-suite. Self-provisions auth (mints admin JWT from api.env if DOCKPANEL_TEST_PASSWORD is unset). Skips PDF assertion when CHAIN_REPORT_SKIP_PDF=1 (CI without outbound HTTPS) and reports 503 cleanly when typst install fails so the suite still green-lights on networks that block GitHub releases.

[2.8.0] - 2026-05-01

Added

  • Restore Confidence SLA card on Backup Orchestrator overview (Phase 4 W1.1). The Overview tab now leads with a single trust signal — "of last 30 backups, X% verified" — sized as a headline number, color-coded by threshold (rust ≥95%, warn ≥80%, danger below). Adjacent cells show p50 and p95 verify-lag (time from backup creation to verification completion), oldest unverified backup age, and a per-server breakdown table when more than one server is registered. Empty state when no recent backups exist. Backend extends GET /api/backup-orchestrator/health with sla_window, sla_verified, sla_failed, sla_pending, verify_lag_p50_hours, verify_lag_p95_hours, oldest_unverified_days (previously declared but never populated), and per_server_sla[]. Latest verification per (backup type, backup id) wins, so re-runs supersede stale entries. No schema migration; same endpoint URL.
  • End-to-end backup drills for site backups (Phase 4 W1.2 part A). Click Drill on any site row in the All Backups tab — the agent extracts the tar to a scratch directory, spins a hardened nginx:alpine container (--network none, --read-only, 128MB / 0.5 CPU caps), HTTP-probes localhost/ via docker exec wget, and tears everything down. Persisted in a new backup_drills table; visible in the new Drills tab with status, HTTP code, duration, and error message. SLA card on Overview gains a "End-to-end drills (30d): N passed · M failed" line when drills exist. New endpoints: POST /api/backup-orchestrator/drill (admin, async — returns 202 immediately, drill row updates as the agent finishes) and GET /api/backup-orchestrator/drills (paginated history). Agent endpoint POST /backups/drill/site.
  • End-to-end DB drills for postgres + mysql/mariadb (Phase 4 W1.2 part B). Click Drill on any database row in the All Backups tab — the agent boots a scratch engine container (postgres:16-alpine or mariadb:11, --network none, 256MB / 1 CPU caps), pipes zcat of the dump into a direct-fd psql/mariadb restore, runs ANALYZE (postgres) to populate planner stats, then sums table count and row totals from pg_class.reltuples / information_schema.tables.table_rows. Drill body records "N tables, ~M rows restored" — strictly stronger than verify, which only confirms the dump applies. Pass requires tables > 0; row total is reported but doesn't gate (legitimate schema-only dumps pass). Backend POST /api/backup-orchestrator/drill now accepts backup_type = "database" and dispatches to new agent route POST /backups/drill/db. Drills tab "HTTP" column renamed to "Result" and renders the row/table summary for DB drills. Volume drill is W1.2.c.
  • End-to-end volume drills (Phase 4 W1.2 part C). Click Drill on any volume row in the All Backups tab — the agent creates a scratch Docker volume, runs a hardened alpine:3.19 restore container (--network none, 128MB / 0.5 CPU caps) that extracts the tar into the scratch volume (parity with restore_volume's actual restore path), then runs a second read-only probe container that mounts the volume RO and read-tests up to 20 sample files (head -c 1 through each — enough to fault filesystem-level corruption without scanning multi-GB volumes). Drill body records "N files, M bytes restored". Pass requires files > 0 AND read-test exit 0 — strictly stronger than verify, which only extracts to a host /tmp dir. Best-effort cleanup of both containers and the scratch volume on every exit path. Backend POST /api/backup-orchestrator/drill now accepts backup_type = "volume" and dispatches to new agent route POST /backups/drill/volume. The placeholder on volume rows is replaced with a working Drill button. W1.2 engine work complete; W1.2.d (per-policy weekly drill scheduler) is the remaining slice.
  • Per-policy drill scheduler (Phase 4 W1.2 part D). Backup policies gain a Drill on schedule toggle and a separate cron drill_schedule (default 0 4 * * 0 — 04:00 UTC Sunday) so drills run on a different cadence from the backups themselves. New backend service drill_scheduler ticks every 60s, finds policies due now, looks up the latest database_backups and volume_backups row tied to each policy by policy_id, and dispatches a real drill against each via the same agent endpoints used by on-demand drills. Records land in the existing backup_drills table — Drills tab can't tell the difference between scheduled and on-demand drills (same audit trail). Per-server concurrency cap = 1 (skips dispatch if a pending/running drill exists for the same server). Schema migration adds drill_enabled BOOLEAN, drill_schedule TEXT, and last_drill_at TIMESTAMPTZ to backup_policies. Site backups don't carry policy_id and are not covered by this scheduler — they stay on the existing 6h backup_verifier cadence. UI: new section in the Policy create form with an enabled checkbox + a curated schedule selector (weekly / monthly / every 3 days), and a small drill <cron> badge under the Schedule column on policy rows when enabled. Cron validation rejects strings that aren't 5-field whitespace-separated on both schedule and drill_schedule writes (was previously unchecked on schedule too — small hardening win). W1.2 (engines + scheduler) is now complete; W1.3 (chain-of-trust PDF/JSON export) ships separately as v2.8.1.

Polish

  • Backup Orchestrator UX pass. Drills tab now paginates with Prev/Next (50 per page) instead of silently truncating to the first 100; backend GET /api/backup-orchestrator/drills returns {items, total} to drive it. Result column is tone-coded for site drills — HTTP 2xx rust, 3xx neutral, 4xx amber, 5xx danger — so failures jump out at a glance. Running drills get a pulsing dot in the status pill and a N running counter + manual Refresh button above the table. Created column shows relative time with the absolute timestamp on hover. Drill button on DB and volume rows now asks once before spending — a confirm/cancel pair appears with the cost hint (boots a 256 MB scratch DB engine, ~60s / boots a 128 MB scratch container + temp volume, ~60s); site drills fire directly since they're cheap.
  • Image scan + SBOM Settings cards (a25c716). Apps CVE drawer + Settings ImageScan/SBOM cards picked up the same dialog/a11y polish as the rest of the panel: role="dialog" + aria-modal + Esc to close on the scan drawer, type="button" + aria-label on every trigger, design-system tokens (no raw Tailwind colors), explicit load-error + Retry on the Settings cards (no more stuck "Loading…"), Last scan Xh ago · N images on file derived from /image-scan/recent when the scanner is installed, and an explicit On-demand only — no schedule, no deploy gate line on the SBOM card so the configuration model is unambiguous.

[2.7.20] - 2026-04-28

Security

  • rustls-webpki 0.103.12 → 0.103.13 in both dockpanel-api and dockpanel-agent Cargo locks — fixes RUSTSEC-2026-0104 (reachable panic in CRL parsing). DockPanel calls into rustls-webpki for ACME cert verification and pinned-fingerprint TLS (Phase 3 #3 Tier 2), so a malformed CRL from a malicious or buggy CA could have crashed the process. Patch release, no API changes.
  • postcss 8.5.8 → 8.5.12 in panel/frontend and website/client package locks — fixes GHSA-7fh5-64p2-3v2j (XSS via unescaped </style> in the CSS stringify output). Build-time only; no runtime exposure on shipped panels — but worth keeping current.

Added

  • Servers page: last-seen-at + 24h uptime sparkline. Each server card now shows a small Last seen 14s ago line under the IP/status subtitle (driven by the existing last_seen_at column, refreshed on every agent checkin) and a 144-cell horizontal uptime strip — one cell per 10-minute bucket over the last 24 hours, derived from metrics_history row presence. Hover any cell for its time window and online/no-data label. New endpoint GET /api/servers/{id}/uptime returns { buckets: bool[], window_hours, bucket_minutes }. Owner- scoped (404 on a server that belongs to a different user); same auth shape as the rest of the /api/servers/* surface.
  • Pre-built Grafana dashboard (dashboards/dockpanel-grafana.json). Drop-in companion to the v2.7.16 Prometheus exporter. Covers fleet stats (version / servers reporting / sites / alerts firing by severity / GPUs reporting), per-server CPU / memory / disk timeseries with sensible thresholds, top-servers bar gauges, sites-by-status donut, a collapsible GPUs row (utilization, VRAM%, temperature, power draw), and an alerts-firing stacked-bars timeseries. Uses a Datasource template input so it imports cleanly onto any Prometheus that's already scraping /api/metrics. UID dockpanel-fleet is stable so runbook deep-links survive re-imports. A Server template variable lets operators focus on a single host or any subset. See docs/guides/prometheus.md "Pre-built Grafana dashboard" for import instructions. Closes the Phase 3 #1 follow-up that paired with the Prometheus endpoint.
  • Tier 2 cert-pin E2E test suite (tests/tier2-pin-e2e.sh). Covers every step of the Phase 3 #3 Tier 2 flow end-to-end against the live API: TOFU fingerprint capture on /api/agent/checkin, match no-op, MITM 403, malformed-fingerprint 400, admin rotate-cert-pin with and without the X-Requested-With CSRF header, activity_logs capture of the rotate action, and re-TOFU after rotate. Also includes a dedicated regression guard for the v2.7.18 rustls CryptoProvider panic — it inserts a synthetic online server row with cert_fingerprint set and a loopback URL with no listener, then POST /api/servers/{id}/test and asserts status exactly 502 (graceful connect failure) — a panic would surface as 500 and be caught. The suite is self-provisioning: it mints an admin JWT locally from /etc/dockpanel/api.env when DOCKPANEL_TEST_PASSWORD is unset, and cleans up all DB rows it creates via an EXIT trap. Wired into tests/full-e2e.sh as a sub-suite at the end of the run.

[2.7.19] - 2026-04-17

Fixed

  • Remote-agent TLS pinning no longer panics the API process. v2.7.18 shipped the PinnedFingerprintVerifier for outbound backend→agent TLS but the backend's main.rs never installed a process-level rustls CryptoProvider. On the first request that actually exercised the pinned path (i.e. a second server enrolled in the fleet with a captured fingerprint), rustls::ClientConfig::builder() panicked on CryptoProvider::get_default(). Pure single-host installs were not affected; any multi-server deployment using the pinned verifier was. Fix: call rustls::crypto::aws_lc_rs::default_provider().install_default() at dockpanel-api startup (the agent already did this at main.rs:24). Caught by the v2.7.18 fresh-VPS test before v2.7.18 was declared public-ready. No API changes; the Tier 2 part 2 verification flows (TOFU capture, MITM 403, rotate-pin, re-TOFU, PinnedFingerprintVerifier accept/reject) now all succeed end-to-end.

[2.7.18] - 2026-04-17

Added

  • RemoteAgentClient cert-pinning enforcement (Phase 3 #3 — Tier 2, part 2). Closes the loop: once an agent's fingerprint has been captured by the backend (Tier 2 part 1), every outbound TLS handshake to that agent goes through a custom rustls::client::danger::ServerCertVerifier that only accepts a cert whose DER SHA-256 matches the pinned value. Comparison is constant-time via subtle; signature verification delegates to rustls::crypto::aws_lc_rs. When cert_fingerprint is still NULL for a server (e.g. old agent that doesn't report it), the client falls back to the legacy AGENT_TLS_VERIFY=insecure env flag for backwards compatibility.
    • AgentRegistry::for_server now reads cert_fingerprint from the servers row and passes it to RemoteAgentClient::new_with_pin. Rotating the pin via POST /api/servers/{id}/rotate-cert-pin already invalidates the cached client (shipped in Tier 2 pt1) so the next request rebuilds with the new pin.
  • Agent TLS + cert fingerprint pinning (Phase 3 #3 — Tier 2, part 1). The agent's multi-server listener now terminates TLS instead of shipping auth tokens in plaintext, and the central panel captures each agent's cert fingerprint on first checkin for later pinning.
    • Agent loads /etc/dockpanel/ssl/agent.{crt,key} at startup (generated at install time by install-agent.sh, or generated on first boot via rcgen when missing). AGENT_LISTEN_TCP=0.0.0.0:9443 now binds a TLS listener via axum-server + rustls — the old plaintext bind and the AGENT_ALLOW_INSECURE_BIND escape hatch are removed, since TLS makes the 0.0.0.0 case safe by construction.
    • Agent computes the SHA-256 (hex) fingerprint of its cert at startup, logs it on first boot, and includes it in every phone-home checkin.
    • Migration 20260417000000_agent_cert_fingerprint.sql adds servers.cert_fingerprint (nullable varchar(64) + partial index).
    • Backend POST /api/agent/checkin captures the fingerprint on first checkin (Trust On First Use); on subsequent checkins a mismatch is rejected with 403 and logged at ERROR level. Format-validated (64-char lowercase hex) before storage.
    • New admin endpoint POST /api/servers/{id}/rotate-cert-pin clears the stored fingerprint so the next checkin re-captures. Use after a legitimate agent cert rotation or reinstall. Invalidates the cached RemoteAgentClient and writes an audit log entry.
    • Servers page gains a per-server TLS pin row showing the shortened fingerprint (first 16 / last 16 chars, full hash on hover) and a "Rotate pin" button with an inline confirmation bar.
    • Pt2 (pin-enforcement in RemoteAgentClient) ships in the same release — see the first bullet above.
  • Unified fleet-wide backup view (Phase 3 #3 — Tier 1). The Backup Orchestrator page gains an All Backups tab that lists site, database, and volume backups from every server in a single paginated table, with optional filters by server and by kind.
    • New admin endpoint GET /api/backup-orchestrator/all joins backups, database_backups, and volume_backups via a UNION CTE and resolves server_id to a server name (site backups derive their server from sites.server_id; database and volume backups carry the column directly). Query params: limit, offset, kind (site|database|volume), server_id. Returns { items, total }.
    • Per-row badges surface encrypted (at-rest encryption enabled) and remote (pushed to a backup destination) so fleet admins can spot inconsistencies at a glance.
    • Closes the last missing north-star bullet for "Operate at Scale": agent enrollment and cross-host placement were already shipped (ServerScope + servers table + install-agent.sh); the unified backup view was the remaining gap.

[2.7.17] - 2026-04-16

Added

  • 2026-ready ACME (Phase 3 #2 — Tier 1). DockPanel is now ready for Let's Encrypt's May 13 2026 tlsserver → 45-day flip, the existing 6-day shortlived profile, and the Feb 2027 / Feb 2028 classic reductions.
    • RFC 9773 ARI-driven renewal. The auto-healer now queries the CA's ACME Renewal Information for each cert and honours the suggested renewal window instead of a hard-coded 30-day threshold. Falls back to a profile-aware margin (2d / 15d / 30d) when a CA doesn't advertise ARI. New columns sites.ssl_renewal_at, sites.ssl_renewal_checked_at.
    • ACME profile selection UI. Settings → ACME Profile lets admins pick the default profile (classic / tlsserver / shortlived) for all new certificates. List auto-populates from the CA's server directory; card hides itself if the CA doesn't advertise the profiles extension. New column sites.ssl_profile stores which profile issued each cert.
    • Force-renew migrated off certbot CLI. /api/ssl/{id}/renew now issues via instant_acme and passes the previous cert as the ARI replaces hint, so the CA sees a continuous issuance chain. Legacy certbot-issued certs no longer trigger spurious failures on renew.
    • /api/ssl/profiles (admin) lists CA-advertised profiles with descriptions. /api/ssl/default-profile (admin) sets or clears the panel-wide default. /ssl/{domain}/renewal-info (agent) exposes the raw ARI suggestion per cert.

Changed

  • Auto-heal SSL copy in Settings replaced stale "3 days" threshold language with accurate ARI + profile-aware explanation.
  • DNS-PERSIST-01 (Q2 2026) intentionally deferred — no Let's Encrypt production date yet; will land once instant-acme exposes the draft API.

[2.7.16] - 2026-04-16

Added

  • Prometheus /api/metrics scrape endpoint (Phase 3 #1). Hand-formatted exposition text — no extra crate, respects the lightness axis. Gated by a SHA-256-hashed scrape token (constant-time compare via subtle); returns 404 when disabled so an off panel doesn't advertise a scrape surface. Exposes dockpanel_info, per-server cpu/memory/disk percents, per-GPU utilization / VRAM / temperature / power, per-status site counts, and alerts firing by severity. New PrometheusSettings card in Settings with auto-generated token, reveal-once banner, rotate button, and a copy-ready prometheus.yml scrape_configs block.

[2.7.15] - 2026-04-16

Added

  • GPU history + alerts (Phase 2 #2). Historical GPU charts in System (utilization, VRAM, temperature, power). Alert engine gains GPU-aware rules: VRAM > 90%, temp > 85°C, utilization pinned at 100% for 15 min.
  • Ollama model management + vLLM picker + idle-unload (Phase 2 #3).

Changed

  • CI on Actions Node 24. Upgraded action pins to their Node-24-ready versions, including sigstore/cosign-installer@v4.1.1 (no floating v4 tag exists). cargo install cargo-sbom is now called with --force so restoring a cached ~/.cargo/bin/ doesn't break the release workflow.

[2.7.14] - 2026-04-15

Fixed

  • scripts/update.sh now self-refreshes from the latest release tag. The v2.7.13 fix to the rollback bug only helped operators who manually refreshed their on-disk copy of update.sh, because update.sh wasn't in the binary release tarball and never overwrote itself during an upgrade. v2.7.14 closes the chicken-and-egg: when run with INSTALL_FROM_RELEASE=1, update.sh fetches the latest tag's scripts/update.sh from raw.githubusercontent.com, replaces its own on-disk copy if it differs, and re-execs. A SELF_REFRESHED=1 env guard prevents infinite loops.

    Operators currently stuck on v2.7.11 or v2.7.12 (where the broken health check rolls every upgrade back) need to bootstrap once:

    sudo curl -fsSL https://raw.githubusercontent.com/ovexro/dockpanel/main/scripts/update.sh \
         -o /opt/dockpanel/scripts/update.sh
    sudo INSTALL_FROM_RELEASE=1 bash /opt/dockpanel/scripts/update.sh
    

    After the first successful upgrade, future runs self-refresh automatically.

[2.7.13] - 2026-04-15

Fixed

  • scripts/update.sh rolled back every upgrade — the post-deploy health check POSTed to /api/auth/setup-status, but that endpoint is GET-only and returned 405 Method Not Allowed on every run, triggering the rollback path even when the new binaries were healthy. Caught by the v2.7.12 fresh-VPS test (the first end-to-end update.sh exercise in several releases). Operators on v2.7.11 or v2.7.12 who pulled via update.sh would have been silently held back; manual re-pull or reinstall via install.sh was unaffected. Fix: switch the check to GET.

[2.7.12] - 2026-04-15

Added

  • Per-container GPU assignment. Multi-GPU hosts can now pin specific NVIDIA devices to specific containers — pin Ollama to GPU 0, vLLM to GPU 1, Stable Diffusion to GPU 2. The deploy form auto-detects available GPUs (via the existing /apps/gpu-info) and shows a multi-select picker on hosts with two or more devices. Single-GPU hosts keep the original simple toggle. Backed by Docker's DeviceRequest.device_ids; assignment persists across update_app() recreations because Docker preserves the host_config when pulling a new image.
  • vLLM template (AI / Machine Learning). High-throughput, memory- efficient LLM inference server with an OpenAI-compatible API. Defaults to meta-llama/Llama-3.2-1B-Instruct and accepts an optional HUGGING_FACE_HUB_TOKEN for gated models. Fills the most-glaring AI template gap (the inference-engine peer to Ollama).
  • gpu_recommended flag on app templates. Templates that materially benefit from GPU passthrough (Ollama, LocalAI, vLLM, Stable Diffusion WebUI, Text Generation WebUI, Whisper) now ship a flag that surfaces a small "GPU" badge on the template card and pre-ticks the GPU passthrough toggle on the deploy form. Frontends/orchestrators (Open WebUI, LiteLLM, Flowise, Langflow, Dify) intentionally remain unflagged.

Changed

  • LocalAI default image switched to GPU variant. localai/localai:latest-cpulocalai/localai:latest-gpu-nvidia-cuda-12. The previous default silently ignored the GPU passthrough toggle on every deploy. Operators on CPU-only hosts can switch back via the Image field on the deploy form.
  • Text Generation WebUI pinned from :default-nightly to :default so shipped deploys don't drift on rebuild.

Public

  • dockpanel.dev/security launched. Public security posture page — audit count, signed-releases / SBOM story, response SLA, all 7 audit rounds with headline fixes, recent advisories, defense-in-depth grid, vulnerability-report CTA. Counter-positions DockPanel against the Coolify/CyberPanel narratives. Linked from main nav (between Compare and Pricing) and footer Product column. SECURITY.md cross-references the page at the top.

[2.7.11] - 2026-04-15

Added

  • Per-image SBOM generation (syft). Second half of the Phase 1 supply-chain story (after v2.7.10's signed releases). Generate an SPDX 2.3 JSON SBOM for any deployed Docker app's image — the composition companion to image vulnerability scanning. Defaults to off; admins opt in from Settings → Services → SBOM Generation.
    • Install button pulls Anchore's signed syft installer into /var/lib/dockpanel/scanners/syft (same self-contained, sandbox-safe pattern as grype — works under ProtectSystem=strict).
    • Download SBOM button in each app's scan drawer. Click runs syft against the app's image (10 – 60 s on first generation), persists the SPDX document, and triggers a browser download of <app>.spdx.json.
    • Persistenceimage_sbom table holds one row per image, overwritten on regeneration. Stored as JSONB so the API serves the SPDX document directly without re-parsing on the agent.
    • API surface mirrors /api/image-scan/... shape: /api/sbom/{settings,install,uninstall,generate,image/{ref}} plus /api/apps/{name}/sbom for both POST (generate) and GET (download).
    • Agent image-ref validator rejects shell metacharacters before invoking syft — defence-in-depth against shell-injection via user-supplied refs.

This is the operator-facing half: every container running on the panel now has a one-click supply-chain artifact to satisfy compliance asks (EU CRA Sep 2026) and to feed external tooling like Dependency-Track or Grype-on-SBOM.

[2.7.10] - 2026-04-15

Added

  • Signed releases via cosign keyless (Sigstore). Every binary and SBOM in the GitHub release is now signed in CI using the release workflow's OIDC identity — no long-lived signing key exists, and every signature is recorded in the public Rekor transparency log. Verification snippet in SECURITY.md.
  • Per-binary SPDX 2.3 SBOMs. cargo-sbom runs in CI for the agent, API, and CLI crates, emitting dockpanel-{agent,api,cli}.spdx.json alongside the binaries (also signed). Local builds via scripts/release.sh now generate SBOMs too; signing remains CI-only so the OIDC-bound certificate identity is always traceable to this repository's release workflow.

This is the first half of the Phase 1 supply-chain story — the next release exposes per-deployed-container SBOMs in-panel.

[2.7.9] - 2026-04-15

Added

  • Per-image vulnerability scanning (grype). First feature in the Phase 1 "Trust by Default" cycle. Scans every Docker app's image for known CVEs and surfaces a severity badge per app row on the Apps page, next to the existing update badge. Click a row to see the full CVE table (CVE ID, severity, package, installed version, fixed version). Defaults to off so existing installs see no behaviour change on upgrade — admins opt in from Settings → Services → Image Vulnerability Scanning.
    • Install button pulls Anchore's signed grype installer into /var/lib/dockpanel/scanners/ (self-contained — doesn't pollute /usr/local/bin and works under the hardened agent sandbox). The vulnerability database primes during install.
    • Scheduled scans rescan every running app's image in the background at a configurable interval (default 24h, range 1–720h).
    • Soft deploy gate refuses new deploys if the template's image has a recent scan exceeding a threshold (critical / high / medium). First encounter of an image triggers a best-effort background scan so the next deploy enforces the gate without blocking the first one.
    • Scan-on-demand from the per-app drawer. Ad-hoc scan of any image via POST /api/image-scan/scan.
    • Agent image-ref validator rejects shell metacharacters before invoking grype — defence-in-depth against shell-injection via user-supplied image references.

Fixed

  • /var/lib/dockpanel was missing from the hardened agent sandbox's ReadWritePaths. Audit 7 introduced ProtectSystem=strict on the agent unit file (panel/agent/dockpanel-agent.service) but only listed /etc/nginx, /etc/dockpanel, /var/run/dockpanel, /var/backups/dockpanel, /var/www, /var/log, /etc/letsencrypt — which meant git builds, terminal recordings, mail backups, Docker app volumes, and the new image scanner would all have silently failed if anyone deployed the hardened unit verbatim. Added /var/lib/dockpanel to the path list. (Installer scripts still emit ProtectSystem=no units, so fresh installs from install.sh / update.sh were not affected.)

[2.7.8] - 2026-04-15

Security (Audit Round 7)

  • tar backups now use --no-dereference — full-site backups, WordPress pre-update snapshots, and mailbox archives no longer follow symlinks inside the site root. A symlink pointed at /etc would previously have been archived as the target's content.
  • Cron command filter explicitly rejects \n and \r — was implicit before; defense-in-depth against scheduled-job newline injection.
  • Web-terminal command blocklist extendedchroot, pivot_root, capsh, mknod, debugfs, kexec added to the pattern list.
  • Agent systemd unit hardenedProtectKernelTunables, ProtectControlGroups, ProtectClock, ProtectHostname, RestrictRealtime, RestrictSUIDSGID, LockPersonality, RestrictNamespaces=~CLONE_NEWUSER.
  • Frontend URL guards — Telemetry's update-release link and the public status page's operator-supplied logo URL now require http(s):// schemes, blocking javascript: / data: URLs routed through backend-controlled config fields.

Fixed

  • Security-scan alert pileup eliminated. The weekly security scanner fired a new alert on every run without resolving prior firing alerts, so unacknowledged alerts compounded and the escalation loop re-notified every 2–5 minutes. New scans now auto-resolve prior firing/acknowledged security alerts before firing their own result.

Improved

  • README / COMPARISON / docs RAM claim updated — previous "~57MB" figure was stale. Fresh Vultr VPS measurement: panel services alone idle at ~19 MB (agent 12 MB + API 7 MB), or ~85 MB including the bundled PostgreSQL. Landing-page RAM bar now shows 19 MB.

[2.7.7] - 2026-04-15

Fixed

  • File Manager uploads were silently broken. The wired agent upload handler expected {path, content_base64} while the backend (and frontend) sent {path, filename, content}. A second handler in agent/routes/files.rs had the right shape but was never wired to a router. Fixed the wired handler to accept the real payload (with content_base64 alias for backwards compatibility) and removed the orphan duplicate.
  • Per-site PHP-FPM pool config changes never took effect. Agent called write_php_pool_config(...) but never reloaded PHP-FPM afterwards, so custom php_memory_mb / php_max_workers per site were ignored until a manual restart. Wired reload_php_fpm right after the pool write.
  • Installer silently fell back to IP-only mode over non-interactive SSH. Piping install.sh through an SSH session with no controlling tty made read < /dev/tty fail silently and cleared PANEL_DOMAIN. Now prints a clear "no tty — set PANEL_DOMAIN to configure" notice and points at the env var.
  • /var/lib/dockpanel/recordings was never created on fresh install. The terminal-recording API and auto-healer retention sweep both reference it. Added to the installer's mkdir -p list.

Removed

  • Agent dead code: restart_app_service, app_service_status, build_labels, connect_to_network (Docker-label routing superseded by file-provider write_route_config), volume_backup::get_backup_path (duplicate), and BackupInfo::new.

[2.7.6] - 2026-04-14

Improved

  • Complete UX polish pass — all remaining 12 pages reviewed and polished
  • Mail: success feedback for alias/backup delete, queue error handling, logs loading skeleton
  • Security: all raw Tailwind colors replaced with design system tokens (lockdown, audit log, approvals)
  • Settings: success feedback for destination delete, API key revoke, lockdown threshold save; SSH key error handling; empty states for SSH keys and IP whitelist
  • Monitors: success feedback for create/toggle/delete operations
  • IncidentManagement: inline delete confirmations (was direct delete), success feedback, settings tab empty state
  • WordPressToolkit: success banner for bulk update and hardening actions
  • Telemetry: fix unsafe error casts, fix version display bug (vundefined), color consistency
  • Login: loading spinner instead of blank page during auth check
  • Integrations: loading skeletons for WHMCS and Migrations tabs
  • NexusLayout: add missing incident count badge (consistent with other 3 layouts)
  • Color consistency: emerald/green/redrust/danger design tokens across 5 files

Removed

  • Zero any remaining in entire frontend (37 new TypeScript interfaces, completed in v2.7.5 cycle)

Security

  • Updated rand 0.9.2 → 0.9.4 (fixes 2 low-severity Dependabot alerts — soundness with custom loggers)

[2.7.5] - 2026-04-14

Improved

  • Systematic UX polish across 20+ frontend pages
  • All confirm() dialogs (25) replaced with inline confirmation bars across 5 files
  • All prompt() calls (6) replaced with inline input forms across 5 files
  • All console.error/warn/log removed from frontend page components
  • All bg-rust-50 light-mode colors replaced with dark-mode-compatible bg-rust-500/10 (8 files)
  • SiteDetail: loading skeletons for traffic stats, PHP extensions, access logs; WAF empty state
  • Databases: success feedback for create/delete/PITR toggle; typed SchemaBrowser generics
  • File Manager: save success indicator, Ctrl+S keyboard shortcut
  • DNS: 16 any type casts replaced with 5 proper TypeScript interfaces

Security

  • Upgraded rand 0.8 → 0.9.3 (fixes 2 Dependabot security alerts)
  • Upgraded vite 6.4.1 → 6.4.2 (fixes 2 high + 2 medium Dependabot alerts)

Added

  • Git hooks: pre-commit (infrastructure leak scan), pre-push (secrets + frontend staleness + version consistency)
  • Scripts: docs-audit.sh, release.sh (x86_64 + ARM64 cross-compile), deploy-check.sh

[2.7.4] - 2026-04-03

Security

  • JWT role staleness: sessions now invalidated immediately on role change (was stale up to 2h)
  • Webhook gateway DNS rebinding SSRF: destination URL re-validated at forward time, not just registration
  • Agent checkin replay prevention: timestamp validation rejects requests >120s old
  • Per-user ACME rate limiting: max 10 SSL certificates per hour per user (HTTP-01 and DNS-01)
  • DNS pre-flight check: verify domain resolves to this server's IP before HTTP-01 provisioning
  • Request timeout: 300s TimeoutLayer added as defense-in-depth against slow requests
  • Agent response streaming limit: uses http_body_util::Limited instead of buffering entire response before size check

Fixed

  • Docker container logs now strip ANSI escape sequences instead of returning raw escape codes

[2.7.3] - 2026-04-03

Added

  • GPU monitoring dashboard — VRAM used/free, temperature, power draw, fan speed, per-process usage with automatic Docker container name resolution. Shown in System Health tab. Gracefully hidden when no GPU detected.
  • GPU process table maps PIDs to Docker container names via /proc cgroup inspection

Changed

  • Certbot installer upgraded from apt (2.9.0) to snap (4.x with ARI support for upcoming 45-day LE certificates). Falls back to pip if snap unavailable.
  • OWASP CRS updated from v4.4.0 to v4.25.0 LTS

Security

  • Fixed CVE-2026-21876 (CVSS 9.3): OWASP CRS multipart charset validation bypass
  • Fixed CVE-2026-33691: OWASP CRS file upload whitespace bypass

[2.7.2] - 2026-04-02

Changed

  • System updates now stream apt output in real-time via NDJSON instead of buffering entire output
  • Agent apply_updates returns streaming response (newline-delimited JSON) for live terminal experience
  • Backend consumes streamed agent response via new post_long_ndjson() method, forwarding lines as SSE events
  • Added stream feature to reqwest for chunked response handling on remote agents

[2.7.1] - 2026-03-31

Changed

  • Version numbers synced across all packages: 2.0.6 → 2.7.0 in agent, API, CLI, and frontend
  • API endpoint count updated to 733 (465 backend + 268 agent) across all docs and marketing
  • E2E test count updated to 476 (8 test suites) across all docs and marketing
  • Docker template count corrected to 151 across 14 categories in docs site (was stale at 54)
  • Security audit rounds updated to 6 (was showing 5) in README and SECURITY.md
  • SECURITY.md now documents Audit Round 6 (zero-assumptions, 30 fixes, 260+ total)
  • FEATURES.md verified metrics updated with precise counts from code
  • CONTRIBUTING.md migration count updated (69 → 81)
  • COMPARISON.md corrected: RAM 60→57MB, templates 54→151, themes/layouts names fixed
  • Docs site getting-started.md RAM corrected (60→57MB)
  • Marketing site Landing.tsx updated with all corrected numbers

Fixed

  • Removed 3 orphaned lazy imports in frontend main.tsx (IncidentManagement, SecurityHardening, WebhookGateway — absorbed into consolidated pages)

[2.7.0] - 2026-03-30

Security — Fresh Zero-Assumptions Audit (Audit 6)

  • 6 parallel agents audited 222 Rust + 506 TypeScript files from scratch
  • 33 findings fixed across 24 files (11 HIGH, 22 MEDIUM)
  • MySQL password reset: fixed SQL injection via wrong quote escaping
  • Deploy script: added is_safe_shell_command() validation before agent forwarding
  • Laravel migration: replaced shell interpolation with dedicated safe agent endpoint
  • Terminal: sanitized uploaded filename before shell echo
  • CSRF: added X-Requested-With header enforcement on all mutating cookie-auth requests
  • Compose YAML: rewrote validator from string matching to parsed AST (serde_yaml_ng)
  • Shell command blocklist: added encoding tools, interpreters, network tools
  • Cron filter: blocked xxd, openssl enc, python3 -c, process substitution
  • Remote agent TLS: default inverted from insecure to strict
  • Agent TCP: refuses 0.0.0.0 bind without explicit AGENT_ALLOW_INSECURE_BIND=true
  • Stripe webhook: constant-time HMAC comparison
  • KDF: upgraded from SHA-256 to HKDF with backwards-compatible legacy fallback
  • Symlink attack on security remove_file/quarantine_file: canonicalize before prefix check
  • Mail forward_to/catch_all: email format + CRLF + pipe injection validation
  • SMTP test email: CRLF header injection prevention
  • WordPress plugin/theme: slug validation (alphanumeric + hyphens only)
  • Dashboard intelligence: scoped queries to authenticated user (cross-user leak)
  • Backup paths: traversal validation on agent URL construction
  • Migration: container name validation (DockPanel-managed only)
  • Stack templates: random passwords generated at selection time
  • Unix socket: permissions tightened from 0o660 to 0o600
  • Raw Command::new(): replaced 3 instances with safe_command (env sanitization)
  • is_safe_relative_path: now rejects backslashes and enforces length limit
  • Compose volumes: long-form object syntax now validated (prevents docker.sock bypass)

[2.6.9] - 2026-03-29

Fixed

  • 7 browser alert() calls replaced with in-page toast/message UI (SiteDetail, Logs, ResellerUsers, Extensions)
  • panic!() on invalid TCP bind (agent) and JWT_SECRET validation (API) replaced with clean exit
  • .unwrap() on server await replaced with error logging in agent and API main
  • Terminal WebSocket resize handler now wrapped in try-catch
  • Dashboard WebSocket cleanup race condition (handlers nulled before close)
  • Metrics WebSocket sends explicit Close frame before disconnect
  • 3 silent .ok() error discards replaced with tracing::warn logging
  • Grafana Docker template default password changed from "admin" to required field
  • Cleanup background task now supervised (auto-restarts on panic)
  • BackupOrchestrator form typed with PolicyForm interface (replaces any)

Added

  • Alert type muting UI in Settings notification channels (suppress per-type from Slack/Discord/PagerDuty)
  • Database password reset endpoint and UI (agent ALTER USER for PostgreSQL/MySQL/MariaDB)
  • Secrets vault rename and description update with inline edit UI

[2.6.8] - 2026-03-29

Fixed

  • Mail queue endpoint returns empty result when Postfix not installed (was causing 502 errors every 15s on dashboard)
  • Onboarding widget template count updated from 34 to 151
  • Real Vultr IP in test script examples replaced with RFC 5737 documentation IP
  • Monitoring screenshot scrubbed of test.dockpanel.dev URL

Added

  • 17 fresh screenshots from live VPS for all major pages (dashboard, sites, Docker apps, terminal, security, etc.)

Security

  • 6 CRITICAL/HIGH findings fixed (command injection ×3, auth bypass, timing attack, systemd injection)
  • 6 additional HIGH findings fixed (CDN SSRF, WebAuthn RP ID, IaC scope, SSH key injection, DB backup pattern)
  • 15 MEDIUM/LOW findings fixed (CORS, rate limiting, input validation, error handling)
  • CodeQL: bookmark URL validation hardened, DNS regex escaping fixed

[2.6.7] - 2026-03-28

Added — Tier 1 (High Impact)

  • Nginx FastCGI cache per site with smart bypass (logged-in users, POST, admin)
  • Cloudflare integration: zone settings, cache purge, security controls, SSL mode
  • Wildcard SSL via DNS-01 challenge (Cloudflare TXT automation, multi-part TLD support)
  • Container auto-update detection (registry digest comparison, update badges, one-click update)
  • 50 new Docker app templates (101→151 across 14 categories: AI, Media, Productivity, Communication, etc.)
  • Redis object cache per site (isolated DB numbers, WP auto-config via wp-cli)
  • WAF: ModSecurity3 + OWASP CRS v4 (per-site detection/prevention mode, event viewer)

Added — Tier 2 (Strong Differentiators)

  • Zero-downtime PHP deploys (Capistrano-style atomic symlink swap, instant rollback)
  • WordPress safe updates (pre-update snapshot, post-update health check, auto-rollback)
  • Image optimization (server-side WebP/AVIF conversion per site)
  • CDN integration (BunnyCDN + Cloudflare CDN, cache purge, bandwidth stats)
  • Restic incremental backups (encrypted, deduplicated, snapshot management)
  • Docker Compose editor validation (structured errors/warnings/info)
  • Auto-optimization recommendations (PHP-FPM workers, nginx workers, disk usage)
  • Cloudflare Tunnel (install cloudflared, token-based config, systemd service)

Added — Tier 3

  • CSP header management per site (policy editor + common presets)
  • Bot protection per site (off/basic/strict modes)
  • Passkey/WebAuthn passwordless login (manual p256+ciborium implementation, max 10 per user)
  • Per-user container isolation policies (max containers, memory, CPU, network isolation, allowed images)
  • Container auto-sleep / scale to zero (configurable idle threshold, auto-healer integration)
  • Visual DB schema browser (tables, columns, indexes, foreign key relationships)
  • Point-in-time DB recovery (WAL archiving for PostgreSQL, binlog retention for MySQL)
  • GPU passthrough for Docker (NVIDIA Container Toolkit detection, --gpus flag)
  • WHMCS billing integration (API config, webhook provisioning/suspension/termination)
  • App migration between servers (migration records, progress tracking)
  • Terraform/Pulumi IaC provider API (scoped tokens, resource listing)
  • Horizontal auto-scaling (rule-based CPU thresholds, min/max replicas, cooldown)

Added — Infrastructure

  • Telemetry & diagnostics: local event collection, opt-in remote sending, PII stripping (19 patterns)
  • Update checker: GitHub Releases API polling every 6h, dashboard banner, release notes display

Fixed

  • Agent token desync on fresh install — agent now prefers AGENT_TOKEN env var over file
  • WebAuthn RP ID defaulted to "localhost" when BASE_URL unset — now derived from request Origin header
  • Sidebar NavLink prefix matching: exact route matching on all layouts
  • 5 unbounded SQL queries now have LIMIT 500 (webhook_endpoints, pending_users, servers, backup_policies, git_previews)
  • Dependabot: picomatch 4.0.3→4.0.4, path-to-regexp 8.3.0→8.4.0 (website dependencies)

[2.6.6] - 2026-03-27

Fixed

  • Dashboard fleet overview crash on fresh install (SQL column mismatch)
  • Backup creation failure on GNU tar (--no-dereference flag)
  • Installer: silent package install failures now warn instead of lying
  • Installer: Docker volume cleanup prevents DB password mismatch on retry
  • 59 silent .ok() failures in agent replaced with proper error handling
  • 51 .ok().flatten() anti-patterns in backend replaced with error propagation
  • System updates (apt upgrade) broken by API's ProtectSystem=strict — proxied through agent

Added

  • Uninstall routes for all 10 services (PHP, Certbot, UFW, Fail2Ban, PowerDNS, Redis, Node.js, Composer, mail server, PHP versions)
  • SSL certificate renewal (certbot force-renewal) and deletion endpoints
  • User suspend/unsuspend toggle with session invalidation
  • Admin password reset for managed users
  • System Health tab shows real data (API status, uptime, CPU/mem/disk)
  • Certificates page: renew and delete buttons with confirmation
  • Monitor list pagination (limit/offset)
  • Backup retention auto-enforcement
  • Terminal share token revocation
  • 45+ command timeouts in agent (Docker, systemctl, apt, system commands)
  • Notifications page link to alert channel configuration

[2.6.5] - 2026-03-25

Security

  • Research-driven security audit: Studied CVEs from CyberPanel, HestiaCP, CloudPanel, VestaCP, Webmin, cPanel — then audited DockPanel against those attack patterns. 55 findings (12 HIGH, 28 MEDIUM, 15 LOW).
  • Command execution safety: Added safe_command() module — env_clear() on all 341 Command::new() calls across 44 files. Prevents LD_PRELOAD/PATH hijacking.
  • Credential encryption at rest: All stored credentials (DB passwords, SMTP, S3/SFTP, OAuth, TOTP, DKIM) encrypted with AES-256-GCM using dedicated key derivation.
  • Shell injection fix: Rewrote database_backup.rs — piped docker exec + gzip instead of bash -c with interpolated strings.
  • Tar symlink attacks: --no-dereference on backup creation, --no-same-owner on restore.
  • Session revocation: revoke_all_sessions now actually works — auth middleware checks cached timestamp.
  • Deploy log IDOR: Ownership verification on both git_deploys and docker_apps SSE streams.
  • Content Security Policy: Added CSP header to frontend nginx config.
  • Docker exec denylist: Added 7 escape-relevant commands (unshare, pivot_root, setns, capsh, mknod, debugfs, kexec).
  • Compose volume symlinks: canonicalize() resolves symlinks before path validation.
  • nginx header inheritance: Security headers re-declared in static asset location blocks.
  • WebSocket security: Conditional upgrade (prevents h2c smuggling), access_log off on token-bearing WS locations.
  • S3 temp files: RAII TempFileGuard with random names + 0600 permissions.
  • 2FA validation: Explicit HS256 + leeway=0 (was Validation::default()).
  • Account enumeration: Registration returns generic response.
  • Git history scrubbed: Removed all passwords, IPs, hostnames, sensitive screenshots from history via git-filter-repo.

[2.6.1] - 2026-03-22

Added (LOW Priority Gap Fixes)

  • Domain rename — New PUT /api/sites/{id}/domain endpoint to rename a site's domain. Agent handler renames nginx config, site directory, SSL certs, log files, PHP-FPM pools, Fail2Ban jails, redirects, and htpasswd configs. Backend updates monitors, status page components, and logs activity
  • Auto-firewall for proxy ports — Sites created with proxy/node/python runtime automatically get a UFW deny rule blocking external access to the allocated proxy port (traffic only allowed through nginx). Rule is auto-removed on site deletion
  • Laravel auto-migrations — Site deploys for Laravel sites (php_preset = "laravel") now auto-run php artisan migrate --force after successful deploy
  • One-time scheduled deploy — New POST /api/git-deploys/{id}/schedule endpoint to schedule a deploy at a specific time. New scheduled_deploy_at column on git_deploys. Deploy scheduler checks for due one-time schedules every 60s and auto-clears after triggering. Cancel with DELETE /api/git-deploys/{id}/schedule
  • Change Docker app image — New PUT /api/apps/{container_id}/image endpoint to change a running container's image tag. Pulls new image, stops old container, creates new one preserving volumes, rolls back on failure
  • Update Docker app resource limits — New PUT /api/apps/{container_id}/limits endpoint to update CPU/memory limits on running containers via docker update. Accepts memory_mb and cpu_percent

[2.6.0] - 2026-03-22

Fixed (Automation Gap Audit — Priority 1)

  • Auto-SSL DB update — Background SSL provisioning now updates ssl_enabled, ssl_cert_path, ssl_key_path, ssl_expiry in the database and activates paused monitors (was silently succeeding without DB update)
  • Auto-SSL config preservation — SSL provisioning now passes php_preset and root_path to the agent, preventing custom nginx config from being wiped
  • Pre-deploy backup — All deploy paths (site deploy, git deploy manual, git deploy webhook/scheduled) now create a site backup before deploying
  • Pre-delete backup — Site deletion creates a final backup before CASCADE-deleting the site record
  • Site deletion cleanup — Now removes orphaned status_page_components matching the deleted domain
  • Database restore — New POST /db-backups/{db_name}/restore/{filename} agent endpoint + POST /api/backup-orchestrator/db-backups/{id}/restore API endpoint. Supports MySQL/MariaDB, PostgreSQL, and MongoDB restore from backup files
  • Dashboard health score — Now factors in backup freshness (-5 per stale site), security scan findings (-10 critical, -3 warning), and open incidents (-10 each)
  • Smart recommendations — Dashboard intelligence endpoint returns actionable recommendations: stale backups, security findings, open incidents, expiring SSL, firing alerts, diagnostic issues. Rendered as a new Recommendations panel on the dashboard
  • Alert escalation — Unacknowledged firing alerts re-notify with [ESCALATED] prefix after 15 minutes, then every 30 minutes. New escalated_at column + migration
  • Alert-to-incident correlation — Before creating a new incident from an alert, checks for existing active incidents within 5 minutes. Appends as incident update instead of creating duplicates
  • Auto-healer restart limit — Tracks restart count per service over 30-minute window. After 3 failed restarts, stops healing, creates critical incident, sends notification, and marks state as exhausted
  • Disk-full forecast alerting — Computes disk fill rate from metrics history; alerts when disk projected full within 48h (critical if <12h)
  • Memory leak trend detection — Compares recent vs older memory averages; warns when sustained >10% increase with usage above 60%
  • Docker container crash detection — New check_container_health in alert engine detects exited, crash-looping, and unhealthy containers
  • Docker container auto-restart — Auto-healer restarts exited/dead Docker containers with same 3-attempt limit as system services
  • Incidents pause deploys — All 5 deploy paths (manual site, webhook site, manual git, webhook git, scheduled git) check for active critical/major incidents before proceeding
  • Security scanner auto-fix — Auto-renews expiring SSL certificates detected by security scans (safe findings only, never auto-deletes)
  • Fail2Ban auto-configuration — New sites auto-get a Fail2Ban jail monitoring their access log; removed on site deletion
  • Session management — New user_sessions table, GET /api/auth/sessions (list with is_current flag), DELETE /api/auth/sessions/{id} (revoke), auto-cleanup of expired sessions
  • Notification center — Bell icon with unread badge in all 4 layouts. New panel_notifications table, 4 API endpoints (list, unread-count, mark-read, mark-all-read), /notifications page with severity colors. Alerts auto-insert into notification center. 30-day retention cleanup. SSE real-time delivery. Wired into 18 event sources (deploys, incidents, backups, security, SSL, auto-healer, sites, auth)

Fixed (Automation Gap Audit — MEDIUM Priority, 25 gaps)

  • Clone site auto-provisioning — Clone now triggers auto-backup schedule, secrets vault, status page component, and site.created event
  • Composite site health — New GET /api/sites/{id}/health-summary combining SSL, backup freshness, uptime, and composite score
  • "Backup Everything" preset — New POST /api/backup-orchestrator/policies/protect-all one-click policy
  • Backup creation retry — Policy executor retries failed backups once with 5s delay
  • Backup freshness alerting — Proactive notification when sites have no backup in 48+ hours (throttled to once/hour)
  • Volume restore endpoint — New POST /api/backup-orchestrator/volume-backups/{id}/restore
  • Deploy lock — Concurrent deploys to same site blocked (checks for active building/deploying status)
  • Response time alerting — Monitors warn when response time exceeds 5000ms threshold
  • Failed cron detection — Manual cron execution fires alert on non-zero exit code
  • Postmortem auto-populate — Transitioning to postmortem status auto-generates timeline template
  • /tmp cleanup + Docker prune — Auto-healer now cleans /tmp (7d) and runs Docker system prune on disk pressure
  • Oversized log rotation — Truncates individual log files larger than 500MB during cleanup
  • Welcome email — New users receive welcome email with panel URL and credentials prompt
  • Audit log IPs — Security-sensitive actions (site create/delete, user create/delete, security fix) now log client IP
  • Auto-rollback on deploy failure — Failed site deploys auto-restore from pre-deploy backup
  • Generic webhook notifications — New notify_webhook_url in alert rules for custom integrations (Telegram, Teams, etc.)
  • Weekly digest email — Monday morning summary with 7-day alert/backup/incident/deploy counts to all admins
  • Post-deploy cache invalidation — Nginx cache purge after successful deploy (fastcgi + proxy cache)
  • Reseller brandingGET /api/branding now returns per-reseller logo/colors/name when applicable
  • Unified event timeline — New GET /api/dashboard/timeline merging deploys, backups, incidents, alerts, scans

[2.5.2] - 2026-03-22

Fixed (Theme & Layout Consistency Audit)

  • Clean-Dark rounding parity — Added ~120 lines of structural overrides (cards, modals, tables, buttons, scrollbar, selection, focus rings, progress bars, code blocks) so Clean-Dark has round corners everywhere, matching Clean
  • Ember radius normalized--radius-xl and --radius-2xl were 2px smaller than all other themes; fixed to 16px/20px
  • Clean hardcoded border-radius → CSS variables — All 11 instances of hardcoded 12px/8px/6px/4px converted to var(--radius-lg/md/sm/xs) for theme consistency
  • Status dot glow per-theme — Green glow was hardcoded for all themes; now uses theme-appropriate accent color (blue for Midnight/Clean-Dark, orange for Ember, teal for Arctic, blue for Clean)
  • Progress bar glow for Arctic & Clean — Missing glow rules added for both light themes
  • Settings theme picker missing data-color-scheme — Switching to light themes now correctly sets color scheme attribute
  • Default theme mismatch — Settings.tsx fallback aligned to midnight (was terminal)
  • FOUC prevention — Added inline script in index.html to apply theme before CSS loads
  • LayoutSwitcher light variant — Replaced hardcoded zinc/blue/white colors with theme variables
  • 2FA banner in all layouts — Replaced amber-* (stock Tailwind) with warn-* (theme tokens)
  • NexusLayout logout hoverrose-400 replaced with danger-400 theme token
  • PublicStatusPage full theme adoption — 40+ hardcoded color references replaced with theme variables
  • Terminal.tsxbg-gray-300 and bg-red-500 replaced with theme tokens
  • Login.tsx — Google OAuth button uses theme-mapped text/hover colors
  • Settings.tsx hardcoded colors — 13 instances of blue-500/red-500 replaced with accent/danger tokens
  • Dashboard stat grid square corners — Added rounded-lg overflow-hidden to stat bar and system info grids; added explicit rounded-lg to metric cards, sparkline cards, onboarding section, and issues panels
  • Compact layout flat nav — GlassLayout now respects dp-flat-nav setting (was only implemented in Sidebar layout)
  • Compact layout footer spacing — Removed nested padding wrapper, aligned px-3 to match Sidebar layout spacing
  • Layout switcher dropdown redesign — Added p-1 padding and rounded-md items to match panel dropdown style; compact mode hides label text to save space; removed bordered button style for cleaner ghost-button look

[2.5.1] - 2026-03-22

Fixed (Remaining 7 Gaps — Phase D)

  • GAP 7+21: Internal events bridge to webhook gatewayfire_event() now also forwards events to webhook gateway routes with filter_path=/event and filter_value={event_type}. Users can subscribe gateway routes to any internal event.
  • GAP 12: Docker apps auto-get monitor + status component — Docker apps deployed with a domain now auto-create an HTTP monitor and a status page component under "Docker Apps" group.
  • GAP 13: Git deploy auto-creates gateway endpoint — New git deploys auto-create a webhook gateway endpoint for webhook inspection/replay capabilities.
  • GAP 16: Incident resolve cleans up alerts + components — Resolving a managed incident auto-resolves linked alerts and clears status_override on affected status page components.
  • GAP 17: Vault export/import — New GET /api/secrets/vaults/{id}/export and POST /api/secrets/vaults/{id}/import endpoints for encrypted vault backup and transfer between DockPanel instances.

Automation Audit: Complete

All 21 identified gaps now addressed. Zero manual steps required for: backup scheduling, uptime monitoring, secret injection, incident creation, status page updates, or webhook delivery.

[2.5.0] - 2026-03-22

Fixed (21-Gap Automation Audit)

  • GAP 1: Backup policies now execute — New backup_policy_executor background service runs every 60s, evaluates cron schedules, executes backup policies across sites, databases, and volumes. Policies are no longer dead config.
  • GAP 2: Verifier respects policy_id — Backup verifier checks verify_after_backup flag. Policy executor triggers verification after successful backups.
  • GAP 3: Auto-incidents from monitoring — When a monitor goes down, the system auto-creates a managed incident with timeline, links affected status page components, and auto-resolves when the monitor recovers.
  • GAP 4: Auto status page components — New sites automatically get a status page component (if status page is enabled).
  • GAP 5: Auto-inject secrets on deploy — After a successful deploy, the system checks for a linked vault with auto_inject secrets and injects them into the site's .env file automatically.
  • GAP 6: Auto-vault for new sites — Every new site gets an auto-created secrets vault linked via site_id.
  • GAP 8: fire_event in all new features — Backup orchestrator, incident management, and secrets manager now emit extension webhook events (db_backup.created, incident.created, secrets.injected, etc.).
  • GAP 9: Critical alerts create incidents — Critical alerts and server offline/service down alerts auto-create managed incidents visible on the status page.
  • GAP 10: Backup failure creates incident — When a backup policy has failures, a managed incident is auto-created.
  • GAP 14: Backup for ALL sites — Removed the site_count <= 1 gate. Every new site now gets a daily backup schedule automatically.
  • GAP 15: Auto-monitor with deferred activation — New sites get a paused HTTP monitor that auto-activates after successful SSL provisioning (when DNS is confirmed working).
  • GAP 18: Webhook delivery cleanup — Added 7-day retention cleanup for webhook_deliveries and 90-day for backup_verifications in the auto-healer retention cycle.
  • GAP 19: Subscribers notified of auto-downtime — Status page subscribers now receive email notifications when monitors detect downtime, not just for manually-created incidents.
  • GAP 20: Policy encrypt flag works — The backup policy executor passes the encrypt flag through to agent backup endpoints when encrypt = TRUE.

Infrastructure

  • New background service: backup_policy_executor (supervised, 60s interval) — 11th background service
  • Modified: uptime.rs (auto-incidents + subscriber notifications), alert_engine.rs (critical→incident), sites.rs (auto-vault, auto-monitor, auto-component, backup for all), ssl.rs (activate monitors), deploy.rs (auto-inject secrets), auto_healer.rs (retention cleanup), backup_orchestrator.rs + incidents.rs + secrets.rs (fire_event calls)

[2.4.0] - 2026-03-22

Added

  • Webhook Gateway: Receive, inspect, route, and replay incoming webhooks.
    • Inbound endpoints: Each gets a unique URL (/api/webhooks/gateway/{token}). Unlimited endpoints per user.
    • Signature verification: HMAC-SHA256 and HMAC-SHA1 modes for GitHub, Stripe, and other providers. Configurable header name and secret.
    • Request inspector: Full request logging — headers, body, source IP, signature validation status. Click any delivery to view complete details.
    • Route builder: Forward incoming webhooks to any destination URL. JSON path filtering (e.g., only forward action=push). Custom header injection. Configurable retry (0-10 attempts with exponential backoff).
    • Replay: Re-send any past delivery to all configured routes. Useful for debugging or recovery.
    • Delivery tracking: Per-route forwarding status, response body, duration. Endpoint-level counters.
    • E2E test suite: tests/webhook-gateway-e2e.sh — endpoint CRUD, webhook receive, delivery inspection, routes, replay, filtering.

Infrastructure

  • New crate dependency: sha1 0.10 for HMAC-SHA1 signature verification.
  • New migration: webhook_endpoints, webhook_deliveries, webhook_routes tables.
  • 8 new API endpoints (7 admin, 1 public inbound).
  • Frontend: WebhookGateway.tsx with 3 tabs (Endpoints, Request Inspector, Routes).

[2.3.0] - 2026-03-22

Added

  • Secrets Manager: AES-256-GCM encrypted secret storage with version history.
    • Secret vaults: Project-scoped vaults for organizing secrets (global or per-site).
    • Encrypted storage: All secret values encrypted with AES-256-GCM (random nonce per secret, key derived from JWT_SECRET via SHA-256).
    • Secret types: Environment variables, API keys, passwords, certificates, custom — with type-specific UI badges.
    • Version history: Every update creates a versioned snapshot. Full audit trail with who changed what and when.
    • Auto-inject: Mark secrets for automatic injection into site .env files on deploy. One-click inject from vault to site.
    • Masked by default: API returns masked values (xxxx••••••••) unless ?reveal=true is explicitly requested.
    • Pull endpoint: GET /api/secrets/vaults/{id}/pull returns all secrets as decrypted key-value pairs (for CLI integration).
    • Vault sidebar UI: Split-pane layout with vault list on left, secrets table on right. Create/edit/delete with inline forms.
    • E2E test suite: tests/secrets-manager-e2e.sh — vault CRUD, secret CRUD, encryption roundtrip, version history, pull.

Infrastructure

  • New crate dependencies: aes-gcm 0.10, base64 0.22 for AES-256-GCM encryption.
  • New service: secrets_crypto.rs — encrypt/decrypt with nonce+ciphertext format, unit tests included.
  • New migration: secret_vaults, secrets, secret_versions tables.
  • 8 new API endpoints under /api/secrets/.
  • Frontend: SecretsManager.tsx with vault browser, reveal toggle, version history panel.

[2.2.0] - 2026-03-22

Added

  • Incident Management: Full incident lifecycle with real-time status updates.
    • Managed incidents: Create, track, and resolve incidents with status lifecycle (investigating → identified → monitoring → resolved → postmortem).
    • Incident severity: Minor, major, critical, and maintenance classifications.
    • Incident timeline: Post updates with status changes and messages. Full audit trail with author emails and timestamps.
    • Postmortem support: Attach post-incident analysis with publish control.
    • Affected components: Link incidents to status page components for targeted impact reporting.
  • Enhanced Status Page: Production-grade public status page replacing the basic monitor list.
    • Status page configuration: Customizable title, description, logo URL, accent color, history display settings.
    • Component groups: Organize monitors into logical service components (e.g., "API Server", "Website") with grouping.
    • Overall status indicator: Automatically computed from component health (operational/degraded/major outage).
    • Incident history: Shows active incidents with full timeline, plus resolved incidents within configurable history window.
    • Auto-detected downtime: Legacy monitor-based incidents also displayed for complete visibility.
    • Email subscribers: Public subscribe/unsubscribe for incident notifications. Verified subscribers receive updates on status changes.
    • Standalone public page: Dark-themed, no-auth status page at /status with responsive layout.
  • Admin UI: New "Incidents" page in Operations nav with 3 tabs (Incidents, Components, Settings).
  • 11 new API endpoints: Incidents CRUD + updates, status page config, components CRUD, subscribers, enhanced public endpoint.
  • E2E test suite: tests/incident-management-e2e.sh covering full incident lifecycle, components, public page, subscribers.

Infrastructure

  • New migration: status_page_config, status_page_components, status_page_component_monitors, managed_incidents, managed_incident_components, incident_updates, status_page_subscribers tables.
  • Frontend: IncidentManagement.tsx (admin), PublicStatusPage.tsx (public standalone).

[2.1.0] - 2026-03-22

Added

  • Backup Orchestrator: New centralized backup management system for databases, Docker volumes, and sites.
    • Database backups: MySQL/MariaDB (mysqldump), PostgreSQL (pg_dump), and MongoDB (mongodump) dump + restore via Docker exec. Compressed with gzip.
    • Docker volume backups: Back up any Docker volume to .tar.gz using a temporary Alpine container. Restore volumes with one click.
    • Encryption at rest: Optional AES-256-CBC encryption (PBKDF2, 100k iterations) for all backup types via OpenSSL. Encrypted files get .enc suffix, originals are auto-deleted.
    • Automatic restore verification: Verify backups by spinning up temporary database containers and restoring dumps, or extracting archives to temp directories. Checks file integrity, table counts, and entry points.
    • Backup policies: Cross-resource policies with cron scheduling, destination selection, retention count, encryption toggle, and auto-verification.
    • Backup health dashboard: Global overview with total counts, storage usage, 24h success/failure rates, active policies, verification stats, and stale backup warnings.
    • Background verifier: Supervised service running every 6 hours that automatically verifies unverified backups and fires alerts on failures.
    • B2 and GCS destinations: Backblaze B2 and Google Cloud Storage now supported as backup destinations (S3-compatible API).
    • CLI commands: dockpanel backup db-create, db-list, vol-create, vol-list, verify, health — full backup management from the command line.
    • E2E test suite: Dedicated backup orchestrator test script (tests/backup-orchestrator-e2e.sh) covering health, policies CRUD, database backup lifecycle with verification.
  • Nav item: "Backups" in Operations section links to the new Backup Orchestrator page.

Infrastructure

  • New migration: backup_policies, database_backups, volume_backups, backup_verifications tables.
  • Extended backup_destinations with encryption_enabled, encryption_key columns, and B2/GCS dtype support.
  • Agent: 4 new services (database_backup, volume_backup, encryption, backup_verify) + 3 new route modules.
  • Backend: backup_orchestrator routes (11 endpoints), backup_verifier supervised background service.
  • Frontend: BackupOrchestrator.tsx page with 5 tabs (Overview, Policies, DB Backups, Volume Backups, Verifications).

[2.0.6] - 2026-03-21

Fixed

  • Nexus themes decoupled from layout: Nexus and Nexus Dark themes were previously locked to the Nexus layout only. They are now independent color themes that work with any layout (Terminal, Glass, Atlas, Nexus). Theme cycling (Ctrl+K) and Settings picker now include all 6 themes.

Improved

  • Premium card depth: Dark theme cards (Terminal, Midnight, Ember, Nexus Dark) now have subtle box shadows creating layered depth instead of flat rectangles.
  • Progress bar polish: All progress bars now have rounded ends and a subtle accent-colored glow per theme (green/blue/orange).
  • Bolder status indicators: Status dots (online/offline/warning) are larger (10px) with colored glow halos for better visibility on dense pages.
  • Theme picker expanded: Settings appearance panel now shows all 6 themes (was 4) with accurate mini-previews including Nexus Dark and Nexus Light.
  • Layout switcher description: Nexus layout description updated to "Modern SaaS, flat nav" (was "Light, clean SaaS" which was misleading since dark themes now work with it).

[2.0.5] - 2026-03-21

Added

  • Nexus Dark theme: Premium dark mode for the Nexus layout with sun/moon toggle. GitHub Dark-inspired three-layer depth palette, Inter font, rounded corners, blue accent. Persists across sessions.
  • Sidebar group labels: Navigation groups (Reseller, Operations, Admin) now display small uppercase labels in the Command layout sidebar.
  • Glass sidebar tooltips: Native browser tooltips show nav item names when the Glass layout sidebar is collapsed.
  • Card elevation system: Three elevation levels (.elevation-1/2/3), .card-interactive hover effects, .hover-lift card animations. Applied to dashboard cards, sites table, mail service cards, app templates, server/monitor items.
  • Page header system: Sticky page-header bar with title, subtitle, and action buttons. Applied to 13 pages (Dashboard, Sites, Databases, Apps, Security, Settings, Servers, Mail, Monitoring, DNS, Users, Git Deploy, Alerts).
  • Login background gradient: Subtle radial gradient that adapts per theme (green/blue/teal/orange).
  • Modal portal system: dp-modal / dp-modal-overlay CSS classes for Nexus-compatible modal styling across 15 modals in 6 pages.

Improved

  • Button color hierarchy: Only primary CTAs (Create Site, Run Scan, Add Record) stay green. All secondary/utility buttons (Customize, Restart Nginx, Export, Refresh, etc.) use neutral gray — breaks the green monotone across 6 pages, ~25 buttons.
  • Dynamic progress bar colors: CPU/Memory/Disk bars change from green (<70%) → amber (70-90%) → red (>90%). Disk uses 80/90 thresholds. Rounded ends with smooth 500ms transitions.
  • Dashboard visual hierarchy: Metric cards with elevation, 24h chart fade-in animation, staggered stat grid, collapsible onboarding wizard (auto-collapses after 3+ steps, persists to localStorage).
  • Sidebar footer redesign: User avatar circle with initial, hover-reveal logout button, descriptive health status ("Connected"/"Disconnected" replaces "OK"/"!"). Applied to both Command and Glass layouts.
  • Typography for non-terminal themes: Midnight and Ember now remove uppercase/tracking like Nexus. All 5 sans-serif themes get 15px body text for better Inter readability.
  • Security card grid: Changed from 5-column with orphan card to balanced 3-column grid with equal min-h-[140px] heights.
  • Table hover states: table-row-hover class added to Security, DNS, and Users table rows with theme-aware hover colors.
  • Onboarding wizard: Completed steps show a solid green circle with white checkmark. Collapsible with compact "Setup: X/5 complete" view.
  • Ember theme contrast: Lightened surfaces and brightened orange accent for better text readability.
  • Atlas layout nav: Added shrink-0 to nav items so they scroll horizontally instead of compressing.
  • Richer empty states: Sites, Databases, Git Deploys, Monitors, and Crons pages show contextual feature descriptions instead of bare "No X yet" text.
  • Login page: Removed bulky "Made with Rust" gear icon, replaced with minimal "Powered by Rust" text. Card shadows added.

Fixed

  • Theme switching: Nexus→Terminal white screen: Switching from Nexus layout to any other layout left dp-theme=nexus (white) active, rendering a white Terminal layout. Fixed with dp-pre-nexus-theme save/restore in LayoutSwitcher, NexusLayout, useLayoutState, and main.tsx IIFE.
  • Nexus modal clipping: Modals in Nexus layout were clipped by overflow-hidden on the main wrapper, hiding the top fields. Fixed with createPortal to render at document.body.
  • Nexus modal contrast: Modal cards in Nexus light had the same #f9fafb background as the page (invisible). Fixed with dp-modal class providing white background, strong shadow, and proper text colors.
  • Page header spacing: Added margin-bottom: 1.25rem to .page-header for consistent spacing between header and content.
  • Nexus light theme: tinted selection buttons: Migration source cards, Settings proxy selector, and all bg-rust-500/10-style toggle buttons were rendering as solid blue blobs. Fixed with properly unescaped selectors.
  • Nexus light theme: accent toggle visibility: bg-accent-500/15 toggles now render with readable blue tint and text.

[2.0.4] - 2026-03-20

Security

  • CORS lockdown: Deny all cross-origin requests by default. Same-origin panel UI is unaffected. Previously defaulted to AllowOrigin::any() which allowed CSRF from any website.
  • Constant-time token comparison: Agent auth middleware now uses subtle::ConstantTimeEq to prevent timing attacks on token validation.
  • Token hashing in database: Agent tokens stored as SHA-256 hashes in agent_token_hash column. DB dump no longer exposes plaintext tokens for inbound auth.
  • Token rotation: New POST /auth/rotate-token on agent + POST /api/servers/{id}/rotate-token on API. 60-second grace period for old token during rotation. Updates api.env on disk for persistence.
  • Secure cookie fix: BASE_URL defaulted to https://panel.example.com, causing Secure flag on cookies over HTTP. Fixed — defaults to empty, setup script sets from domain.
  • jsonwebtoken upgraded 9 → 10.3.0: Fixes type confusion vulnerability that could lead to authorization bypass.
  • serde_yml replaced with serde_yaml_ng: serde_yml and libyml are unsound/unmaintained. Replaced with serde_yaml_ng v0.10.0.

Fixed

  • Cascade cron cleanup: Deleting a site now removes cron entries from the system crontab. Previously, DB records were cleaned via CASCADE but crontab entries were orphaned.
  • UFW port gap: Setup script now adds panel ports (80, 443, 8443) to UFW even when the firewall is pre-existing. Previously skipped port rules if UFW was already installed.
  • Token rotation API→agent desync: Rotating the agent token now updates the API's in-memory AgentClient token AND writes to api.env on disk. Previously left the API with the old token, breaking all agent communication.

Added

  • CI pipeline (.github/workflows/ci.yml): Rust clippy, frontend type check, build verification, unit tests, cargo-audit + npm audit security scanning. Runs on every push to main and PRs.
  • E2E test suite (tests/e2e.sh): 62 tests across 27 categories — full CRUD lifecycle, security edge cases, zero-leftover cleanup. Run: bash tests/e2e.sh <host> [port].
  • Deep E2E test suite (tests/deep-e2e.sh): 51 tests for advanced features — WordPress install, backup restore, git deploy, reseller system, file operations, compose stacks, concurrent operations, extensions API.
  • 29 unit tests: Config parsing (BASE_URL defaults, Secure flag logic), token hashing, input validation (domains, names, container IDs, path traversal, pagination).
  • API reference (docs/api-reference.md): 648 lines documenting all 371 endpoints with request bodies and examples.
  • Competitor comparison (COMPARISON.md): Honest comparison vs HestiaCP, CloudPanel, RunCloud, CyberPanel, Ploi.
  • README overhaul: Dashboard screenshot, comparison table, collapsible screenshot gallery, cleaner structure.
  • FUNDING.yml: PayPal sponsor link (paypal.me/ovexro).

Verified

  • Reboot recovery: All services start automatically after server reboot. 62/62 E2E tests pass post-reboot.
  • Fresh install E2E: Full install via INSTALL_FROM_RELEASE=1 on clean Ubuntu 24.04 VPS — all features operational.

[2.0.3] - 2026-03-20

Added

  • Documentation site at docs.dockpanel.dev: mdBook-generated, 8 pages (getting-started, troubleshooting, CLI reference, WordPress, Git deploy, email, multi-server, backups). 1855 lines.

Changed

  • Docker app templates pinned: 33 of 39 :latest tags replaced with specific major versions (e.g., redis:7, ghost:5, grafana/grafana:11). 6 kept at :latest due to non-standard versioning (minio, nocodb, etc.).
  • Auto-monitors removed: Sites no longer auto-create uptime monitors on creation. Users create monitors manually when DNS is configured.

Added — Documentation

  • 8 documentation pages at docs/: getting-started, troubleshooting, CLI reference, and 5 guides (WordPress, Git deploy, email, multi-server, backups). 1855 lines of practical, copy-paste-friendly docs.

Fixed — Fresh Install E2E (real clean VPS test)

  • Local server not registered after setup: API returned 503 on all requests after admin creation. Added ensure_local_server() call in the setup endpoint.
  • Site docroot missing /public/ subdirectory: Agent created /var/www/{domain}/ but nginx expected /var/www/{domain}/public/. Fixed to create the correct subdirectory.
  • Backup tar flag incompatibility: Replaced --no-dereference with -h (POSIX-compatible).

Fixed — Comprehensive Audit (57 findings across 7 audit types)

Critical

  • Migration ordering: whitelabel_oauth migration was running before reseller_system (ALTERing a table before it existed). Renumbered to 20260320050000.
  • OAuth bypasses 2FA: OAuth login issued full session without checking totp_enabled. Now redirects to 2FA challenge when enabled.
  • Setup script missing build tools: Fresh VPS source builds failed — added build-essential cmake pkg-config installation.
  • No swap on x86_64 low-RAM VPS: Swap creation only triggered on ARM. Now applies to all architectures when building from source.
  • install-agent.sh wrong env vars: Remote agents never entered phone-home mode (AGENT_TOKEN vs DOCKPANEL_SERVER_TOKEN). Fixed to write both sets.
  • Systemd services never updated during upgrade: update.sh now rewrites service files with current ReadWritePaths and hardening.
  • Required directories not created during upgrade: update.sh now creates /etc/postfix, /var/vmail, and other directories needed by new features.

High

  • UFW blocks panel port 8443: IP-based installs now open the configured panel port in UFW.
  • ExecStartPost hardcodes www-data: Agent socket chgrp now auto-detects nginx group (www-data or nginx).
  • read prompt broken in curl-pipe-bash: Domain prompt now reads from /dev/tty when stdin is piped.
  • Frontend path mismatch after upgrade: update.sh now fixes nginx root path when switching between source and release modes.
  • config.rs default LISTEN_ADDR was 0.0.0.0:3000: Changed to 127.0.0.1:3080 to match all scripts and nginx config.
  • uninstall.sh incomplete cleanup: Now removes CLI binary, tmpfiles.d, crontab entries, /var/www/acme, /var/lib/dockpanel.
  • Stacks INSERT missing server_id: Docker Compose stacks now include server_id in INSERT.
  • Staging site INSERT missing server_id: Staging environments now inherit parent site's server_id.
  • No domain uniqueness across sites + git_deploys: Cross-table domain conflict check prevents silent hijacking.
  • Blue-green deploy dropped resource limits: New container now inherits memory/cpu_period/cpu_quota from config.
  • Git preview port has no unique constraint: Added UNIQUE INDEX on git_previews(host_port).
  • Site proxy_port has no unique constraint: Added partial UNIQUE INDEX on sites(proxy_port).
  • No terminal session limit: Added AtomicU32 counter with max 20 concurrent PTY sessions.

Added

  • CONTRIBUTING.md: Development setup, architecture overview, code style, PR process.
  • GitHub issue templates: Bug report and feature request forms with structured fields.
  • GitHub PR template: Checklist for builds, tests, and changelog.

Changed

  • README.md: Added badges (license, release, build), doc links, contributing section, phone-home disclosure.
  • .gitignore: Added SSL material, database file patterns.

Fixed — Adversarial Security Pentest

  • Rate limit bypass via X-Forwarded-For: Login rate limiter now uses X-Real-IP (set by nginx, not forgeable) instead of X-Forwarded-For.
  • SSRF filter bypass in extensions: Webhook URL validation replaced string-matching with DNS resolution + is_loopback()/is_private()/is_link_local() checks. Blocks hex IPs, decimal IPs, IPv6 loopback, DNS-to-localhost, cloud metadata.
  • Nginx version disclosure: Added server_tokens off to nginx config.

Fixed — Disaster Recovery

  • Agent fails after every reboot: Removed ReadWritePaths and PrivateTmp=yes from agent systemd service (redundant with ProtectSystem=no, and caused NAMESPACE errors for missing dirs). Added ExecStartPre to create /run/dockpanel.
  • Health endpoint false "ok": /api/health now checks DB connectivity, returns "degraded" when database is unreachable.
  • StartLimitIntervalSec in wrong section: Moved from [Service] to [Unit] in all 3 scripts.

Fixed — UX Walkthrough (fresh VPS testing)

  • Secure cookie over HTTP: Login cookie conditionally sets Secure flag based on BASE_URL scheme. SameSite changed from Strict to Lax (Strict blocked OAuth redirects).
  • Site document root not created: Agent now creates /var/www/{domain}/public/ with a default index.html during site provisioning.
  • PHP site without PHP check: Agent validates PHP-FPM socket exists before writing PHP nginx config. Returns clear error with install instructions.

Fixed — Supply Chain

  • serde_yaml archived: Replaced with serde_yml in agent and CLI (serde_yaml maintainer archived the crate in 2024).
  • MailHog abandoned: Replaced mailhog/mailhog template with axllent/mailpit (MailHog last updated 2020).
  • Stale build templates: Updated rust:1.82-slimrust:1.94-slim, golang:1.23-alpinegolang:1.24-alpine.

Fixed — Code Quality

  • Cloudflare auth header deduplication: 5 inline blocks → shared helpers::cf_headers().
  • Server IP detection deduplication: 6 inline blocks → shared helpers::detect_public_ip().
  • Agent semaphore split: Long-running ops (Docker builds) use separate 5-permit semaphore, quick requests keep 20.
  • Extension webhook rate limiting: Max 20 concurrent deliveries with atomic counter.
  • DB pool acquire timeout: 5-second timeout prevents indefinite blocking.
  • Uptime monitor N+1 query: Maintenance window check batched into single query.

[2.0.2] - 2026-03-20

Changed

  • Version alignment: All Cargo.toml and package.json versions bumped to 2.0.2 (were 0.1.0/1.0.0). API health endpoint and CLI --version now report correct version.
  • Binary size claims: Marketing site, README, and FAQ updated from "~20MB" (agent-only) to "~35MB" (total of agent + API + CLI) for honest comparison.
  • Template count: FAQ corrected from 53 to 54 app templates.
  • OS support: Hero section now includes Rocky Linux 9+ alongside other supported distros.

Fixed

  • install-agent.sh binary naming: Was downloading dockpanel-agent-x86_64 / dockpanel-agent-aarch64 but GitHub Releases publishes dockpanel-agent-linux-amd64 / dockpanel-agent-linux-arm64. Fixed to match release naming.
  • install-agent.sh apt-get hardcoding: Now detects package manager (apt/dnf/yum) instead of hardcoding apt-get. CentOS, Rocky, Fedora, and Amazon Linux now supported for remote agent installs.
  • install-agent.sh server-id persistence: --server-id was accepted but never written to config. Now persisted to /etc/dockpanel/api.env as SERVER_ID.
  • install-agent.sh tmpfiles.d: Added /run/dockpanel tmpfiles.d entry so socket directory survives reboots.
  • install-agent.sh systemd hardening: Remote agent service now matches local agent hardening (MemoryMax, LimitNOFILE, PrivateTmp, ProtectKernelLogs/Modules).
  • update.sh pre-built binary path: Added INSTALL_FROM_RELEASE=1 support so ARM users who installed via release binaries can update without Rust toolchain.
  • update.sh redundant health check: Removed duplicate wait-for-health loop after rollback-capable check.

[2.0.0] - 2026-03-19

Added — High-Impact Features

  • Multi-Server Management: Manage unlimited remote servers from one panel. AgentRegistry dispatches to local (Unix socket) or remote (HTTPS) agents. Server selector in sidebar, test connection, install script for remote agents. ServerScope extractor with user ownership verification on every request.
  • Reseller / Multi-Tenant Accounts: Admin → Reseller → User hierarchy. Reseller quotas (max users/sites/databases), server allocation, per-reseller branding (logo, colors, hide DockPanel name). Quota enforcement on site/database creation with counter sync.
  • Nixpacks Auto-Detection: Build any app without a Dockerfile using Nixpacks (30+ languages). Dynamic version resolution from GitHub releases. Deploy pipeline: try Nixpacks → fall back to auto-detect (6 langs) → docker build. Build method tracked per deploy.
  • Preview Environments: TTL-based auto-cleanup of preview deployments. Branch deletion webhook auto-removes previews. Configurable preview_ttl_hours per deploy. Background cleanup service (5-minute interval).
  • Migration Wizard: Import sites, databases, and email from cPanel, Plesk, or HestiaCP. 4-step wizard: select source → analyze backup (auto-detect domains, DBs, mail) → select items → SSE-streamed import. cPanel full parser, Plesk/HestiaCP beta stubs.
  • WordPress Toolkit: Multi-site WP dashboard with parallel detection. Vulnerability scanning against 14 known exploited plugins. Security hardening (7 checks, 6 auto-fixable via wp-cli). Bulk update plugins/themes/core across selected sites.
  • White-Label Branding: Public /api/branding endpoint. Per-reseller logo_url, accent_color, panel_name, hide_branding. BrandingContext provider applies to sidebar + login page. Dynamic accent color via CSS variable.
  • OAuth / SSO Login: Google, GitHub, GitLab via OAuth 2.0 authorization code flow. CSRF state tokens (10-minute expiry). GitHub private email fallback. Auto-create users on first OAuth login (configurable). Provider-colored login buttons.
  • Traefik Reverse Proxy: Alternative to nginx for Docker app routing. Traefik v3.3 as Docker container with auto-SSL (Let's Encrypt ACME). File-based dynamic route configs with auto-watch. Install/uninstall/status management. Settings toggle in admin panel.
  • Plugin / Extension API: Webhook-based integrations with HMAC-SHA256 signed event delivery. Extension CRUD with dpx_ API keys and whsec_ webhook secrets. Event types: site/backup/deploy/app/auth/ssl. Delivery log with status tracking. Secret rotation. SSRF protection on webhook URLs.

Added — Feature Gap Analysis Enhancements

  • SQL Browser: Built-in query editor for PostgreSQL and MariaDB with schema viewer
  • Node.js + Python Site Runtimes: Managed systemd services with auto-port allocation
  • Docker Compose Stacks: Full stack lifecycle (deploy, start, stop, restart, update, remove)
  • Blue-Green Zero-Downtime Deploy: Docker app updates with traffic swap and rollback
  • Git Push-to-Deploy Pipeline: Clone → build → deploy with webhook triggers and rollback
  • Container Health Checks: Docker health status (healthy/unhealthy/starting) in Apps view
  • Container Logs Viewer: Search, filter, auto-refresh, color-coded log levels
  • Command Palette (Ctrl+K): Global search across all panel pages
  • One-Click App Updates: Pull latest image, preserve config, recreate container
  • 34 App Templates: Database, CMS, monitoring, analytics, tools, dev, storage, media, networking, security
  • Getting Started Wizard: 5-step onboarding checklist

Changed

  • Architecture: Single-agent → multi-agent (AgentRegistry, AgentHandle enum, RemoteAgentClient)
  • Auth: Added ResellerUser extractor, ServerScope with ownership verification
  • Database: 8 new tables, server_id FK on all resource tables, reseller profiles, extensions, migrations
  • Frontend: BrandingContext, ServerContext providers. 8 new pages (Servers, ResellerDashboard, ResellerUsers, Migration, WordPressToolkit, Extensions, plus per-site WP and Git Deploy enhancements)
  • Rust Edition: 2024 (Rust 1.94)

Security

  • ServerScope verifies server.user_id == claims.sub on every request (prevents cross-user server access)
  • OAuth: SameSite=Strict cookies, error callback handling, empty oauth_id validation, no auto-link to password accounts
  • Extension API: SSRF protection (blocks private IPs, metadata endpoints), HMAC bypass fix, webhook secret rotation
  • Migration wizard: command injection fix (direct docker args), path traversal validation, TAR --no-same-owner
  • WordPress: domain path validation, targeted chown (not recursive), site path fallback
  • Nixpacks: build_context path traversal validation, dynamic version resolution
  • Traefik: ACME directory permissions (0700), network cleanup on uninstall
  • Branding: logo_url validated (HTTP(S) only), accent_color validated (hex/rgb/hsl only)
  • Reseller: quota enforcement wired up, server isolation for reseller users, counter sync on create/delete
  • Preview: TTL reset on redeploy, MAKE_INTERVAL for PostgreSQL safety, cleanup error logging

Fixed

  • 100+ findings from 9 comprehensive audits across all features
  • server_id filtering added to git_deploys, stacks, databases, dashboard, alerts list endpoints
  • Compose deployments now correctly set build_method='compose'
  • Preview cleanup query uses MAKE_INTERVAL instead of string concat
  • fire_event() wired into site/backup/app handlers (was dead code)
  • Traefik Docker app integration (was install-only with no functional routing)
  • Frontend SecurityItem type mismatch in WordPress Toolkit fixed
  • OAuth parameter mismatch (doc_root vs source_dir) in migration wizard fixed

[1.1.0] - 2026-03-15

Added

  • Email Management: Full mail server with one-click install (Postfix + Dovecot + OpenDKIM). Domains, mailboxes, aliases, catch-all, quotas, autoresponders, DKIM signing, DNS helper (MX/SPF/DKIM/DMARC), mail queue viewer
  • PowerDNS: Self-hosted DNS alongside Cloudflare. Provider selector, zone creation, record CRUD, setup guide
  • One-Click CMS Install: WordPress, Drupal, Joomla — create site + database + install + SSL in one click from Sites page
  • Historical Charts: SVG sparkline charts (CPU/Memory/Disk 24h) with background metrics collector (60s interval, 7-day retention)
  • Light Theme: CSS variable overrides, sun/moon toggle in sidebar footer, localStorage persistence
  • One-Click Service Installers: PHP-FPM, Certbot, UFW, Fail2Ban — install from Settings page
  • Smart Port Opener: Port recognition (28+ ports), safety categories (safe/caution/blocked), quick presets (Web/Mail/Database)
  • SSH Key Management: List/add/remove authorized keys with SHA256 fingerprints
  • Auto-Updates: Toggle for unattended-upgrades security patches
  • Panel IP Whitelist: Restrict panel access to specific IPs
  • Auto-SSL: Automatic Let's Encrypt provisioning on site creation
  • Webhook Testing: Test Slack/Discord webhooks from Settings
  • File Upload: Base64 binary upload with path traversal protection
  • Webmail Template: Roundcube one-click deploy from Docker Apps
  • Spam Filter Template: Rspamd one-click deploy from Docker Apps
  • BUILD STABLE Badge: Build status indicator in sidebar footer

Changed

  • Harmonized Color Palette: Green/amber/red at identical saturation/lightness (anchored at #22c55e). Custom warn-* and danger-* CSS scales. Zero stale emerald/amber/yellow references
  • Dashboard Redesign: Bar metrics with centered text-5xl numbers (replaced ring gauges), neutral white numbers + gray progress bars (color only for warnings/critical), system info grid (replaced neofetch style)
  • Sidebar Overhaul: Flat nav (no progressive disclosure), white active state with blinking _ cursor, 19px icons, spacing-only groups
  • Terminal Frame: Unified bordered container (header + canvas in single frame)
  • Mobile Responsive: Card layouts for Activity, Users, DNS records. Logs toolbar wrapping. Monitors polish
  • Contrast: All text-dark-400 bumped to text-dark-300 globally (36 instances, 14 files) for WCAG compliance
  • Animations: Page fade-up, stagger children, counting numbers, typewriter welcome, hover-lift. Respects prefers-reduced-motion
  • Login Page: Logo updated to match sidebar brand
  • Apps/Sites Separation: WordPress/Drupal/Joomla moved from Docker Apps to native PHP in Sites. 32 Docker templates remain for services and tools
  • 502 Error UX: "Agent offline" message with systemctl restart command instead of cryptic "Request failed (502)"
  • Security Score: Prominence increase, singular/plural grammar fix
  • Apps Empty State: Error message with icon when templates fail to load

Fixed

  • Diagnostics: Agent nginx -t check distinguishes [warn] from [emerg]/[error] — no false critical on cosmetic warnings
  • Document Root False Positives: Changed ProtectHome=yes → read-only so agent can see /home/* directories
  • Agent Socket Persistence: Added tmpfiles.d config + /run/nginx.pid to ReadWritePaths
  • Agent Permissions: NoNewPrivileges=no, ReadWritePaths for mail/apt/etc paths — enables package installation
  • CUPS Disabled: Removed unnecessary print service

Security

  • Setup script auto-installs UFW + Fail2Ban with default rules
  • Smart firewall blocks dangerous ports (Telnet, NetBIOS, SMB, MSSQL)
  • All cookie flags verified: HttpOnly, Secure, SameSite=Strict, Max-Age=7200

Infrastructure

  • Metrics collector background service (60s interval, 7-day retention)
  • Mail config sync to Postfix/Dovecot via atomic file writes
  • DKIM key generation via openssl RSA 2048-bit
  • Setup script installs PHP, Certbot, UFW, Fail2Ban out of the box

[1.0.0] - 2026-03-14

Added

  • Core Panel: Site management (static, PHP, proxy), database management (PostgreSQL, MariaDB), SSL (Let's Encrypt), file manager, web terminal, backups
  • Docker Apps: 50+ one-click templates across 10 categories + Docker Compose import
  • CLI: Full command-line interface — status, sites, db, apps, ssl, backup, logs, security, diagnose, export, apply
  • Infrastructure as Code: YAML export/import of server configuration
  • Smart Diagnostics: Pattern-based issue detection across 6 categories with one-click fixes
  • Auto-Healing: Automatic restart of crashed services, log cleanup on full disk, SSL renewal
  • Alerting System: 5 alert types (CPU/memory/disk thresholds, server offline, SSL expiry, service health, backup failure) with email, Slack, Discord notifications
  • 2FA/TOTP: Full two-factor authentication with QR setup and recovery codes
  • Dashboard Intelligence: Health score (0-100), top active issues, SSL expiry countdowns
  • Docker Resource Limits: Memory and CPU limits on container deploy
  • Container Management: Health checks, logs viewer, environment viewer, one-click updates
  • Security: Firewall management, Fail2Ban, SSH hardening, security scanning with scoring
  • DNS Management: Cloudflare DNS zone management with full record CRUD
  • Git Deploy: Webhook-triggered deployments from Git repos
  • Staging Environments: Create staging copies, sync from production, push to live
  • Uptime Monitoring: HTTP checks with configurable intervals and incident tracking
  • Teams: Multi-user access with roles and team-based permissions
  • Activity Log: Full audit trail of all admin actions
  • Multi-Server: Manage unlimited servers from a single dashboard
  • ARM64 Support: Pre-built binaries for Raspberry Pi and ARM64 servers
  • Auto Reverse Proxy: Domain + SSL auto-configured when deploying Docker apps
  • Command Palette: Ctrl+K global search across all panel pages
  • Notification Channels: Email toggle, Slack/Discord webhook configuration
  • Custom Nginx Directives: Per-site textarea for advanced nginx config
  • Onboarding Wizard: 5-step getting started checklist for new users

Security

  • JWT auth with HttpOnly cookies + Bearer header support
  • Token blacklist for logout with periodic cleanup
  • Argon2 password hashing
  • Rate limiting on login, 2FA, webhooks, and agent endpoints
  • Systemd hardening (NoNewPrivileges, ProtectSystem, MemoryMax)
  • Nginx rate limiting (30r/s on API)
  • 12 CHECK constraints on database status/type fields
  • Atomic nginx config writes (tmp+rename)

Infrastructure

  • Supervised background tasks with auto-restart on panic
  • Statement timeout on all database pool connections (30s)
  • Agent request timeout (60s)
  • DB backup cron (daily, 7-day retention)
  • Docker prune cron (weekly)