All notable changes to DockPanel will be documented in this file.
The format is based on Keep a Changelog.
Fixes two fresh-install blockers reported in #70 and #71.
- Install failed on Debian 12 with
GLIBC_2.38 / GLIBC_2.39 not found(#70). Release binaries were built onubuntu-latest(now Ubuntu 24.04, glibc 2.39), so the dynamically-linked agent/API/CLI demanded glibc ≥ 2.38 and the agent refused to start on Debian 12 (glibc 2.36). The same break silently affected the rest of the documented support matrix — Ubuntu 20.04, Debian 11, CentOS 9, Rocky 9, Amazon Linux 2023 all ship glibc ≤ 2.34. The release workflow now builds fully static musl binaries (x86_64-unknown-linux-musl/aarch64-unknown-linux-musl) viacargo-zigbuild, so the binaries carry zero glibc dependency and run on any modern Linux regardless of distro libc version (lddreports "statically linked"). DockPanel's TLS stack is entirely rustls, so there is no OpenSSL system dependency to block static linking. - First login bounced straight back to the login screen on a domain
install served over HTTP
(#71). When a domain
is supplied at setup,
setup.shwritesBASE_URL=https://<domain>, but the panel vhost is served over plain HTTP until TLS is added. The cookie helper keyed theSecureflag offBASE_URL, so it stampedSecureon the session cookie even though the response left over HTTP — the browser silently dropped the cookie and the next/api/auth/me401'd, bouncing the user back to the login screen in every browser. This was the case the #47 fix left open (it only removed the empty-BASE_URLpath).routes/auth.rs::cookie_secure_flagand the OAuth callback now deriveSecuresolely from the actual request scheme (X-Forwarded-Proto, which nginx always sets and is authoritative because the API only listens on127.0.0.1behind the proxy). HTTPS installs still get aSecurecookie; HTTP-served installs no longer bounce.
Hotfix for the v2.8.22 webmail reverse-proxy regression reported in #57.
- Webmail "Open" landed on the panel dashboard instead of Roundcube
login. Roundcube emits root-anchored URLs in its HTML (form
action="/?_task=login") and inline JS (comm_path: "/?_task=login") — it has no concept that it lives under/webmail/on the panel vhost. The v2.8.22 nginx fragment proxied withproxy_redirect offand no body rewriting, so the browser navigated to/?_task=login→ hit the panel'slocation /block → rendered the React SPA (dashboard). The CPU spike in the report was Roundcube's container booting on first hit. Fix inpanel/agent/src/routes/mail.rs:926: addedproxy_redirect / /webmail/;to rewrite 30xLocation:headers andsub_filter '"/?_task=' '"/webmail/?_task=';to rewrite embedded URLs in HTML/JSON/JS bodies. Also clearsAccept-Encodingto upstream sosub_filterreceives uncompressed responses. - Auto-heal for existing webmail installs. v2.8.22 → v2.10.0 boxes
already have the broken fragment on disk; the agent only writes on
Install click, so users would have to Remove + Install to recover.
scripts/update.sh:428detects the old shape (nosub_filterline) and regenerates the fragment from the current template, using the current Roundcube container's host port fromdocker inspect.
Phase 4 W4 ships panel self-update from the UI with health-check rollback, persistent snapshots, update channels (stable / candidate / hold), and fleet rolling updates.
The reframing matters: scripts/update.sh:430-499 already ships a
production-tested binary-swap + .bak-restore rollback flow. v2.10.0 does
NOT reimplement that — the new orchestrator (services/panel_update.rs)
shells out to the same script under a controlled environment
(DOCKPANEL_NO_SELF_REFRESH=1 + DOCKPANEL_VERSION=<target>) so every
bug fix already in update.sh (self-refresh ordering from v2.8.15-16,
lock-wait conf from v2.8.17, fragment-include awk migration from v2.8.22,
ACME cooldown from v2.8.23) keeps working unchanged. The new code is a
presentation + persistence layer over a proven core.
- Apply Update button in Telemetry → Updates tab. Click → confirm
modal showing target version + 4-step preview ("snapshot, download,
swap, probe") → status modal that polls
/api/update/statusevery 2s. Replaces the static SSH copy-paste block. - Pre-update snapshot service (
services/panel_snapshot.rs). Each Apply call first writes a tar.gz triplet to/var/backups/dockpanel/snapshots/containingbinaries/{agent,api,cli}+db/dump.sql.gz(pg_dump --clean --if-exists) +etc/dockpanel/+metadata.json. Written to.tmpthen atomically renamed; DB row only inserted after rename succeeds. Refuses to create when the snapshot partition has less than 2 GiB free. - Operator-triggered rollback from the UI. Each snapshot row has a
Roll back button (confirms then restores binaries + DB + /etc and
bounces services). Reach back to any retained snapshot, not just the
.bakthatupdate.shkeeps for ~30 seconds. - Update channels: stable (GA only — current default behaviour),
candidate (includes
prerelease: truebuilds, takes the first bypublished_atdesc), hold (skips the 6h auto-poll entirely; Manual Check button still works). Channel selector in Updates tab, singlesettings.update_channelrow. - Fleet rolling update. Operator-initiated form in Updates tab:
target version + halt-on-failure + include-panel toggles. Plan = all
user-owned remote servers reachable in the last 5 minutes, sorted
oldest agent_version first. POSTs to each agent's new
/panel/updateendpoint, polls/panel/update/statusfor terminal state, records per-server progress infleet_update_runs.progressJSONB. Halts on first failure unlesshalt_on_failure: false. - Agent-side
/panel/updatereceiver (panel/agent/src/routes/panel_update.rs). Distinct from the existing OS-package/system/updates/*endpoints. Bearer-auth, returns 202, spawnsupdate.shdetached so the agent's own systemctl restart doesn't break the subprocess pipeline. - 10 new admin endpoints under
/api/update/*+/api/snapshots/*: GET/status, POST/apply, POST/manual-check, POST/rollback, GET+PUT/channel, GET+POST/api/snapshots, DELETE/api/snapshots/{id}, GET+POST/update/fleet, GET/update/fleet/{id}. All admin-gated. - Snapshot retention sweep wired into the existing 24h
run_retention_cleanupticker inauto_healer.rs. Always-keep last 3 snapshots regardless of age; delete anything older than 7 days beyond that floor. File-delete first; DB row stays for retry if the file delete fails. - Startup finalize hook (
finalize_pending_on_startupinservices/panel_update.rs). Closes out anypanel_snapshotsrows withto_version IS NULLafter a process restart by writingto_version = CARGO_PKG_VERSION. Equalfrom_version/to_versionon a finalized row indicatesupdate.sh's in-flight rollback fired; differing values indicate a successful apply.
- Update poller honours
update_channel(services/telemetry_collector.rs:302).holdskips the poll;candidatewidens to/releases?per_page=20(first bypublished_atdesc);stablekeeps the existing/releases/latestURL bit-for-bit. scripts/update.shaccepts two new env vars.DOCKPANEL_NO_SELF_REFRESH=1bypasses the v2.7.13 self-refresh block so the orchestrator can stream a single subprocess invocation's stdout into its state machine without a mid-flight re-exec breaking the pipe (SSH-operator flow keeps self-refresh on by default).DOCKPANEL_VERSION=vX.Y.Zpins the release tag instead of fetching/releases/latest, so a candidate-channel pick can't race a GA publish between the panel's poll and the operator's click.
- One settings row inserted (
update_channel = 'stable'— the implicit pre-W4 default). Two new empty tables (panel_snapshots,fleet_update_runs). No ALTER on existing tables; every install keeps current behaviour until an admin clicks Apply or changes the channel. Migration file:20260520000000_panel_self_update.sql.
- Snapshots consume disk: ~150-300 MB each typical, retained 7 days
(last 3 always kept). Stored under
/var/backups/dockpanel/snapshots/. A free-disk pre-check refuses to create a snapshot if the partition has less than 2 GiB free. update.sh's SSH-only flow continues working unchanged — no operator forced to use the UI.- Cosign signature verification at download time is not in W4. HTTPS-to-GitHub is the existing trust boundary; cosign verify is a separate hardening pass (non-trivial key management).
- The api process will be killed mid-binary-swap by
update.sh'ssystemctl stop dockpanel-api. The orchestrator state lives in the DB rows (panel_snapshots.to_version); the new process boots Idle and the finalize hook closes out the in-flight row.
Phase 4 W3 ships on-call rotations and escalation policies. A small team can now self-host their on-call schedule in DockPanel — when an alert fires, the panel pages whoever is on-call right now (not every channel on the rule); if the alert isn't acknowledged within a policy-defined threshold, the panel routes the page to the next step in the chain.
Larger teams that already pay for PagerDuty keep using PagerDuty — the
escalation policy supports a webhook:<url> route shape that forwards
directly into their existing PD service key.
on_call_schedulestable + admin tab. A rotation = ordered list of user IDs plus a cadence in days (1–90). "Who's on-call at time T" is one-liner cadence math against an anchor; no calendar widget, no per-day overrides, no holiday handling. New endpointsGET/POST/PUT/DELETE /api/on-call/schedules[/{id}](admin) andGET /api/on-call/whoami(any authenticated user) for "am I on the hook right now?"escalation_policiestable + admin tab. Policies are an ordered JSONB array of{after_minutes, route}steps. Routes are discriminated:on_call_schedule:<uuid>resolves to the current rotation holder,user:<uuid>pages a specific user,all_channelspreserves the pre-W3 default (alert owner's channels),webhook:<url>is a direct outbound webhook bypass. New endpointsGET/POST/PUT/DELETE /api/escalation-policies[/{id}](admin).- Per-alert-rule policy attachment.
alert_rulesgains a nullableescalation_policy_idFK. NULL = pre-W3 hardcoded 15-min unack → 30-min re-page (unchanged for every existing rule). Admin-only attach endpoint atPUT /api/alert-rules/{rule_id}/escalation-policy. - Ack actor + optional comment.
PUT /api/alerts/{id}/acknowledgenow accepts an optional{ "comment": "..." }body (500-char cap) and stores bothacknowledged_by(the actor) andacknowledged_comment. Older clients that PUT with no body keep working — they just don't carry a comment. The UI surfaces actor email + truncated comment inline on each acked alert row. - Frontend tabs. Alerts page grows two new tabs alongside Alerts +
Runbooks: an On-call editor (rotation CRUD with reorderable member
list) and an Escalation policies editor (step chain with route picker
- live route description).
- Escalation pages now carry the runbook payload. Phase 4 W2 added
the runbook excerpt + URL to fire payloads via
send_notification_with_runbook, butcheck_escalationswas still calling baresend_notification— so re-pages on unacknowledged alerts lost the runbook context that the original fire had carried. The W3 rewrite ofcheck_escalationsextracts the sharedload_runbook_payloadhelper so fire and escalation paths produce identical payloads.
No manual action is required. escalation_policy_id is added to
alert_rules as a nullable FK with default NULL — every existing rule
keeps its pre-W3 behaviour bit-for-bit. The three new alerts columns
(acknowledged_by, acknowledged_comment, escalation_step_index)
default to NULL/0 on existing rows.
- SSL renewal cadence is now profile-aware. The auto-healer previously
used a hardcoded 6h cooldown for both ARI re-fetch (RFC 9773) and the
post-attempt retry. For the new
shortlivedprofile (6-day certs whose renewal window is only ~4 days wide), 6h was 6% of the cert lifetime — a CA-issued early-renew nudge could be missed by a full quarter-day, and a failed attempt near expiry could burn the whole window. The cooldown is now 1h forshortlivedand stays at 6h fortlsserver(45d) andclassic(90d → 64d → 45d across the LE roadmap). Lets-Encrypt's tlsserver profile transitioned to 45-day issuance on 2026-05-13, which is what unblocked this change.
-
New Prometheus counter:
dockpanel_cert_renewals_total{result}withresult="success"andresult="failure"labels. Tracks auto-healer renewal attempts so operators can graph trend and alert onrate(...{result="failure"}[1h]). The counter is process-local (resets across restart — Prometheusincrease()handles that gracefully) and adds zero DB queries per scrape.Exposed at
/metricsalongside the existingdockpanel_info,dockpanel_site_count, anddockpanel_alerts_firinggauges.
-
Webmail "Open" button on the Mail page was unreachable ([#57] third finding by @WiskeyPapa). The Roundcube container is bound to
127.0.0.1:8888on the host (loopback only — never exposed to the public IP for security), but the frontend Open button generated ahttp://<panel-hostname>:8888URL. That URL has nothing listening on it (and Cloudflare doesn't proxy port 8888 anyway), so the button produced a hang / connection refused.Fixed by reverse-proxying Roundcube under
/webmail/on the existing panel nginx vhost via a drop-in fragment file. The frontend Open URL is now${origin}/webmail/— same-origin, inherits the panel's TLS, works on both HTTPS-with-domain and HTTP-on-IP installs. The Roundcube container also gets new env vars (ROUNDCUBEMAIL_PROXY_WHITELIST=127.0.0.1, plusROUNDCUBEMAIL_TRUSTED_HOSTSandROUNDCUBEMAIL_FORWARDED_PROTO=httpswhen the panel has a configuredserver_name) so Roundcube accepts the forwarded headers and generates correct URLs behind the proxy.
-
webmail_installis now idempotent — clicking Install when an existingdockpanel-roundcubecontainer is present tears it down before recreating, so env-var additions across releases (like the v2.8.22 proxy/trusted-hosts envs) apply automatically on next Install click. Users who deployed Roundcube on v2.8.20/v2.8.21 just need to click Install again, which now rebuilds in place. Thewebmail_removeendpoint also tears down the panel-vhost reverse-proxy fragment for clean uninstall. -
Panel nginx vhost gains an
include /etc/nginx/conf.d/dockpanel-panel.locations/*.conf;directive (baked intoscripts/setup.sh's vhost template; injected into existing vhosts byscripts/update.shvia an awk-based one-time migration — same shape as the v2.8.3 IPv6-listen migration). Drop-in directory for path-mounted tool reverse-proxies; webmail is the first user, but other tools (phpMyAdmin, Adminer) can use the same mechanism in the future.
- New helper
panel_server_name()inpanel/agent/src/routes/mail.rsreads the panel vhost'sserver_namedirective to drive Roundcube'sTRUSTED_HOSTScomputation — same approachupdate.shuses to detect the panel domain forBASE_URLauto-population. - New helpers
write_webmail_nginx(port)/remove_webmail_nginx()write the/webmail/location fragment, validate vianginx -t, and reload on success. Failed validation unlinks the fragment so nginx is never left in a broken state.
-
Firewall add/remove rule returned "Agent offline" with
ufw: ERROR: '/etc/ufw/user.rules' is not writable([#57] follow-up by @WiskeyPapa). The agent runs underProtectSystem=strictwith an explicitReadWritePaths=allowlist, and/etc/ufwwas never in that list.ufw status(read-only) worked, but writes touser.rulesduring add/delete were blocked by the sandbox mount. Added/etc/ufwand/var/lib/ufwto the canonical agent unit'sReadWritePaths, plus matching pre-create entries inscripts/setup.shandscripts/update.shso the namespace mount succeeds even on systems where ufw isn't installed yet. Same shape of fix as the v2.8.13 expansion that added/etc/modsecurity//etc/cloudflared//etc/postfixto the RWP list. -
Dashboard "Set up backups" onboarding step stayed incomplete after a manual backup ran, and the card linked to Sites instead of the Backups page ([#57] follow-up by @WiskeyPapa). The completion check was
sitesList.some(s => !!s.backup_schedule), but/api/sitesdoesn't return abackup_schedulefield — so the check was always false regardless of how many backups had been created. Added a newGET /api/backup-setup-statusendpoint (auth-gated, scoped by user) returning{ has_schedule, has_backup }derived from real DB counts acrossbackup_schedules,backups,database_backups, andvolume_backups. Dashboard now fetches the status once and the card flips to complete as soon as any of those exist. Link retargeted from/sites/<id>to/backup-orchestrator(the global backup view).
- WAF install button stayed on "Install" after a successful install on
Ubuntu 24.04 ([#57] follow-up by @WiskeyPapa). Ubuntu Noble's
time_t-64 ABI transition renamedlibmodsecurity3→libmodsecurity3t64as a virtual-provides (no transitional shim). The agent'sinstall_statusroute was checkingdpkg -l libmodsecurity3literally, which never matches "ii" on Noble even though the install succeeded — frontend therefore kept showing "Install". Detect path inroutes/service_installer.rs::install_statusnow accepts either name (OR-clause, matching the existing PHP fallback pattern). Same fix applied touninstall_waf's apt purge list so uninstall on Noble actually removes the package instead of silently no-op'ing.
- Mail server install failed under the agent's strict sandbox ([#57]
follow-up by @WiskeyPapa).
routes/mail.rs::install_mailwas runningapt-get installvia the sandboxedsafe_command(...)wrapper, so the agent unit'sProtectSystem=strictmade/var/lib/dpkg/lock-frontendread-only inside the namespace. apt printedNot using locking for read only lock file /var/lib/dpkg/lock-frontendwarnings and then bailed when it tried tochownfiles. Switched the four apt-get call sites inmail.rs(install / purge / autoremove / rspamd install) tosafe_command_unsandboxed, matching the #54-A pattern v2.8.14 applied to vmailuseradd/groupadd(which lived right next to the apt-get call that this commit corrects).routes/system.rs::disk_cleanup'sapt-get cleangot the same treatment so/var/cache/aptactually gets cleared. - Cloudflare Tunnel install wrote a literal
$(lsb_release -cs)into/etc/apt/sources.list.d/cloudflared.list([#57] follow-up by @WiskeyPapa). The shell pipeline used single quotes around the echo argument, which prevents bash command substitution. Once the broken source landed, every subsequentapt-get updateon the box failed (The repository '... $(lsb_release Release' does not have a Release file), blocking unrelated installs (Redis, WAF, Mail Server). Pre-resolveVERSION_CODENAMEfrom/etc/os-releasein Rust andprintfthe source line with the actual codename — also drops thelsb-releasepackage dependency on minimal Debian images. Defensive: on install failure, delete a half-written source file so it doesn't break the rest of apt. update.shnow repairs an existing broken cloudflared apt source on upgrade. Operators who already hit the bug get auto-cleanup onINSTALL_FROM_RELEASE=1 bash update.sh— no manualrmneeded. Looks for the literal$(lsb_releasestring in/etc/apt/sources.list.d/cloudflared.listand removes the file if found.
- Phase 4 W2: Alert runbooks attached to fired alerts. Markdown text per
alert type, indexed by
alert_type. Excerpts (280 char, truncated at sentence boundary) ride along in slack/discord/pagerduty/webhook payloads; full markdown is rendered into email HTML and into the new Alerts page row expansion. Operator-edited runbooks survive panel upgrades by construction (apply-defaultsusesON CONFLICT DO NOTHINGand never overwrites edits). Resolution is DB-row-then-default — fresh installs produce useful payloads from the compile-time const slice without the operator having to seed first. - 15 default runbooks shipped with the panel (
panel/backend/runbooks/): 5 critical (offline / service_down / container_crashloop / backup_failure / gpu_temperature), 9 warning (cpu / memory / disk / disk_forecast / ssl_expiry / container_unhealthy / gpu_utilization / gpu_vram / memory_leak), 1 info (container_down). Each follows the same shape: Why this fired → First check → Common causes → Escalation. Authored for paging-grade discipline (info alerts won't wake anyone, critical ones page clearly). Alerts → Runbookstab with per-type list, edit modal (split textarea + live markdown preview, severity selector, Restore-default button), and "Seed missing default runbooks" action with insert-or-skip confirmation.- Inline runbook expansion on each fired alert — click any row in the Alerts list, the runbook for that alert type is fetched and rendered below the alert detail. Targets the W2 acceptance bar: an admin paged at 3am sees the runbook in the page, not as a "go look at our wiki" link.
- 5 admin-only API endpoints under
/api/alerts/runbooks:GET(list withis_defaultflag),GET {alert_type}(single),PUT {alert_type}(upsert, 50KB cap, severity validated),DELETE(restore default by removing DB row),POST apply-defaults(insert-or-skip from const slice, returns{ inserted, skipped }).
services/notifications.rs::try_fire_alertnow resolves a runbook byalert_typeand threadsrunbook_excerpt+runbook_urlthrough a newsend_notification_with_runbookhelper (the existingsend_notificationis unchanged, so the 14 non-alert callers across auto_healer, uptime, security_hardening, git_deploys, and incidents stay on the original API). Email gets full pulldown-cmark-rendered HTML appended to the body, slack and discord get a link plus excerpt, pagerduty extendscustom_details, generic webhook addsrunbook_url+runbook_excerptas top-level keys.- New backend dep:
pulldown-cmark = "0.10"(no_std-capable, ~50KB binary impact, fuzz-tested upstream; rendered output wrapped incatch_unwinddefensively with HTML-escape fallback). - New frontend deps:
marked@^14+dompurify@^3(~51KB gzipped combined). DOMPurify is non-negotiable defense-in-depth — runbook markdown is admin-authored but stored in DB and editable via API. - Email template variables now include
{{runbook_excerpt}}and{{runbook_url}}alongside the existing{{title}}/{{message}}/{{severity}}/{{timestamp}}. Backwards-compatible: existing custom templates ignore unknown placeholders. - Migration
20260507000000_alert_runbooks.sqladds the table:alert_runbooks(alert_type TEXT PK, runbook_md TEXT, severity_default TEXT CHECK (info|warning|critical), updated_by UUID FK users(id) ON DELETE SET NULL, updated_at TIMESTAMPTZ).
- Agent installers failed with
Could not get lock /var/lib/dpkg/lock-frontendwhen another apt was running (#57 follow-up). On fresh Debian 13 boots,unattended-upgradesruns in the background and holds the dpkg frontend lock for several minutes. The panel UI'sInstall PHP 8.4(and any other agent-driven apt install/purge — services, updates) failed immediately on contention instead of waiting. Bothsetup.sh(fresh installs) andupdate.sh(existing operators) now drop/etc/apt/apt.conf.d/99-dockpanel-lock-wait.confsettingDPkg::Lock::Timeout "300";— every apt invocation on the system (agent and otherwise) now waits up to 5 minutes for the dpkg lock before giving up. No agent code change needed; the config file is read fresh on every apt run. Verified end-to-end on Debian 13 Trixie:python3 fcntl.lockfholding the dpkg lock for 15 s →apt-get installwaits 15 s and succeeds (vs. 0 s fail-fast pre-fix). - Settings → Services →
Install Redis(and Node.js, Composer, WAF, Cloudflare Tunnel) returned 404 (#57 follow-up). Latent backend gap since these services were added: the agent has full install/uninstall implementations inpanel/agent/src/routes/service_installer.rs, but the backend'sroutes/mod.rsonly proxied install for php/certbot/ufw/fail2ban/ powerdns. Frontend POST to/api/services/install/redis(and the other four) hit a non-existent route and returned 404 before reaching the agent. Added the 5 missing install handlers + the 2 missing uninstall handlers (waf, cloudflared) inpanel/backend/src/routes/system.rsand registered all 7 routes inroutes/mod.rs. Each handler is a 5-line proxy mirroring the existing pattern.
- PHP install failed on Debian 13 (trixie) (#57).
setup.shhardcodedPHP_VER=8.3and reached foradd-apt-repository -y ppa:ondrej/phpwheneverapt-cache show php8.3returned nothing — but trixie ships PHP 8.4 in its default repo, andppa:ondrej/phpis an Ubuntu PPA that has no packages built for trixie. Fresh Debian 13 installs hit "PHP 8.3 installation failed" and ended up with no PHP at all. New flow: try the default-repophp-fpmmetapackage first (covers Debian 13/12 and Ubuntu 24.04 cleanly with whatever PHP version each distro ships), fall back todeb.sury.orgfor older Debian orppa:ondrej/phpfor Ubuntu when the default repo can't satisfy the install. Same Debian-vs-Ubuntu split applied to the panel-driven PHP installer inpanel/agent/src/routes/php.rsso Settings → Services → Install PHP works on Debian too. update.shself-refresh never fired on the default code path. Mode auto-detection (INSTALL_FROM_RELEASE=1when no Rust toolchain / no source) ran after the self-refresh check, so a user running plainbash /opt/dockpanel/scripts/update.shentered withINSTALL_FROM_RELEASE=0, failed the self-refresh gate, and then got bumped to1by auto-detect — but with the stale local script still executing. Effect: pre-v2.8.16 panels swapped binaries to the latest release just fine, but never picked up script-side fixes (unit-file deploys, nginx config tweaks, install-agent.sh drop into FE_DIST). That's why issue #56 resurfaced after the v2.8.14 fix shipped — operators on v2.8.13 ran update.sh and stayed on v2.8.13's update.sh logic. Fix: move mode detection ahead of the self-refresh block soINSTALL_FROM_RELEASEis correct by the time the gate evaluates. Operators on v2.8.13/v2.8.14/v2.8.15 should runINSTALL_FROM_RELEASE=1 bash /opt/dockpanel/scripts/update.shonce to trigger self-refresh; from v2.8.16 onward, plainbash update.shworks.- PHP 8.4 not detected as installed.
panel/agent/src/routes/service_installer.rsenumeratedphp8.{1,2,3}-fpmto determine if PHP was installed/running, so a Debian 13 install (which lands PHP 8.4 from the default repo) was reported as "PHP not installed" in Settings → Services even when it was running fine. Addedphp8.4-fpmto both checks.
update.shskipped the repo sync inINSTALL_FROM_RELEASE=1mode, so v2.8.14's canonical-unit changes never deployed on the standard upgrade path. Found by the v2.8.13 → v2.8.14 VPS upgrade test: binaries upgraded to v2.8.14 successfully, but the systemd unit file on disk was still v2.8.13's content (noRuntimeDirectory=dockpanel, no/var/cache/nginxinReadWritePaths=). Root cause: line 106 gatedgit pullbehindINSTALL_FROM_RELEASE != 1, but the code at line 215 deploys the canonical unit from$AGENT_SRCregardless of mode. Same family as the v2.8.13 "dev fiction" bug — canonical file in repo, installer reads stale on-disk copy.git pull --ff-onlyalso didn't cover installs cloned with-b v2.8.13(or any explicit tag — they end up on a detached HEAD with no localmain). Replaced the conditional with an unconditionalgit fetch --depth=1 origin main+git reset --hard FETCH_HEADso the canonical unit, nginx templates, and install-agent.sh are always at the latest origin/main when update.sh runs. Operators who already upgraded to v2.8.14 viabash update.shshould re-run it on v2.8.15 to pick up the unit changes; the self-refresh logic added in v2.7.13 will fetch the fixed update.sh from this release.
-
WordPress provisioning failures on every fresh install (#54). Three independent regressions surfaced when the v2.8.12 strict sandbox shipped, each only firing on specific paths so they slipped through the v2.8.13 verification:
Failed to download wp-cli—services/wordpress.rs::ensure_cliransafe_command("curl") -o /usr/local/bin/wp, but/usr/local/binis not in the agent'sReadWritePathsunderProtectSystem=strictso the write was blocked silently and bubbled up as a 422 on the WP install endpoint. Switched tosafe_command_unsandboxed("curl", &[])(the samesystemd-runescape used for apt/dpkg in v2.8.12) and now surface the curl stderr in the error message instead of just the static "Failed to download wp-cli" string.mkdir() "/var/cache/nginx/fastcgi/<site>" failed (ENOENT)—routes/nginx.rs::put_sitecalledcreate_dir_allon the per-site FastCGI cache path before rendering the vhost, but/var/cache/nginxwas not in the agent'sReadWritePathsso the create silently failed (only atracing::warn!). The config was written anyway; nginx -t then fired its own mkdir of the cache leaf, found no parent, and rejected the reload. Added/var/cache/nginxtoReadWritePaths=in the canonical unit, pre-created/var/cache/nginx/fastcgiinsetup.shandupdate.sh, and promoted the agent-sidecreate_dir_allfailure from awarn!to a 500 with an actionable message ("Ensure /var/cache/nginx is in the agent's ReadWritePaths") so we never again render a config we know nginx can't validate.tar: unrecognized option '--no-dereference'— three call sites (services/backups.rs,services/wordpress.rs::create_update_snapshot,routes/mail.rs::mailbox_backup) passed--no-dereferencetotar -c. GNU tar 1.35 (current Trixie/Noble default and the version on this server) does not accept that option in create mode, so every site backup, every WP update snapshot, and every mail backup since the flag was introduced has been failing silently — including on the panel's own demo. GNU tar's create-mode default is already "do not follow symlinks", so the fix is to drop the flag from all three sites.
-
curl … {panel_url}/install-agent.sh | bashreturned the SPA HTML (#56). The multi-server install command surfaced inroutes/servers.rspointed users at{panel_url}/install-agent.sh, but the panel's nginx config hastry_files $uri $uri/ /index.html;with no override for that path — so the URI fell through to the SPA'sindex.htmlandbashchoked on<!DOCTYPE html>. The script also wasn't deployed under any served path. Fixed by havingsetup.shandupdate.shcopyscripts/install-agent.shinto$FE_ROOT/install-agent.shso the existingtry_files $urirule serves it directly with the right MIME. -
HTTP-on-IP installs were stuck in a login bounce (#47). The cookie helper in
routes/auth.rs::issue_sessionsetSecurewheneverBASE_URLwas empty (the assumption being that production deployments use HTTPS and an empty default should not regress them). For users running on the barehttp://<ip>:<port>URL before adding a domain, the browser silently dropped theSecurecookie on the plain-HTTP response and/api/auth/methen 401'd on the very next request — login appeared to succeed and immediately bounced back to the login screen. Replaced the BASE_URL-only check with a combinedBASE_URL=https://… || X-Forwarded-Proto: httpscheck (nginx already setsX-Forwarded-Proto $scheme), and threaded the requestHeaderMapthroughissue_session_pub/logout/ OAuthcallback/ passkeyauth_completeso every login path uses the same scheme detection. -
/run/dockpaneldisappeared mid-upgrade and pinned the agent at StartLimitBurst (v2.8.13 followup, surfaced during the demo upgrade-path test).update.shmkdir's/var/run/dockpanelbefore thesystemctl stop / startcycle, but between stop and start the directory disappeared on Ubuntu — the agent's namespace mount (which now resolves/run/dockpanelas aReadWritePaths=symlink target) failed five times in 60s and the unit refused to start until manualsystemd-tmpfiles --createplussystemctl reset-failed. AddedRuntimeDirectory=dockpanelandRuntimeDirectoryPreserve=yesto the canonical unit so systemd creates and persists the directory itself, which fires before the namespace setup and survives every restart. -
Agent socket occasionally left at 0600 root:root, breaking the panel's "Failed to load system update status" toast. The systemd unit's
ExecStartPostwas the only thing that chown'd the socket towww-dataand chmod'd it to 0660 — and it failed silently in some restart sequences, leaving the panel unable to reach the agent over its UNIX socket. The agent now sets the permissions inline right afterUnixListener::bind(via libcgetgrnam/chown/set_permissions), so the unit'sExecStartPostis belt-and-suspenders rather than load-bearing. -
Mail provisioning's
groupadd/useraddfor the vmail user failed under strict sandbox. Same family as #54-A: thesafe_commandwrapper runs sandboxed, but the user-management binaries write/etc/passwd//etc/shadow//etc/group, which are too sensitive to put inReadWritePaths=. Switched both calls tosafe_command_unsandboxed("groupadd", &[])/safe_command_unsandboxed("useradd", &[]).
dockpanel-agent.serviceis now deployed from a single source of truth (#48 followup). The in-repo unit file atpanel/agent/dockpanel-agent.servicewas historically a hardened reference (ProtectSystem=strict+ a curatedReadWritePaths=list) that no installer ever deployed —scripts/setup.shandscripts/update.shboth wrote a permissiveProtectSystem=no/ProtectHome=no/PrivateTmp=nounit inline via heredoc, so every install.sh-based install ran with no namespace hardening at all. v2.8.13 deletes both heredocs and has the install scriptscpthe canonical unit file from the repo. Existing installs upgrading viaupdate.shget the strict sandbox automatically on the next update; the daemon-reload + agent restart that update.sh already performs at the end of its run picks up the new unit. The remote-agent installer (scripts/install-agent.sh) is intentionally left on its own inline heredoc — it deploys a different unit (afterdocker.service, no nginx dep, env-file driven) for the multi-host remote-agent path.
- Hardened the deployed agent sandbox to
ProtectSystem=strictplus the fullProtect*/Restrict*set (#48 followup). The newReadWritePaths=covers everything the agent actually writes viastd::fs::write/tokio::fs::write/create_dir_all: the original eight (/etc/nginx /etc/dockpanel /var/run/dockpanel /var/backups/dockpanel /var/lib/dockpanel /var/www /var/log /etc/letsencrypt) plus ten new paths grepped from current agent code (/etc/apt /etc/fail2ban /etc/systemd/system /etc/powerdns /etc/modsecurity /etc/cloudflared /etc/postfix /etc/dovecot /var/spool/postfix /opt). v2.8.12'ssafe_command_unsandboxedsystemd-run wrapper continues to handle the apt/dpkg/snap subprocess paths that can't be expressed viaReadWritePaths=. Net effect: the agent now runs with meaningful kernel-namespace isolation —ProtectKernel{Logs,Modules,Tunables}=yes,ProtectControlGroups=yes,ProtectClock=yes,ProtectHostname=yes,RestrictRealtime=yes,RestrictSUIDSGID=yes,LockPersonality=yes,RestrictNamespaces=~user,NoNewPrivileges=yes,ProtectHome=yes,PrivateTmp=yes. None of this hardening was ever active on install.sh-installed users; demo had a hand-deployed strict version which is what surfaced the v2.8.12 EROFS bug.
- The mail subsystem (
panel/agent/src/routes/mail.rs:173-174) still spawnsuseradd/groupaddvia the sandboxedsafe_command, which fails underProtectSystem=strictbecause/etc/passwd,/etc/shadow, and/etc/groupare too sensitive to add toReadWritePaths=. This was already broken under demo's strict sandbox; mail provisioning has been silently failing on that path. v2.8.14 will wrap the user/group creation calls with the v2.8.12safe_command_unsandboxedpattern (systemd-run escape) for a clean fix.
-
Service Installers + System → Updates fail silently with
Read-only file systemerrors under the agent'sProtectSystem=strictsandbox (#48 followup).dockpanel-agent.serviceruns withProtectSystem=strictand aReadWritePaths=list that omits/var/cache/apt,/var/lib/apt,/var/lib/dpkg, and/usr— the paths apt and dpkg must write to. Every install / upgrade path that spawnedapt-get,snap install,dpkg, orcurl | bashfrom the agent inherited the sandbox and EROFS'd the moment it tried to download a.debor install a binary into/usr/bin. Surfaced wheninsxaclickedInstallon Redis / Composer / Node.js / Cloudflare Tunnel / WAF in Settings → Services — every one failed. System → Updates' "Update All" button hit the same wall.- Added
safe_command_unsandboxed()(and a sync sibling) topanel/agent/src/safe_cmd.rs. The helper invokes the binary viasystemd-run --quiet --pipe --wait --collect --setenv=... -- <bin>, which routes through PID1 to spawn a transient unit in PID1's own mount namespace. The inner binary sees the full filesystem read-write while the agent itself stays sandboxed for everything else. Every--setenvflag explicitly re-establishes the sanitized env (PATH/HOME/LANG/LC_ALL/DEBIAN_FRONTEND) so the inner binary doesn't inherit PID1's wider environment. - Converted ~25 call sites that legitimately need
/usrwrite access to use the new helper:panel/agent/src/routes/updates.rs(apt-get updateandapt-get install/upgrade),service_installer.rs(everyinstall_*anduninstall_*shell script + therm /usr/local/bin/composerinuninstall_composer),php.rs(add-apt-repository ppa:ondrej/php,apt-get installfor PHP base + extensions,apt-get purge/autoremove),server_utils.rs(enable_auto_updates'sapt-get install unattended-upgrades), andservices/smtp.rs(ensure_msmtp'sapt-get install msmtp). - Read-only callers (
apt list --upgradable,apt-cache show,dpkg -l,which <bin>) keep using the sandboxedsafe_command—ProtectSystem=strictpermits reads of/var/lib/apt/listsand/var/lib/dpkg, so wrapping them withsystemd-runwould just add overhead. - Empirically verified: from inside the agent's mount namespace,
touch /var/cache/apt/archives/_testreturnsEROFS; the sametouchwrapped insystemd-run --quiet --pipe --wait --collect --succeeds because the transient unit gets a fresh mount namespace. Smoke-tested on demo:GET /system/updatesreturned 69 upgradable packages cleanly (was returning empty pre-fix becauseapt-get updateEROFS'd before populating the lists).
WAF + Cloudflare Tunnel installers will partially succeed in v2.8.12 (apt step now works) but still hit
EROFSon follow-upstd::fs::write/create_dir_allcalls into/etc/modsecurityand/etc/cloudflared. v2.8.13 will close those by either adding the directories to the unit'sReadWritePathsor by routing those writes through the same helper. - Added
- Settings → Services tab missing from the tab bar — PowerDNS / Image
Scan / SBOM / Prometheus config UIs unreachable from the panel for
over a month (#48
followup). Commit
fd44a31(2026-03-24, "UX: fix overlaps, decompose Settings, create System page") removed the{ id: "services", label: "Services" }entry fromSettings.tsx's tab list intending to move the contents to the new System page, but the actual content block ({tab === "services" && (<>...</>)}at lines 2169-2245) was orphaned in place and never relocated. The DNS page's "configure PowerDNS API in Settings" hint pointed users at a tab that didn't exist. Surfaced when aninsxafollowup on issue #48 asked for a screenshot of where to find the Services tab — there wasn't one. Fix: restored the Services tab button so the existing content block is reachable. (A proper move-to-System-page refactor remains on the list but is a bigger UX restructure than tonight's scope.)
-
Dashboard "Restart nginx" / "Restart PHP-FPM" buttons did nothing on click (#48 followup). The frontend was POSTing the wrong request shape to the agent —
{ fix: "restart_nginx" }and{ fix: "restart_php" }, while the agent's/diagnostics/fixendpoint deserializes{ fix_id: "restart-service:<name>" }. Even after deserializing, the valuerestart_nginxdoesn't match any of the supportedapply_fixactions. Two changes:- Frontend (
panel/frontend/src/pages/Dashboard.tsx) now sends{ fix_id: "restart-service:nginx" }and{ fix_id: "restart-service:php-fpm" }. - Agent (
panel/agent/src/services/diagnostics.rs) treatsphp-fpm(no version) as a smart alias: it enumerates loadedphp<ver>-fpm.service(Ubuntu/Debian) or plainphp-fpm.serviceunits viasystemctl list-unitsand restarts every match, so multi-version installs (PHP 8.1 + 8.2 + 8.3) all reload their OPcache after the click. Returns a clear error if no PHP-FPM unit is installed at all.
- Frontend (
-
Disk-full forecast fired during install on otherwise-idle systems.
services/alert_engine.rsextrapolated linearly from the most recent 60 metrics_history rows. On a fresh install the first 30-60 minutes show 5-10%/hour disk growth (binary writes, frontend tarball, postgres init, container layers); the extrapolation predicted "disk full in 9 hours" even at 30% usage. Surfaced in the sameinsxafollowup on issue #48 — alerts fired non-stop on a 40 GB box minutes after install. Forecast now requires (a) at least 6 hours of trend data so the install spike bleeds out, AND (b) current disk usage already over 60% so we're on a runway to a real full disk, not extrapolating from noise on an empty box. Existing thresholds (forecast horizon < 48h, severity cutoff at 12h) preserved.
-
Agent's
restart-servicevalidator rejected systemd unit names containing a dot.php8.3-fpm,containerd.service, etc. are legitimate unit names but the regex was[a-z0-9_-]+only. Surfaced in the sameinsxafollowup on issue #48 — every PHP-FPM auto-heal attempt was returning "Invalid service name" silently. Also affected the post-restore PHP-FPM reload inroutes/backups.rs:244. Fix: allow.in service names. Dots in systemd unit names cannot be used for path traversal becausesystemctl restart <name>doesn't treat the argument as a path. -
Seven Settings toggles silently rejected by the backend whitelist (#48 followup).
PUT /api/settingsvalidates incoming keys against a hard-coded allow-list. Several security/registration toggles in the Settings UI wrote keys that were absent from that list —self_registration_enabled,security_approval_required,security_geo_alert_enabled,security_session_recording,security_db_backup_enabled,security_canary_enabled,security_lockdown_threshold. Toggling them returned400 Unknown setting: <key>, the toast surfaced as "Failed", and the value never persisted. Backend code paths (routes/auth.rs,services/security_hardening.rs) already read these keys, so the runtime behaviour was tied to whatever value was set out-of-band. Surfaced when aninsxafollowup on issue #48 reported "these two in settings are not opted: Self-Registration, Require Approval for New Users" — same root cause as v2.8.5's ipv6only-strip migration miss: a list that grew implicit coupling to other parts of the codebase that nobody updated when new toggles were added. Frontend-only would have masked the issue with try/catch; the right fix is at the writer-side gate. No agent / cli / frontend code changes; binaries recompiled to carry the v2.8.9 version string.
- Password reset link bounced to
/logininstead of rendering the reset form (#48 followup). When an unauthenticated user clicked the reset link from their email,ServerProvider(mounted at the top of the SPA tree) firedapi.get("/servers")on mount → 401 because no session →api.ts's 401 handler redirected to/loginbecause its no-redirect allow-list only covered/loginand/setup. Net effect: user lands on the login form, never sees the reset password fields, and the one-time token expires unused. Same hole hit/forgot-password,/register, and/verify-emailfor any unauth visitor — though those were less obviously broken because users typically reach them already knowing they need to log in. Fix extends the allow-list inpanel/frontend/src/api.tsto all six top-level public routes (/login,/setup,/register,/forgot-password,/reset-password,/verify-email). Surfaced when aninsxafollowup on issue #48 reported the bounce; the page rendering was fine on the demo when probed, which made it look like an email-client mangling issue at first — empirically confirmed by insxa pasting the URL bar after click ashttps://your-panel/login, proving a synchronous redirect from the SPA was firing. No backend changes; binaries recompiled to carry the v2.8.8 version string.
- Branding logo upload (#48
follow-up). Settings → Branding now exposes an "Upload image…" button
next to the existing Logo URL field. The frontend POSTs the file's
raw bytes to a new
POST /api/branding/logoendpoint (admin-only, PNG / JPEG / WebP, 2 MB cap, content-type and magic-bytes validated to defend against MIME spoofing). Files are stored content-addressed at/var/lib/dockpanel/branding/logo-<hash>.<ext>and served back overGET /api/branding/logo/{filename}(public — the login page is unauthenticated and needs to render the logo) withCache-Control: public, max-age=31536000, immutable. The upload handler auto-saveslogo_urlso the new image takes effect on the next page render. Surfaced when aninsxafollow-up on issue #48 reported "branding image could not be saved" — the existing settings field only accepted a URL, with no file upload UI. Admins on air-gapped panels with no public CDN can now self-host their logo.
update.shdefaulted to compile-from-source on production VPS installs that don't have Rust — surfaced when aninsxafollow-up on #48 hitRust toolchain not foundthen OOM'd onproc-macro2after they installed rustup. The script already auto-switches to release binaries when the source tree is missing, but production installs do have the source tree (install.sh writes it) — the missing signal was whethercargowas on$PATH. update.sh now also auto-switches to the pre-built release binaries when the Rust toolchain isn't available, so a freshbash /opt/dockpanel/scripts/update.shon a stock VPS works without the operator having to know aboutINSTALL_FROM_RELEASE=1or install ~4 GB of rustup. Developers who do want to compile from source can setBUILD_FROM_SOURCE=1to override. The "Rust toolchain not found" error message also rewords to recommend droppingBUILD_FROM_SOURCE=1over installing rustup, with the RAM-cost callout up front.
- v2.8.4 upgrade path still hit
duplicate listen options for [::]:443on multi-site installs that ran v2.8.3 first (#48). v2.8.4 reverted the agent templates and panel vhost to plainlisten [::]:80;/listen [::]:443 ssl;, and v2.8.4's update.sh strippedipv6only=onfrom the panel vhost — but it dropped the v2.8.3 site-vhost migration block, so any site provisioned on v2.8.3 keptlisten [::]:443 ssl ipv6only=on;on disk. nginx accepts panel-plain + ONE site-with-ipv6only=onon a shared[::]:443socket, but rejects TWO-or-more site vhosts both settingipv6only=onwithduplicate listen options for [::]:443. The reload triggered by v2.8.4's update.sh therefore failed silently on any install with 2+ sites — the new panel listen never took effect, the IPv6 hijack from the original #48 persisted, and the nextsystemctl restart nginxwould refuse to start. update.sh now stripsipv6only=onfrom every site vhost in/etc/nginx/sites-enabled/*.conf(skipping the panel vhost, which is already handled), bringing the listener options back in line so nginx reloads cleanly. No code changes — fix is pure upgrade-script. Manual one-liner for v2.8.4-stuck users:for f in /etc/nginx/sites-enabled/*.conf; do [ "$(basename "$f")" = dockpanel-panel.conf ] && continue; sed -i -E 's|^([[:space:]]*)listen \[::\]:(80\|443 ssl) ipv6only=on;|\1listen [::]:\2;|' "$f"; done && nginx -t && nginx -s reload
- v2.8.3 nginx
duplicate listen optionsregression on multi-site installs. v2.8.3 addedipv6only=ontolisten [::]:80andlisten [::]:443 sslin agent templates + the panel vhost to fix the IPv6 hijack from #48. Two vhosts on the same shared socket both declaringipv6only=oncaused nginx to emitduplicate listen options for [::]:80and refuse the config — surfaced when a second site was added on a v2.8.3 install. Reverted: agent templates and the panel vhost now use plainlisten [::]:80;andlisten [::]:443 ssl;(dual-stack, noipv6only=on). Linux's default dual-stack behaviour means a single shared[::]socket handles both IPv6 and IPv4-without-specific-binding, and nginx routes byserver_nameacross that shared socket without conflict. The underlying #48 fix still holds — the panel vhost gains a[::]:IPv6 listen so site vhosts can no longer be the only IPv6 listener and hijack panel-domain traffic. update.sh now also strips anyipv6only=onleft on a v2.8.3 panel vhost so the upgrade path doesn't inherit the regression.
-
Manual "Let's Encrypt SSL" provisioning failed with
Template render error: Invalid PHP socket pathon PHP sites (#48). Four backend call sites built the agent'sphp_socketfield as/run/php/phpX-fpm.sock, but the agent's strict validator (is_safe_php_socket,panel/agent/src/services/nginx.rs:149) requires theunix:/...prefix and 500'd the request. The auto-SSL background task at site-creation time was correct (sites.rs:557), which is whyAuto-SSL attempt 2succeeded after the manual click 500'd in between. Fixed inroutes/ssl.rs:118,404,services/auto_healer.rs:598, andservices/security_scanner.rs:271— all now emitunix:/run/php/phpX-fpm.socklike the working site-creation path. -
Visiting the panel URL redirected to a freshly-installed WordPress site after that site's Let's Encrypt SSL was provisioned (#48). Root cause was a dual-stack listen mismatch: agent-rendered site nginx vhosts declared
listen [::]:443 ssl;(noipv6only=on), butscripts/setup.shbound the panel's vhost to IPv4 only. The first site to provision SSL therefore became the de-facto default for any IPv6 (or non-matched-IPv4) request — WordPress saw a Host that didn't matchhome_urland 301'd to its canonical domain. Fixed by addingipv6only=onto all[::]:80and[::]:443 ssllistens acrosspanel/agent/src/templates/nginx/{http,https,proxy}.conf, pairing every panel IPv4 listen insetup.shwith anipv6only=onIPv6 listen, and adding a one-shot migration inscripts/update.shso existing installs gain the IPv6 listen on next upgrade.
-
Chain-of-trust report extended to database + volume backups. v2.8.1 shipped site-only because only the
backupstable carried integrity-hash columns. v2.8.2 lands the matching migration (20260430200000_db_volume_backup_hashes.sql) —sha256_hash,previous_hash, andchain_validon bothdatabase_backupsandvolume_backups, applied in a single transaction so a partial apply can't leave one table chained and the other not. The agent now computes SHA-256 on every database dump (mysql / postgres / mongo) and every volume tarball, and the backend persists the hash + previous-hash link on the same INSERT path that lands the new backup row — both the on-demand routes (POST /api/backup-orchestrator/db-backup/volume-backup) and the policy-executor scheduled path. The All Backups tab now shows theReport | JSON | PDF3-segment control on every row regardless of kind.The chain-report routes were collapsed from kind-specific (
/chain-report/site/{id}[/pdf]) into one generic shape:GET /api/backup-orchestrator/chain-report/{kind}/{id}— JSON.GET /api/backup-orchestrator/chain-report/{kind}/{id}/pdf— PDF.
{kind}∈{site, database, volume}; bogus kinds 400 cleanly. The JSONbackupobject now carrieskindplus optional kind-specific fields (database_id,container_id,volume_name,db_type); the formersite_namefield was renamed toresource_name(domain for site, db_name for database,container:volumefor volume) so the same consumer can render any kind.build_site_chain_report→build_chain_report(kind, id)with table-name dispatch. The typst template is now a single file that branches ondata.backup.kindfor the resource label and kind-specific extras (db engine, container ID), so the three kinds can't drift apart. -
typst tarball SHA-256 pinning. v2.8.1 trusted GitHub TLS for the v0.13.0 musl tarball (matching the existing grype installer). v2.8.2 pins the per-arch SHA-256 (
x86_64-unknown-linux-musl:cd1148da…feb6,aarch64-unknown-linux-musl:1a1b3841…46e6), verified at install time beforetarever sees the bytes. Operators can override per arch viaDOCKPANEL_TYPST_SHA256_X86_64/_AARCH64env vars (e.g. air-gapped mirror, custom typst version). Mismatch surfaces as a distinct error rather than a generic install failure. Install timeout bumped 90 → 120 s to absorb the second pass over the bytes.
tests/chain-report-e2e.shextended to all three kinds. The site block became a kind-agnosticassert_kindhelper; the suite now iteratessite → database → volumeand runs the same shape of assertions per kind (auth gate, kind validation, 404 on bogus id, JSON 200, JSONbackup.kind+backup.id+backup.resource_nameround-trip + shape, PDF 200 + Content-Type + Content-Disposition- %PDF magic + size > 1 KB). Fixtures are discovered per kind from
backups/database_backups/volume_backups; missing fixtures skip rather than fail (so CI hosts that haven't seeded volume backups still green-light). Total suite ~50 assertions.
- %PDF magic + size > 1 KB). Fixtures are discovered per kind from
-
Chain-of-trust report for site backups (Phase 4 W1.3). Every site backup is now downloadable as a single forensic artifact bundling its full provenance chain — the backup itself (filename, size, SHA-256, previous-hash link, chain-validity flag), every passive verification run against it (status, checks-passed/total, duration, errors), and every end-to-end restore drill (status, HTTP probe result, body excerpt, duration). Two formats from the same data:
GET /api/backup-orchestrator/chain-report/site/{id}— JSON.GET /api/backup-orchestrator/chain-report/site/{id}/pdf— typst-rendered PDF with DockPanel branding, status pills, and a full chain-integrity summary. Designed to be handed to an auditor as proof a backup was actually verified and restorable.
All Backups tab on the Backup Orchestrator page now shows a
Report | JSON | PDF3-segment control on every site row. The first PDF request lazy-installs thetypstCLI into/var/lib/dockpanel/typst/(~30 MB, one-time, ~30 s on a fresh box); subsequent renders are instant. Compile timeout 30 s; install timeout 90 s; concurrent installs serialised via a process-wide async mutex so a burst of first requests doesn't stampede.Site-only for v2.8.1 because only
backups.sha256_hash/previous_hash/chain_validare populated today (added in audit migration20260324000000). The db + volume backup tables don't carry hashes yet — extending chain reports across all three kinds is a v2.8.2 follow-up that needs a hash-columns migration plus agent changes to compute SHA-256 during db/volume backup.
/api/backup-orchestrator/health500 once any backup exists.SUM(size_bytes)returnsNUMERICin PostgreSQL (since aggregatingBIGINTcan overflowint8); the existing query bound it toOption<i64>without an explicit cast. Empty backup tables returnedNULLand decoded fine, but the moment a real backup row landed the endpoint started 500ing withINT8 not compatible with NUMERIC. Cast to::bigintin three sites inroutes/backup_orchestrator.rs::health- the rolled-up SUM in
services/backup_policy_executor.rs. Caught by the v2.8.1 fresh-VPS test once a synthetic backup row was seeded for the chain-report PDF round-trip.
- the rolled-up SUM in
- New
tests/chain-report-e2e.shsub-suite: unauthenticated request blocked, bogus uuid → 404, JSON shape, PDF magic bytes / Content-Type / Content-Disposition, file-size sanity. Wired intofull-e2e.shalongside the tier2-pin sub-suite. Self-provisions auth (mints admin JWT fromapi.envifDOCKPANEL_TEST_PASSWORDis unset). Skips PDF assertion whenCHAIN_REPORT_SKIP_PDF=1(CI without outbound HTTPS) and reports 503 cleanly when typst install fails so the suite still green-lights on networks that block GitHub releases.
- Restore Confidence SLA card on Backup Orchestrator overview (Phase 4
W1.1). The Overview tab now leads with a single trust signal — "of last
30 backups, X% verified" — sized as a headline number, color-coded by
threshold (rust ≥95%, warn ≥80%, danger below). Adjacent cells show p50
and p95 verify-lag (time from backup creation to verification
completion), oldest unverified backup age, and a per-server breakdown
table when more than one server is registered. Empty state when no
recent backups exist. Backend extends
GET /api/backup-orchestrator/healthwithsla_window,sla_verified,sla_failed,sla_pending,verify_lag_p50_hours,verify_lag_p95_hours,oldest_unverified_days(previously declared but never populated), andper_server_sla[]. Latest verification per (backup type, backup id) wins, so re-runs supersede stale entries. No schema migration; same endpoint URL. - End-to-end backup drills for site backups (Phase 4 W1.2 part A).
Click
Drillon any site row in the All Backups tab — the agent extracts the tar to a scratch directory, spins a hardenednginx:alpinecontainer (--network none,--read-only, 128MB / 0.5 CPU caps), HTTP-probeslocalhost/viadocker exec wget, and tears everything down. Persisted in a newbackup_drillstable; visible in the new Drills tab with status, HTTP code, duration, and error message. SLA card on Overview gains a "End-to-end drills (30d): N passed · M failed" line when drills exist. New endpoints:POST /api/backup-orchestrator/drill(admin, async — returns 202 immediately, drill row updates as the agent finishes) andGET /api/backup-orchestrator/drills(paginated history). Agent endpointPOST /backups/drill/site. - End-to-end DB drills for postgres + mysql/mariadb (Phase 4 W1.2 part B).
Click
Drillon any database row in the All Backups tab — the agent boots a scratch engine container (postgres:16-alpineormariadb:11,--network none, 256MB / 1 CPU caps), pipeszcatof the dump into a direct-fdpsql/mariadbrestore, runsANALYZE(postgres) to populate planner stats, then sums table count and row totals frompg_class.reltuples/information_schema.tables.table_rows. Drill body records"N tables, ~M rows restored"— strictly stronger than verify, which only confirms the dump applies. Pass requires tables > 0; row total is reported but doesn't gate (legitimate schema-only dumps pass). BackendPOST /api/backup-orchestrator/drillnow acceptsbackup_type = "database"and dispatches to new agent routePOST /backups/drill/db. Drills tab "HTTP" column renamed to "Result" and renders the row/table summary for DB drills. Volume drill is W1.2.c. - End-to-end volume drills (Phase 4 W1.2 part C). Click
Drillon any volume row in the All Backups tab — the agent creates a scratch Docker volume, runs a hardenedalpine:3.19restore container (--network none, 128MB / 0.5 CPU caps) that extracts the tar into the scratch volume (parity withrestore_volume's actual restore path), then runs a second read-only probe container that mounts the volume RO and read-tests up to 20 sample files (head -c 1through each — enough to fault filesystem-level corruption without scanning multi-GB volumes). Drill body records"N files, M bytes restored". Pass requires files > 0 AND read-test exit 0 — strictly stronger than verify, which only extracts to a host /tmp dir. Best-effort cleanup of both containers and the scratch volume on every exit path. BackendPOST /api/backup-orchestrator/drillnow acceptsbackup_type = "volume"and dispatches to new agent routePOST /backups/drill/volume. The—placeholder on volume rows is replaced with a workingDrillbutton. W1.2 engine work complete; W1.2.d (per-policy weekly drill scheduler) is the remaining slice. - Per-policy drill scheduler (Phase 4 W1.2 part D). Backup policies
gain a
Drill on scheduletoggle and a separate crondrill_schedule(default0 4 * * 0— 04:00 UTC Sunday) so drills run on a different cadence from the backups themselves. New backend servicedrill_schedulerticks every 60s, finds policies due now, looks up the latestdatabase_backupsandvolume_backupsrow tied to each policy bypolicy_id, and dispatches a real drill against each via the same agent endpoints used by on-demand drills. Records land in the existingbackup_drillstable — Drills tab can't tell the difference between scheduled and on-demand drills (same audit trail). Per-server concurrency cap = 1 (skips dispatch if apending/runningdrill exists for the same server). Schema migration addsdrill_enabled BOOLEAN,drill_schedule TEXT, andlast_drill_at TIMESTAMPTZtobackup_policies. Site backups don't carrypolicy_idand are not covered by this scheduler — they stay on the existing 6hbackup_verifiercadence. UI: new section in the Policy create form with an enabled checkbox + a curated schedule selector (weekly / monthly / every 3 days), and a smalldrill <cron>badge under the Schedule column on policy rows when enabled. Cron validation rejects strings that aren't 5-field whitespace-separated on bothscheduleanddrill_schedulewrites (was previously unchecked onscheduletoo — small hardening win). W1.2 (engines + scheduler) is now complete; W1.3 (chain-of-trust PDF/JSON export) ships separately as v2.8.1.
- Backup Orchestrator UX pass. Drills tab now paginates with
Prev/Next(50 per page) instead of silently truncating to the first 100; backendGET /api/backup-orchestrator/drillsreturns{items, total}to drive it. Result column is tone-coded for site drills — HTTP 2xx rust, 3xx neutral, 4xx amber, 5xx danger — so failures jump out at a glance. Running drills get a pulsing dot in the status pill and aN runningcounter + manualRefreshbutton above the table. Created column shows relative time with the absolute timestamp on hover. Drill button on DB and volume rows now asks once before spending — a confirm/cancel pair appears with the cost hint (boots a 256 MB scratch DB engine, ~60s/boots a 128 MB scratch container + temp volume, ~60s); site drills fire directly since they're cheap. - Image scan + SBOM Settings cards (a25c716). Apps CVE drawer +
Settings ImageScan/SBOM cards picked up the same dialog/a11y polish
as the rest of the panel:
role="dialog"+aria-modal+ Esc to close on the scan drawer,type="button"+aria-labelon every trigger, design-system tokens (no raw Tailwind colors), explicit load-error + Retry on the Settings cards (no more stuck "Loading…"),Last scan Xh ago · N images on filederived from/image-scan/recentwhen the scanner is installed, and an explicitOn-demand only — no schedule, no deploy gateline on the SBOM card so the configuration model is unambiguous.
- rustls-webpki 0.103.12 → 0.103.13 in both
dockpanel-apianddockpanel-agentCargo locks — fixesRUSTSEC-2026-0104(reachable panic in CRL parsing). DockPanel calls into rustls-webpki for ACME cert verification and pinned-fingerprint TLS (Phase 3 #3 Tier 2), so a malformed CRL from a malicious or buggy CA could have crashed the process. Patch release, no API changes. - postcss 8.5.8 → 8.5.12 in
panel/frontendandwebsite/clientpackage locks — fixesGHSA-7fh5-64p2-3v2j(XSS via unescaped</style>in the CSS stringify output). Build-time only; no runtime exposure on shipped panels — but worth keeping current.
- Servers page: last-seen-at + 24h uptime sparkline. Each server card
now shows a small
Last seen 14s agoline under the IP/status subtitle (driven by the existinglast_seen_atcolumn, refreshed on every agent checkin) and a 144-cell horizontal uptime strip — one cell per 10-minute bucket over the last 24 hours, derived frommetrics_historyrow presence. Hover any cell for its time window and online/no-data label. New endpointGET /api/servers/{id}/uptimereturns{ buckets: bool[], window_hours, bucket_minutes }. Owner- scoped (404 on a server that belongs to a different user); same auth shape as the rest of the/api/servers/*surface. - Pre-built Grafana dashboard (
dashboards/dockpanel-grafana.json). Drop-in companion to the v2.7.16 Prometheus exporter. Covers fleet stats (version / servers reporting / sites / alerts firing by severity / GPUs reporting), per-server CPU / memory / disk timeseries with sensible thresholds, top-servers bar gauges, sites-by-status donut, a collapsible GPUs row (utilization, VRAM%, temperature, power draw), and an alerts-firing stacked-bars timeseries. Uses aDatasourcetemplate input so it imports cleanly onto any Prometheus that's already scraping/api/metrics. UIDdockpanel-fleetis stable so runbook deep-links survive re-imports. AServertemplate variable lets operators focus on a single host or any subset. Seedocs/guides/prometheus.md"Pre-built Grafana dashboard" for import instructions. Closes the Phase 3 #1 follow-up that paired with the Prometheus endpoint. - Tier 2 cert-pin E2E test suite (
tests/tier2-pin-e2e.sh). Covers every step of the Phase 3 #3 Tier 2 flow end-to-end against the live API: TOFU fingerprint capture on/api/agent/checkin, match no-op, MITM 403, malformed-fingerprint 400, admin rotate-cert-pin with and without theX-Requested-WithCSRF header,activity_logscapture of the rotate action, and re-TOFU after rotate. Also includes a dedicated regression guard for the v2.7.18 rustlsCryptoProviderpanic — it inserts a synthetic online server row withcert_fingerprintset and a loopback URL with no listener, thenPOST /api/servers/{id}/testand asserts status exactly 502 (graceful connect failure) — a panic would surface as 500 and be caught. The suite is self-provisioning: it mints an admin JWT locally from/etc/dockpanel/api.envwhenDOCKPANEL_TEST_PASSWORDis unset, and cleans up all DB rows it creates via anEXITtrap. Wired intotests/full-e2e.shas a sub-suite at the end of the run.
- Remote-agent TLS pinning no longer panics the API process. v2.7.18
shipped the
PinnedFingerprintVerifierfor outbound backend→agent TLS but the backend'smain.rsnever installed a process-level rustlsCryptoProvider. On the first request that actually exercised the pinned path (i.e. a second server enrolled in the fleet with a captured fingerprint),rustls::ClientConfig::builder()panicked onCryptoProvider::get_default(). Pure single-host installs were not affected; any multi-server deployment using the pinned verifier was. Fix: callrustls::crypto::aws_lc_rs::default_provider().install_default()atdockpanel-apistartup (the agent already did this atmain.rs:24). Caught by the v2.7.18 fresh-VPS test before v2.7.18 was declared public-ready. No API changes; the Tier 2 part 2 verification flows (TOFU capture, MITM 403, rotate-pin, re-TOFU, PinnedFingerprintVerifier accept/reject) now all succeed end-to-end.
RemoteAgentClientcert-pinning enforcement (Phase 3 #3 — Tier 2, part 2). Closes the loop: once an agent's fingerprint has been captured by the backend (Tier 2 part 1), every outbound TLS handshake to that agent goes through a customrustls::client::danger::ServerCertVerifierthat only accepts a cert whose DER SHA-256 matches the pinned value. Comparison is constant-time viasubtle; signature verification delegates torustls::crypto::aws_lc_rs. Whencert_fingerprintis still NULL for a server (e.g. old agent that doesn't report it), the client falls back to the legacyAGENT_TLS_VERIFY=insecureenv flag for backwards compatibility.AgentRegistry::for_servernow readscert_fingerprintfrom theserversrow and passes it toRemoteAgentClient::new_with_pin. Rotating the pin viaPOST /api/servers/{id}/rotate-cert-pinalready invalidates the cached client (shipped in Tier 2 pt1) so the next request rebuilds with the new pin.
- Agent TLS + cert fingerprint pinning (Phase 3 #3 — Tier 2, part 1).
The agent's multi-server listener now terminates TLS instead of shipping
auth tokens in plaintext, and the central panel captures each agent's
cert fingerprint on first checkin for later pinning.
- Agent loads
/etc/dockpanel/ssl/agent.{crt,key}at startup (generated at install time byinstall-agent.sh, or generated on first boot viarcgenwhen missing).AGENT_LISTEN_TCP=0.0.0.0:9443now binds a TLS listener viaaxum-server+rustls— the old plaintext bind and theAGENT_ALLOW_INSECURE_BINDescape hatch are removed, since TLS makes the 0.0.0.0 case safe by construction. - Agent computes the SHA-256 (hex) fingerprint of its cert at startup, logs it on first boot, and includes it in every phone-home checkin.
- Migration
20260417000000_agent_cert_fingerprint.sqladdsservers.cert_fingerprint(nullable varchar(64) + partial index). - Backend
POST /api/agent/checkincaptures the fingerprint on first checkin (Trust On First Use); on subsequent checkins a mismatch is rejected with 403 and logged at ERROR level. Format-validated (64-char lowercase hex) before storage. - New admin endpoint
POST /api/servers/{id}/rotate-cert-pinclears the stored fingerprint so the next checkin re-captures. Use after a legitimate agent cert rotation or reinstall. Invalidates the cachedRemoteAgentClientand writes an audit log entry. - Servers page gains a per-server TLS pin row showing the shortened fingerprint (first 16 / last 16 chars, full hash on hover) and a "Rotate pin" button with an inline confirmation bar.
- Pt2 (pin-enforcement in
RemoteAgentClient) ships in the same release — see the first bullet above.
- Agent loads
- Unified fleet-wide backup view (Phase 3 #3 — Tier 1). The Backup
Orchestrator page gains an All Backups tab that lists site, database,
and volume backups from every server in a single paginated table, with
optional filters by server and by kind.
- New admin endpoint
GET /api/backup-orchestrator/alljoinsbackups,database_backups, andvolume_backupsvia a UNION CTE and resolvesserver_idto a server name (site backups derive their server fromsites.server_id; database and volume backups carry the column directly). Query params:limit,offset,kind(site|database|volume),server_id. Returns{ items, total }. - Per-row badges surface
encrypted(at-rest encryption enabled) andremote(pushed to a backup destination) so fleet admins can spot inconsistencies at a glance. - Closes the last missing north-star bullet for "Operate at Scale":
agent enrollment and cross-host placement were already shipped
(
ServerScope+serverstable +install-agent.sh); the unified backup view was the remaining gap.
- New admin endpoint
- 2026-ready ACME (Phase 3 #2 — Tier 1). DockPanel is now ready for
Let's Encrypt's May 13 2026
tlsserver→ 45-day flip, the existing 6-dayshortlivedprofile, and the Feb 2027 / Feb 2028classicreductions.- RFC 9773 ARI-driven renewal. The auto-healer now queries the CA's
ACME Renewal Information for each cert and honours the suggested
renewal window instead of a hard-coded 30-day threshold. Falls back to
a profile-aware margin (2d / 15d / 30d) when a CA doesn't advertise
ARI. New columns
sites.ssl_renewal_at,sites.ssl_renewal_checked_at. - ACME profile selection UI. Settings → ACME Profile lets admins
pick the default profile (
classic/tlsserver/shortlived) for all new certificates. List auto-populates from the CA's server directory; card hides itself if the CA doesn't advertise the profiles extension. New columnsites.ssl_profilestores which profile issued each cert. - Force-renew migrated off certbot CLI.
/api/ssl/{id}/renewnow issues viainstant_acmeand passes the previous cert as the ARIreplaceshint, so the CA sees a continuous issuance chain. Legacy certbot-issued certs no longer trigger spurious failures on renew. /api/ssl/profiles(admin) lists CA-advertised profiles with descriptions./api/ssl/default-profile(admin) sets or clears the panel-wide default./ssl/{domain}/renewal-info(agent) exposes the raw ARI suggestion per cert.
- RFC 9773 ARI-driven renewal. The auto-healer now queries the CA's
ACME Renewal Information for each cert and honours the suggested
renewal window instead of a hard-coded 30-day threshold. Falls back to
a profile-aware margin (2d / 15d / 30d) when a CA doesn't advertise
ARI. New columns
- Auto-heal SSL copy in Settings replaced stale "3 days" threshold language with accurate ARI + profile-aware explanation.
- DNS-PERSIST-01 (Q2 2026) intentionally deferred — no Let's Encrypt production date yet; will land once instant-acme exposes the draft API.
- Prometheus
/api/metricsscrape endpoint (Phase 3 #1). Hand-formatted exposition text — no extra crate, respects the lightness axis. Gated by a SHA-256-hashed scrape token (constant-time compare viasubtle); returns 404 when disabled so an off panel doesn't advertise a scrape surface. Exposesdockpanel_info, per-server cpu/memory/disk percents, per-GPU utilization / VRAM / temperature / power, per-status site counts, and alerts firing by severity. NewPrometheusSettingscard in Settings with auto-generated token, reveal-once banner, rotate button, and a copy-readyprometheus.ymlscrape_configs block.
- GPU history + alerts (Phase 2 #2). Historical GPU charts in System (utilization, VRAM, temperature, power). Alert engine gains GPU-aware rules: VRAM > 90%, temp > 85°C, utilization pinned at 100% for 15 min.
- Ollama model management + vLLM picker + idle-unload (Phase 2 #3).
- CI on Actions Node 24. Upgraded action pins to their Node-24-ready
versions, including
sigstore/cosign-installer@v4.1.1(no floating v4 tag exists).cargo install cargo-sbomis now called with--forceso restoring a cached~/.cargo/bin/doesn't break the release workflow.
-
scripts/update.shnow self-refreshes from the latest release tag. The v2.7.13 fix to the rollback bug only helped operators who manually refreshed their on-disk copy of update.sh, because update.sh wasn't in the binary release tarball and never overwrote itself during an upgrade. v2.7.14 closes the chicken-and-egg: when run withINSTALL_FROM_RELEASE=1, update.sh fetches the latest tag'sscripts/update.shfrom raw.githubusercontent.com, replaces its own on-disk copy if it differs, and re-execs. ASELF_REFRESHED=1env guard prevents infinite loops.Operators currently stuck on v2.7.11 or v2.7.12 (where the broken health check rolls every upgrade back) need to bootstrap once:
sudo curl -fsSL https://raw.githubusercontent.com/ovexro/dockpanel/main/scripts/update.sh \ -o /opt/dockpanel/scripts/update.sh sudo INSTALL_FROM_RELEASE=1 bash /opt/dockpanel/scripts/update.shAfter the first successful upgrade, future runs self-refresh automatically.
scripts/update.shrolled back every upgrade — the post-deploy health check POSTed to/api/auth/setup-status, but that endpoint is GET-only and returned 405 Method Not Allowed on every run, triggering the rollback path even when the new binaries were healthy. Caught by the v2.7.12 fresh-VPS test (the first end-to-endupdate.shexercise in several releases). Operators on v2.7.11 or v2.7.12 who pulled viaupdate.shwould have been silently held back; manual re-pull or reinstall viainstall.shwas unaffected. Fix: switch the check to GET.
- Per-container GPU assignment. Multi-GPU hosts can now pin specific
NVIDIA devices to specific containers — pin Ollama to GPU 0, vLLM to
GPU 1, Stable Diffusion to GPU 2. The deploy form auto-detects available
GPUs (via the existing
/apps/gpu-info) and shows a multi-select picker on hosts with two or more devices. Single-GPU hosts keep the original simple toggle. Backed by Docker'sDeviceRequest.device_ids; assignment persists acrossupdate_app()recreations because Docker preserves the host_config when pulling a new image. - vLLM template (AI / Machine Learning). High-throughput, memory-
efficient LLM inference server with an OpenAI-compatible API. Defaults
to
meta-llama/Llama-3.2-1B-Instructand accepts an optionalHUGGING_FACE_HUB_TOKENfor gated models. Fills the most-glaring AI template gap (the inference-engine peer to Ollama). gpu_recommendedflag on app templates. Templates that materially benefit from GPU passthrough (Ollama, LocalAI, vLLM, Stable Diffusion WebUI, Text Generation WebUI, Whisper) now ship a flag that surfaces a small "GPU" badge on the template card and pre-ticks the GPU passthrough toggle on the deploy form. Frontends/orchestrators (Open WebUI, LiteLLM, Flowise, Langflow, Dify) intentionally remain unflagged.
- LocalAI default image switched to GPU variant.
localai/localai:latest-cpu→localai/localai:latest-gpu-nvidia-cuda-12. The previous default silently ignored the GPU passthrough toggle on every deploy. Operators on CPU-only hosts can switch back via the Image field on the deploy form. - Text Generation WebUI pinned from
:default-nightlyto:defaultso shipped deploys don't drift on rebuild.
- dockpanel.dev/security launched. Public security posture page — audit count, signed-releases / SBOM story, response SLA, all 7 audit rounds with headline fixes, recent advisories, defense-in-depth grid, vulnerability-report CTA. Counter-positions DockPanel against the Coolify/CyberPanel narratives. Linked from main nav (between Compare and Pricing) and footer Product column. SECURITY.md cross-references the page at the top.
- Per-image SBOM generation (syft). Second half of the Phase 1 supply-chain
story (after v2.7.10's signed releases). Generate an SPDX 2.3 JSON SBOM for
any deployed Docker app's image — the composition companion to image
vulnerability scanning. Defaults to off; admins opt in from
Settings → Services → SBOM Generation.
- Install button pulls Anchore's signed syft installer into
/var/lib/dockpanel/scanners/syft(same self-contained, sandbox-safe pattern as grype — works underProtectSystem=strict). - Download SBOM button in each app's scan drawer. Click runs syft against
the app's image (10 – 60 s on first generation), persists the SPDX
document, and triggers a browser download of
<app>.spdx.json. - Persistence —
image_sbomtable holds one row per image, overwritten on regeneration. Stored as JSONB so the API serves the SPDX document directly without re-parsing on the agent. - API surface mirrors
/api/image-scan/...shape:/api/sbom/{settings,install,uninstall,generate,image/{ref}}plus/api/apps/{name}/sbomfor both POST (generate) and GET (download). - Agent image-ref validator rejects shell metacharacters before invoking syft — defence-in-depth against shell-injection via user-supplied refs.
- Install button pulls Anchore's signed syft installer into
This is the operator-facing half: every container running on the panel now has a one-click supply-chain artifact to satisfy compliance asks (EU CRA Sep 2026) and to feed external tooling like Dependency-Track or Grype-on-SBOM.
- Signed releases via cosign keyless (Sigstore). Every binary and SBOM in the GitHub release is now signed in CI using the release workflow's OIDC identity — no long-lived signing key exists, and every signature is recorded in the public Rekor transparency log. Verification snippet in SECURITY.md.
- Per-binary SPDX 2.3 SBOMs.
cargo-sbomruns in CI for the agent, API, and CLI crates, emittingdockpanel-{agent,api,cli}.spdx.jsonalongside the binaries (also signed). Local builds viascripts/release.shnow generate SBOMs too; signing remains CI-only so the OIDC-bound certificate identity is always traceable to this repository's release workflow.
This is the first half of the Phase 1 supply-chain story — the next release exposes per-deployed-container SBOMs in-panel.
- Per-image vulnerability scanning (grype). First feature in the Phase 1
"Trust by Default" cycle. Scans every Docker app's image for known CVEs and
surfaces a severity badge per app row on the Apps page, next to the existing
update badge. Click a row to see the full CVE table (CVE ID, severity,
package, installed version, fixed version). Defaults to off so existing
installs see no behaviour change on upgrade — admins opt in from
Settings → Services → Image Vulnerability Scanning.
- Install button pulls Anchore's signed grype installer into
/var/lib/dockpanel/scanners/(self-contained — doesn't pollute/usr/local/binand works under the hardened agent sandbox). The vulnerability database primes during install. - Scheduled scans rescan every running app's image in the background at a configurable interval (default 24h, range 1–720h).
- Soft deploy gate refuses new deploys if the template's image has a
recent scan exceeding a threshold (
critical/high/medium). First encounter of an image triggers a best-effort background scan so the next deploy enforces the gate without blocking the first one. - Scan-on-demand from the per-app drawer. Ad-hoc scan of any image via
POST /api/image-scan/scan. - Agent image-ref validator rejects shell metacharacters before invoking grype — defence-in-depth against shell-injection via user-supplied image references.
- Install button pulls Anchore's signed grype installer into
/var/lib/dockpanelwas missing from the hardened agent sandbox'sReadWritePaths. Audit 7 introducedProtectSystem=stricton the agent unit file (panel/agent/dockpanel-agent.service) but only listed/etc/nginx,/etc/dockpanel,/var/run/dockpanel,/var/backups/dockpanel,/var/www,/var/log,/etc/letsencrypt— which meant git builds, terminal recordings, mail backups, Docker app volumes, and the new image scanner would all have silently failed if anyone deployed the hardened unit verbatim. Added/var/lib/dockpanelto the path list. (Installer scripts still emitProtectSystem=nounits, so fresh installs frominstall.sh/update.shwere not affected.)
- tar backups now use
--no-dereference— full-site backups, WordPress pre-update snapshots, and mailbox archives no longer follow symlinks inside the site root. A symlink pointed at/etcwould previously have been archived as the target's content. - Cron command filter explicitly rejects
\nand\r— was implicit before; defense-in-depth against scheduled-job newline injection. - Web-terminal command blocklist extended —
chroot,pivot_root,capsh,mknod,debugfs,kexecadded to the pattern list. - Agent systemd unit hardened —
ProtectKernelTunables,ProtectControlGroups,ProtectClock,ProtectHostname,RestrictRealtime,RestrictSUIDSGID,LockPersonality,RestrictNamespaces=~CLONE_NEWUSER. - Frontend URL guards — Telemetry's update-release link and the public
status page's operator-supplied logo URL now require
http(s)://schemes, blockingjavascript:/data:URLs routed through backend-controlled config fields.
- Security-scan alert pileup eliminated. The weekly security scanner fired a new alert on every run without resolving prior firing alerts, so unacknowledged alerts compounded and the escalation loop re-notified every 2–5 minutes. New scans now auto-resolve prior firing/acknowledged security alerts before firing their own result.
- README / COMPARISON / docs RAM claim updated — previous "~57MB" figure was stale. Fresh Vultr VPS measurement: panel services alone idle at ~19 MB (agent 12 MB + API 7 MB), or ~85 MB including the bundled PostgreSQL. Landing-page RAM bar now shows 19 MB.
- File Manager uploads were silently broken. The wired agent upload handler
expected
{path, content_base64}while the backend (and frontend) sent{path, filename, content}. A second handler inagent/routes/files.rshad the right shape but was never wired to a router. Fixed the wired handler to accept the real payload (withcontent_base64alias for backwards compatibility) and removed the orphan duplicate. - Per-site PHP-FPM pool config changes never took effect. Agent called
write_php_pool_config(...)but never reloaded PHP-FPM afterwards, so customphp_memory_mb/php_max_workersper site were ignored until a manual restart. Wiredreload_php_fpmright after the pool write. - Installer silently fell back to IP-only mode over non-interactive SSH.
Piping
install.shthrough an SSH session with no controlling tty maderead < /dev/ttyfail silently and clearedPANEL_DOMAIN. Now prints a clear "no tty — set PANEL_DOMAIN to configure" notice and points at the env var. /var/lib/dockpanel/recordingswas never created on fresh install. The terminal-recording API and auto-healer retention sweep both reference it. Added to the installer'smkdir -plist.
- Agent dead code:
restart_app_service,app_service_status,build_labels,connect_to_network(Docker-label routing superseded by file-providerwrite_route_config),volume_backup::get_backup_path(duplicate), andBackupInfo::new.
- Complete UX polish pass — all remaining 12 pages reviewed and polished
- Mail: success feedback for alias/backup delete, queue error handling, logs loading skeleton
- Security: all raw Tailwind colors replaced with design system tokens (lockdown, audit log, approvals)
- Settings: success feedback for destination delete, API key revoke, lockdown threshold save; SSH key error handling; empty states for SSH keys and IP whitelist
- Monitors: success feedback for create/toggle/delete operations
- IncidentManagement: inline delete confirmations (was direct delete), success feedback, settings tab empty state
- WordPressToolkit: success banner for bulk update and hardening actions
- Telemetry: fix unsafe error casts, fix version display bug (
vundefined), color consistency - Login: loading spinner instead of blank page during auth check
- Integrations: loading skeletons for WHMCS and Migrations tabs
- NexusLayout: add missing incident count badge (consistent with other 3 layouts)
- Color consistency:
emerald/green/red→rust/dangerdesign tokens across 5 files
- Zero
anyremaining in entire frontend (37 new TypeScript interfaces, completed in v2.7.5 cycle)
- Updated
rand0.9.2 → 0.9.4 (fixes 2 low-severity Dependabot alerts — soundness with custom loggers)
- Systematic UX polish across 20+ frontend pages
- All
confirm()dialogs (25) replaced with inline confirmation bars across 5 files - All
prompt()calls (6) replaced with inline input forms across 5 files - All
console.error/warn/logremoved from frontend page components - All
bg-rust-50light-mode colors replaced with dark-mode-compatiblebg-rust-500/10(8 files) - SiteDetail: loading skeletons for traffic stats, PHP extensions, access logs; WAF empty state
- Databases: success feedback for create/delete/PITR toggle; typed SchemaBrowser generics
- File Manager: save success indicator, Ctrl+S keyboard shortcut
- DNS: 16
anytype casts replaced with 5 proper TypeScript interfaces
- Upgraded
rand0.8 → 0.9.3 (fixes 2 Dependabot security alerts) - Upgraded
vite6.4.1 → 6.4.2 (fixes 2 high + 2 medium Dependabot alerts)
- Git hooks: pre-commit (infrastructure leak scan), pre-push (secrets + frontend staleness + version consistency)
- Scripts:
docs-audit.sh,release.sh(x86_64 + ARM64 cross-compile),deploy-check.sh
- JWT role staleness: sessions now invalidated immediately on role change (was stale up to 2h)
- Webhook gateway DNS rebinding SSRF: destination URL re-validated at forward time, not just registration
- Agent checkin replay prevention: timestamp validation rejects requests >120s old
- Per-user ACME rate limiting: max 10 SSL certificates per hour per user (HTTP-01 and DNS-01)
- DNS pre-flight check: verify domain resolves to this server's IP before HTTP-01 provisioning
- Request timeout: 300s TimeoutLayer added as defense-in-depth against slow requests
- Agent response streaming limit: uses
http_body_util::Limitedinstead of buffering entire response before size check
- Docker container logs now strip ANSI escape sequences instead of returning raw escape codes
- GPU monitoring dashboard — VRAM used/free, temperature, power draw, fan speed, per-process usage with automatic Docker container name resolution. Shown in System Health tab. Gracefully hidden when no GPU detected.
- GPU process table maps PIDs to Docker container names via /proc cgroup inspection
- Certbot installer upgraded from apt (2.9.0) to snap (4.x with ARI support for upcoming 45-day LE certificates). Falls back to pip if snap unavailable.
- OWASP CRS updated from v4.4.0 to v4.25.0 LTS
- Fixed CVE-2026-21876 (CVSS 9.3): OWASP CRS multipart charset validation bypass
- Fixed CVE-2026-33691: OWASP CRS file upload whitespace bypass
- System updates now stream apt output in real-time via NDJSON instead of buffering entire output
- Agent
apply_updatesreturns streaming response (newline-delimited JSON) for live terminal experience - Backend consumes streamed agent response via new
post_long_ndjson()method, forwarding lines as SSE events - Added
streamfeature to reqwest for chunked response handling on remote agents
- Version numbers synced across all packages: 2.0.6 → 2.7.0 in agent, API, CLI, and frontend
- API endpoint count updated to 733 (465 backend + 268 agent) across all docs and marketing
- E2E test count updated to 476 (8 test suites) across all docs and marketing
- Docker template count corrected to 151 across 14 categories in docs site (was stale at 54)
- Security audit rounds updated to 6 (was showing 5) in README and SECURITY.md
- SECURITY.md now documents Audit Round 6 (zero-assumptions, 30 fixes, 260+ total)
- FEATURES.md verified metrics updated with precise counts from code
- CONTRIBUTING.md migration count updated (69 → 81)
- COMPARISON.md corrected: RAM 60→57MB, templates 54→151, themes/layouts names fixed
- Docs site getting-started.md RAM corrected (60→57MB)
- Marketing site Landing.tsx updated with all corrected numbers
- Removed 3 orphaned lazy imports in frontend main.tsx (IncidentManagement, SecurityHardening, WebhookGateway — absorbed into consolidated pages)
- 6 parallel agents audited 222 Rust + 506 TypeScript files from scratch
- 33 findings fixed across 24 files (11 HIGH, 22 MEDIUM)
- MySQL password reset: fixed SQL injection via wrong quote escaping
- Deploy script: added
is_safe_shell_command()validation before agent forwarding - Laravel migration: replaced shell interpolation with dedicated safe agent endpoint
- Terminal: sanitized uploaded filename before shell echo
- CSRF: added
X-Requested-Withheader enforcement on all mutating cookie-auth requests - Compose YAML: rewrote validator from string matching to parsed AST (serde_yaml_ng)
- Shell command blocklist: added encoding tools, interpreters, network tools
- Cron filter: blocked
xxd,openssl enc,python3 -c, process substitution - Remote agent TLS: default inverted from insecure to strict
- Agent TCP: refuses
0.0.0.0bind without explicitAGENT_ALLOW_INSECURE_BIND=true - Stripe webhook: constant-time HMAC comparison
- KDF: upgraded from SHA-256 to HKDF with backwards-compatible legacy fallback
- Symlink attack on security remove_file/quarantine_file: canonicalize before prefix check
- Mail forward_to/catch_all: email format + CRLF + pipe injection validation
- SMTP test email: CRLF header injection prevention
- WordPress plugin/theme: slug validation (alphanumeric + hyphens only)
- Dashboard intelligence: scoped queries to authenticated user (cross-user leak)
- Backup paths: traversal validation on agent URL construction
- Migration: container name validation (DockPanel-managed only)
- Stack templates: random passwords generated at selection time
- Unix socket: permissions tightened from 0o660 to 0o600
- Raw
Command::new(): replaced 3 instances withsafe_command(env sanitization) is_safe_relative_path: now rejects backslashes and enforces length limit- Compose volumes: long-form object syntax now validated (prevents docker.sock bypass)
- 7 browser alert() calls replaced with in-page toast/message UI (SiteDetail, Logs, ResellerUsers, Extensions)
- panic!() on invalid TCP bind (agent) and JWT_SECRET validation (API) replaced with clean exit
- .unwrap() on server await replaced with error logging in agent and API main
- Terminal WebSocket resize handler now wrapped in try-catch
- Dashboard WebSocket cleanup race condition (handlers nulled before close)
- Metrics WebSocket sends explicit Close frame before disconnect
- 3 silent .ok() error discards replaced with tracing::warn logging
- Grafana Docker template default password changed from "admin" to required field
- Cleanup background task now supervised (auto-restarts on panic)
- BackupOrchestrator form typed with PolicyForm interface (replaces
any)
- Alert type muting UI in Settings notification channels (suppress per-type from Slack/Discord/PagerDuty)
- Database password reset endpoint and UI (agent ALTER USER for PostgreSQL/MySQL/MariaDB)
- Secrets vault rename and description update with inline edit UI
- Mail queue endpoint returns empty result when Postfix not installed (was causing 502 errors every 15s on dashboard)
- Onboarding widget template count updated from 34 to 151
- Real Vultr IP in test script examples replaced with RFC 5737 documentation IP
- Monitoring screenshot scrubbed of test.dockpanel.dev URL
- 17 fresh screenshots from live VPS for all major pages (dashboard, sites, Docker apps, terminal, security, etc.)
- 6 CRITICAL/HIGH findings fixed (command injection ×3, auth bypass, timing attack, systemd injection)
- 6 additional HIGH findings fixed (CDN SSRF, WebAuthn RP ID, IaC scope, SSH key injection, DB backup pattern)
- 15 MEDIUM/LOW findings fixed (CORS, rate limiting, input validation, error handling)
- CodeQL: bookmark URL validation hardened, DNS regex escaping fixed
- Nginx FastCGI cache per site with smart bypass (logged-in users, POST, admin)
- Cloudflare integration: zone settings, cache purge, security controls, SSL mode
- Wildcard SSL via DNS-01 challenge (Cloudflare TXT automation, multi-part TLD support)
- Container auto-update detection (registry digest comparison, update badges, one-click update)
- 50 new Docker app templates (101→151 across 14 categories: AI, Media, Productivity, Communication, etc.)
- Redis object cache per site (isolated DB numbers, WP auto-config via wp-cli)
- WAF: ModSecurity3 + OWASP CRS v4 (per-site detection/prevention mode, event viewer)
- Zero-downtime PHP deploys (Capistrano-style atomic symlink swap, instant rollback)
- WordPress safe updates (pre-update snapshot, post-update health check, auto-rollback)
- Image optimization (server-side WebP/AVIF conversion per site)
- CDN integration (BunnyCDN + Cloudflare CDN, cache purge, bandwidth stats)
- Restic incremental backups (encrypted, deduplicated, snapshot management)
- Docker Compose editor validation (structured errors/warnings/info)
- Auto-optimization recommendations (PHP-FPM workers, nginx workers, disk usage)
- Cloudflare Tunnel (install cloudflared, token-based config, systemd service)
- CSP header management per site (policy editor + common presets)
- Bot protection per site (off/basic/strict modes)
- Passkey/WebAuthn passwordless login (manual p256+ciborium implementation, max 10 per user)
- Per-user container isolation policies (max containers, memory, CPU, network isolation, allowed images)
- Container auto-sleep / scale to zero (configurable idle threshold, auto-healer integration)
- Visual DB schema browser (tables, columns, indexes, foreign key relationships)
- Point-in-time DB recovery (WAL archiving for PostgreSQL, binlog retention for MySQL)
- GPU passthrough for Docker (NVIDIA Container Toolkit detection, --gpus flag)
- WHMCS billing integration (API config, webhook provisioning/suspension/termination)
- App migration between servers (migration records, progress tracking)
- Terraform/Pulumi IaC provider API (scoped tokens, resource listing)
- Horizontal auto-scaling (rule-based CPU thresholds, min/max replicas, cooldown)
- Telemetry & diagnostics: local event collection, opt-in remote sending, PII stripping (19 patterns)
- Update checker: GitHub Releases API polling every 6h, dashboard banner, release notes display
- Agent token desync on fresh install — agent now prefers AGENT_TOKEN env var over file
- WebAuthn RP ID defaulted to "localhost" when BASE_URL unset — now derived from request Origin header
- Sidebar NavLink prefix matching: exact route matching on all layouts
- 5 unbounded SQL queries now have LIMIT 500 (webhook_endpoints, pending_users, servers, backup_policies, git_previews)
- Dependabot: picomatch 4.0.3→4.0.4, path-to-regexp 8.3.0→8.4.0 (website dependencies)
- Dashboard fleet overview crash on fresh install (SQL column mismatch)
- Backup creation failure on GNU tar (
--no-dereferenceflag) - Installer: silent package install failures now warn instead of lying
- Installer: Docker volume cleanup prevents DB password mismatch on retry
- 59 silent .ok() failures in agent replaced with proper error handling
- 51 .ok().flatten() anti-patterns in backend replaced with error propagation
- System updates (apt upgrade) broken by API's ProtectSystem=strict — proxied through agent
- Uninstall routes for all 10 services (PHP, Certbot, UFW, Fail2Ban, PowerDNS, Redis, Node.js, Composer, mail server, PHP versions)
- SSL certificate renewal (certbot force-renewal) and deletion endpoints
- User suspend/unsuspend toggle with session invalidation
- Admin password reset for managed users
- System Health tab shows real data (API status, uptime, CPU/mem/disk)
- Certificates page: renew and delete buttons with confirmation
- Monitor list pagination (limit/offset)
- Backup retention auto-enforcement
- Terminal share token revocation
- 45+ command timeouts in agent (Docker, systemctl, apt, system commands)
- Notifications page link to alert channel configuration
- Research-driven security audit: Studied CVEs from CyberPanel, HestiaCP, CloudPanel, VestaCP, Webmin, cPanel — then audited DockPanel against those attack patterns. 55 findings (12 HIGH, 28 MEDIUM, 15 LOW).
- Command execution safety: Added
safe_command()module —env_clear()on all 341Command::new()calls across 44 files. Prevents LD_PRELOAD/PATH hijacking. - Credential encryption at rest: All stored credentials (DB passwords, SMTP, S3/SFTP, OAuth, TOTP, DKIM) encrypted with AES-256-GCM using dedicated key derivation.
- Shell injection fix: Rewrote database_backup.rs — piped
docker exec+gzipinstead ofbash -cwith interpolated strings. - Tar symlink attacks:
--no-dereferenceon backup creation,--no-same-owneron restore. - Session revocation:
revoke_all_sessionsnow actually works — auth middleware checks cached timestamp. - Deploy log IDOR: Ownership verification on both git_deploys and docker_apps SSE streams.
- Content Security Policy: Added CSP header to frontend nginx config.
- Docker exec denylist: Added 7 escape-relevant commands (unshare, pivot_root, setns, capsh, mknod, debugfs, kexec).
- Compose volume symlinks:
canonicalize()resolves symlinks before path validation. - nginx header inheritance: Security headers re-declared in static asset location blocks.
- WebSocket security: Conditional upgrade (prevents h2c smuggling),
access_log offon token-bearing WS locations. - S3 temp files: RAII TempFileGuard with random names + 0600 permissions.
- 2FA validation: Explicit HS256 + leeway=0 (was Validation::default()).
- Account enumeration: Registration returns generic response.
- Git history scrubbed: Removed all passwords, IPs, hostnames, sensitive screenshots from history via git-filter-repo.
- Domain rename — New
PUT /api/sites/{id}/domainendpoint to rename a site's domain. Agent handler renames nginx config, site directory, SSL certs, log files, PHP-FPM pools, Fail2Ban jails, redirects, and htpasswd configs. Backend updates monitors, status page components, and logs activity - Auto-firewall for proxy ports — Sites created with proxy/node/python runtime automatically get a UFW deny rule blocking external access to the allocated proxy port (traffic only allowed through nginx). Rule is auto-removed on site deletion
- Laravel auto-migrations — Site deploys for Laravel sites (
php_preset = "laravel") now auto-runphp artisan migrate --forceafter successful deploy - One-time scheduled deploy — New
POST /api/git-deploys/{id}/scheduleendpoint to schedule a deploy at a specific time. Newscheduled_deploy_atcolumn ongit_deploys. Deploy scheduler checks for due one-time schedules every 60s and auto-clears after triggering. Cancel withDELETE /api/git-deploys/{id}/schedule - Change Docker app image — New
PUT /api/apps/{container_id}/imageendpoint to change a running container's image tag. Pulls new image, stops old container, creates new one preserving volumes, rolls back on failure - Update Docker app resource limits — New
PUT /api/apps/{container_id}/limitsendpoint to update CPU/memory limits on running containers viadocker update. Acceptsmemory_mbandcpu_percent
- Auto-SSL DB update — Background SSL provisioning now updates
ssl_enabled,ssl_cert_path,ssl_key_path,ssl_expiryin the database and activates paused monitors (was silently succeeding without DB update) - Auto-SSL config preservation — SSL provisioning now passes
php_presetandroot_pathto the agent, preventing custom nginx config from being wiped - Pre-deploy backup — All deploy paths (site deploy, git deploy manual, git deploy webhook/scheduled) now create a site backup before deploying
- Pre-delete backup — Site deletion creates a final backup before CASCADE-deleting the site record
- Site deletion cleanup — Now removes orphaned
status_page_componentsmatching the deleted domain - Database restore — New
POST /db-backups/{db_name}/restore/{filename}agent endpoint +POST /api/backup-orchestrator/db-backups/{id}/restoreAPI endpoint. Supports MySQL/MariaDB, PostgreSQL, and MongoDB restore from backup files - Dashboard health score — Now factors in backup freshness (-5 per stale site), security scan findings (-10 critical, -3 warning), and open incidents (-10 each)
- Smart recommendations — Dashboard intelligence endpoint returns actionable recommendations: stale backups, security findings, open incidents, expiring SSL, firing alerts, diagnostic issues. Rendered as a new Recommendations panel on the dashboard
- Alert escalation — Unacknowledged firing alerts re-notify with
[ESCALATED]prefix after 15 minutes, then every 30 minutes. Newescalated_atcolumn + migration - Alert-to-incident correlation — Before creating a new incident from an alert, checks for existing active incidents within 5 minutes. Appends as incident update instead of creating duplicates
- Auto-healer restart limit — Tracks restart count per service over 30-minute window. After 3 failed restarts, stops healing, creates critical incident, sends notification, and marks state as
exhausted - Disk-full forecast alerting — Computes disk fill rate from metrics history; alerts when disk projected full within 48h (critical if <12h)
- Memory leak trend detection — Compares recent vs older memory averages; warns when sustained >10% increase with usage above 60%
- Docker container crash detection — New
check_container_healthin alert engine detects exited, crash-looping, and unhealthy containers - Docker container auto-restart — Auto-healer restarts exited/dead Docker containers with same 3-attempt limit as system services
- Incidents pause deploys — All 5 deploy paths (manual site, webhook site, manual git, webhook git, scheduled git) check for active critical/major incidents before proceeding
- Security scanner auto-fix — Auto-renews expiring SSL certificates detected by security scans (safe findings only, never auto-deletes)
- Fail2Ban auto-configuration — New sites auto-get a Fail2Ban jail monitoring their access log; removed on site deletion
- Session management — New
user_sessionstable,GET /api/auth/sessions(list with is_current flag),DELETE /api/auth/sessions/{id}(revoke), auto-cleanup of expired sessions - Notification center — Bell icon with unread badge in all 4 layouts. New
panel_notificationstable, 4 API endpoints (list, unread-count, mark-read, mark-all-read),/notificationspage with severity colors. Alerts auto-insert into notification center. 30-day retention cleanup. SSE real-time delivery. Wired into 18 event sources (deploys, incidents, backups, security, SSL, auto-healer, sites, auth)
- Clone site auto-provisioning — Clone now triggers auto-backup schedule, secrets vault, status page component, and site.created event
- Composite site health — New
GET /api/sites/{id}/health-summarycombining SSL, backup freshness, uptime, and composite score - "Backup Everything" preset — New
POST /api/backup-orchestrator/policies/protect-allone-click policy - Backup creation retry — Policy executor retries failed backups once with 5s delay
- Backup freshness alerting — Proactive notification when sites have no backup in 48+ hours (throttled to once/hour)
- Volume restore endpoint — New
POST /api/backup-orchestrator/volume-backups/{id}/restore - Deploy lock — Concurrent deploys to same site blocked (checks for active building/deploying status)
- Response time alerting — Monitors warn when response time exceeds 5000ms threshold
- Failed cron detection — Manual cron execution fires alert on non-zero exit code
- Postmortem auto-populate — Transitioning to postmortem status auto-generates timeline template
- /tmp cleanup + Docker prune — Auto-healer now cleans /tmp (7d) and runs Docker system prune on disk pressure
- Oversized log rotation — Truncates individual log files larger than 500MB during cleanup
- Welcome email — New users receive welcome email with panel URL and credentials prompt
- Audit log IPs — Security-sensitive actions (site create/delete, user create/delete, security fix) now log client IP
- Auto-rollback on deploy failure — Failed site deploys auto-restore from pre-deploy backup
- Generic webhook notifications — New
notify_webhook_urlin alert rules for custom integrations (Telegram, Teams, etc.) - Weekly digest email — Monday morning summary with 7-day alert/backup/incident/deploy counts to all admins
- Post-deploy cache invalidation — Nginx cache purge after successful deploy (fastcgi + proxy cache)
- Reseller branding —
GET /api/brandingnow returns per-reseller logo/colors/name when applicable - Unified event timeline — New
GET /api/dashboard/timelinemerging deploys, backups, incidents, alerts, scans
- Clean-Dark rounding parity — Added ~120 lines of structural overrides (cards, modals, tables, buttons, scrollbar, selection, focus rings, progress bars, code blocks) so Clean-Dark has round corners everywhere, matching Clean
- Ember radius normalized —
--radius-xland--radius-2xlwere 2px smaller than all other themes; fixed to 16px/20px - Clean hardcoded border-radius → CSS variables — All 11 instances of hardcoded
12px/8px/6px/4pxconverted tovar(--radius-lg/md/sm/xs)for theme consistency - Status dot glow per-theme — Green glow was hardcoded for all themes; now uses theme-appropriate accent color (blue for Midnight/Clean-Dark, orange for Ember, teal for Arctic, blue for Clean)
- Progress bar glow for Arctic & Clean — Missing glow rules added for both light themes
- Settings theme picker missing
data-color-scheme— Switching to light themes now correctly sets color scheme attribute - Default theme mismatch — Settings.tsx fallback aligned to
midnight(wasterminal) - FOUC prevention — Added inline script in index.html to apply theme before CSS loads
- LayoutSwitcher light variant — Replaced hardcoded
zinc/blue/whitecolors with theme variables - 2FA banner in all layouts — Replaced
amber-*(stock Tailwind) withwarn-*(theme tokens) - NexusLayout logout hover —
rose-400replaced withdanger-400theme token - PublicStatusPage full theme adoption — 40+ hardcoded color references replaced with theme variables
- Terminal.tsx —
bg-gray-300andbg-red-500replaced with theme tokens - Login.tsx — Google OAuth button uses theme-mapped text/hover colors
- Settings.tsx hardcoded colors — 13 instances of
blue-500/red-500replaced withaccent/dangertokens - Dashboard stat grid square corners — Added
rounded-lg overflow-hiddento stat bar and system info grids; added explicitrounded-lgto metric cards, sparkline cards, onboarding section, and issues panels - Compact layout flat nav — GlassLayout now respects
dp-flat-navsetting (was only implemented in Sidebar layout) - Compact layout footer spacing — Removed nested padding wrapper, aligned
px-3to match Sidebar layout spacing - Layout switcher dropdown redesign — Added
p-1padding androunded-mditems to match panel dropdown style; compact mode hides label text to save space; removed bordered button style for cleaner ghost-button look
- GAP 7+21: Internal events bridge to webhook gateway —
fire_event()now also forwards events to webhook gateway routes withfilter_path=/eventandfilter_value={event_type}. Users can subscribe gateway routes to any internal event. - GAP 12: Docker apps auto-get monitor + status component — Docker apps deployed with a domain now auto-create an HTTP monitor and a status page component under "Docker Apps" group.
- GAP 13: Git deploy auto-creates gateway endpoint — New git deploys auto-create a webhook gateway endpoint for webhook inspection/replay capabilities.
- GAP 16: Incident resolve cleans up alerts + components — Resolving a managed incident auto-resolves linked alerts and clears status_override on affected status page components.
- GAP 17: Vault export/import — New
GET /api/secrets/vaults/{id}/exportandPOST /api/secrets/vaults/{id}/importendpoints for encrypted vault backup and transfer between DockPanel instances.
All 21 identified gaps now addressed. Zero manual steps required for: backup scheduling, uptime monitoring, secret injection, incident creation, status page updates, or webhook delivery.
- GAP 1: Backup policies now execute — New
backup_policy_executorbackground service runs every 60s, evaluates cron schedules, executes backup policies across sites, databases, and volumes. Policies are no longer dead config. - GAP 2: Verifier respects policy_id — Backup verifier checks
verify_after_backupflag. Policy executor triggers verification after successful backups. - GAP 3: Auto-incidents from monitoring — When a monitor goes down, the system auto-creates a managed incident with timeline, links affected status page components, and auto-resolves when the monitor recovers.
- GAP 4: Auto status page components — New sites automatically get a status page component (if status page is enabled).
- GAP 5: Auto-inject secrets on deploy — After a successful deploy, the system checks for a linked vault with
auto_injectsecrets and injects them into the site's.envfile automatically. - GAP 6: Auto-vault for new sites — Every new site gets an auto-created secrets vault linked via
site_id. - GAP 8: fire_event in all new features — Backup orchestrator, incident management, and secrets manager now emit extension webhook events (
db_backup.created,incident.created,secrets.injected, etc.). - GAP 9: Critical alerts create incidents — Critical alerts and server offline/service down alerts auto-create managed incidents visible on the status page.
- GAP 10: Backup failure creates incident — When a backup policy has failures, a managed incident is auto-created.
- GAP 14: Backup for ALL sites — Removed the
site_count <= 1gate. Every new site now gets a daily backup schedule automatically. - GAP 15: Auto-monitor with deferred activation — New sites get a paused HTTP monitor that auto-activates after successful SSL provisioning (when DNS is confirmed working).
- GAP 18: Webhook delivery cleanup — Added 7-day retention cleanup for
webhook_deliveriesand 90-day forbackup_verificationsin the auto-healer retention cycle. - GAP 19: Subscribers notified of auto-downtime — Status page subscribers now receive email notifications when monitors detect downtime, not just for manually-created incidents.
- GAP 20: Policy encrypt flag works — The backup policy executor passes the encrypt flag through to agent backup endpoints when
encrypt = TRUE.
- New background service:
backup_policy_executor(supervised, 60s interval) — 11th background service - Modified:
uptime.rs(auto-incidents + subscriber notifications),alert_engine.rs(critical→incident),sites.rs(auto-vault, auto-monitor, auto-component, backup for all),ssl.rs(activate monitors),deploy.rs(auto-inject secrets),auto_healer.rs(retention cleanup),backup_orchestrator.rs+incidents.rs+secrets.rs(fire_event calls)
- Webhook Gateway: Receive, inspect, route, and replay incoming webhooks.
- Inbound endpoints: Each gets a unique URL (
/api/webhooks/gateway/{token}). Unlimited endpoints per user. - Signature verification: HMAC-SHA256 and HMAC-SHA1 modes for GitHub, Stripe, and other providers. Configurable header name and secret.
- Request inspector: Full request logging — headers, body, source IP, signature validation status. Click any delivery to view complete details.
- Route builder: Forward incoming webhooks to any destination URL. JSON path filtering (e.g., only forward
action=push). Custom header injection. Configurable retry (0-10 attempts with exponential backoff). - Replay: Re-send any past delivery to all configured routes. Useful for debugging or recovery.
- Delivery tracking: Per-route forwarding status, response body, duration. Endpoint-level counters.
- E2E test suite:
tests/webhook-gateway-e2e.sh— endpoint CRUD, webhook receive, delivery inspection, routes, replay, filtering.
- Inbound endpoints: Each gets a unique URL (
- New crate dependency:
sha1 0.10for HMAC-SHA1 signature verification. - New migration:
webhook_endpoints,webhook_deliveries,webhook_routestables. - 8 new API endpoints (7 admin, 1 public inbound).
- Frontend:
WebhookGateway.tsxwith 3 tabs (Endpoints, Request Inspector, Routes).
- Secrets Manager: AES-256-GCM encrypted secret storage with version history.
- Secret vaults: Project-scoped vaults for organizing secrets (global or per-site).
- Encrypted storage: All secret values encrypted with AES-256-GCM (random nonce per secret, key derived from JWT_SECRET via SHA-256).
- Secret types: Environment variables, API keys, passwords, certificates, custom — with type-specific UI badges.
- Version history: Every update creates a versioned snapshot. Full audit trail with who changed what and when.
- Auto-inject: Mark secrets for automatic injection into site
.envfiles on deploy. One-click inject from vault to site. - Masked by default: API returns masked values (
xxxx••••••••) unless?reveal=trueis explicitly requested. - Pull endpoint:
GET /api/secrets/vaults/{id}/pullreturns all secrets as decrypted key-value pairs (for CLI integration). - Vault sidebar UI: Split-pane layout with vault list on left, secrets table on right. Create/edit/delete with inline forms.
- E2E test suite:
tests/secrets-manager-e2e.sh— vault CRUD, secret CRUD, encryption roundtrip, version history, pull.
- New crate dependencies:
aes-gcm 0.10,base64 0.22for AES-256-GCM encryption. - New service:
secrets_crypto.rs— encrypt/decrypt with nonce+ciphertext format, unit tests included. - New migration:
secret_vaults,secrets,secret_versionstables. - 8 new API endpoints under
/api/secrets/. - Frontend:
SecretsManager.tsxwith vault browser, reveal toggle, version history panel.
- Incident Management: Full incident lifecycle with real-time status updates.
- Managed incidents: Create, track, and resolve incidents with status lifecycle (investigating → identified → monitoring → resolved → postmortem).
- Incident severity: Minor, major, critical, and maintenance classifications.
- Incident timeline: Post updates with status changes and messages. Full audit trail with author emails and timestamps.
- Postmortem support: Attach post-incident analysis with publish control.
- Affected components: Link incidents to status page components for targeted impact reporting.
- Enhanced Status Page: Production-grade public status page replacing the basic monitor list.
- Status page configuration: Customizable title, description, logo URL, accent color, history display settings.
- Component groups: Organize monitors into logical service components (e.g., "API Server", "Website") with grouping.
- Overall status indicator: Automatically computed from component health (operational/degraded/major outage).
- Incident history: Shows active incidents with full timeline, plus resolved incidents within configurable history window.
- Auto-detected downtime: Legacy monitor-based incidents also displayed for complete visibility.
- Email subscribers: Public subscribe/unsubscribe for incident notifications. Verified subscribers receive updates on status changes.
- Standalone public page: Dark-themed, no-auth status page at
/statuswith responsive layout.
- Admin UI: New "Incidents" page in Operations nav with 3 tabs (Incidents, Components, Settings).
- 11 new API endpoints: Incidents CRUD + updates, status page config, components CRUD, subscribers, enhanced public endpoint.
- E2E test suite:
tests/incident-management-e2e.shcovering full incident lifecycle, components, public page, subscribers.
- New migration:
status_page_config,status_page_components,status_page_component_monitors,managed_incidents,managed_incident_components,incident_updates,status_page_subscriberstables. - Frontend:
IncidentManagement.tsx(admin),PublicStatusPage.tsx(public standalone).
- Backup Orchestrator: New centralized backup management system for databases, Docker volumes, and sites.
- Database backups: MySQL/MariaDB (
mysqldump), PostgreSQL (pg_dump), and MongoDB (mongodump) dump + restore via Docker exec. Compressed with gzip. - Docker volume backups: Back up any Docker volume to
.tar.gzusing a temporary Alpine container. Restore volumes with one click. - Encryption at rest: Optional AES-256-CBC encryption (PBKDF2, 100k iterations) for all backup types via OpenSSL. Encrypted files get
.encsuffix, originals are auto-deleted. - Automatic restore verification: Verify backups by spinning up temporary database containers and restoring dumps, or extracting archives to temp directories. Checks file integrity, table counts, and entry points.
- Backup policies: Cross-resource policies with cron scheduling, destination selection, retention count, encryption toggle, and auto-verification.
- Backup health dashboard: Global overview with total counts, storage usage, 24h success/failure rates, active policies, verification stats, and stale backup warnings.
- Background verifier: Supervised service running every 6 hours that automatically verifies unverified backups and fires alerts on failures.
- B2 and GCS destinations: Backblaze B2 and Google Cloud Storage now supported as backup destinations (S3-compatible API).
- CLI commands:
dockpanel backup db-create,db-list,vol-create,vol-list,verify,health— full backup management from the command line. - E2E test suite: Dedicated backup orchestrator test script (
tests/backup-orchestrator-e2e.sh) covering health, policies CRUD, database backup lifecycle with verification.
- Database backups: MySQL/MariaDB (
- Nav item: "Backups" in Operations section links to the new Backup Orchestrator page.
- New migration:
backup_policies,database_backups,volume_backups,backup_verificationstables. - Extended
backup_destinationswithencryption_enabled,encryption_keycolumns, and B2/GCS dtype support. - Agent: 4 new services (
database_backup,volume_backup,encryption,backup_verify) + 3 new route modules. - Backend:
backup_orchestratorroutes (11 endpoints),backup_verifiersupervised background service. - Frontend:
BackupOrchestrator.tsxpage with 5 tabs (Overview, Policies, DB Backups, Volume Backups, Verifications).
- Nexus themes decoupled from layout: Nexus and Nexus Dark themes were previously locked to the Nexus layout only. They are now independent color themes that work with any layout (Terminal, Glass, Atlas, Nexus). Theme cycling (Ctrl+K) and Settings picker now include all 6 themes.
- Premium card depth: Dark theme cards (Terminal, Midnight, Ember, Nexus Dark) now have subtle box shadows creating layered depth instead of flat rectangles.
- Progress bar polish: All progress bars now have rounded ends and a subtle accent-colored glow per theme (green/blue/orange).
- Bolder status indicators: Status dots (online/offline/warning) are larger (10px) with colored glow halos for better visibility on dense pages.
- Theme picker expanded: Settings appearance panel now shows all 6 themes (was 4) with accurate mini-previews including Nexus Dark and Nexus Light.
- Layout switcher description: Nexus layout description updated to "Modern SaaS, flat nav" (was "Light, clean SaaS" which was misleading since dark themes now work with it).
- Nexus Dark theme: Premium dark mode for the Nexus layout with sun/moon toggle. GitHub Dark-inspired three-layer depth palette, Inter font, rounded corners, blue accent. Persists across sessions.
- Sidebar group labels: Navigation groups (Reseller, Operations, Admin) now display small uppercase labels in the Command layout sidebar.
- Glass sidebar tooltips: Native browser tooltips show nav item names when the Glass layout sidebar is collapsed.
- Card elevation system: Three elevation levels (
.elevation-1/2/3),.card-interactivehover effects,.hover-liftcard animations. Applied to dashboard cards, sites table, mail service cards, app templates, server/monitor items. - Page header system: Sticky
page-headerbar with title, subtitle, and action buttons. Applied to 13 pages (Dashboard, Sites, Databases, Apps, Security, Settings, Servers, Mail, Monitoring, DNS, Users, Git Deploy, Alerts). - Login background gradient: Subtle radial gradient that adapts per theme (green/blue/teal/orange).
- Modal portal system:
dp-modal/dp-modal-overlayCSS classes for Nexus-compatible modal styling across 15 modals in 6 pages.
- Button color hierarchy: Only primary CTAs (Create Site, Run Scan, Add Record) stay green. All secondary/utility buttons (Customize, Restart Nginx, Export, Refresh, etc.) use neutral gray — breaks the green monotone across 6 pages, ~25 buttons.
- Dynamic progress bar colors: CPU/Memory/Disk bars change from green (<70%) → amber (70-90%) → red (>90%). Disk uses 80/90 thresholds. Rounded ends with smooth 500ms transitions.
- Dashboard visual hierarchy: Metric cards with elevation, 24h chart fade-in animation, staggered stat grid, collapsible onboarding wizard (auto-collapses after 3+ steps, persists to localStorage).
- Sidebar footer redesign: User avatar circle with initial, hover-reveal logout button, descriptive health status ("Connected"/"Disconnected" replaces "OK"/"!"). Applied to both Command and Glass layouts.
- Typography for non-terminal themes: Midnight and Ember now remove uppercase/tracking like Nexus. All 5 sans-serif themes get 15px body text for better Inter readability.
- Security card grid: Changed from 5-column with orphan card to balanced 3-column grid with equal
min-h-[140px]heights. - Table hover states:
table-row-hoverclass added to Security, DNS, and Users table rows with theme-aware hover colors. - Onboarding wizard: Completed steps show a solid green circle with white checkmark. Collapsible with compact "Setup: X/5 complete" view.
- Ember theme contrast: Lightened surfaces and brightened orange accent for better text readability.
- Atlas layout nav: Added
shrink-0to nav items so they scroll horizontally instead of compressing. - Richer empty states: Sites, Databases, Git Deploys, Monitors, and Crons pages show contextual feature descriptions instead of bare "No X yet" text.
- Login page: Removed bulky "Made with Rust" gear icon, replaced with minimal "Powered by Rust" text. Card shadows added.
- Theme switching: Nexus→Terminal white screen: Switching from Nexus layout to any other layout left
dp-theme=nexus(white) active, rendering a white Terminal layout. Fixed withdp-pre-nexus-themesave/restore in LayoutSwitcher, NexusLayout, useLayoutState, and main.tsx IIFE. - Nexus modal clipping: Modals in Nexus layout were clipped by
overflow-hiddenon the main wrapper, hiding the top fields. Fixed withcreatePortalto render atdocument.body. - Nexus modal contrast: Modal cards in Nexus light had the same
#f9fafbbackground as the page (invisible). Fixed withdp-modalclass providing white background, strong shadow, and proper text colors. - Page header spacing: Added
margin-bottom: 1.25remto.page-headerfor consistent spacing between header and content. - Nexus light theme: tinted selection buttons: Migration source cards, Settings proxy selector, and all
bg-rust-500/10-style toggle buttons were rendering as solid blue blobs. Fixed with properly unescaped selectors. - Nexus light theme: accent toggle visibility:
bg-accent-500/15toggles now render with readable blue tint and text.
- CORS lockdown: Deny all cross-origin requests by default. Same-origin panel UI is unaffected. Previously defaulted to
AllowOrigin::any()which allowed CSRF from any website. - Constant-time token comparison: Agent auth middleware now uses
subtle::ConstantTimeEqto prevent timing attacks on token validation. - Token hashing in database: Agent tokens stored as SHA-256 hashes in
agent_token_hashcolumn. DB dump no longer exposes plaintext tokens for inbound auth. - Token rotation: New
POST /auth/rotate-tokenon agent +POST /api/servers/{id}/rotate-tokenon API. 60-second grace period for old token during rotation. Updatesapi.envon disk for persistence. - Secure cookie fix:
BASE_URLdefaulted tohttps://panel.example.com, causingSecureflag on cookies over HTTP. Fixed — defaults to empty, setup script sets from domain. - jsonwebtoken upgraded 9 → 10.3.0: Fixes type confusion vulnerability that could lead to authorization bypass.
- serde_yml replaced with serde_yaml_ng:
serde_ymlandlibymlare unsound/unmaintained. Replaced withserde_yaml_ngv0.10.0.
- Cascade cron cleanup: Deleting a site now removes cron entries from the system crontab. Previously, DB records were cleaned via CASCADE but crontab entries were orphaned.
- UFW port gap: Setup script now adds panel ports (80, 443, 8443) to UFW even when the firewall is pre-existing. Previously skipped port rules if UFW was already installed.
- Token rotation API→agent desync: Rotating the agent token now updates the API's in-memory
AgentClienttoken AND writes toapi.envon disk. Previously left the API with the old token, breaking all agent communication.
- CI pipeline (
.github/workflows/ci.yml): Rust clippy, frontend type check, build verification, unit tests,cargo-audit+npm auditsecurity scanning. Runs on every push to main and PRs. - E2E test suite (
tests/e2e.sh): 62 tests across 27 categories — full CRUD lifecycle, security edge cases, zero-leftover cleanup. Run:bash tests/e2e.sh <host> [port]. - Deep E2E test suite (
tests/deep-e2e.sh): 51 tests for advanced features — WordPress install, backup restore, git deploy, reseller system, file operations, compose stacks, concurrent operations, extensions API. - 29 unit tests: Config parsing (BASE_URL defaults, Secure flag logic), token hashing, input validation (domains, names, container IDs, path traversal, pagination).
- API reference (
docs/api-reference.md): 648 lines documenting all 371 endpoints with request bodies and examples. - Competitor comparison (
COMPARISON.md): Honest comparison vs HestiaCP, CloudPanel, RunCloud, CyberPanel, Ploi. - README overhaul: Dashboard screenshot, comparison table, collapsible screenshot gallery, cleaner structure.
- FUNDING.yml: PayPal sponsor link (paypal.me/ovexro).
- Reboot recovery: All services start automatically after server reboot. 62/62 E2E tests pass post-reboot.
- Fresh install E2E: Full install via
INSTALL_FROM_RELEASE=1on clean Ubuntu 24.04 VPS — all features operational.
- Documentation site at
docs.dockpanel.dev: mdBook-generated, 8 pages (getting-started, troubleshooting, CLI reference, WordPress, Git deploy, email, multi-server, backups). 1855 lines.
- Docker app templates pinned: 33 of 39
:latesttags replaced with specific major versions (e.g.,redis:7,ghost:5,grafana/grafana:11). 6 kept at:latestdue to non-standard versioning (minio, nocodb, etc.). - Auto-monitors removed: Sites no longer auto-create uptime monitors on creation. Users create monitors manually when DNS is configured.
- 8 documentation pages at
docs/: getting-started, troubleshooting, CLI reference, and 5 guides (WordPress, Git deploy, email, multi-server, backups). 1855 lines of practical, copy-paste-friendly docs.
- Local server not registered after setup: API returned 503 on all requests after admin creation. Added
ensure_local_server()call in the setup endpoint. - Site docroot missing /public/ subdirectory: Agent created
/var/www/{domain}/but nginx expected/var/www/{domain}/public/. Fixed to create the correct subdirectory. - Backup tar flag incompatibility: Replaced
--no-dereferencewith-h(POSIX-compatible).
- Migration ordering:
whitelabel_oauthmigration was running beforereseller_system(ALTERing a table before it existed). Renumbered to20260320050000. - OAuth bypasses 2FA: OAuth login issued full session without checking
totp_enabled. Now redirects to 2FA challenge when enabled. - Setup script missing build tools: Fresh VPS source builds failed — added
build-essential cmake pkg-configinstallation. - No swap on x86_64 low-RAM VPS: Swap creation only triggered on ARM. Now applies to all architectures when building from source.
- install-agent.sh wrong env vars: Remote agents never entered phone-home mode (
AGENT_TOKENvsDOCKPANEL_SERVER_TOKEN). Fixed to write both sets. - Systemd services never updated during upgrade:
update.shnow rewrites service files with currentReadWritePathsand hardening. - Required directories not created during upgrade:
update.shnow creates/etc/postfix,/var/vmail, and other directories needed by new features.
- UFW blocks panel port 8443: IP-based installs now open the configured panel port in UFW.
- ExecStartPost hardcodes www-data: Agent socket
chgrpnow auto-detects nginx group (www-dataornginx). readprompt broken in curl-pipe-bash: Domain prompt now reads from/dev/ttywhen stdin is piped.- Frontend path mismatch after upgrade:
update.shnow fixes nginx root path when switching between source and release modes. - config.rs default LISTEN_ADDR was 0.0.0.0:3000: Changed to
127.0.0.1:3080to match all scripts and nginx config. - uninstall.sh incomplete cleanup: Now removes CLI binary, tmpfiles.d, crontab entries,
/var/www/acme,/var/lib/dockpanel. - Stacks INSERT missing server_id: Docker Compose stacks now include
server_idin INSERT. - Staging site INSERT missing server_id: Staging environments now inherit parent site's server_id.
- No domain uniqueness across sites + git_deploys: Cross-table domain conflict check prevents silent hijacking.
- Blue-green deploy dropped resource limits: New container now inherits
memory/cpu_period/cpu_quotafrom config. - Git preview port has no unique constraint: Added
UNIQUE INDEXongit_previews(host_port). - Site proxy_port has no unique constraint: Added partial
UNIQUE INDEXonsites(proxy_port). - No terminal session limit: Added
AtomicU32counter with max 20 concurrent PTY sessions.
- CONTRIBUTING.md: Development setup, architecture overview, code style, PR process.
- GitHub issue templates: Bug report and feature request forms with structured fields.
- GitHub PR template: Checklist for builds, tests, and changelog.
- README.md: Added badges (license, release, build), doc links, contributing section, phone-home disclosure.
- .gitignore: Added SSL material, database file patterns.
- Rate limit bypass via X-Forwarded-For: Login rate limiter now uses
X-Real-IP(set by nginx, not forgeable) instead ofX-Forwarded-For. - SSRF filter bypass in extensions: Webhook URL validation replaced string-matching with DNS resolution +
is_loopback()/is_private()/is_link_local()checks. Blocks hex IPs, decimal IPs, IPv6 loopback, DNS-to-localhost, cloud metadata. - Nginx version disclosure: Added
server_tokens offto nginx config.
- Agent fails after every reboot: Removed
ReadWritePathsandPrivateTmp=yesfrom agent systemd service (redundant withProtectSystem=no, and caused NAMESPACE errors for missing dirs). AddedExecStartPreto create/run/dockpanel. - Health endpoint false "ok":
/api/healthnow checks DB connectivity, returns"degraded"when database is unreachable. - StartLimitIntervalSec in wrong section: Moved from
[Service]to[Unit]in all 3 scripts.
- Secure cookie over HTTP: Login cookie conditionally sets
Secureflag based onBASE_URLscheme.SameSitechanged fromStricttoLax(Strict blocked OAuth redirects). - Site document root not created: Agent now creates
/var/www/{domain}/public/with a defaultindex.htmlduring site provisioning. - PHP site without PHP check: Agent validates PHP-FPM socket exists before writing PHP nginx config. Returns clear error with install instructions.
serde_yamlarchived: Replaced withserde_ymlin agent and CLI (serde_yaml maintainer archived the crate in 2024).- MailHog abandoned: Replaced
mailhog/mailhogtemplate withaxllent/mailpit(MailHog last updated 2020). - Stale build templates: Updated
rust:1.82-slim→rust:1.94-slim,golang:1.23-alpine→golang:1.24-alpine.
- Cloudflare auth header deduplication: 5 inline blocks → shared
helpers::cf_headers(). - Server IP detection deduplication: 6 inline blocks → shared
helpers::detect_public_ip(). - Agent semaphore split: Long-running ops (Docker builds) use separate 5-permit semaphore, quick requests keep 20.
- Extension webhook rate limiting: Max 20 concurrent deliveries with atomic counter.
- DB pool acquire timeout: 5-second timeout prevents indefinite blocking.
- Uptime monitor N+1 query: Maintenance window check batched into single query.
- Version alignment: All Cargo.toml and package.json versions bumped to 2.0.2 (were 0.1.0/1.0.0). API health endpoint and CLI --version now report correct version.
- Binary size claims: Marketing site, README, and FAQ updated from "~20MB" (agent-only) to "~35MB" (total of agent + API + CLI) for honest comparison.
- Template count: FAQ corrected from 53 to 54 app templates.
- OS support: Hero section now includes Rocky Linux 9+ alongside other supported distros.
- install-agent.sh binary naming: Was downloading
dockpanel-agent-x86_64/dockpanel-agent-aarch64but GitHub Releases publishesdockpanel-agent-linux-amd64/dockpanel-agent-linux-arm64. Fixed to match release naming. - install-agent.sh apt-get hardcoding: Now detects package manager (apt/dnf/yum) instead of hardcoding apt-get. CentOS, Rocky, Fedora, and Amazon Linux now supported for remote agent installs.
- install-agent.sh server-id persistence:
--server-idwas accepted but never written to config. Now persisted to/etc/dockpanel/api.envasSERVER_ID. - install-agent.sh tmpfiles.d: Added
/run/dockpaneltmpfiles.d entry so socket directory survives reboots. - install-agent.sh systemd hardening: Remote agent service now matches local agent hardening (MemoryMax, LimitNOFILE, PrivateTmp, ProtectKernelLogs/Modules).
- update.sh pre-built binary path: Added
INSTALL_FROM_RELEASE=1support so ARM users who installed via release binaries can update without Rust toolchain. - update.sh redundant health check: Removed duplicate wait-for-health loop after rollback-capable check.
- Multi-Server Management: Manage unlimited remote servers from one panel. AgentRegistry dispatches to local (Unix socket) or remote (HTTPS) agents. Server selector in sidebar, test connection, install script for remote agents. ServerScope extractor with user ownership verification on every request.
- Reseller / Multi-Tenant Accounts: Admin → Reseller → User hierarchy. Reseller quotas (max users/sites/databases), server allocation, per-reseller branding (logo, colors, hide DockPanel name). Quota enforcement on site/database creation with counter sync.
- Nixpacks Auto-Detection: Build any app without a Dockerfile using Nixpacks (30+ languages). Dynamic version resolution from GitHub releases. Deploy pipeline: try Nixpacks → fall back to auto-detect (6 langs) → docker build. Build method tracked per deploy.
- Preview Environments: TTL-based auto-cleanup of preview deployments. Branch deletion webhook auto-removes previews. Configurable preview_ttl_hours per deploy. Background cleanup service (5-minute interval).
- Migration Wizard: Import sites, databases, and email from cPanel, Plesk, or HestiaCP. 4-step wizard: select source → analyze backup (auto-detect domains, DBs, mail) → select items → SSE-streamed import. cPanel full parser, Plesk/HestiaCP beta stubs.
- WordPress Toolkit: Multi-site WP dashboard with parallel detection. Vulnerability scanning against 14 known exploited plugins. Security hardening (7 checks, 6 auto-fixable via wp-cli). Bulk update plugins/themes/core across selected sites.
- White-Label Branding: Public
/api/brandingendpoint. Per-reseller logo_url, accent_color, panel_name, hide_branding. BrandingContext provider applies to sidebar + login page. Dynamic accent color via CSS variable. - OAuth / SSO Login: Google, GitHub, GitLab via OAuth 2.0 authorization code flow. CSRF state tokens (10-minute expiry). GitHub private email fallback. Auto-create users on first OAuth login (configurable). Provider-colored login buttons.
- Traefik Reverse Proxy: Alternative to nginx for Docker app routing. Traefik v3.3 as Docker container with auto-SSL (Let's Encrypt ACME). File-based dynamic route configs with auto-watch. Install/uninstall/status management. Settings toggle in admin panel.
- Plugin / Extension API: Webhook-based integrations with HMAC-SHA256 signed event delivery. Extension CRUD with
dpx_API keys andwhsec_webhook secrets. Event types: site/backup/deploy/app/auth/ssl. Delivery log with status tracking. Secret rotation. SSRF protection on webhook URLs.
- SQL Browser: Built-in query editor for PostgreSQL and MariaDB with schema viewer
- Node.js + Python Site Runtimes: Managed systemd services with auto-port allocation
- Docker Compose Stacks: Full stack lifecycle (deploy, start, stop, restart, update, remove)
- Blue-Green Zero-Downtime Deploy: Docker app updates with traffic swap and rollback
- Git Push-to-Deploy Pipeline: Clone → build → deploy with webhook triggers and rollback
- Container Health Checks: Docker health status (healthy/unhealthy/starting) in Apps view
- Container Logs Viewer: Search, filter, auto-refresh, color-coded log levels
- Command Palette (Ctrl+K): Global search across all panel pages
- One-Click App Updates: Pull latest image, preserve config, recreate container
- 34 App Templates: Database, CMS, monitoring, analytics, tools, dev, storage, media, networking, security
- Getting Started Wizard: 5-step onboarding checklist
- Architecture: Single-agent → multi-agent (AgentRegistry, AgentHandle enum, RemoteAgentClient)
- Auth: Added ResellerUser extractor, ServerScope with ownership verification
- Database: 8 new tables, server_id FK on all resource tables, reseller profiles, extensions, migrations
- Frontend: BrandingContext, ServerContext providers. 8 new pages (Servers, ResellerDashboard, ResellerUsers, Migration, WordPressToolkit, Extensions, plus per-site WP and Git Deploy enhancements)
- Rust Edition: 2024 (Rust 1.94)
- ServerScope verifies
server.user_id == claims.subon every request (prevents cross-user server access) - OAuth: SameSite=Strict cookies, error callback handling, empty oauth_id validation, no auto-link to password accounts
- Extension API: SSRF protection (blocks private IPs, metadata endpoints), HMAC bypass fix, webhook secret rotation
- Migration wizard: command injection fix (direct docker args), path traversal validation, TAR --no-same-owner
- WordPress: domain path validation, targeted chown (not recursive), site path fallback
- Nixpacks: build_context path traversal validation, dynamic version resolution
- Traefik: ACME directory permissions (0700), network cleanup on uninstall
- Branding: logo_url validated (HTTP(S) only), accent_color validated (hex/rgb/hsl only)
- Reseller: quota enforcement wired up, server isolation for reseller users, counter sync on create/delete
- Preview: TTL reset on redeploy, MAKE_INTERVAL for PostgreSQL safety, cleanup error logging
- 100+ findings from 9 comprehensive audits across all features
- server_id filtering added to git_deploys, stacks, databases, dashboard, alerts list endpoints
- Compose deployments now correctly set build_method='compose'
- Preview cleanup query uses MAKE_INTERVAL instead of string concat
- fire_event() wired into site/backup/app handlers (was dead code)
- Traefik Docker app integration (was install-only with no functional routing)
- Frontend SecurityItem type mismatch in WordPress Toolkit fixed
- OAuth parameter mismatch (doc_root vs source_dir) in migration wizard fixed
- Email Management: Full mail server with one-click install (Postfix + Dovecot + OpenDKIM). Domains, mailboxes, aliases, catch-all, quotas, autoresponders, DKIM signing, DNS helper (MX/SPF/DKIM/DMARC), mail queue viewer
- PowerDNS: Self-hosted DNS alongside Cloudflare. Provider selector, zone creation, record CRUD, setup guide
- One-Click CMS Install: WordPress, Drupal, Joomla — create site + database + install + SSL in one click from Sites page
- Historical Charts: SVG sparkline charts (CPU/Memory/Disk 24h) with background metrics collector (60s interval, 7-day retention)
- Light Theme: CSS variable overrides, sun/moon toggle in sidebar footer, localStorage persistence
- One-Click Service Installers: PHP-FPM, Certbot, UFW, Fail2Ban — install from Settings page
- Smart Port Opener: Port recognition (28+ ports), safety categories (safe/caution/blocked), quick presets (Web/Mail/Database)
- SSH Key Management: List/add/remove authorized keys with SHA256 fingerprints
- Auto-Updates: Toggle for unattended-upgrades security patches
- Panel IP Whitelist: Restrict panel access to specific IPs
- Auto-SSL: Automatic Let's Encrypt provisioning on site creation
- Webhook Testing: Test Slack/Discord webhooks from Settings
- File Upload: Base64 binary upload with path traversal protection
- Webmail Template: Roundcube one-click deploy from Docker Apps
- Spam Filter Template: Rspamd one-click deploy from Docker Apps
- BUILD STABLE Badge: Build status indicator in sidebar footer
- Harmonized Color Palette: Green/amber/red at identical saturation/lightness (anchored at #22c55e). Custom
warn-*anddanger-*CSS scales. Zero stale emerald/amber/yellow references - Dashboard Redesign: Bar metrics with centered text-5xl numbers (replaced ring gauges), neutral white numbers + gray progress bars (color only for warnings/critical), system info grid (replaced neofetch style)
- Sidebar Overhaul: Flat nav (no progressive disclosure), white active state with blinking _ cursor, 19px icons, spacing-only groups
- Terminal Frame: Unified bordered container (header + canvas in single frame)
- Mobile Responsive: Card layouts for Activity, Users, DNS records. Logs toolbar wrapping. Monitors polish
- Contrast: All text-dark-400 bumped to text-dark-300 globally (36 instances, 14 files) for WCAG compliance
- Animations: Page fade-up, stagger children, counting numbers, typewriter welcome, hover-lift. Respects prefers-reduced-motion
- Login Page: Logo updated to match sidebar brand
- Apps/Sites Separation: WordPress/Drupal/Joomla moved from Docker Apps to native PHP in Sites. 32 Docker templates remain for services and tools
- 502 Error UX: "Agent offline" message with
systemctl restartcommand instead of cryptic "Request failed (502)" - Security Score: Prominence increase, singular/plural grammar fix
- Apps Empty State: Error message with icon when templates fail to load
- Diagnostics: Agent nginx -t check distinguishes [warn] from [emerg]/[error] — no false critical on cosmetic warnings
- Document Root False Positives: Changed ProtectHome=yes → read-only so agent can see /home/* directories
- Agent Socket Persistence: Added tmpfiles.d config + /run/nginx.pid to ReadWritePaths
- Agent Permissions: NoNewPrivileges=no, ReadWritePaths for mail/apt/etc paths — enables package installation
- CUPS Disabled: Removed unnecessary print service
- Setup script auto-installs UFW + Fail2Ban with default rules
- Smart firewall blocks dangerous ports (Telnet, NetBIOS, SMB, MSSQL)
- All cookie flags verified: HttpOnly, Secure, SameSite=Strict, Max-Age=7200
- Metrics collector background service (60s interval, 7-day retention)
- Mail config sync to Postfix/Dovecot via atomic file writes
- DKIM key generation via openssl RSA 2048-bit
- Setup script installs PHP, Certbot, UFW, Fail2Ban out of the box
- Core Panel: Site management (static, PHP, proxy), database management (PostgreSQL, MariaDB), SSL (Let's Encrypt), file manager, web terminal, backups
- Docker Apps: 50+ one-click templates across 10 categories + Docker Compose import
- CLI: Full command-line interface — status, sites, db, apps, ssl, backup, logs, security, diagnose, export, apply
- Infrastructure as Code: YAML export/import of server configuration
- Smart Diagnostics: Pattern-based issue detection across 6 categories with one-click fixes
- Auto-Healing: Automatic restart of crashed services, log cleanup on full disk, SSL renewal
- Alerting System: 5 alert types (CPU/memory/disk thresholds, server offline, SSL expiry, service health, backup failure) with email, Slack, Discord notifications
- 2FA/TOTP: Full two-factor authentication with QR setup and recovery codes
- Dashboard Intelligence: Health score (0-100), top active issues, SSL expiry countdowns
- Docker Resource Limits: Memory and CPU limits on container deploy
- Container Management: Health checks, logs viewer, environment viewer, one-click updates
- Security: Firewall management, Fail2Ban, SSH hardening, security scanning with scoring
- DNS Management: Cloudflare DNS zone management with full record CRUD
- Git Deploy: Webhook-triggered deployments from Git repos
- Staging Environments: Create staging copies, sync from production, push to live
- Uptime Monitoring: HTTP checks with configurable intervals and incident tracking
- Teams: Multi-user access with roles and team-based permissions
- Activity Log: Full audit trail of all admin actions
- Multi-Server: Manage unlimited servers from a single dashboard
- ARM64 Support: Pre-built binaries for Raspberry Pi and ARM64 servers
- Auto Reverse Proxy: Domain + SSL auto-configured when deploying Docker apps
- Command Palette: Ctrl+K global search across all panel pages
- Notification Channels: Email toggle, Slack/Discord webhook configuration
- Custom Nginx Directives: Per-site textarea for advanced nginx config
- Onboarding Wizard: 5-step getting started checklist for new users
- JWT auth with HttpOnly cookies + Bearer header support
- Token blacklist for logout with periodic cleanup
- Argon2 password hashing
- Rate limiting on login, 2FA, webhooks, and agent endpoints
- Systemd hardening (NoNewPrivileges, ProtectSystem, MemoryMax)
- Nginx rate limiting (30r/s on API)
- 12 CHECK constraints on database status/type fields
- Atomic nginx config writes (tmp+rename)
- Supervised background tasks with auto-restart on panic
- Statement timeout on all database pool connections (30s)
- Agent request timeout (60s)
- DB backup cron (daily, 7-day retention)
- Docker prune cron (weekly)