Skip to content

feat: make dashboard hardware- and host-agnostic#7

Merged
niklasfrick merged 5 commits into
mainfrom
feat/hardware-host-agnostic
Apr 22, 2026
Merged

feat: make dashboard hardware- and host-agnostic#7
niklasfrick merged 5 commits into
mainfrom
feat/hardware-host-agnostic

Conversation

@niklasfrick

Copy link
Copy Markdown
Owner

Project was built targeting the NVIDIA DGX Spark (Grace+Blackwell GB10) and baked a few Spark-specific assumptions into the metrics layer that produced wrong data on any other Linux + NVIDIA GPU host. Fix the load-bearing cases, add multi-GPU awareness, and generalize the framing without renaming the crate.

Load-bearing fixes

  • metrics/memory: drop hardcoded is_unified=true. Read NVML memory_info for real GPU VRAM total/used and auto-detect unified memory via a pure detect_unified_memory helper (GPU VRAM within 10% of system RAM). Added gpu_memory_total_bytes and gpu_memory_used_bytes to MemoryMetrics.
  • metrics/mod: add --gpu-index (SPARK_DASHBOARD_GPU_INDEX, default 0) with graceful out-of-range handling; log NVML device count at startup.

Frontend

  • MemoryCard branches on is_unified: unified hosts keep the existing stacked bar; discrete-GPU hosts render separate system RAM and GPU VRAM sections.
  • Added tests covering the discrete path and the missing-VRAM fallback.

Dev / docs

  • Rename dev env vars to DEPLOY_USER / DEPLOY_HOST / DEPLOY_DIR. SPARK_* are still accepted as a fallback with a one-line deprecation note so existing .env files keep working.
  • Generalize README, CONTRIBUTING, dev/README, install.sh, systemd unit, Cargo description, and internal comments away from "DGX Spark only" framing. DGX Spark kept as the original reference point.
  • Drop the aarch64 warning from packaging/install.sh.

Project was built targeting the NVIDIA DGX Spark (Grace+Blackwell GB10) and
baked a few Spark-specific assumptions into the metrics layer that produced
wrong data on any other Linux + NVIDIA GPU host. Fix the load-bearing cases,
add multi-GPU awareness, and generalize the framing without renaming the crate.

Load-bearing fixes
- metrics/memory: drop hardcoded is_unified=true. Read NVML memory_info for
  real GPU VRAM total/used and auto-detect unified memory via a pure
  detect_unified_memory helper (GPU VRAM within 10% of system RAM). Added
  gpu_memory_total_bytes and gpu_memory_used_bytes to MemoryMetrics.
- metrics/mod: add --gpu-index (SPARK_DASHBOARD_GPU_INDEX, default 0) with
  graceful out-of-range handling; log NVML device count at startup.

Frontend
- MemoryCard branches on is_unified: unified hosts keep the existing stacked
  bar; discrete-GPU hosts render separate system RAM and GPU VRAM sections.
- Added tests covering the discrete path and the missing-VRAM fallback.

Dev / docs
- Rename dev env vars to DEPLOY_USER / DEPLOY_HOST / DEPLOY_DIR. SPARK_* are
  still accepted as a fallback with a one-line deprecation note so existing
  .env files keep working.
- Generalize README, CONTRIBUTING, dev/README, install.sh, systemd unit,
  Cargo description, and internal comments away from "DGX Spark only"
  framing. DGX Spark kept as the original reference point.
- Drop the aarch64 warning from packaging/install.sh.
@niklasfrick niklasfrick self-assigned this Apr 22, 2026
@niklasfrick niklasfrick added the enhancement New feature or request label Apr 22, 2026
@niklasfrick

Copy link
Copy Markdown
Owner Author

#6

Increase contrast on engine and hardware section titles, metric
labels, gauge labels, and stacked bar legends. Inline the HwCard
subtitle next to the title with a middle-dot separator, and expand
"GPU Util" to "GPU Utilization".
Thread DeploymentMode through the detector, engine state, and
snapshot pipeline so the UI can distinguish a native vLLM process
from a containerized one. The engine tab now shows the vLLM logo
(plus a Docker logo when applicable) alongside a primary/secondary
title pair styled to match the hardware cards.
@niklasfrick niklasfrick merged commit 3b77d5a into main Apr 22, 2026
5 checks passed
@niklasfrick niklasfrick deleted the feat/hardware-host-agnostic branch April 22, 2026 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant