This is the minimal setup for DS4 ROCm inference on a
Strix Halo machine with 128 GB RAM and Radeon 8060S (gfx1151).
On Ubuntu 26.04 LTS, install the ROCm compiler/runtime and libraries used by the Strix Halo backend:
sudo apt-get update
sudo apt-get install -y \
hipcc rocminfo rocm-smi \
libamdhip64-dev \
libhipblas-dev libhipblaslt-dev \
librocblas-dev \
librocwmma-dev \
libhipcub-devThe backend uses rocWMMA. On this Ubuntu 26.04 setup, librocwmma-dev
installs the top-level rocWMMA headers but misses rocwmma/internal/.
No Ubuntu package currently provides those internal headers. Install a complete
matching rocWMMA header tree:
git clone --depth 1 --branch rocm-7.1.0 https://github.com/ROCm/rocWMMA.git /tmp/rocWMMA-rocm-7.1.0
sudo mkdir -p /usr/local/include
sudo cp -a /tmp/rocWMMA-rocm-7.1.0/library/include/rocwmma /usr/local/include/If ROCm is installed under /usr but tooling expects /opt/rocm, add these
compatibility links:
sudo mkdir -p /opt/rocm/bin
sudo ln -sf /usr/bin/hipcc /opt/rocm/bin/hipcc
sudo ln -sfn /usr/lib/x86_64-linux-gnu /opt/rocm/lib
sudo ln -sfn /usr/include /opt/rocm/includeThe user running DS4 must be able to open /dev/kfd and the DRM render node:
sudo usermod -aG render,video "$USER"Log out and back in, or reboot. Verify:
rocminfo | grep -A80 'Name: gfx1151'If DS4 says no ROCm-capable device is detected, check that rocminfo can open
/dev/kfd and that groups includes render.
A 128 GB Strix Halo system may initially expose only about 62 GB of GPU-visible memory. DS4 needs the larger GTT aperture for the 80.76 GiB model plus runtime buffers.
Use these kernel parameters:
amd_iommu=off amdgpu.gttsize=126976 ttm.pages_limit=32505856 ttm.page_pool_size=32505856
On Ubuntu with GRUB:
sudo cp /etc/default/grub /etc/default/grub.bak
sudoedit /etc/default/grubSet:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=off amdgpu.gttsize=126976 ttm.pages_limit=32505856 ttm.page_pool_size=32505856"
Then:
sudo update-grub
sudo rebootAfter reboot, verify:
cat /proc/cmdline
sudo dmesg | grep -Ei 'GTT|gttsize|TTM|VRAM'
rocminfo | grep -A80 'Name: gfx1151'Expected signs:
amdgpu: 126976M of GTT memory ready
rocminfo gfx1151 pool: 130023424 KB
Use the normal Strix Halo target. It builds the standard binary names:
make strix-halo -j"$(nproc)"make rocm is an alias for make strix-halo.
Use the standard IQ2XXS/Q2K/Q8 imatrix GGUF:
DeepSeek-V4-Flash-IQ2XXS-w2Q2K-AProjQ8-SExpQ8-OutQ8-chat-v2-imatrix.gguf
Avoid the mixed IQ2/IQ4 or IQ2/Q4 GGUFs on this machine for now. They put much more memory pressure on the ROCm path and can trigger system OOM instead of a clean DS4 failure.
Run it normally:
./ds4 -m gguf/DeepSeek-V4-Flash-IQ2XXS-w2Q2K-AProjQ8-SExpQ8-OutQ8-chat-v2-imatrix.ggufThe ROCm build uses the Strix Halo backend automatically.