This integration kit does not patch upstream QMD. Instead, it makes backend selection more predictable for OpenClaw deployments.
- If you know the correct backend, set
QMD_LLAMA_GPUexplicitly. - Otherwise use the provided wrapper script, which selects:
cudawhen NVIDIA userland indicators and CUDA toolkit availability are presentvulkanwhenvulkaninfoandglslcare presentfalse(CPU mode) otherwise
On WSL systems with NVIDIA passthrough, it is common to have:
nvidia-smiavailable/dev/dxgpresent- but missing Vulkan build dependencies such as:
libvulkan-devvulkan-toolsglslcglslang-tools
In that situation, automatic backend probing may repeatedly try a broken Vulkan path and then fall back to CPU.
A stable CPU path is better than repeatedly failing GPU auto-detection. If your system is not ready for CUDA/Vulkan, force:
export QMD_LLAMA_GPU=falsesudo apt-get update
sudo apt-get install -y libvulkan-dev vulkan-tools glslc glslang-toolsThe wrapper only checks for minimal userland indicators. Real CUDA support may still require:
- compatible NVIDIA Windows driver with WSL GPU support
libcuda.sovisibility inside WSL- CUDA Toolkit /
nvccavailability when node-llama-cpp needs to build a local CUDA backend
This integration flow was successfully verified on a WSL host with an NVIDIA GeForce RTX 3050 Ti Laptop GPU after installing CUDA Toolkit:
nvccavailable- CUDA libraries visible (
libcudart,libcublas,libcuda) QMD_LLAMA_GPU=cuda qmd statusreported:GPU: cuda (offloading: yes)- NVIDIA device name present
- VRAM visibility working
- QMD MCP wrapper restarted successfully on the CUDA path
- GPU-mode embeddings completed successfully with:
QMD_LLAMA_GPU=cuda qmd embed --max-docs-per-batch 12 --max-batch-mb 8Embedded 291 chunks from 125 documents in 1m 0s
QMD_LLAMA_GPU=cuda qmd status
QMD_LLAMA_GPU=vulkan qmd status
QMD_LLAMA_GPU=false qmd statusUse whichever mode is both fast and repeatably stable on your host.