Pre-built llama-server binaries from llama.cpp b9305.
Downloads — Linux
| Variant |
File |
| x64 CPU |
llama-server-*-linux-x64-cpu.tar.gz |
| x64 CUDA 12.8 |
llama-server-*-linux-x64-cuda-12.tar.gz |
| x64 CUDA 13.1 |
llama-server-*-linux-x64-cuda-13.tar.gz |
| x64 Vulkan |
llama-server-*-linux-x64-vulkan.tar.gz |
| arm64 CPU |
llama-server-*-linux-arm64-cpu.tar.gz |
| arm64 CUDA 12.8 |
llama-server-*-linux-arm64-cuda-12.tar.gz |
| arm64 CUDA 13.1 |
llama-server-*-linux-arm64-cuda-13.tar.gz |
Downloads — Windows
| Variant |
File |
| x64 CPU |
llama-server-*-windows-x64-cpu.zip |
| x64 CUDA 12.4 |
llama-server-*-windows-x64-cuda-12.zip |
| x64 Vulkan |
llama-server-*-windows-x64-vulkan.zip |
Downloads — macOS
| Variant |
File |
| arm64 Metal |
llama-server-*-macos-arm64-metal.tar.gz |
| x64 CPU |
llama-server-*-macos-x64-cpu.tar.gz |
CUDA SM targets: 75;80;86;89;90;100;120