Wrangle GPU-accelerated llamas on Android
_ _
| | | __ _ _ __ ___ _ ___ __
| | |/ _` | '_ ` _ \| | | \ \/ /
| | | (_| | | | | | | |_| |> <
|_|_|\__,_|_| |_| |_|\__,_/_/\_\
llamux is a CLI tool that automates building Ollama from source with Vulkan GPU acceleration on Android devices running Termux. It handles dependency installation, source patching, compilation, installation, and rollback — all in one command.
Ollama's official releases don't support Android. Unofficial binaries don't support GPU acceleration. Building from source on Termux requires several patches:
- CMakeLists.txt uses
RUNTIME_DEPENDENCIESwhich Android's CMake doesn't support - Vulkan shader compilation deadlocks on Android due to high concurrency defaults (
ASYNCIO_CONCURRENCY=64)
llamux applies these patches automatically and builds a fully functional Ollama with Vulkan GPU support.
Run this single command on a fresh Termux installation to get everything set up:
curl -fsSL https://raw.githubusercontent.com/JediRhymeTrix/llamux/main/bootstrap.sh | bashThis will:
- Install git if missing
- Clone llamux to
~/llamux - Add it to your PATH
- Verify the installation
Then just run:
source ~/.zshrc # or ~/.bashrc
llamux install# Clone llamux
git clone https://github.com/JediRhymeTrix/llamux.git
cd llamux
# Build and install the latest Ollama with Vulkan GPU support
./llamux installThat's it. llamux will:
- Install all build dependencies (cmake, go, vulkan-headers, shaderc, etc.)
- Clone the latest Ollama release
- Apply Android/Termux patches
- Build native libraries with Vulkan support
- Build the Go binary
- Back up your current installation
- Install the new build
- Configure
OLLAMA_VULKAN=truein your shell - Run a smoke test to verify everything works
- Android device with aarch64/arm64 processor
- Termux installed (F-Droid version recommended)
- Internet connection for downloading dependencies and Ollama source
git clone https://github.com/JediRhymeTrix/llamux.git
cd llamux
# Option A: Run directly from the repo
./llamux install
# Option B: Install llamux system-wide
make install
llamux install# Latest version with Vulkan GPU acceleration
llamux install
# Specific version
llamux install --version 0.20.2
# CPU-only (no Vulkan)
llamux install --no-vulkan
# Skip the post-install smoke test
llamux install --no-smoke
# Preview what would happen
llamux install --dry-run
# Set up auto-start on boot (requires Termux:Boot)
llamux install --boot-service
# Use more parallel build jobs (may cause OOM on low-RAM devices)
llamux install --jobs 2# Check installation status
llamux status
# Roll back to previous version
llamux rollback
# Install build dependencies only
llamux deps
# Clean up build artifacts
llamux clean
# Show version
llamux version
# Show help
llamux helpAndroid's CMake doesn't support the RUNTIME_DEPENDENCIES parameter in install(TARGETS). llamux removes this block so the build succeeds.
The Vulkan shader generator (vulkan-shaders-gen.cpp) defaults to ASYNCIO_CONCURRENCY=64, which causes deadlocks on Android due to limited process resources. llamux reduces this to 1 to prevent the deadlock while still compiling all 2000+ shader variants.
llamux/
├── llamux # Main CLI entry point
├── bootstrap.sh # One-click installer for fresh Termux
├── lib/
│ ├── utils.sh # Logging, colors, error handling, platform detection
│ ├── deps.sh # Dependency detection & installation via pkg
│ ├── source.sh # Git clone, version resolution, checkout
│ ├── patch.sh # Android-specific source patches
│ ├── build.sh # CMake configure + native build + Go build
│ ├── install.sh # Backup, install, rollback, env setup
│ └── verify.sh # Post-install verification & smoke test
├── tests/
│ ├── test_utils.bats # Utility function tests (12 tests)
│ ├── test_source.bats # Version resolution tests (6 tests)
│ ├── test_patch.bats # Patch application tests (7 tests)
│ └── test_install.bats # Install/rollback tests (8 tests)
├── .github/workflows/
│ ├── ci.yml # ShellCheck + bats tests on push/PR
│ └── release.yml # Tag-based GitHub releases
├── Makefile # install/uninstall/test/lint/clean targets
├── CONTRIBUTING.md # Contribution guidelines
├── CHANGELOG.md # Version history
└── README.md
# Install bats-core (not available as a Termux package — install from source)
git clone --depth 1 https://github.com/bats-core/bats-core.git
cd bats-core && ./install.sh $PREFIX && cd .. && rm -rf bats-core
# Run all tests
make test
# Run specific test file
bats tests/test_patch.bats# Install shellcheck
pkg install shellcheck
# Lint all scripts
make lintgit tag v0.1.0
git push origin v0.1.0
# GitHub Actions will create a release automatically| Variable | Default | Description |
|---|---|---|
OLLAMA_VULKAN |
true (set by llamux) |
Enable Vulkan GPU acceleration |
LLAMUX_DEBUG |
0 |
Enable debug logging |
This means the patch didn't apply correctly. Run llamux clean and try again.
This is the concurrency bug that llamux fixes. If you see this, ensure the ASYNCIO_CONCURRENCY patch was applied: check that the value is 1 in the shader generator source.
Try llamux install --jobs 1 (default). Close other apps to free RAM. The Vulkan shader compilation is memory-intensive.
- Ensure
OLLAMA_VULKAN=trueis set:echo $OLLAMA_VULKAN - Check if your device has Vulkan support:
vulkaninfo(install withpkg install vulkan-tools) - Some devices don't expose Vulkan drivers to Termux — in this case, Ollama will fall back to CPU
llamux rollbackSee CONTRIBUTING.md for guidelines.