This workspace spins up an Ollama container and provides a tiny TypeScript client for sending prompts to the API.
- Docker + Docker Compose v2
- Node.js 18+ (for native
fetchsupport) - NVIDIA GPU with proprietary drivers (optional, but recommended for speed)
-
Install the latest NVIDIA driver
sudo apt update sudo ubuntu-drivers autoinstall sudo reboot
-
Install Docker Engine (skip if you already have it)
sudo apt update sudo apt install -y ca-certificates curl gnupg sudo install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" | sudo tee /etc/apt/sources.list.d/docker.list >/dev/null sudo apt update sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
-
Install the NVIDIA Container Toolkit (lets Docker expose GPUs to containers)
distribution=$(. /etc/os-release; echo $ID$VERSION_ID) curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit.gpg curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit.gpg] https://#' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt update sudo apt install -y nvidia-container-toolkit sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker
-
Verify GPU access from Docker
docker run --rm --gpus all nvidia/cuda:12.3.2-base-ubuntu22.04 nvidia-smi
Once these steps succeed, docker compose up -d will start Ollama with GPU acceleration automatically (see docker-compose.yml).
docker compose up -dPull at least one model before sending prompts (example: llama3).
docker compose exec local-ollama ollama pull llama3npm installnpm run serveFor hot reload during development:
npm run devBoth commands respect PORT (default 3000).
curl -s \
-X POST http://localhost:3000/api/prompt \
-H "Content-Type: application/json" \
-d '{"prompt":"Write a haiku about local models."}'OLLAMA_HOST(defaulthttp://localhost:11434)OLLAMA_MODEL(defaultgemma3:latest)PORT(default3000)
Override example:
PORT=4000 OLLAMA_MODEL=mistral npm run serve