Local LLM Playground

This workspace spins up an Ollama container and provides a tiny TypeScript client for sending prompts to the API.

Prerequisites

Docker + Docker Compose v2
Node.js 18+ (for native fetch support)
NVIDIA GPU with proprietary drivers (optional, but recommended for speed)

Enable GPU on Ubuntu

Install the latest NVIDIA driver

sudo apt update
sudo ubuntu-drivers autoinstall
sudo reboot

Install Docker Engine (skip if you already have it)

sudo apt update
sudo apt install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" | sudo tee /etc/apt/sources.list.d/docker.list >/dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Install the NVIDIA Container Toolkit (lets Docker expose GPUs to containers)

distribution=$(. /etc/os-release; echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit.gpg] https://#' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update
sudo apt install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Verify GPU access from Docker

docker run --rm --gpus all nvidia/cuda:12.3.2-base-ubuntu22.04 nvidia-smi

Once these steps succeed, docker compose up -d will start Ollama with GPU acceleration automatically (see docker-compose.yml).

Start Ollama

docker compose up -d

Pull at least one model before sending prompts (example: llama3).

docker compose exec local-ollama ollama pull llama3

Install dependencies

npm install

Run the API server

npm run serve

For hot reload during development:

npm run dev

Both commands respect PORT (default 3000).

Call the endpoint

curl -s \
	-X POST http://localhost:3000/api/prompt \
	-H "Content-Type: application/json" \
	-d '{"prompt":"Write a haiku about local models."}'

Environment overrides

OLLAMA_HOST (default http://localhost:11434)
OLLAMA_MODEL (default gemma3:latest)
PORT (default 3000)

Override example:

PORT=4000 OLLAMA_MODEL=mistral npm run serve

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local LLM Playground

Prerequisites

Enable GPU on Ubuntu

Start Ollama

Install dependencies

Run the API server

Call the endpoint

Environment overrides

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Local LLM Playground

Prerequisites

Enable GPU on Ubuntu

Start Ollama

Install dependencies

Run the API server

Call the endpoint

Environment overrides

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages