Tired of complex AI setups? π© llama.ui
is an open-source desktop application that provides a beautiful β¨, user-friendly interface for interacting with large language models (LLMs) powered by llama.cpp
. Designed for simplicity and privacy π, this project lets you chat with powerful quantized models on your local machine - no cloud required! π«βοΈ
This repository is a fork of llama.cpp WebUI with:
- Fresh new styles π¨
- Extra functionality βοΈ
- Smoother experience β¨
-
Multi-Provider Support: Works with llama.cpp, LM Studio, Ollama, vLLM, OpenAI,.. and many more!
-
Conversation Management:
- IndexedDB storage for conversations
- Branching conversation support (edit messages while preserving history)
- Import/export functionality
-
Rich UI Components:
- Markdown rendering with syntax highlighting
- LaTeX math support
- File attachments (text, images, PDFs)
- Theme customization with DaisyUI themes
- Responsive design for mobile and desktop
-
Advanced Features:
- PWA support with offline capabilities
- Streaming responses with Server-Sent Events
- Customizable generation parameters
- Performance metrics display
-
Privacy Focused: All data is stored locally in your browser - no cloud required!
-
Localized Interface: Most popular language packs are included in the app, and you can choose the language at any time.
- β¨ Open our hosted UI instance
- βοΈ Click the gear icon β General settings
- π Set "Base URL" to your local llama.cpp server (e.g.
http://localhost:8080
) - π Start chatting with your AI!
π§ Need HTTPS magic for your local instance? Try this mitmproxy hack!
Uh-oh! Browsers block HTTP requests from HTTPS sites π€. Since llama.cpp
uses HTTP, we need a bridge π. Enter mitmproxy - our traffic wizard! π§ββοΈ
Local setup:
mitmdump -p 8443 --mode reverse:http://localhost:8080/
Docker quickstart:
docker run -it -p 8443:8443 mitmproxy/mitmproxy mitmdump -p 8443 --mode reverse:http://localhost:8080/
Pro-tip with Docker Compose:
services:
mitmproxy:
container_name: mitmproxy
image: mitmproxy/mitmproxy:latest
ports:
- '8443:8443' # π Port magic happening here!
command: mitmdump -p 8443 --mode reverse:http://localhost:8080/
# ... (other config)
β οΈ Certificate Tango Time!
- Visit http://localhost:8443
- Click "Trust this certificate" π€
- Restart π¦ llama.ui page π
- Profit! πΈ
VoilΓ ! You've hacked the HTTPS barrier! π©β¨
- π¦ Grab the latest release from our releases page
- ποΈ Unpack the archive (feel that excitement! π€©)
- β‘ Fire up your llama.cpp server:
Linux/MacOS:
./server --host 0.0.0.0 \
--port 8080 \
--path "/path/to/llama.ui" \
-m models/llama-2-7b.Q4_0.gguf \
--ctx-size 4096
Windows:
llama-server ^
--host 0.0.0.0 ^
--port 8080 ^
--path "C:\path\to\llama.ui" ^
-m models\mistral-7b.Q4_K_M.gguf ^
--ctx-size 4096
- π Visit http://localhost:8080 and meet your new AI buddy! π€β€οΈ
We're building something special together! π
- π― PRs are welcome! (Seriously, we high-five every contribution! β)
- π Bug squashing? Yes please! π§―
- π Documentation heroes needed! π¦Έ
- β¨ Make magic with your commits! (Follow Conventional Commits)
Prerequisites:
- π» macOS/Windows/Linux
- β¬’ Node.js >= 22
- π¦ Local llama.cpp server humming along
Build the future:
npm ci # π¦ Grab dependencies
npm run build # π¨ Craft the magic
npm start # π¬ Launch dev server (http://localhost:5173) for live-coding bliss! π₯
Planning to redistribute the app with opinionated settings out of the box? Any JSON under
src/config
is baked into immutable defaults at build time (see
src/config/index.ts
).
If those baked defaults include a non-empty baseUrl
, the inference server will auto-sync on first load
so model metadata is fetched without requiring manual input.
- Frontend: React with TypeScript
- Styling: Tailwind CSS + DaisyUI
- State Management: React Context API
- Routing: React Router
- Storage: IndexedDB via Dexie.js
- Build Tool: Vite
- App Context: Manages global configuration and settings
- Inference Context: Handles API communication with inference providers
- Message Context: Manages conversation state and message generation
- Storage Utils: IndexedDB operations and localStorage management
- Inference API: HTTP client for communicating with inference servers
llama.ui is proudly MIT licensed - go build amazing things! π See LICENSE for details.
Made with β€οΈ and β by humans who believe in private AI