Run local models from a terminal UI with a local OpenAI-compatible gateway.
This repository is a binary distribution bundle centered on ./bin/llmost.
./bin/llmostOn Home:
- Press
Enterto install runtime (if needed), pull the starter model, and start the gateway. - Press
Enteragain to jump toChat.
- Terminal UI tabs:
Home,Models,Serve,Setup,Tuning,Advisor,Chat,Use,Logs - Local OpenAI-compatible API gateway
- Runtime install/start/stop from TUI and CLI
- Model scan/import/pull/register lifecycle
- Scoped tuning via
llmost tune ... - Python host-first resolution with managed fallback support
- Diagnostics: doctor/status/ports/runtime checks/logs
- User guide:
docs/user_guide.mdlink - Troubleshooting:
docs/troubleshooting.mdlink - Security:
docs/security.mdlink
./bin/llmost
./bin/llmost --help
./bin/llmost doctor
./bin/llmost status
./bin/llmost ports
./bin/llmost stop
./bin/llmost cleanup-ghosts
./bin/llmost logsllmost prefers host Python and only falls back to managed setup when needed.
./bin/llmost python status
./bin/llmost python check
./bin/llmost python use /absolute/path/to/python3
./bin/llmost python clear-preference./bin/llmost tune show
./bin/llmost tune set serve.context_length 8192
./bin/llmost tune set chat.temperature 0.2
./bin/llmost tune set chat.thinking_mode off
./bin/llmost tune reset chat.temperature- default bind host is loopback (
127.0.0.1) - local-first workflow requires no cloud account
- when exposing outside loopback, use bearer auth
Example:
./bin/llmost serve --model-id <id> --host 0.0.0.0 --port 8787 --bearer-token '<strong-token>'- Binary path:
./bin/llmost - Runtime/model state lives under
./configand./var - Bundled vendor components are under
./vendor
Apache-2.0. See LICENSE.