Quick Start

60-second tour

git clone https://github.com/mtecnic/model-chat-cli.git
cd model-chat-cli
pip install -r requirements.txt
python main.py

That's everything. The next five things will happen, in this order:

Subnet scan. Auto-detects your /24 and probes ports 11434, 1234, 5000, 8000, 8080. Discovery
Model table. Lists every model on every server with latency. Discovery#discovered-models-table
Picker. Arrow keys, Enter to chat. Type a /command from the chat prompt for everything else.
Cache write. Server list goes to ~/.model_chat_cache.json so next launch is instant.
You're chatting. Streaming response with decode TPS / TTFT in the status line. Chat

The five commands you'll use

  /stress       → 6 modes of load testing                 [[Stress-Testing]]
  /arena        → Multi-model head-to-head                 [[Arena]]
  /promptarena  → System-prompt tournament                 [[Prompt-Arena]]
  /think        → Toggle reasoning-token rendering         [[Chat#thinking-mode]]
  /export       → Dump conversation to markdown

Full command list: Chat#commands.

Common first-run gotchas

Symptom	Fix
"No servers found"	Your model host isn't on the same `/24`. Check firewall / VLAN. Troubleshooting#no-servers-found
Latency column shows huge numbers	Often a cold-cache LM Studio model — first request loads weights. Re-run scan.
`/think` shows nothing	Model doesn't emit `<think>` tags. Try a Qwen 3 / 3.5 reasoning variant.
Stress test exits immediately	The selected server's model name has changed. Re-pick from `/switch`.
Arena tournament hangs at "Judging…"	Judge model failed mid-response; check `logs/stress_test_*.log` for the trace.

What to try next

Run a Throughput stress test at concurrency 20 to see how your server scales: Stress-Testing#throughput.
Try Tool Bench → Quick against any model that advertises tool-calling: Tool-Calling-Benchmark#quick-suite.
Run an Arena → Quick Compare between two models you're choosing between: Arena#quick-compare.

Model Chat CLI · MIT · repo · issues · No telemetry · No cloud calls · No surprises

Model Chat CLI

Getting started

Features

Internals

Operating

GitHub repo →

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quick Start

Quick Start

60-second tour

The five commands you'll use

Common first-run gotchas

What to try next

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Model Chat CLI

Clone this wiki locally