Skip to content

Commit 7d702ea

Browse files
committed
feat: improvement of LLM client and agent view
1 parent df7f229 commit 7d702ea

File tree

7 files changed

+389
-234
lines changed

7 files changed

+389
-234
lines changed

Cargo.lock

Lines changed: 128 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ prettytable = "0.10.0"
4040
rand = "0.9"
4141
reqwest = { version = "0.13", default-features = false, features = ["rustls","json","stream"], optional = true }
4242
rust-mcp-sdk = { version = "0.8.2", optional = true, default-features = false, features = ["server","macros","streamable-http","hyper-server"] }
43+
async-openai = { version = "0.33", optional = true, default-features = false, features = ["rustls", "chat-completion", "model"] }
4344
async-trait = { version = "0.1.89", optional = true }
4445
semver = { version = "1.0", optional = true }
4546
serde = "1.0"
@@ -66,6 +67,7 @@ hydrate = ["leptos/hydrate"]
6667
ssr = [
6768
"dep:ant-releases",
6869
"dep:alloy",
70+
"dep:async-openai",
6971
"dep:async-stream",
7072
"dep:async-trait",
7173
"dep:axum",

README.md

Lines changed: 22 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -327,19 +327,32 @@ curl -fsSL https://ollama.com/install.sh | sh
327327

328328
#### 2. Pull a model
329329

330-
The agent needs a model with **tool calling / function calling** support. The following are recommended, ordered from best balance to most lightweight:
330+
The agent needs a model with **tool calling / function calling** support. Model quality matters here: the agent issues multi-step tool calls (fetching node lists, then acting on individual node IDs) and smaller or less capable models sometimes fail to follow the tool-use protocol reliably, pass placeholder values instead of real node IDs, or stop mid-sequence.
331+
332+
> **Resource trade-off:** Larger, more capable models (like the Qwen3 series) produce more reliable results but consume significantly more RAM and CPU. On a machine that is already running several nodes, a heavier model will compete for resources — responses will take longer to generate and your host's CPU and memory usage will be noticeably higher while the agent is active. If your machine is resource-constrained, start with a lighter model and only move up if you find the results unreliable.
331333
332334
```bash
333-
# Recommended — good reasoning, ~2 GB RAM
334-
ollama pull llama3.2:3b
335+
# Recommended — strong tool use and reasoning
336+
# ~5 GB RAM; higher CPU usage during inference
337+
ollama pull qwen3:8b
335338

336-
# Excellent tool use, similar size
337-
ollama pull qwen2.5:3b
339+
# Good alternative — solid tool calling, lower resource usage
340+
# ~4 GB RAM; moderate CPU usage
341+
ollama pull qwen2.5:7b
338342

339-
# Lightest option — ~1 GB RAM, suitable for Raspberry Pi
343+
# Lightweight option — ~2 GB RAM, low CPU overhead
344+
# works well for simple queries; may occasionally struggle
345+
# with complex multi-step node management actions
346+
ollama pull llama3.2:3b
347+
348+
# Lightest option — ~1 GB RAM, minimal CPU usage
349+
# suitable for Raspberry Pi or very constrained machines;
350+
# basic queries work, but multi-step actions may be unreliable
340351
ollama pull llama3.2:1b
341352
```
342353

354+
> **Why `qwen3:8b`?** In practice, models with stronger reasoning and tool-use training (such as the Qwen3 series) handle the kind of multi-step actions Formicaio requires — e.g. "restart all stopped nodes", "show me the node with the highest record count" — much more reliably than smaller 1–3 B parameter models. The cost is higher RAM and CPU consumption: on a busy node-running machine this can be noticeable, especially during autonomous mode checks. Smaller models like `llama3.2:3b` are a valid choice if resources are constrained, but you may see occasional errors or incomplete actions that require a follow-up prompt.
355+
343356
On some systems Ollama starts automatically after installation. If it is not running, start it manually:
344357

345358
```bash
@@ -357,8 +370,8 @@ sudo systemctl enable --now ollama
357370
1. Open Formicaio in your browser (`http://localhost:52100`)
358371
2. Go to **Settings → AI Agent**
359372
3. Set **LLM Base URL** to `http://localhost:11434` (already the default)
360-
4. Set **Model Name** to the model you pulled (e.g. `llama3.2:3b`)
361-
5. Click **Test Connection** — you should see `Connected — model: llama3.2:3b`
373+
4. Set **Model Name** to the model you pulled (e.g. `qwen3:8b`)
374+
5. Click **Test Connection** — you should see `Connected — model: qwen3:8b`
362375
6. Click **Save Changes**
363376

364377
### Using the Agent
@@ -441,7 +454,7 @@ All agent settings are available under **Settings → AI Agent**:
441454
| Setting | Default | Description |
442455
|---------|---------|-------------|
443456
| LLM Base URL | `http://localhost:11434` | Base URL of your OpenAI-compatible LLM API |
444-
| Model Name | `llama3.2:3b` | **Required**. Model to use for chat and autonomous monitoring |
457+
| Model Name | `qwen3:8b` | **Required**. Model to use for chat and autonomous monitoring |
445458
| API Key | *(empty)* | Optional — leave empty for Ollama and other keyless backends |
446459
| Custom System Prompt | *(empty)* | Additional instructions appended to the built-in Formicaio prompt |
447460
| Max Context Messages | `20` | How many prior messages to include in each LLM request |

0 commit comments

Comments
 (0)