You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+22-9Lines changed: 22 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -327,19 +327,32 @@ curl -fsSL https://ollama.com/install.sh | sh
327
327
328
328
#### 2. Pull a model
329
329
330
-
The agent needs a model with **tool calling / function calling** support. The following are recommended, ordered from best balance to most lightweight:
330
+
The agent needs a model with **tool calling / function calling** support. Model quality matters here: the agent issues multi-step tool calls (fetching node lists, then acting on individual node IDs) and smaller or less capable models sometimes fail to follow the tool-use protocol reliably, pass placeholder values instead of real node IDs, or stop mid-sequence.
331
+
332
+
> **Resource trade-off:** Larger, more capable models (like the Qwen3 series) produce more reliable results but consume significantly more RAM and CPU. On a machine that is already running several nodes, a heavier model will compete for resources — responses will take longer to generate and your host's CPU and memory usage will be noticeably higher while the agent is active. If your machine is resource-constrained, start with a lighter model and only move up if you find the results unreliable.
331
333
332
334
```bash
333
-
# Recommended — good reasoning, ~2 GB RAM
334
-
ollama pull llama3.2:3b
335
+
# Recommended — strong tool use and reasoning
336
+
# ~5 GB RAM; higher CPU usage during inference
337
+
ollama pull qwen3:8b
335
338
336
-
# Excellent tool use, similar size
337
-
ollama pull qwen2.5:3b
339
+
# Good alternative — solid tool calling, lower resource usage
340
+
# ~4 GB RAM; moderate CPU usage
341
+
ollama pull qwen2.5:7b
338
342
339
-
# Lightest option — ~1 GB RAM, suitable for Raspberry Pi
343
+
# Lightweight option — ~2 GB RAM, low CPU overhead
344
+
# works well for simple queries; may occasionally struggle
345
+
# with complex multi-step node management actions
346
+
ollama pull llama3.2:3b
347
+
348
+
# Lightest option — ~1 GB RAM, minimal CPU usage
349
+
# suitable for Raspberry Pi or very constrained machines;
350
+
# basic queries work, but multi-step actions may be unreliable
340
351
ollama pull llama3.2:1b
341
352
```
342
353
354
+
> **Why `qwen3:8b`?** In practice, models with stronger reasoning and tool-use training (such as the Qwen3 series) handle the kind of multi-step actions Formicaio requires — e.g. "restart all stopped nodes", "show me the node with the highest record count" — much more reliably than smaller 1–3 B parameter models. The cost is higher RAM and CPU consumption: on a busy node-running machine this can be noticeable, especially during autonomous mode checks. Smaller models like `llama3.2:3b` are a valid choice if resources are constrained, but you may see occasional errors or incomplete actions that require a follow-up prompt.
355
+
343
356
On some systems Ollama starts automatically after installation. If it is not running, start it manually:
0 commit comments