Single-user Telegram bot for server monitoring plus AI chat. Supports local Ollama models and optional cloud providers (OpenAI, Anthropic, DeepSeek).
Supported locales: en, es, it, de, fr.
- You send a command or free-text message from Telegram.
- The bot fetches live server metrics from Glances.
- Metrics are injected into the prompt.
- The prompt is sent to the currently active model option (
ollama,openai,anthropic, ordeepseek). - The reply is sent back in Telegram.
Free-text chat now uses a bounded multi-turn context window (turns + char budget) so follow-up questions keep continuity without unbounded prompt growth.
The bot uses the Glances REST API v4 (GLANCES_BASE_URL) and fetches all server
metrics in a single GET /all request, plus /all/limits for threshold data
and three lightweight history endpoints for trend analysis.
Recommended setup: run Glances directly on the host (glances -w) and point
GLANCES_BASE_URL to its address. This gives full access to hardware data
(GPU, sensors, all disks) without Docker isolation issues. If you prefer
running Glances inside Docker, see the commented service block in
docker-compose.yml.
The metrics pipeline includes three layers:
Snapshot layer: single/allfetch with short cache (10s) and fallback host resolution.Operational layer: health score (0-100), severity (good|warning|critical), key findings, and next-action hints built from thresholds + live values.AI context layer: rich structured JSON payload with all containers (name/status/CPU/RAM/IO), all mounts, all active network interfaces (RX/TX), sensors, disk I/O rates, system info, GPU state, and top 10 processes.
Endpoints used per snapshot:
| Endpoint | Purpose |
|---|---|
/all |
Full server state (single request) |
/all/limits |
Warning/critical thresholds for health scoring |
/cpu/total/history/3 |
CPU trend (last 3 samples) |
/mem/percent/history/3 |
RAM trend (last 3 samples) |
/load/min1/history/3 |
Load trend (last 3 samples) |
The /glances interactive menu still fetches individual detail endpoints on
demand (/cpu, /mem, /fs, /gpu, /sensors, etc.).
Notes:
- Base URL example:
http://192.168.1.10:61208/api/4 - Glances auth is optional (works without auth if Glances is not started with
--password). - Set
GLANCES_LOG_FULL_PAYLOAD=trueto log the full/allJSON atINFOlevel for diagnostics. - For GPU telemetry, Glances needs runtime device access. When running on the host this works automatically.
- Scheduler alerts include classic metric threshold alarms plus a global health alert when the operational score degrades to warning/critical.
| Command | Description |
|---|---|
/start |
Start the bot and show the persistent keyboard |
/status |
Current server metrics snapshot |
/alerts |
View and configure alert thresholds |
/glances |
Open live Glances per-endpoint detail menu |
/models |
Open model manager: switch active model, install local Ollama models, and delete local models |
/help |
Show help message |
[ 📊 Status ] [ 🔔 Alerts ]
[ 🤖 Models ] [ ❓ Help ]
Model manager(/models): lists all local Ollama models and optionally cloud provider options (OpenAI,Anthropic,DeepSeek) when API key and model are configured. Active option is marked with✅.Model install(inside/models): tap⬇️ Install model, reply with an Ollama model name (for examplellama3.2:3b), and the bot startsPOST /api/pullwith a live progress bar (█░, percentage, MB transferred).Install cancel(during model download): while a local model is downloading, an inline⛔ Cancel downloadbutton is shown; pressing it requests cancellation and stops the active pull safely.Model delete(inside/models): tap🗑️ Delete local model, select one local Ollama model from the inline list, then confirm with✅ Confirmbefore deletion (DELETE /api/delete).Alert thresholds(/alerts): edit CPU/RAM/Disk thresholds with confirmation.Glances details(/glancesor inline button from/status): open a live menu and fetch one Glances endpoint on demand (CPU, RAM, disk, GPU, network, containers, top processes, etc.).Chat context controls(free-text replies):ℹ️ Contextshows current context usage; panel includes🧹 Clearand❌ Close.
.
├── app/
│ ├── main.py
│ ├── core/
│ │ ├── auth.py
│ │ ├── config.py
│ │ └── store.py
│ ├── handlers/
│ │ ├── alerts.py
│ │ ├── chat.py
│ │ ├── glances_menu.py
│ │ ├── help.py
│ │ ├── models.py
│ │ ├── start.py
│ │ └── status.py
│ ├── services/
│ │ ├── glances.py
│ │ ├── llm_router.py
│ │ └── ollama.py
│ └── utils/
│ ├── formatting.py
│ └── i18n.py
├── locale/
│ ├── en.json
│ ├── es.json
│ ├── it.json
│ ├── de.json
│ └── fr.json
├── docker-compose.yml
├── Dockerfile
└── pyproject.toml
| Variable | Required | Default | Description |
|---|---|---|---|
TELEGRAM_BOT_TOKEN |
✅ | — | Token from BotFather |
TELEGRAM_CHAT_ID |
✅ | — | Authorized chat ID |
GLANCES_BASE_URL |
http://glances:61208/api/4 |
Glances API base URL | |
GLANCES_REQUEST_TIMEOUT_SECONDS |
8.0 |
Glances HTTP request timeout | |
GLANCES_LOG_FULL_PAYLOAD |
false |
Log full /all Glances payload at INFO |
|
OLLAMA_BASE_URL |
http://host.docker.internal:11434 |
Ollama API base URL | |
OLLAMA_MODEL |
llama3.2:3b |
Default Ollama model | |
OPENAI_API_KEY |
— | Optional OpenAI API key | |
OPENAI_MODEL |
— | Optional fixed OpenAI model | |
ANTHROPIC_API_KEY |
— | Optional Anthropic API key | |
ANTHROPIC_MODEL |
— | Optional fixed Anthropic model | |
DEEPSEEK_API_KEY |
— | Optional DeepSeek API key | |
DEEPSEEK_MODEL |
— | Optional fixed DeepSeek model | |
BOT_LOG_LEVEL |
INFO |
Logging level | |
BOT_LOCALE |
en |
Fallback locale | |
TZ |
UTC |
Timezone | |
SQLITE_PATH |
/app/data/serverwatch.db |
SQLite path | |
DATA_PATH |
/opt/docker/serverwatch-ai-bot/data |
Host path mounted into /app/data |
|
ALERT_CHECK_INTERVAL_SECONDS |
60 |
Alert scan interval (0 disables scheduler polling) |
|
ALERT_COOLDOWN_SECONDS |
300 |
Alert cooldown | |
ALERT_DEFAULT_CPU_THRESHOLD |
85 |
Default CPU threshold | |
ALERT_DEFAULT_RAM_THRESHOLD |
85 |
Default RAM threshold | |
ALERT_DEFAULT_DISK_THRESHOLD |
90 |
Default Disk threshold | |
ALERT_CONSECUTIVE_BREACHES |
2 |
Consecutive checks required before firing metric alerts | |
ALERT_RECOVERY_MARGIN_PERCENT |
5 |
Hysteresis margin (threshold - margin) required to reset alert state | |
ALERT_CONTEXT_WINDOW_SAMPLES |
3 |
Number of recent samples used for alert context average | |
CHAT_CONTEXT_MAX_TURNS |
8 |
Max past user+assistant turns included in AI context | |
CHAT_CONTEXT_MAX_CHARS |
10000 |
Max total chars for stored context window sent to AI | |
CHAT_CONTEXT_RETENTION_MESSAGES |
200 |
Max persisted chat messages kept per chat |
cp .env.example .env
# edit .env
docker compose up -d --buildRecommended command (all-in-one):
./.venv/bin/python -m ruff check . && ./.venv/bin/python -m ruff format --check . && ./.venv/bin/python -m mypy appYou can also run them separately:
./.venv/bin/python -m ruff check .
./.venv/bin/python -m ruff format --check .
./.venv/bin/python -m mypy app
./.venv/bin/python -m pytest -qSee CONTRIBUTING.md.
This project is licensed under the Apache 2.0 License. See LICENSE.
Built to keep server operations simple, clear, and always one message away.
