Skip to content

Chance-Konstruktion/ha-gaming-assistant

Repository files navigation

Gaming Assistant for Home Assistant

🎮 Gaming Assistant for Home Assistant

An AI coach that watches your screen and whispers tips into your smart home.

Local Vision LLM (Ollama) or cloud AI (GPT-4o, Gemini, DeepSeek, Groq) analyzes your gameplay — frame by frame — and pushes context-aware tips, voice lines, and triggers straight into Home Assistant. RGB lights react. TTS reads tips aloud. Spoilers? You control them.

HACS Custom Release CI License: MIT Python

🚀 Quick Start · 🧠 How it works · 🕹 Supported games · 🗺 Roadmap · 🤝 Contribute


💥 Why this exists

You game. You have a Home Assistant box sitting in the corner. Cloud "AI coaches" want your screen, your account, and a subscription. Why not run the whole thing yourself?

Your GPU, your tips, your house. No SaaS. No telemetry. No spoilers you didn't ask for.

🧿 Sees your game

A Vision LLM looks at JPEG frames over MQTT and produces real tips — not generic walkthrough text.

🏠 Lives in your house

Native HACS integration. Tips, status, sensors, services — all exposed to Lovelace, automations, and HA Assist.

🔒 Stays local

Ollama on your gaming rig or NAS. Cloud backends optional, never required. Zero data leaves your LAN unless you say so.


✨ Highlights

🎯 Four assistant modes Coach · Co-Player · Opponent · Analyst — switch live from the dashboard or by voice.

🃏 Tabletop + console too Chess, Poker, Catan, UNO via webcam. Consoles via HDMI capture or IP webcam. No PC needed.

🎙 HA Assist conversation agent "Wechsel modus auf gegner." "How do I beat this boss?" Both work.

📦 26 prompt packs included Elden Ring, BG3, CS2, Zelda, Hearthstone, MTG Arena, FIFA, Civ VI, Rocket League… plus community packs hot-reloaded from a sibling repo.

🙈 7-category spoiler control Story, items, enemies, bosses, locations, lore, mechanics — each at none / low / medium / high, persisted per game.

🧠 Game State Engine Tracks structured state across frames (health declining, phase changes, momentum) — your tips know what just happened, not only what's on screen.

⚡ Pluggable backends Ollama · LM Studio · GPT-4o · Gemini · DeepSeek · Groq. Vision or text-only. Raspberry Pi friendly.

🟢 Optional YOLO worker Real-time object detection on CUDA / NCNN / Hailo-8L / TFLite, feeding the Game State Engine.


🧠 Architecture

v0.11 is a Thin Client design. The gaming device only captures and ships frames. All intelligence lives in Home Assistant.

┌─────────────────────────────────────────┐        ┌───────────────────────────────────────────────┐
│  CAPTURE AGENT  (you on the couch)      │  MQTT  │  HOME ASSISTANT  (the brain)                  │
│  ─────────────────────────────          │ ─────► │  ─────────────────────────────                │
│  • Windows / Linux / macOS              │  jpeg  │  Image listener  →  hash dedupe               │
│  • Android / Android TV / Google TV     │        │  Game detection  →  Game State Engine         │
│  • IP webcam · HDMI bridge · camera     │        │  Trend detection (health, phase, momentum)    │
│  • JPEG compress · backpressure         │        │  Spoiler policy  →  Prompt builder            │
└─────────────────────────────────────────┘        │  LLM backend  →  Tip                          │
                                                   │  Sensors · Services · Conversation agent      │
                                                   └─────────────┬─────────────────────────────────┘
                                                                 │
                                          ┌──────────────────────┼──────────────────────┐
                                          ▼                      ▼                      ▼
                                   🔊  TTS / Piper        💡  RGB scenes         📱  Notifications
                                   "Watch the left."      red on low HP          mobile_app push
Optional: YOLO Worker (external, GPU/NPU)
   Game State Engine  ◄──  Detections (MQTT)  ◄──  YOLO Worker
                                                   ├─ CUDA  (PC GPU)
                                                   ├─ NCNN  (Raspberry Pi)
                                                   ├─ Hailo-8L  (NPU)
                                                   └─ TFLite  (mobile)

Adds structured object detection (enemies, items, UI elements) on top of the Vision LLM. Pure upgrade; the rest works without it.

Legacy mode (v0.2 / v0.3 workers)

Old workers that publish finished tips to gaming_assistant/tip still work in passthrough. Migrate when you're ready — see Migration.


🚀 Quick Start

1 — Install via HACS

HACS → Integrations → ⋯ → Custom repositories
  URL:   https://github.com/Chance-Konstruktion/ha-gaming-assistant
  Type:  Integration

Restart Home Assistant.

2 — Add the integration

Settings → Devices & Services → Add Integration → Gaming Assistant

A 6-step config flow asks for: LLM provider → connection → model + interval → spoiler default → camera → voice. Cloud providers need an API key. Connection is validated before the flow closes.

3 — Fire up a capture agent

# Pick your platform — they all speak MQTT
pip install -r worker/requirements-capture.txt

python worker/capture_agent.py \
  --broker 192.168.1.10 \
  --client-id gaming-pc \
  --interval 5 \
  --quality 75

Tips start landing on sensor.gaming_assistant_tip within seconds.

💡 Non-technical? Build the Windows GUI: cd worker && build_exe.bat → run GamingAssistant.exe. Enter broker IP, hit Start. Done.


🤖 AI backends

Backend Type Vision Notes
Ollama 🏠 Local Default. No API key. Recommended.
LM Studio 🏠 Local OpenAI-compatible endpoint
OpenAI GPT-4o ☁ Cloud Best quality, paid
Google Gemini ☁ Cloud Free tier available
DeepSeek ☁ Cloud Cheap, text-only
Groq ☁ Cloud Ultra-fast inference, text-only

⚪ Text-only backends never receive images — they get structured game state + context descriptions instead. Great for Raspberry Pi setups without a GPU.

Recommended local vision models (Ollama)

Model VRAM Notes
qwen2.5vl ~8 GB Best quality, recommended
llava ~8 GB Good general purpose
bakllava ~6 GB Lighter option
llama3.2-vision ~10 GB Excellent, needs more VRAM
ministral:3b ~2 GB Lightweight, low-VRAM rigs

🥧 Raspberry Pi / no GPU? Use Gemini free tier or DeepSeek. Integration runs on anything that runs Home Assistant.


🕹 Supported games

🎮 Video (18)

Elden Ring · Dark Souls III · Baldur's Gate 3 · Minecraft · Zelda: TotK · Zelda: BotW · Stardew Valley · Hades · Mario Kart · CS2 · League of Legends · Valorant · Fortnite · Rocket League · FIFA / EA FC · Civ VI · Cyberpunk 2077 · The Witcher 3 · Diablo IV

🃏 Card / Strategy (4)

Hearthstone · MTG Arena · Among Us

♟ Tabletop (4)

Chess · Poker · Catan · UNO

Community packs are pulled from ha-gaming-assistant-prompts and hot-reloaded via the gaming_assistant.refresh_prompt_packs service. Roll your own with docs/pack_authoring.md — schema, validation, local testing, the full workflow. A _template.json ships in the repo.


🎚 Feature deep-dive

🎯 Assistant Modes — Coach, Co-Player, Opponent, Analyst

Switch live from the dashboard via select.gaming_assistant_assistant_mode, or:

action: select.select_option
target:
  entity_id: select.gaming_assistant_assistant_mode
data:
  option: opponent
Mode What it does
Coach Tips and strategy to help you win (default)
Co-Player Collaborative teammate, suggests joint moves
Opponent Plays against you, announces its moves out loud
Analyst Neutral commentator, doesn't take sides
🙈 Spoiler System — 7 categories, 4 levels, per-game profiles

Control what the AI is allowed to reveal across story · items · enemies · bosses · locations · lore · mechanics — each at none / low / medium / high.

service: gaming_assistant.set_spoiler_level
data:
  category: bosses
  level: none
  game: "Elden Ring"   # optional

Set everything at once with a profile:

service: gaming_assistant.set_spoiler_profile
data:
  game: "Elden Ring"
  level: none

Profiles persist across HA restarts. Add clear: true to wipe.

❓ Ask Mode — talk to it like ChatGPT, with or without a screenshot
service: gaming_assistant.ask
data:
  question: "How do I beat this boss?"
  game_hint: "Elden Ring"

With a screenshot:

service: gaming_assistant.ask
data:
  question: "What item is on the ground here?"
  image_path: /config/www/screenshot.jpg
♟ Tabletop Support — point a camera at your board
service: gaming_assistant.watch_camera
data:
  entity_id: camera.board_game_cam
  game_hint: "Chess"
  client_type: tabletop
  interval: 10

Stop with gaming_assistant.stop_watch_camera. Built-in packs for Chess, Poker, Catan, UNO.

🔊 Voice (TTS) + 🗣 Voice Control (HA Assist)

Auto-announce: Flip switch.gaming_assistant_auto_announce on — every new tip is spoken via your configured TTS engine and media player.

Manual:

service: gaming_assistant.announce
data:
  tts_entity: tts.piper
  media_player_entity_id: media_player.living_room

Conversation agent: Pick Gaming Assistant as the conversation agent in Settings → Voice assistants. Built-in commands (EN + DE):

English Deutsch Action
"switch mode to opponent" "wechsel modus auf gegner" Change mode
"set spoiler to low" "ändere spoiler auf niedrig" Change spoiler level
"start" / "stop" "starte" / "stoppe" Pause/resume capture
"current tip" "aktueller tipp" Read latest tip
"session summary" "zusammenfassung" Read session summary
"analyze" "analysiere" Force immediate analysis

Anything else is forwarded to the LLM as a free-form question. "Was ist mein nächster Zug?" works.

📝 Session Summaries — wrap-up after every session

After 5 minutes of inactivity, the integration can auto-summarize the session's key insights (3+ tips required). The result lands in sensor.gaming_assistant_session_summary. A gaming_assistant_session_ended event fires for automations.

Manual:

service: gaming_assistant.summarize_session
data:
  game: "Elden Ring"
📡 Event-based automations — every tip fires gaming_assistant_new_tip
trigger:
  - platform: event
    event_type: gaming_assistant_new_tip
action:
  - service: notify.mobile_app_your_phone
    data:
      title: "Gaming Tip ({{ trigger.event.data.game }})"
      message: "{{ trigger.event.data.tip }}"

Event payload includes tip, game, client_id, assistant_mode. Pair with light scenes, scripts, or whatever HA can do.


📦 Capture agents

Agent Script Good for
PC worker/capture_agent.py Windows / Linux / macOS gaming rigs
Android worker/capture_agent_android.py Phones via ADB (USB or Wi-Fi)
Android TV / Google TV worker/capture_agent_android_tv.py Steam Link, GeForce NOW, Xbox Game Pass on TV
IP Webcam worker/capture_agent_ipcam.py Phone aimed at a TV/monitor, console gaming
HDMI Bridge worker/capture_agent_bridge.py USB HDMI dongle on a Raspberry Pi → any console
Windows GUI GamingAssistant.exe (built via worker/build_exe.bat) Non-technical users — one click to start
CLI cheat sheet

PC agent (capture_agent.py):

Arg Default Notes
--broker (required) MQTT broker IP
--port 1883 MQTT port
--user / --password MQTT auth
--client-id hostname Unique client ID
--interval 5 Seconds between captures
--quality 75 JPEG quality (1–100)
--resize 960x540 Image dimensions
--monitor 1 Monitor index
--game-hint Manual game name (Wayland fallback)
--detect-change off Skip unchanged frames

Linux note: PC agent uses xprop for window-title detection on X11. On Wayland, auto-detection is unavailable — use --game-hint.

Android agent — same as PC, plus:

Arg Notes
--device ADB device serial or IP:port

Android TV agent — same as Android, plus --game-hint (for streaming apps where window title is meaningless).

IP Webcam agent (capture_agent_ipcam.py):

Arg Default Notes
--url (required) e.g. http://192.168.1.42:8080/shot.jpg
--interval 5 Seconds between captures
--quality 75 JPEG quality
--resize 960x540 Image dimensions
--auth-user / --auth-password HTTP basic auth
--timeout 8 HTTP timeout
--game-hint Manual game name
--detect-change off Skip unchanged frames

HTTP errors trigger exponential backoff (2s → 4s → 8s → …, capped at 60s). Exits only after 20 consecutive failures.

HDMI Bridge (capture_agent_bridge.py):

pip install opencv-python
python capture_agent_bridge.py --broker 192.168.1.10 --device /dev/video0
Arg Default Notes
--device /dev/video0 V4L2 device path or index
--capture-resolution 1280x720 Requested input resolution
--resize 960x540 Output size for MQTT
--quality 70 JPEG quality
--interval 2 Seconds between frames
--client-type console Reported source type
--game-hint Manual game name
--detect-change off Skip unchanged frames

A systemd unit ships at worker/systemd/gaming-assistant-bridge.service — adjust broker IP, device path, and user before enabling.

PC Overlay HUD (tools/overlay_pc.py): Subscribes to gaming_assistant/tip and shows the latest tip in an always-on-top transparent window. F8 to toggle, Esc to quit. See tools/README.md.

Android over Wi-Fi (no cable)
adb tcpip 5555
adb connect 192.168.1.42:5555

python worker/capture_agent_android.py \
  --broker 192.168.1.10 \
  --device 192.168.1.42:5555

🕹️ Agent Mode / Player 2 (experimental)

Beyond watching your game, the assistant can play it. worker/agent_executor.py turns the AI's structured action output into real inputs on a virtual Xbox controller (vgamepad).

Why a virtual gamepad? A virtual controller can only send game-controller inputs — it can never move your mouse, alt-tab, or type system commands. The AI is sandboxed to "press buttons", which is the entire safety premise of Agent Mode.

Safety model — opt-in and conservative by default:

  • Whitelist — only buttons you list in --allow-buttons are ever forwarded; anything else is rejected and logged.
  • Dry-run--dry-run (and the automatic fallback when vgamepad isn't installed) validates and logs actions without sending input. Always start here.
  • Audit log — every action (accepted, rejected, or skipped) is appended as one JSON line to --audit-log.
  • Emergency stop — publish stop to gaming_assistant/command to instantly pause and release all inputs; start resumes. Inputs are also released on disconnect and shutdown, so nothing ever stays stuck.
pip install -r worker/requirements-player2.txt   # vgamepad + paho-mqtt
# (Windows also needs the free ViGEmBus driver for vgamepad.)

# 1) Safe first run — validates + logs, sends nothing:
python worker/agent_executor.py --broker 192.168.1.10 --client-id gaming-pc --dry-run

# 2) Go live, restricted to face buttons + D-pad:
python worker/agent_executor.py --broker 192.168.1.10 --client-id gaming-pc \
  --allow-buttons A,B,X,Y,DPAD_UP,DPAD_DOWN,DPAD_LEFT,DPAD_RIGHT

Test it end-to-end by publishing an action yourself (HA → Developer Tools → Actions → mqtt.publish, or mosquitto_pub):

mosquitto_pub -h 192.168.1.10 -t gaming_assistant/gaming-pc/action \
  -m '{"action":"tap_button","button":"A","duration_ms":80,"reason":"confirm"}'
CLI cheat sheet (`agent_executor.py`)
Arg Default Notes
--broker localhost MQTT broker IP
--port 1883 MQTT port
--username / --password MQTT auth
--client-id hostname Must match the capture client
--allow-buttons all Comma-separated whitelist, e.g. A,B,X,Y
--dry-run off Validate + log, never send input
--tap-ms 80 Default tap_button duration
--audit-log agent_executor_audit.log JSON-lines audit trail ('' to disable)

Actions follow PromptBuilder.ACTION_SCHEMA: press_button, release_button, tap_button (buttons A/B/X/Y/LB/RB/LT/RT/DPAD_*/START/BACK), move_stick (left/right, x/y in [-1.0, 1.0]), wait, and no_op.

Let Home Assistant drive it. Enable the Agent Mode switch (or call gaming_assistant.set_agent_mode) and each analyzed frame additionally asks the LLM for one controller action, validates it, and publishes it to gaming_assistant/{client_id}/action for the executor:

action: gaming_assistant.set_agent_mode
data:
  enabled: true
  allowed_buttons: "A, B, X, Y, DPAD_UP, DPAD_DOWN, DPAD_LEFT, DPAD_RIGHT"

Safety: Agent Mode is strictly opt-in and resets to OFF on every Home Assistant restart — the AI never controls inputs unless you deliberately turn it on. It runs a second inference per frame (in addition to the normal tip), so expect higher load, especially on local models. The executor still enforces its own whitelist and --dry-run, and stop on gaming_assistant/command is the emergency brake. Start with the executor in --dry-run to watch the action stream safely before going live.


🧩 Entities & services

Controls (adjustable from the dashboard)
Entity Type Description
select.gaming_assistant_assistant_mode Select Coach / Co-Player / Opponent / Analyst
select.gaming_assistant_spoiler_level Select Default spoiler level
number.gaming_assistant_interval Number Capture interval (5–120 s)
number.gaming_assistant_timeout Number Analysis timeout (10–300 s)
switch.gaming_assistant_auto_announce Switch Auto-announce tips via TTS
switch.gaming_assistant_auto_summary Switch Auto-summarize on session end
Sensors
Entity Description
sensor.gaming_assistant_tip Latest AI tip (attrs: game, worker_status)
sensor.gaming_assistant_status idle / analyzing / error
sensor.gaming_assistant_history Tip count + recent tips
sensor.gaming_assistant_latency Duration of last analysis (s)
sensor.gaming_assistant_error_count Errors since startup
sensor.gaming_assistant_frames_processed Total frames analyzed
sensor.gaming_assistant_last_analysis Timestamp of last success
sensor.gaming_assistant_active_watchers Active camera watchers
sensor.gaming_assistant_registered_workers Auto-discovered workers
sensor.gaming_assistant_session_summary Last session summary
binary_sensor.gaming_mode ON when a game is detected
image.gaming_assistant_last_frame Last received JPEG (debug)
conversation.gaming_assistant Voice control via HA Assist
Services
Service Description
gaming_assistant.analyze Trigger an immediate screenshot analysis
gaming_assistant.start / .stop Pause / resume capture
gaming_assistant.process_image Manually analyze an image (path or base64)
gaming_assistant.ask Ask a direct question (optional image)
gaming_assistant.set_spoiler_level Change spoiler settings per category/game
gaming_assistant.set_spoiler_profile Set/clear a per-game spoiler profile
gaming_assistant.clear_history Clear tip history
gaming_assistant.capture_from_camera One-shot capture from a HA camera entity
gaming_assistant.watch_camera Continuous camera monitoring at interval
gaming_assistant.stop_watch_camera Stop watcher(s)
gaming_assistant.announce Speak current tip (or custom message) via TTS
gaming_assistant.summarize_session Generate a session summary
gaming_assistant.refresh_prompt_packs Hot-reload community packs

Mode, spoiler level, interval, and timeout are now controlled via entities — services are for one-shot actions.


🪟 Lovelace & Automations

A ready-made dashboard ships at lovelace/dashboard.yaml. Drop it into a Manual card — you get current tip, history, spoiler controls, status, and action buttons out of the box.

Sample automations live in lovelace/automations_example.yaml: speak tips via TTS, change RGB color when gaming starts, send tips as mobile notifications, change spoiler level based on detected game.


🧰 Requirements

Component Minimum
Home Assistant 2024.1+ with MQTT integration
MQTT Broker Mosquitto (built-in HA add-on)
Capture device Windows · Linux · macOS · Android · Android TV · IP cam · HDMI bridge
AI backend Ollama (local) or cloud API (GPT-4o, Gemini, DeepSeek, Groq)

🩹 Troubleshooting

Config flow: 500 Internal Server Error

Make sure the MQTT integration (Mosquitto) is fully set up before adding Gaming Assistant. If it persists, delete __pycache__ inside custom_components/gaming_assistant/ and restart HA.

Sensor stuck on "Waiting for tips…"
  • Confirm the capture agent is running and reachable.
  • Verify MQTT (Mosquitto add-on) is up.
  • Check that Ollama is running and reachable from HA.
Ollama timeout

The model may be loading for the first time — wait 60s and retry. Or reduce --quality and --resize in the capture agent.

No game detection on desktop

Install pywin32 on Windows; make sure the game is in the foreground. Or add your title to KNOWN_GAMES in the capture agent.

ADB screencap fails

adb devices should show your phone as device (not unauthorized). Accept the USB-debugging prompt on the device.


🪄 Migration from v0.2 / v0.3

  • Workers: old workers moved to worker/legacy/. Still functional but deprecated — switch to the new capture agents.
  • Config: existing entries remain valid. New fields get defaults automatically.
  • Topics: old MQTT topics (gaming_assistant/tip, …/status, …/gaming_mode) are still supported in legacy passthrough mode.

🤝 Contributing

Issues, PRs, and prompt-pack submissions are welcome.


📜 License

MIT — do whatever you want with it. A ⭐ on the repo is the polite tip-jar.


Built by gamers, for the HA homelab crowd.

v0.11 · Thin Client · Local-first · Open source · Made with 🟦 and 🟪

About

AI-powered gaming coach for Home Assistant – uses local Vision LLM (Ollama) to analyze your screen and push gameplay tips via MQTT. HACS-ready.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors