Transparency notice: This integration was entirely vibe coded with GitHub Copilot. Use it accordingly.
Run fully local voice assistants in Home Assistant using Lemonade as the AI backend.
| Capability | Backend | Models |
|---|---|---|
| Speech-to-Text | Whisper via whisper.cpp or FastFlowLM | Whisper Tiny → Large-v3-Turbo |
| Conversation / LLM | LLama.cpp (Vulkan/ROCm/CPU), FastFlowLM (NPU), RyzenAI (NPU) | Qwen 3, Llama 3.2, Phi-4, Mistral 7B, and 100+ more |
| Text-to-Speech | Kokoro TTS | kokoro-v1 (multiple voices) |
- Lemonade server installed and running on your machine. Download from https://github.com/lemonade-sdk/lemonade/releases.
- If running Home Assistant OS or Supervised, Lemonade should run on the
host (not inside a container). Set the host to
homeassistant.localor the host's LAN IP.
The integration can be installed and updated directly through HACS, with no add-on required.
- In HACS, click ⋮ → Custom repositories, paste
https://github.com/rwfsmith/Lemonade-HA, choose category Integration, and click Add. - Search for Lemonade HA and click Download.
- Restart Home Assistant so HA loads the custom component.
- Proceed to Add the integration below.
To update: HACS will notify you when a new version is available — click Update and restart HA.
The add-on bundles the integration and copies it into your HA config directory on startup. Use this method if you prefer not to use HACS.
- Settings → Add-ons → Add-on Store → ⋮ → Repositories, add:
https://github.com/rwfsmith/Lemonade-HA - Find Lemonade HA, click Install, then Start.
- Restart Home Assistant once so HA loads the custom component.
Tip: If you later switch to HACS-managed updates, set
copy_integration: falsein the add-on Configuration tab. The add-on will no longer overwrite the component on restart, so HACS controls the integration version independently.
Settings → Devices & Services → Add Integration → search "Lemonade HA"
- Enter the host and port of your Lemonade server (default
localhost:8000). HA will verify the connection before proceeding. - Choose your STT, LLM, and TTS models.
- Click Submit. The integration registers:
- Lemonade Whisper — Speech-to-Text engine
- Lemonade Kokoro — Text-to-Speech engine
- Lemonade LLM — Conversation agent
Settings → Voice Assistants → Add Assistant
| Field | Value |
|---|---|
| Conversation agent | Lemonade LLM |
| Speech-to-text | Lemonade Whisper |
| Text-to-speech | Lemonade Kokoro |
| Option | Default | Description |
|---|---|---|
| STT Model | Whisper-Large-v3-Turbo |
Whisper model size. Larger = more accurate, slower. |
| STT Backend | auto |
whispercpp:cpu, whispercpp:npu, whispercpp:vulkan, or flm:npu |
| STT Language | en |
ISO 639-1 code or auto for language detection |
| STT Beam Size | 0 |
Beam search width (0 = auto) |
| Option | Default | Description |
|---|---|---|
| LLM Backend | llamacpp |
llamacpp, ryzenai, or flm |
| LLM Model | Qwen3-4B-GGUF |
Model name for the selected backend |
| LLM Context Size | 4096 |
Max context window in tokens |
| LLM Max Tokens | 256 |
Max tokens per response |
| LLM System Prompt | (built-in) | Customise the assistant personality |
| Option | Default | Description |
|---|---|---|
| TTS Model | kokoro-v1 |
Kokoro ONNX model |
| TTS Voice | af_heart |
Voice ID (af_ = American Female, am_ = American Male, bf_ = British Female, bm_ = British Male) |
| Option | Default | Description |
|---|---|---|
| Lemonade Host | localhost |
Hostname or IP of the Lemonade server |
| Lemonade Port | 8000 |
HTTP port of the Lemonade server |
| Auto-start Lemonade | false |
Try to launch lemond if not already running |
| Hardware | Recommended Backend |
|---|---|
| AMD Ryzen AI (NPU) | ryzenai (LLM), flm:npu (STT) |
| AMD GPU (Radeon) | llamacpp + vulkan sub-backend |
| CPU only | llamacpp + cpu sub-backend |
| Apple Silicon | llamacpp + metal sub-backend |
Add-on starts but integration doesn't appear after HA restart
→ Check that the add-on started successfully and that /config/custom_components/lemonade_wyoming/ exists. Restart HA again if needed.
"Cannot connect" error when adding integration
→ Verify Lemonade is running (curl http://localhost:8000/api/v1/health). If HA OS, use the host LAN IP instead of localhost.
Models not downloading → Check the add-on log (Add-ons → Lemonade HA → Log). Model downloads can take several minutes.