Skip to content

Latest commit

Β 

History

History
157 lines (122 loc) Β· 5.95 KB

File metadata and controls

157 lines (122 loc) Β· 5.95 KB

Software β€” Stack Completo

Camadas do Sistema

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          Camada 5: Conversation Agent        β”‚
β”‚          OpenClaw (Quasar / Claude)          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚          Camada 4: Voice Pipeline            β”‚
β”‚          Home Assistant Assist               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Camada 3a    β”‚  Camada 3b   β”‚  Camada 3c   β”‚
β”‚  STT: Whisper β”‚  TTS: Piper  β”‚  WW: openWW  β”‚
β”‚  (Wyoming)    β”‚  (Wyoming)   β”‚  (Wyoming)   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚          Camada 2: ESPHome Native API        β”‚
β”‚          ComunicaΓ§Γ£o ESP32 ↔ HA              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚          Camada 1: Firmware ESPHome          β”‚
β”‚          ESP32-S3 (mic + speaker + LED)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Componentes de Software

No ESP32-S3 (Firmware)

Componente Tecnologia FunΓ§Γ£o
Firmware base ESPHome Framework de configuraΓ§Γ£o YAML, OTA, API nativa
Voice Assistant voice_assistant component Streaming Γ‘udio ↔ HA pipeline
Wake Word micro_wake_word component DetecΓ§Γ£o local no ESP32 (TFLite)
Microfone i2s_audio + microphone Captura Γ‘udio I2S do INMP441
Speaker i2s_audio + speaker ReproduΓ§Γ£o Γ‘udio I2S via MAX98357A
LED light + neopixelbus Feedback visual WS2812B
Wi-Fi wifi component ConexΓ£o Γ  rede local
Logger logger component Debug via USB serial

No Servidor (DeskFelipeDell)

Componente Tecnologia Porta FunΓ§Γ£o
Home Assistant HA Core :8123 Orquestrador central + Voice Pipeline
Whisper whisper.cpp / faster-whisper Wyoming Speech-to-Text local
Piper piper-tts Wyoming Text-to-Speech neural local
openWakeWord openWakeWord Wyoming Wake word backup (treinamento custom)
OpenClaw Clawdbot β€” Conversation Agent (Claude API)
HA Integrations Wyoming + OpenAI Conv. β€” Cola entre componentes

Home Assistant β€” ConfiguraΓ§Γ£o necessΓ‘ria

IntegraΓ§Γ΅es

  1. Wyoming Protocol β€” Conecta Whisper, Piper e openWakeWord
  2. ESPHome β€” Conecta os QuasarBox satellites
  3. OpenAI Conversation (ou custom) β€” Conversation agent apontando pro OpenClaw

Voice Pipeline (Assist)

# ConfiguraΓ§Γ£o via UI do HA: Settings β†’ Voice Assistants
# Pipeline "Quasar":
#   - STT: Whisper (Wyoming)
#   - Conversation Agent: OpenClaw (custom/OpenAI-compatible)
#   - TTS: Piper (Wyoming - pt-BR)
#   - Wake Word: openWakeWord (Wyoming)

ServiΓ§os Wyoming

O Wyoming protocol roda cada componente como um serviΓ§o TCP independente:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ faster-whisper (Wyoming server)  β”‚ :10300
β”‚ Modelo: small / medium           β”‚
β”‚ LΓ­ngua: pt-BR                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ piper-tts (Wyoming server)       β”‚ :10200
β”‚ Voz: pt_BR-faber-medium          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ openWakeWord (Wyoming server)    β”‚ :10400
β”‚ Modelo: custom "ei_quasar"       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Conversation Agent β€” OpenClaw como "CΓ©rebro"

Forma 1: HA como Ponte (escolhida βœ…)

O HA tem suporte nativo a conversation agents via integraΓ§Γ£o OpenAI Conversation, que aceita qualquer API compatΓ­vel com o formato OpenAI Chat Completions.

O OpenClaw expΓ΅e (ou pode expor) um endpoint compatΓ­vel. O fluxo:

Voice Pipeline β†’ STT β†’ texto
  β†’ Conversation Agent (OpenClaw API)
    β†’ Claude interpreta o comando
    β†’ Chama HA API se necessΓ‘rio (tools/function calling)
    β†’ Retorna texto de resposta
  β†’ TTS β†’ Γ‘udio
  β†’ Volta pro ESP32

Vantagens sobre Assist nativo:

  • Entende linguagem natural complexa ("tΓ‘ um forno aqui")
  • MantΓ©m contexto da conversa
  • Pode executar aΓ§Γ΅es compostas ("modo filme")
  • Integra com serviΓ§os externos (Γ“rbita, TV, etc.)

Alternativa: Extended OpenAI Conversation (HACS)

Se a integraΓ§Γ£o nativa nΓ£o for suficiente, existe o Extended OpenAI Conversation via HACS que suporta:

  • Function calling (chamar serviΓ§os HA)
  • Prompt templates
  • Qualquer API OpenAI-compatible

DependΓͺncias de Software

Servidor (pip / Docker)

# Whisper (jΓ‘ instalado)
whisper-cpp ou faster-whisper

# Piper TTS
piper-tts

# openWakeWord
openwakeword

# Wyoming servers
wyoming-faster-whisper
wyoming-piper
wyoming-openwakeword

ESPHome (pip)

esphome >= 2024.2.0

Versionamento

Componente VersΓ£o mΓ­nima
Home Assistant 2024.2+ (voice pipeline v2)
ESPHome 2024.2+ (voice_assistant v2, micro_wake_word)
Whisper large-v3 / medium (pt-BR)
Piper 1.2+ (pt_BR voices)
Python (servidor) 3.10+