akdeb
diff --git a/‎server-fastapi/README.md‎
Lines changed: 256 additions & 0 deletions b/‎server-fastapi/README.md‎
Lines changed: 256 additions & 0 deletions
diff --git a/‎server-fastapi/__pycache__/classic_route.cpython-313.pyc‎
1.91 KB b/‎server-fastapi/__pycache__/classic_route.cpython-313.pyc‎
1.91 KB
diff --git a/‎server-fastapi/classic_route.py‎
Lines changed: 55 additions & 0 deletions b/‎server-fastapi/classic_route.py‎
Lines changed: 55 additions & 0 deletions
diff --git a/‎server-fastapi/env.example‎
Lines changed: 27 additions & 0 deletions b/‎server-fastapi/env.example‎
Lines changed: 27 additions & 0 deletions
diff --git a/‎server-fastapi/models/llm/__init__.py‎
Lines changed: 20 additions & 0 deletions b/‎server-fastapi/models/llm/__init__.py‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎server-fastapi/models/llm/__pycache__/__init__.cpython-313.pyc‎
797 Bytes b/‎server-fastapi/models/llm/__pycache__/__init__.cpython-313.pyc‎
797 Bytes
diff --git a/‎server-fastapi/models/stt/__init__.py‎
Lines changed: 16 additions & 0 deletions b/‎server-fastapi/models/stt/__init__.py‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎server-fastapi/models/stt/__pycache__/__init__.cpython-313.pyc‎
677 Bytes b/‎server-fastapi/models/stt/__pycache__/__init__.cpython-313.pyc‎
677 Bytes
diff --git a/‎server-fastapi/models/tts/__init__.py‎
Lines changed: 17 additions & 0 deletions b/‎server-fastapi/models/tts/__init__.py‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎server-fastapi/models/tts/__pycache__/__init__.cpython-313.pyc‎
719 Bytes b/‎server-fastapi/models/tts/__pycache__/__init__.cpython-313.pyc‎
719 Bytes
@@ -0,0 +1,256 @@
+## ElatoAI: Realtime Voice AI Models on FastAPI
+
+`server-fastapi` is the simplest self-hosted Elato backend for people who want a normal Python server instead of an edge runtime.
+
+Use this if you want:
+
+- a FastAPI server you can run on your own machine or VM
+- a classic `STT -> LLM -> TTS` voice pipeline
+- a smaller provider surface that is easy to understand
+- the same ESP32 transport shape as the rest of Elato
+
+If you are new to the project, read these first:
+
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/README.md`
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/README.md`
+
+## The Simple Provider Set
+
+To keep onboarding straightforward, the classic FastAPI route is centered around a small set of providers.
+
+### LLM
+
+- `openai`
+- `claude`
+- `gemini`
+- `grok`
+
+### STT
+
+- `deepgram`
+- `whisper`
+
+### TTS
+
+- `elevenlabs`
+- `cartesia`
+- `deepgram`
+- `openai`
+
+The code still uses the `models/llm`, `models/stt`, and `models/tts` layout, but the active registry is intentionally trimmed so the default experience stays simple.
+
+## Default Setup
+
+The default classic route is:
+
+- STT: `deepgram`
+- LLM: `openai`
+- TTS: `elevenlabs`
+
+That gives people one obvious path to get running before they start swapping providers.
+
+## Project Layout
+
+```text
+server-fastapi/
+├── bot.py
+├── classic_route.py
+├── esp32_transport.py
+├── server.py
+├── env.example
+└── models/
+    ├── llm/
+    ├── stt/
+    └── tts/
+```
+
+## How The FastAPI Server Fits Into Elato
+
+Elato has three backend options right now:
+
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/deno`
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server/cloudflare`
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi`
+
+A clean way to think about them is:
+
+- `Deno`: edge-first, mature provider integrations
+- `Cloudflare`: Workers + Durable Objects + Workers AI
+- `FastAPI`: normal Python server, easy to self-host, easy to reason about
+
+## Quick Start
+
+### 1. Create or activate your Python environment
+
+Use whatever you prefer. If you already use `uv`, that is a good default.
+
+### 2. Install dependencies
+
+This repo uses `pyproject.toml`, so install from that environment rather than a `requirements.txt` file.
+
+With `uv`:
+
+```bash
+cd /Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi
+uv sync
+```
+
+Or with plain pip in your venv:
+
+```bash
+cd /Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi
+pip install -e .
+```
+
+### 3. Create your env file
+
+Copy the example values from:
+
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/env.example`
+
+Minimum example for the default route:
+
+```env
+DEEPGRAM_API_KEY=your_deepgram_api_key
+OPENAI_API_KEY=your_openai_api_key
+ELEVENLABS_API_KEY=your_elevenlabs_api_key
+
+CURRENT_VOICE_ROUTE=classic
+CLASSIC_STT_PROVIDER=deepgram
+CLASSIC_LLM_PROVIDER=openai
+CLASSIC_TTS_PROVIDER=elevenlabs
+
+ESP32_INPUT_SAMPLE_RATE=16000
+BROWSER_INPUT_SAMPLE_RATE=16000
+AUDIO_OUTPUT_SAMPLE_RATE=24000
+PIPELINE_AUDIO_IN_SAMPLE_RATE=16000
+PIPELINE_AUDIO_OUT_SAMPLE_RATE=24000
+
+ALLOWED_ORIGINS=*
+HOST=0.0.0.0
+PORT=7860
+```
+
+### 4. Run the server
+
+If you use `uv`:
+
+```bash
+cd /Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi
+uv run server.py
+```
+
+If you use your activated venv directly:
+
+```bash
+cd /Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi
+python server.py
+```
+
+### 5. Point your ESP32 at the FastAPI backend
+
+Update the firmware config so your hardware connects to this server instead of the Deno or Cloudflare backend.
+
+The ESP32 route is:
+
+```text
+/ws/esp32
+```
+
+For browser or Next.js testing, the server also exposes:
+
+- `/ws/browser`
+- `/ws/nextjs`
+
+## How Provider Selection Works
+
+The classic route reads three env vars:
+
+- `CLASSIC_STT_PROVIDER`
+- `CLASSIC_LLM_PROVIDER`
+- `CLASSIC_TTS_PROVIDER`
+
+So changing providers is just an env change.
+
+Examples:
+
+### OpenAI + Deepgram + ElevenLabs
+
+```env
+CLASSIC_STT_PROVIDER=deepgram
+CLASSIC_LLM_PROVIDER=openai
+CLASSIC_TTS_PROVIDER=elevenlabs
+```
+
+### Whisper + Claude + Cartesia
+
+```env
+CLASSIC_STT_PROVIDER=whisper
+CLASSIC_LLM_PROVIDER=claude
+CLASSIC_TTS_PROVIDER=cartesia
+```
+
+### Deepgram + Gemini + OpenAI TTS
+
+```env
+CLASSIC_STT_PROVIDER=deepgram
+CLASSIC_LLM_PROVIDER=gemini
+CLASSIC_TTS_PROVIDER=openai
+```
+
+## Unified Experience Across Elato
+
+A simple way to keep the product understandable is:
+
+- keep the Next.js frontend focused on character creation and device management
+- keep the ESP32 firmware focused on one transport protocol
+- let users choose one backend runtime:
+  - Deno
+  - Cloudflare
+  - FastAPI
+- inside each backend, expose the same conceptual knobs:
+  - `STT`
+  - `LLM`
+  - `TTS`
+
+That means the hardware story stays stable:
+
+- one firmware
+- one websocket-style mental model
+- three server deployment choices
+
+The cleanest unification strategy is not “every backend supports every provider.”
+It is:
+
+- every backend should expose the same categories
+- each backend should have one recommended default stack
+- advanced users can swap providers later
+
+## Recommended Defaults
+
+If you want a simple opinionated experience for users, keep one default combo per backend.
+
+Suggested defaults:
+
+- `Deno`: OpenAI realtime
+- `Cloudflare`: Workers AI STT/TTS + OpenAI LLM
+- `FastAPI`: Deepgram + OpenAI + ElevenLabs
+
+That gives users one obvious starting point without taking away flexibility.
+
+## Important Files
+
+If you want to change the FastAPI backend, start here:
+
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/server.py`
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/classic_route.py`
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/esp32_transport.py`
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/models/llm/__init__.py`
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/models/stt/__init__.py`
+- `/Users/akashdeepdeb/Desktop/Projects/ElatoAI/server-fastapi/models/tts/__init__.py`
+
+## Current Notes
+
+- The filesystem still contains many scaffolded provider modules from the earlier broader experiment.
+- The active provider registry is now intentionally much smaller.
+- That means the codebase stays extensible, but the user-facing default path stays simple.
@@ -0,0 +1,55 @@
+"""Classic STT -> LLM -> TTS pipeline builder."""
+
+from __future__ import annotations
+
+import os
+
+from character_prompt import LANGUAGE_LEARNING_PAL_PROMPT
+from loguru import logger
+from models.llm import create_llm_service
+from models.stt import create_stt_service
+from models.tts import create_tts_service
+from pipecat.audio.vad.silero import SileroVADAnalyzer
+from pipecat.audio.vad.vad_analyzer import VADParams
+from pipecat.processors.aggregators.llm_context import LLMContext
+from pipecat.processors.aggregators.llm_response_universal import (
+    LLMContextAggregatorPair,
+    LLMUserAggregatorParams,
+)
+
+
+def build_classic_route(input_processor, context: LLMContext):
+    stt_provider = os.getenv("CLASSIC_STT_PROVIDER", "deepgram")
+    llm_provider = os.getenv("CLASSIC_LLM_PROVIDER", "openai")
+    tts_provider = os.getenv("CLASSIC_TTS_PROVIDER", "elevenlabs")
+
+    logger.info(
+        "Building classic route with stt={} llm={} tts={}",
+        stt_provider,
+        llm_provider,
+        tts_provider,
+    )
+
+    stt = create_stt_service(stt_provider)
+    llm = create_llm_service(
+        llm_provider,
+        system_instruction=LANGUAGE_LEARNING_PAL_PROMPT,
+    )
+    tts = create_tts_service(tts_provider)
+
+    user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
+        context,
+        user_params=LLMUserAggregatorParams(
+            vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=1))
+        ),
+    )
+
+    processors = [
+        input_processor,
+        stt,
+        user_aggregator,
+        llm,
+        tts,
+    ]
+
+    return processors, assistant_aggregator
@@ -0,0 +1,27 @@
+DEEPGRAM_API_KEY=your_deepgram_api_key
+OPENAI_API_KEY=your_openai_api_key
+ANTHROPIC_API_KEY=your_anthropic_api_key
+GEMINI_API_KEY=your_gemini_api_key
+XAI_API_KEY=your_xai_api_key
+ELEVENLABS_API_KEY=your_elevenlabs_api_key
+CARTESIA_API_KEY=your_cartesia_api_key
+
+# Classic route providers
+CURRENT_VOICE_ROUTE=classic
+CLASSIC_STT_PROVIDER=deepgram
+CLASSIC_LLM_PROVIDER=openai
+CLASSIC_TTS_PROVIDER=elevenlabs
+
+# Transport and pipeline sample rates
+ESP32_INPUT_SAMPLE_RATE=16000
+BROWSER_INPUT_SAMPLE_RATE=16000
+AUDIO_OUTPUT_SAMPLE_RATE=24000
+PIPELINE_AUDIO_IN_SAMPLE_RATE=16000
+PIPELINE_AUDIO_OUT_SAMPLE_RATE=24000
+
+# Browser / Next.js access
+ALLOWED_ORIGINS=*
+
+# WebSocket server settings
+HOST=0.0.0.0
+PORT=7860
@@ -0,0 +1,20 @@
+"""LLM provider registry."""
+
+from __future__ import annotations
+
+from models._provider_loader import load_provider_factory
+
+LLM_REGISTRY = {
+    "claude": "models.llm.anthropic",
+    "anthropic": "models.llm.anthropic",
+    "gemini": "models.llm.google_gemini",
+    "google_gemini": "models.llm.google_gemini",
+    "google_vertex_ai": "models.llm.google_vertex_ai",
+    "grok": "models.llm.grok",
+    "openai": "models.llm.openai",
+}
+
+
+def create_llm_service(provider_name: str, **kwargs):
+    factory = load_provider_factory(LLM_REGISTRY, provider_name, "LLM")
+    return factory(**kwargs)
@@ -0,0 +1,16 @@
+"""STT provider registry."""
+
+from __future__ import annotations
+
+from models._provider_loader import load_provider_factory
+
+STT_REGISTRY = {
+    "deepgram": "models.stt.deepgram",
+    "openai": "models.stt.openai",
+    "whisper": "models.stt.whisper",
+}
+
+
+def create_stt_service(provider_name: str, **kwargs):
+    factory = load_provider_factory(STT_REGISTRY, provider_name, "STT")
+    return factory(**kwargs)
@@ -0,0 +1,17 @@
+"""TTS provider registry."""
+
+from __future__ import annotations
+
+from models._provider_loader import load_provider_factory
+
+TTS_REGISTRY = {
+    "cartesia": "models.tts.cartesia",
+    "deepgram": "models.tts.deepgram",
+    "elevenlabs": "models.tts.elevenlabs",
+    "openai": "models.tts.openai",
+}
+
+
+def create_tts_service(provider_name: str, **kwargs):
+    factory = load_provider_factory(TTS_REGISTRY, provider_name, "TTS")
+    return factory(**kwargs)