Skip to content

[Bug]: Model selection does not work with gaia talk #124

@sgvandijk

Description

@sgvandijk

Quick Check ✨

  • I've taken a look at existing issues and discussions
  • I've checked the hardware requirements in the docs
  • This issue relates to GAIA UI (Open-WebUI)

Which version of GAIA are you using?

v0.13.0

Details to help us reproduce the issue

  1. Install Lemonade C++ server on Linux (Arch)
  2. Install Gaia by cloning the latest main (commit 8981b65) and run cd gaia && uv tool install -e '.[talk]'
  3. Run uv run gaia talk --model Qwen3-Coder-30B-A3B-Instruct-GGUF
  4. Wait for 'Listening' to show up, and say some words.

What actually happened?

Gaia gave the following output:

warning: No `requires-python` value found in the workspace. Defaulting to `>=3.12`.
[2025-12-04 14:44:29] | INFO | gaia.audio.audio_client.__init__ | audio_client.py:50 | Audio client initialized.
[2025-12-04 14:44:29] | INFO | gaia.talk.sdk.__init__ | sdk.py:133 | TalkSDK initialized with ChatSDK integration
Starting voice chat...
Say 'stop' to quit or press Ctrl+C
WARNING: Defaulting repo_id to hexgrad/Kokoro-82M. Pass repo_id='hexgrad/Kokoro-82M' to suppress this warning.
/home/sander/src/gaia/.venv/lib/python3.12/site-packages/torch/nn/utils/weight_norm.py:144: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`.
  WeightNorm.apply(module, name, dim)
Starting voice chat.
Say 'stop' to quit application or 'restart' to clear the chat history.
Press Enter key to stop during audio playback.
WARNING: Defaulting repo_id to hexgrad/Kokoro-82M. Pass repo_id='hexgrad/Kokoro-82M' to suppress this warning.
⠏ Hello, good morning, hello, good morning.
[2025-12-04 14:44:46] | INFO | httpx._send_single_request | _client.py:1025 | HTTP Request: POST http://localhost:8000/api/v0/completions "HTTP/1.1 404 Not Found"
[2025-12-04 14:44:46] | ERROR | gaia.llm.llm_client.generate | llm_client.py:257 | Error generating response from local LLM: Error code: 404 - {'error': {'message': 'The requested endpoint does not exist', 'path': '/api/v0/completions', 'type': 'not_found'}}
[2025-12-04 14:44:46] | ERROR | gaia.chat.sdk.send | sdk.py:318 | Error in send: Error code: 404 - {'error': {'message': 'The requested endpoint does not exist', 'path': '/api/v0/completions', 'type': 'not_found'}}
[2025-12-04 14:44:46] | ERROR | gaia.audio.audio_client._process_audio_wrapper | audio_client.py:425 | Error in process_audio_wrapper: Error code: 404 - {'error': {'message': 'The requested endpoint does not exist', 'path': '/api/v0/completions', 'type': 'not_found'}}
[2025-12-04 14:44:46] | INFO | gaia.audio.audio_client.start_voice_chat | audio_client.py:135 | Voice recording stopped
[2025-12-04 14:44:46] | INFO | gaia.talk.sdk.start_voice_session | sdk.py:259 | Voice chat session ended
[2025-12-04 14:44:46] | INFO | gaia.cli.async_main | cli.py:380 | Voice chat session ended.

Lemonade server gave the following output:

[Server PRE-ROUTE] POST /api/v0/completions
[Server] Switching from 'Qwen3-Coder-30B-A3B-Instruct-GGUF' to 'Qwen2.5-0.5B-Instruct-CPU'
[Server ERROR] Failed to load model: Model not found: Qwen2.5-0.5B-Instruct-CPU
[Server] Error 404: POST /api/v0/completions
[Server] POST /api/v0/completions - 404

What did you expect to happen?

I expected that the Lemonade server would not try to switch to gaia talk's default model, but that the model provided with --model would be respected.

Other Gaia commands do use the correct model, for instance uv run gaia llm --model Qwen3-Coder-30B-A3B-Instruct-GGUF "Hello, good morning" runs successfully and produces output.

How did you install GAIA?

Git Clone

Which mode are you running?

Generic

What's your CPU?

AMD Ryzen AI 9 HX 370

What about your GPU setup?

None

AMD GPU Driver Version

No response

NPU Driver Version

No response

Lemonade Version (if applicable)

9.0.3

What's your operating system?

Arch Linux

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions