Skip to content

feat: add MiniMax provider support (Chat + TTS)#82

Open
octo-patch wants to merge 1 commit into
debpalash:mainfrom
octo-patch:feature/add-minimax-provider
Open

feat: add MiniMax provider support (Chat + TTS)#82
octo-patch wants to merge 1 commit into
debpalash:mainfrom
octo-patch:feature/add-minimax-provider

Conversation

@octo-patch
Copy link
Copy Markdown

@octo-patch octo-patch commented May 18, 2026

Summary

Add MiniMax as a new provider for both Chat (LLM) and TTS capabilities:

  • Chat Model: MiniMaxBackend — OpenAI-compatible adapter using models MiniMax-M2.7 and MiniMax-M2.7-highspeed. Follows the same registry pattern as the existing OpenAICompatBackend.
  • TTS: MiniMaxTTSBackend — cloud-based text-to-speech via MiniMax's T2A API (speech-2.8-hd, speech-2.8-turbo). Supports 6 English voices, hex-encoded audio decoding, and auto-appears in Settings → Engines.
  • Shared API key: Both adapters use MINIMAX_API_KEY (get one at platform.minimax.io)
  • Tests: 12 new unit tests covering registration, availability, config, and env var overrides — all passing

Environment Variables

Variable Required Purpose
MINIMAX_API_KEY Yes API key for both Chat and TTS
MINIMAX_BASE_URL No Override default base URL
MINIMAX_MODEL No Override default Chat model (default: MiniMax-M2.7)
MINIMAX_TTS_MODEL No Override default TTS model (default: speech-2.8-hd)

API Documentation

Test Plan

  • Unit tests pass (uv run pytest tests/test_engines.py — 26 passed, 2 skipped)
  • Integration test: Chat API returns valid response with MiniMax-M2.7
  • Integration test: TTS API returns valid hex-encoded MP3 audio (27KB for short text)
  • Existing tests unaffected (only 1 assertion updated to include new provider in registry check)

Summary by CodeRabbit

  • New Features
    • Added MiniMax as a supported language model provider. Configure with MINIMAX_API_KEY environment variable.
    • Added MiniMax as a supported text-to-speech provider. Configure with MINIMAX_API_KEY and optional MINIMAX_BASE_URL and MINIMAX_MODEL variables.
    • Both backends are now discoverable and selectable in application settings.

Review Change Stack

- Add MiniMaxBackend LLM adapter using OpenAI-compatible API
  (models: MiniMax-M2.7, MiniMax-M2.7-highspeed)
- Add MiniMaxTTSBackend for text-to-speech via MiniMax T2A API
  (models: speech-2.8-hd, speech-2.8-turbo)
- Add MINIMAX_API_KEY environment variable support (shared by both)
- Add unit tests for registration, availability, and config
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

📝 Walkthrough

Walkthrough

Adds MiniMax provider integration for LLM chat completions and TTS audio synthesis. The LLM backend uses OpenAI-compatible chat API with forced temperature=1.0. The TTS backend posts JSON to MiniMax API, decodes hex-encoded MP3 audio, and resamples to 32 kHz mono via torchaudio. Both backends register in module-level maps and include comprehensive test coverage.

Changes

MiniMax Backend Support

Layer / File(s) Summary
MiniMax LLM Backend Implementation and Registration
backend/services/llm_backend.py
MiniMaxBackend class implements the LLMBackend interface, checks MINIMAX_API_KEY availability, lazily constructs an OpenAI client with MINIMAX_BASE_URL, and calls chat.completions.create() with temperature=1.0 forced. Registered in _REGISTRY under "minimax" key for discoverability.
MiniMax TTS Backend Implementation, Registration, and Config
backend/services/tts_backend.py
MiniMaxTTSBackend class implements TTSBackend, gates on MINIMAX_API_KEY, builds JSON POST to /v1/t2a_v2, decodes hex-encoded MP3 audio from the response, loads and resamples to 32 kHz via torchaudio, forces mono output shape (1, n_samples), and validates API status codes and response data. Registered in _REGISTRY and added to _INSTALL_HINTS for Settings UI configuration hints.
Backend Registry and MiniMax Test Coverage
tests/test_engines.py
Updated test_llm_registry_includes_off to assert minimax backend presence. Added test suites for MiniMax LLM (registry, availability gating, model name configuration, env selection) and MiniMax TTS (API-key gating, sample rate assertion, voice/language support, env selection, registry inclusion), with conditional skip markers for missing openai package.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A rabbit's ode to new voices far and wide:
MiniMax joins the chorus, with TTS as guide,
Chat flows at temperature one, so precise,
Audio resamples to 32k—crisp and nice!
Tests verify every quirk and detail inside. 🎵

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding MiniMax provider support for both Chat and TTS.
Description check ✅ Passed The description is comprehensive and follows the template structure with all key sections completed: Summary, Changes (implicit in the detailed breakdown), Type (New feature), Testing (test plan provided), and important implementation details.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/services/tts_backend.py`:
- Around line 1167-1193: The code builds the TTS endpoint by unconditionally
appending "/v1/t2a_v2" to the MINIMAX_BASE_URL (variable base_url) which causes
double "/v1/v1" if MINIMAX_BASE_URL already contains a path; fix by normalizing
base_url before composing the Request: strip trailing slashes and also remove a
trailing "/v1" segment if present (or use urllib.parse.urljoin to safely join
base_url and "/v1/t2a_v2"), then use the normalized base_url in the Request call
(refer to base_url and the Request f"{base_url}/v1/t2a_v2" in tts_backend.py).

In `@tests/test_engines.py`:
- Around line 162-165: The test test_llm_minimax_default_model relies on ambient
MINIMAX_MODEL env state; before creating MiniMaxBackend() ensure the env var is
isolated by removing or overriding it (e.g., call
os.environ.pop("MINIMAX_MODEL", None) or use pytest's
monkeypatch.delenv("MINIMAX_MODEL", raising=False)) so
MiniMaxBackend().model_name equals the hardcoded default "MiniMax-M2.7"; update
the test to remove/delenv the variable just prior to instantiating
MiniMaxBackend to guarantee deterministic behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: eedf3a6b-c1e9-4dd4-940b-36a45f131a87

📥 Commits

Reviewing files that changed from the base of the PR and between 21c338c and 7472d6d.

📒 Files selected for processing (3)
  • backend/services/llm_backend.py
  • backend/services/tts_backend.py
  • tests/test_engines.py

Comment on lines +1167 to +1193
base_url = os.environ.get("MINIMAX_BASE_URL", "https://api.minimax.io")
base_url = base_url.rstrip("/")
model = os.environ.get("MINIMAX_TTS_MODEL", "speech-2.8-hd")
voice = kw.get("voice", self.VOICES[0])
speed = float(kw.get("speed", 1.0))

payload = json.dumps({
"model": model,
"text": text,
"stream": False,
"voice_setting": {
"voice_id": voice,
"speed": speed,
"vol": 1,
"pitch": 0,
},
"audio_setting": {
"sample_rate": self.sample_rate,
"bitrate": 128000,
"format": "mp3",
"channel": 1,
},
}).encode()

req = urllib.request.Request(
f"{base_url}/v1/t2a_v2",
data=payload,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Normalize MINIMAX_BASE_URL before composing the TTS endpoint.

This currently appends /v1/t2a_v2 unconditionally. If MINIMAX_BASE_URL is set to https://api.minimax.io/v1 (the same shape used by the MiniMax LLM backend), requests go to /v1/v1/t2a_v2 and fail.

Suggested fix
-        base_url = os.environ.get("MINIMAX_BASE_URL", "https://api.minimax.io")
-        base_url = base_url.rstrip("/")
+        base_url = os.environ.get("MINIMAX_BASE_URL", "https://api.minimax.io/v1").rstrip("/")
+        endpoint = (
+            f"{base_url}/t2a_v2"
+            if base_url.endswith("/v1")
+            else f"{base_url}/v1/t2a_v2"
+        )
@@
-            f"{base_url}/v1/t2a_v2",
+            endpoint,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
base_url = os.environ.get("MINIMAX_BASE_URL", "https://api.minimax.io")
base_url = base_url.rstrip("/")
model = os.environ.get("MINIMAX_TTS_MODEL", "speech-2.8-hd")
voice = kw.get("voice", self.VOICES[0])
speed = float(kw.get("speed", 1.0))
payload = json.dumps({
"model": model,
"text": text,
"stream": False,
"voice_setting": {
"voice_id": voice,
"speed": speed,
"vol": 1,
"pitch": 0,
},
"audio_setting": {
"sample_rate": self.sample_rate,
"bitrate": 128000,
"format": "mp3",
"channel": 1,
},
}).encode()
req = urllib.request.Request(
f"{base_url}/v1/t2a_v2",
data=payload,
base_url = os.environ.get("MINIMAX_BASE_URL", "https://api.minimax.io/v1").rstrip("/")
endpoint = (
f"{base_url}/t2a_v2"
if base_url.endswith("/v1")
else f"{base_url}/v1/t2a_v2"
)
model = os.environ.get("MINIMAX_TTS_MODEL", "speech-2.8-hd")
voice = kw.get("voice", self.VOICES[0])
speed = float(kw.get("speed", 1.0))
payload = json.dumps({
"model": model,
"text": text,
"stream": False,
"voice_setting": {
"voice_id": voice,
"speed": speed,
"vol": 1,
"pitch": 0,
},
"audio_setting": {
"sample_rate": self.sample_rate,
"bitrate": 128000,
"format": "mp3",
"channel": 1,
},
}).encode()
req = urllib.request.Request(
endpoint,
data=payload,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/services/tts_backend.py` around lines 1167 - 1193, The code builds
the TTS endpoint by unconditionally appending "/v1/t2a_v2" to the
MINIMAX_BASE_URL (variable base_url) which causes double "/v1/v1" if
MINIMAX_BASE_URL already contains a path; fix by normalizing base_url before
composing the Request: strip trailing slashes and also remove a trailing "/v1"
segment if present (or use urllib.parse.urljoin to safely join base_url and
"/v1/t2a_v2"), then use the normalized base_url in the Request call (refer to
base_url and the Request f"{base_url}/v1/t2a_v2" in tts_backend.py).

Comment thread tests/test_engines.py
Comment on lines +162 to +165
def test_llm_minimax_default_model():
be = llm_backend.MiniMaxBackend()
assert be.model_name == "MiniMax-M2.7"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Isolate MINIMAX_MODEL in the default-model test.

This test depends on ambient environment state and can fail when MINIMAX_MODEL is already set in CI/dev shells.

Suggested fix
-def test_llm_minimax_default_model():
+def test_llm_minimax_default_model(monkeypatch):
+    monkeypatch.delenv("MINIMAX_MODEL", raising=False)
     be = llm_backend.MiniMaxBackend()
     assert be.model_name == "MiniMax-M2.7"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_llm_minimax_default_model():
be = llm_backend.MiniMaxBackend()
assert be.model_name == "MiniMax-M2.7"
def test_llm_minimax_default_model(monkeypatch):
monkeypatch.delenv("MINIMAX_MODEL", raising=False)
be = llm_backend.MiniMaxBackend()
assert be.model_name == "MiniMax-M2.7"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_engines.py` around lines 162 - 165, The test
test_llm_minimax_default_model relies on ambient MINIMAX_MODEL env state; before
creating MiniMaxBackend() ensure the env var is isolated by removing or
overriding it (e.g., call os.environ.pop("MINIMAX_MODEL", None) or use pytest's
monkeypatch.delenv("MINIMAX_MODEL", raising=False)) so
MiniMaxBackend().model_name equals the hardcoded default "MiniMax-M2.7"; update
the test to remove/delenv the variable just prior to instantiating
MiniMaxBackend to guarantee deterministic behavior.

@debpalash
Copy link
Copy Markdown
Owner

Thanks @octo-patch — clean structure (env-gated registration, tests included, follows the existing OpenAICompatBackend pattern). Appreciate the discipline of one feature per PR with proportionate tests.

Project-identity decision (made today after council review of the local-first question your PR raised):

OmniVoice is local-first for TTS, ASR, dubbing, and dictation. Cloud LLM adapters are accepted as opt-in for translation only.

Reasoning: OpenAICompatBackend already established cloud-LLM precedent in this codebase, and translation has no local model matching GPT-4-class quality for the dubbing workflow. Cloud TTS, ASR, and dubbing are different — those are the local-first crown jewels that differentiate this project from cloud SaaS alternatives. We're going to make this explicit in PROJECT.md shortly.

Concretely for this PR:

  • MiniMaxBackend (LLM/Chat) — want to merge. Two small CodeRabbit items first:
    • MINIMAX_BASE_URL join can produce /v1/v1 when users include trailing /v1 — use urllib.parse.urljoin or strip trailing /v1
    • test_llm_minimax_default_model should monkeypatch.delenv("MINIMAX_MODEL") to avoid env leakage between tests
  • MiniMaxTTSBackend — declining. Cloud TTS conflicts with the project's local-first TTS identity. Local engines (Kitten/IndexTTS/Cosy/Supertonic-3-coming) are the value proposition; adding cloud TTS dilutes it.

Path forward: If you can split this PR into LLM-only (drop the TTS backend + its tests), I'll squash-merge the LLM side this week. The MiniMax TTS conversation isn't dead — it's a separate discussion about whether the project ever opens to cloud TTS, which deserves community input via a discussion thread rather than a code review. Happy to help open that thread if you want to start it.

Thanks for your patience while we worked through the policy question — your PR forced a useful conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants