feat: add MiniMax provider support (Chat + TTS) by octo-patch · Pull Request #82 · debpalash/OmniVoice-Studio

octo-patch · 2026-05-18T09:22:53Z

Summary

Add MiniMax as a new provider for both Chat (LLM) and TTS capabilities:

Chat Model: MiniMaxBackend — OpenAI-compatible adapter using models MiniMax-M2.7 and MiniMax-M2.7-highspeed. Follows the same registry pattern as the existing OpenAICompatBackend.
TTS: MiniMaxTTSBackend — cloud-based text-to-speech via MiniMax's T2A API (speech-2.8-hd, speech-2.8-turbo). Supports 6 English voices, hex-encoded audio decoding, and auto-appears in Settings → Engines.
Shared API key: Both adapters use MINIMAX_API_KEY (get one at platform.minimax.io)
Tests: 12 new unit tests covering registration, availability, config, and env var overrides — all passing

Environment Variables

Variable	Required	Purpose
`MINIMAX_API_KEY`	Yes	API key for both Chat and TTS
`MINIMAX_BASE_URL`	No	Override default base URL
`MINIMAX_MODEL`	No	Override default Chat model (default: `MiniMax-M2.7`)
`MINIMAX_TTS_MODEL`	No	Override default TTS model (default: `speech-2.8-hd`)

API Documentation

Chat (OpenAI Compatible): https://platform.minimax.io/docs/api-reference/text-openai-api
TTS: https://platform.minimax.io/docs/api-reference/speech-t2a-http

Test Plan

Unit tests pass (uv run pytest tests/test_engines.py — 26 passed, 2 skipped)
Integration test: Chat API returns valid response with MiniMax-M2.7
Integration test: TTS API returns valid hex-encoded MP3 audio (27KB for short text)
Existing tests unaffected (only 1 assertion updated to include new provider in registry check)

Summary by CodeRabbit

New Features
- Added MiniMax as a supported language model provider. Configure with MINIMAX_API_KEY environment variable.
- Added MiniMax as a supported text-to-speech provider. Configure with MINIMAX_API_KEY and optional MINIMAX_BASE_URL and MINIMAX_MODEL variables.
- Both backends are now discoverable and selectable in application settings.

- Add MiniMaxBackend LLM adapter using OpenAI-compatible API (models: MiniMax-M2.7, MiniMax-M2.7-highspeed) - Add MiniMaxTTSBackend for text-to-speech via MiniMax T2A API (models: speech-2.8-hd, speech-2.8-turbo) - Add MINIMAX_API_KEY environment variable support (shared by both) - Add unit tests for registration, availability, and config

coderabbitai · 2026-05-18T09:23:06Z

📝 Walkthrough

Walkthrough

Adds MiniMax provider integration for LLM chat completions and TTS audio synthesis. The LLM backend uses OpenAI-compatible chat API with forced temperature=1.0. The TTS backend posts JSON to MiniMax API, decodes hex-encoded MP3 audio, and resamples to 32 kHz mono via torchaudio. Both backends register in module-level maps and include comprehensive test coverage.

Changes

MiniMax Backend Support

Layer / File(s)	Summary
MiniMax LLM Backend Implementation and Registration `backend/services/llm_backend.py`	`MiniMaxBackend` class implements the `LLMBackend` interface, checks `MINIMAX_API_KEY` availability, lazily constructs an OpenAI client with `MINIMAX_BASE_URL`, and calls `chat.completions.create()` with `temperature=1.0` forced. Registered in `_REGISTRY` under `"minimax"` key for discoverability.
MiniMax TTS Backend Implementation, Registration, and Config `backend/services/tts_backend.py`	`MiniMaxTTSBackend` class implements `TTSBackend`, gates on `MINIMAX_API_KEY`, builds JSON POST to `/v1/t2a_v2`, decodes hex-encoded MP3 audio from the response, loads and resamples to 32 kHz via `torchaudio`, forces mono output shape `(1, n_samples)`, and validates API status codes and response data. Registered in `_REGISTRY` and added to `_INSTALL_HINTS` for Settings UI configuration hints.
Backend Registry and MiniMax Test Coverage `tests/test_engines.py`	Updated `test_llm_registry_includes_off` to assert `minimax` backend presence. Added test suites for MiniMax LLM (registry, availability gating, model name configuration, env selection) and MiniMax TTS (API-key gating, sample rate assertion, voice/language support, env selection, registry inclusion), with conditional skip markers for missing `openai` package.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A rabbit's ode to new voices far and wide:
MiniMax joins the chorus, with TTS as guide,
Chat flows at temperature one, so precise,
Audio resamples to 32k—crisp and nice!
Tests verify every quirk and detail inside. 🎵

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding MiniMax provider support for both Chat and TTS.
Description check	✅ Passed	The description is comprehensive and follows the template structure with all key sections completed: Summary, Changes (implicit in the detailed breakdown), Type (New feature), Testing (test plan provided), and important implementation details.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/services/tts_backend.py`:
- Around line 1167-1193: The code builds the TTS endpoint by unconditionally
appending "/v1/t2a_v2" to the MINIMAX_BASE_URL (variable base_url) which causes
double "/v1/v1" if MINIMAX_BASE_URL already contains a path; fix by normalizing
base_url before composing the Request: strip trailing slashes and also remove a
trailing "/v1" segment if present (or use urllib.parse.urljoin to safely join
base_url and "/v1/t2a_v2"), then use the normalized base_url in the Request call
(refer to base_url and the Request f"{base_url}/v1/t2a_v2" in tts_backend.py).

In `@tests/test_engines.py`:
- Around line 162-165: The test test_llm_minimax_default_model relies on ambient
MINIMAX_MODEL env state; before creating MiniMaxBackend() ensure the env var is
isolated by removing or overriding it (e.g., call
os.environ.pop("MINIMAX_MODEL", None) or use pytest's
monkeypatch.delenv("MINIMAX_MODEL", raising=False)) so
MiniMaxBackend().model_name equals the hardcoded default "MiniMax-M2.7"; update
the test to remove/delenv the variable just prior to instantiating
MiniMaxBackend to guarantee deterministic behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: eedf3a6b-c1e9-4dd4-940b-36a45f131a87

📥 Commits

Reviewing files that changed from the base of the PR and between 21c338c and 7472d6d.

📒 Files selected for processing (3)

backend/services/llm_backend.py
backend/services/tts_backend.py
tests/test_engines.py

coderabbitai · 2026-05-18T09:26:11Z

+        base_url = os.environ.get("MINIMAX_BASE_URL", "https://api.minimax.io")
+        base_url = base_url.rstrip("/")
+        model = os.environ.get("MINIMAX_TTS_MODEL", "speech-2.8-hd")
+        voice = kw.get("voice", self.VOICES[0])
+        speed = float(kw.get("speed", 1.0))
+
+        payload = json.dumps({
+            "model": model,
+            "text": text,
+            "stream": False,
+            "voice_setting": {
+                "voice_id": voice,
+                "speed": speed,
+                "vol": 1,
+                "pitch": 0,
+            },
+            "audio_setting": {
+                "sample_rate": self.sample_rate,
+                "bitrate": 128000,
+                "format": "mp3",
+                "channel": 1,
+            },
+        }).encode()
+
+        req = urllib.request.Request(
+            f"{base_url}/v1/t2a_v2",
+            data=payload,


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Normalize MINIMAX_BASE_URL before composing the TTS endpoint.

This currently appends /v1/t2a_v2 unconditionally. If MINIMAX_BASE_URL is set to https://api.minimax.io/v1 (the same shape used by the MiniMax LLM backend), requests go to /v1/v1/t2a_v2 and fail.

Suggested fix

- base_url = os.environ.get("MINIMAX_BASE_URL", "https://api.minimax.io") - base_url = base_url.rstrip("/") + base_url = os.environ.get("MINIMAX_BASE_URL", "https://api.minimax.io/v1").rstrip("/") + endpoint = ( + f"{base_url}/t2a_v2" + if base_url.endswith("/v1") + else f"{base_url}/v1/t2a_v2" + ) @@ - f"{base_url}/v1/t2a_v2", + endpoint,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

base_url = os.environ.get("MINIMAX_BASE_URL", "https://api.minimax.io")

base_url = base_url.rstrip("/")

model = os.environ.get("MINIMAX_TTS_MODEL", "speech-2.8-hd")

voice = kw.get("voice", self.VOICES[0])

speed = float(kw.get("speed", 1.0))

payload = json.dumps({

"model": model,

"text": text,

"stream": False,

"voice_setting": {

"voice_id": voice,

"speed": speed,

"vol": 1,

"pitch": 0,

},

"audio_setting": {

"sample_rate": self.sample_rate,

"bitrate": 128000,

"format": "mp3",

"channel": 1,

},

}).encode()

req = urllib.request.Request(

f"{base_url}/v1/t2a_v2",

data=payload,

base_url = os.environ.get("MINIMAX_BASE_URL", "https://api.minimax.io/v1").rstrip("/")

endpoint = (

f"{base_url}/t2a_v2"

if base_url.endswith("/v1")

else f"{base_url}/v1/t2a_v2"

)

model = os.environ.get("MINIMAX_TTS_MODEL", "speech-2.8-hd")

voice = kw.get("voice", self.VOICES[0])

speed = float(kw.get("speed", 1.0))

payload = json.dumps({

"model": model,

"text": text,

"stream": False,

"voice_setting": {

"voice_id": voice,

"speed": speed,

"vol": 1,

"pitch": 0,

},

"audio_setting": {

"sample_rate": self.sample_rate,

"bitrate": 128000,

"format": "mp3",

"channel": 1,

},

}).encode()

req = urllib.request.Request(

endpoint,

data=payload,

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/services/tts_backend.py` around lines 1167 - 1193, The code builds the TTS endpoint by unconditionally appending "/v1/t2a_v2" to the MINIMAX_BASE_URL (variable base_url) which causes double "/v1/v1" if MINIMAX_BASE_URL already contains a path; fix by normalizing base_url before composing the Request: strip trailing slashes and also remove a trailing "/v1" segment if present (or use urllib.parse.urljoin to safely join base_url and "/v1/t2a_v2"), then use the normalized base_url in the Request call (refer to base_url and the Request f"{base_url}/v1/t2a_v2" in tts_backend.py).

coderabbitai · 2026-05-18T09:26:11Z

+def test_llm_minimax_default_model():
+    be = llm_backend.MiniMaxBackend()
+    assert be.model_name == "MiniMax-M2.7"
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Isolate MINIMAX_MODEL in the default-model test.

This test depends on ambient environment state and can fail when MINIMAX_MODEL is already set in CI/dev shells.

Suggested fix

-def test_llm_minimax_default_model(): +def test_llm_minimax_default_model(monkeypatch): + monkeypatch.delenv("MINIMAX_MODEL", raising=False) be = llm_backend.MiniMaxBackend() assert be.model_name == "MiniMax-M2.7"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def test_llm_minimax_default_model():

be = llm_backend.MiniMaxBackend()

assert be.model_name == "MiniMax-M2.7"

def test_llm_minimax_default_model(monkeypatch):

monkeypatch.delenv("MINIMAX_MODEL", raising=False)

be = llm_backend.MiniMaxBackend()

assert be.model_name == "MiniMax-M2.7"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_engines.py` around lines 162 - 165, The test test_llm_minimax_default_model relies on ambient MINIMAX_MODEL env state; before creating MiniMaxBackend() ensure the env var is isolated by removing or overriding it (e.g., call os.environ.pop("MINIMAX_MODEL", None) or use pytest's monkeypatch.delenv("MINIMAX_MODEL", raising=False)) so MiniMaxBackend().model_name equals the hardcoded default "MiniMax-M2.7"; update the test to remove/delenv the variable just prior to instantiating MiniMaxBackend to guarantee deterministic behavior.

debpalash · 2026-05-18T11:40:16Z

Thanks @octo-patch — clean structure (env-gated registration, tests included, follows the existing OpenAICompatBackend pattern). Appreciate the discipline of one feature per PR with proportionate tests.

Project-identity decision (made today after council review of the local-first question your PR raised):

OmniVoice is local-first for TTS, ASR, dubbing, and dictation. Cloud LLM adapters are accepted as opt-in for translation only.

Reasoning: OpenAICompatBackend already established cloud-LLM precedent in this codebase, and translation has no local model matching GPT-4-class quality for the dubbing workflow. Cloud TTS, ASR, and dubbing are different — those are the local-first crown jewels that differentiate this project from cloud SaaS alternatives. We're going to make this explicit in PROJECT.md shortly.

Concretely for this PR:

✅ MiniMaxBackend (LLM/Chat) — want to merge. Two small CodeRabbit items first:
- MINIMAX_BASE_URL join can produce /v1/v1 when users include trailing /v1 — use urllib.parse.urljoin or strip trailing /v1
- test_llm_minimax_default_model should monkeypatch.delenv("MINIMAX_MODEL") to avoid env leakage between tests
❌ MiniMaxTTSBackend — declining. Cloud TTS conflicts with the project's local-first TTS identity. Local engines (Kitten/IndexTTS/Cosy/Supertonic-3-coming) are the value proposition; adding cloud TTS dilutes it.

Path forward: If you can split this PR into LLM-only (drop the TTS backend + its tests), I'll squash-merge the LLM side this week. The MiniMax TTS conversation isn't dead — it's a separate discussion about whether the project ever opens to cloud TTS, which deserves community input via a discussion thread rather than a code review. Happy to help open that thread if you want to start it.

Thanks for your patience while we worked through the policy question — your PR forced a useful conversation.

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add MiniMax provider support (Chat + TTS)#82

feat: add MiniMax provider support (Chat + TTS)#82
octo-patch wants to merge 1 commit into
debpalash:mainfrom
octo-patch:feature/add-minimax-provider

octo-patch commented May 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 18, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 18, 2026

Uh oh!

coderabbitai Bot May 18, 2026

Uh oh!

debpalash commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

octo-patch commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Environment Variables

API Documentation

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

debpalash commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

octo-patch commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading