feat(mimo-tts): support voiceclone model with reference audio by lingyun14beta · Pull Request #9106 · AstrBotDevs/AstrBot

lingyun14beta · 2026-07-01T14:48:52Z

fix #9105

Modifications / 改动点

astrbot/core/config/default.py：新增配置项 mimo-tts-voiceclone-audio（参考音频路径/URL/base64），并更新 mimo-tts-voice 的 hint，说明其在 voiceclone 模型下会被忽略
astrbot/core/provider/sources/mimo_api_common.py：prepare_audio_input() 增加 target_format/preserve_mp3 可选参数（默认值保持原行为，向后兼容）
astrbot/core/provider/sources/mimo_tts_api_source.py：
- 新增 _is_voiceclone_model() 判断当前模型是否为 voiceclone
- 新增 _resolve_voiceclone_voice()，将参考音频转换为 data URL 并按来源缓存，使用 asyncio.Lock 防止并发请求重复转码
- 未配置参考音频时抛出清晰的 MiMoAPIError
- _build_payload() 支持通过 voice_value 覆盖音色字段，写入转换后的 data URL
- 转码时优先保留原始 mp3（preserve_mp3=True），避免转 wav 后体积膨胀超过官方 10MB 限制
- terminate() 中补充清理转码产生的临时文件
dashboard/src/i18n/locales/{en-US,ru-RU,zh-CN}/features/config-metadata.json：同步补充新字段的多语言文案
tests/test_mimo_api_sources.py：新增 7 个测试用例，覆盖模型判定、未配置报错、data URL 写入 payload、缓存生效/换源刷新、mp3 保留、并发调用只转码一次

This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
/ 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到 requirements.txt 和 pyproject.toml 文件相应位置。
😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。

Summary by Sourcery

Add support for MiMo TTS voiceclone model using a configurable reference audio source and integrate it into the existing TTS provider pipeline.

New Features:

Introduce the mimo-tts-voiceclone-audio configuration option to supply a reference audio sample for MiMo voiceclone models.
Support passing a voice value (e.g., data URL from reference audio) into the TTS payload, overriding the standard voice setting when needed.

Enhancements:

Cache converted voiceclone reference audio with concurrency control to avoid redundant conversions and temporary file leaks.
Extend audio input preparation to accept optional target format and mp3-preservation flags for more flexible handling of input audio.
Ensure cleanup of temporary files generated during voiceclone audio preparation when the provider is terminated.

Documentation:

Update configuration metadata and dashboard i18n strings to document the new voiceclone reference audio option and clarify when mimo-tts-voice is ignored.

Tests:

Add tests covering voiceclone model detection, error handling when reference audio is missing, payload voice field behavior, caching semantics, mp3 preservation, and concurrent conversion behavior.

sourcery-ai

Hey - I've found 2 issues, and left some high level feedback:

The _is_voiceclone_model check relies on a substring match in model_name; consider tightening this logic (e.g., explicit allowed model list or regex) to avoid accidentally treating future models with similar names as voiceclone-capable.
The error message in _resolve_voiceclone_voice hardcodes mimo-v2.5-tts-voiceclone even though the check is generic; consider interpolating self.model_name instead so the message stays accurate if other voiceclone models are used.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `_is_voiceclone_model` check relies on a substring match in `model_name`; consider tightening this logic (e.g., explicit allowed model list or regex) to avoid accidentally treating future models with similar names as voiceclone-capable.
- The error message in `_resolve_voiceclone_voice` hardcodes `mimo-v2.5-tts-voiceclone` even though the check is generic; consider interpolating `self.model_name` instead so the message stays accurate if other voiceclone models are used.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/sources/mimo_tts_api_source.py" line_range="102-111" />
<code_context>
+        async with self._voiceclone_lock:
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider reusing the lock in terminate() to avoid potential races with voiceclone cleanup.

terminate() calls cleanup_files(self._voiceclone_cleanup_paths) without holding _voiceclone_lock, while _resolve_voiceclone_voice() accesses and mutates _voiceclone_cleanup_paths under that lock. If terminate() runs while voiceclone resolution is in progress, this can cause races or inconsistent cleanup. Please guard terminate()’s access to _voiceclone_cleanup_paths with the same lock (or otherwise prevent concurrent access).

Suggested implementation:

```python
        async with self._voiceclone_lock:
            cleanup_files(self._voiceclone_cleanup_paths)

```

To safely reuse `_voiceclone_lock` in `terminate()`:
1. Ensure `terminate()` is an `async def` so that `async with self._voiceclone_lock:` is valid. If `terminate()` must remain synchronous, instead move the cleanup into an `async` helper (e.g. `_async_terminate_cleanup`) that uses the lock, and have `terminate()` schedule/await that helper where appropriate.
2. Verify that all other accesses and mutations of `self._voiceclone_cleanup_paths` (if any) are also guarded by `_voiceclone_lock` to fully prevent races.
</issue_to_address>

### Comment 2
<location path="tests/test_mimo_api_sources.py" line_range="321-224" />
<code_context>
+async def test_mimo_tts_voiceclone_preserves_mp3_instead_of_forcing_wav(monkeypatch):
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test that verifies temporary files for voiceclone are cleaned up on terminate

The current tests for `_resolve_voiceclone_voice` and `preserve_mp3` cover caching/conversion, but don’t exercise the new `self._voiceclone_cleanup_paths` tracking or the `cleanup_files` calls on refresh/`terminate()`. Please add a test that monkeypatches `cleanup_files` in `mimo_tts_api_source`, performs a voiceclone conversion to populate `_voiceclone_cleanup_paths`, calls `terminate()`, and asserts that `cleanup_files` is called with the expected paths and that the cleanup list is cleared. This will directly validate the new temp-file cleanup behavior for voiceclone.

Suggested implementation:

```python
    captured_kwargs: dict = {}


@pytest.mark.asyncio
async def test_mimo_tts_voiceclone_temp_files_cleaned_on_terminate(monkeypatch):
    """voiceclone 的临时文件应在 terminate() 时被清理"""
    provider = _make_tts_provider(
        {
            "model": "mimo-v2.5-tts-voiceclone",
            "mimo-tts-voiceclone-audio": "/tmp/reference_voice.mp3",
            "mimo-tts-seed-text": "",
        }
    )

    # monkeypatch cleanup_files 以便观察调用情况
    cleanup_calls = []

    # 注意：mimo_tts_api_source 的导入路径可能需要根据项目结构调整
    import mimo_tts_api_source  # type: ignore

    async def fake_cleanup_files(paths):
        # 记录被请求清理的路径
        cleanup_calls.append(list(paths))

    monkeypatch.setattr(mimo_tts_api_source, "cleanup_files", fake_cleanup_files)

    # 触发一次 voiceclone 转换以填充 _voiceclone_cleanup_paths
    provider.voiceclone_audio_source = "/tmp/voice_a.mp3"
    await provider._resolve_voiceclone_voice()

    # 确认有待清理的临时文件被记录
    assert getattr(provider, "_voiceclone_cleanup_paths", []), "_voiceclone_cleanup_paths 应在转换后包含临时文件路径"

    # 记录当前待清理路径，用于后续断言
    paths_to_cleanup = list(provider._voiceclone_cleanup_paths)

    # 调用 terminate，应触发 cleanup_files 并清空 _voiceclone_cleanup_paths
    await provider.terminate()

    # 验证 cleanup_files 被调用且传入的路径与记录的一致
    assert cleanup_calls == [paths_to_cleanup]

    # 验证清理列表已被清空
    assert provider._voiceclone_cleanup_paths == []

```

1. 如果 `mimo_tts_api_source` 在测试文件中已有导入（例如 `from src.mimo_tts_api_source import cleanup_files` 或类似），请删除该测试中的局部 `import mimo_tts_api_source` 并改为使用正确的模块引用路径，例如：
   - `import src.mimo_tts_api_source as mimo_tts_api_source`，或
   - `from src import mimo_tts_api_source`。
2. 确保 `provider` 实例在项目中确实存在私有属性 `self._voiceclone_cleanup_paths`，且 `terminate()` 会调用 `cleanup_files(self._voiceclone_cleanup_paths)` 并在完成后清空该列表。如果实现略有不同（例如属性名或清空逻辑不一致），请相应调整测试中的属性访问和断言。
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-07-01T14:51:30Z

+        async with self._voiceclone_lock:
+            if (
+                self._voiceclone_cache_data_url is not None
+                and self._voiceclone_cache_source == self.voiceclone_audio_source
+            ):
+                return self._voiceclone_cache_data_url
+
+            try:
+                data_url, cleanup_paths = await prepare_audio_input(
+                    self.voiceclone_audio_source,


suggestion (bug_risk): Consider reusing the lock in terminate() to avoid potential races with voiceclone cleanup.

terminate() calls cleanup_files(self._voiceclone_cleanup_paths) without holding _voiceclone_lock, while _resolve_voiceclone_voice() accesses and mutates _voiceclone_cleanup_paths under that lock. If terminate() runs while voiceclone resolution is in progress, this can cause races or inconsistent cleanup. Please guard terminate()’s access to _voiceclone_cleanup_paths with the same lock (or otherwise prevent concurrent access).

Suggested implementation:

async with self._voiceclone_lock: cleanup_files(self._voiceclone_cleanup_paths)

To safely reuse _voiceclone_lock in terminate():

Ensure terminate() is an async def so that async with self._voiceclone_lock: is valid. If terminate() must remain synchronous, instead move the cleanup into an async helper (e.g. _async_terminate_cleanup) that uses the lock, and have terminate() schedule/await that helper where appropriate.

Verify that all other accesses and mutations of self._voiceclone_cleanup_paths (if any) are also guarded by _voiceclone_lock to fully prevent races.

sourcery-ai · 2026-07-01T14:51:30Z

+        with pytest.raises(MiMoAPIError, match="mimo-tts-voiceclone-audio"):
+            await provider.get_audio("hello")
+    finally:
+        await provider.terminate()


suggestion (testing): Consider adding a test that verifies temporary files for voiceclone are cleaned up on terminate

The current tests for _resolve_voiceclone_voice and preserve_mp3 cover caching/conversion, but don’t exercise the new self._voiceclone_cleanup_paths tracking or the cleanup_files calls on refresh/terminate(). Please add a test that monkeypatches cleanup_files in mimo_tts_api_source, performs a voiceclone conversion to populate _voiceclone_cleanup_paths, calls terminate(), and asserts that cleanup_files is called with the expected paths and that the cleanup list is cleared. This will directly validate the new temp-file cleanup behavior for voiceclone.

Suggested implementation:

captured_kwargs: dict = {} @pytest.mark.asyncio async def test_mimo_tts_voiceclone_temp_files_cleaned_on_terminate(monkeypatch): """voiceclone 的临时文件应在 terminate() 时被清理""" provider = _make_tts_provider( { "model": "mimo-v2.5-tts-voiceclone", "mimo-tts-voiceclone-audio": "/tmp/reference_voice.mp3", "mimo-tts-seed-text": "", } ) # monkeypatch cleanup_files 以便观察调用情况 cleanup_calls = [] # 注意：mimo_tts_api_source 的导入路径可能需要根据项目结构调整 import mimo_tts_api_source # type: ignore async def fake_cleanup_files(paths): # 记录被请求清理的路径 cleanup_calls.append(list(paths)) monkeypatch.setattr(mimo_tts_api_source, "cleanup_files", fake_cleanup_files) # 触发一次 voiceclone 转换以填充 _voiceclone_cleanup_paths provider.voiceclone_audio_source = "/tmp/voice_a.mp3" await provider._resolve_voiceclone_voice() # 确认有待清理的临时文件被记录 assert getattr(provider, "_voiceclone_cleanup_paths", []), "_voiceclone_cleanup_paths 应在转换后包含临时文件路径" # 记录当前待清理路径，用于后续断言 paths_to_cleanup = list(provider._voiceclone_cleanup_paths) # 调用 terminate，应触发 cleanup_files 并清空 _voiceclone_cleanup_paths await provider.terminate() # 验证 cleanup_files 被调用且传入的路径与记录的一致 assert cleanup_calls == [paths_to_cleanup] # 验证清理列表已被清空 assert provider._voiceclone_cleanup_paths == []

如果 mimo_tts_api_source 在测试文件中已有导入（例如 from src.mimo_tts_api_source import cleanup_files 或类似），请删除该测试中的局部 import mimo_tts_api_source 并改为使用正确的模块引用路径，例如：

import src.mimo_tts_api_source as mimo_tts_api_source，或

from src import mimo_tts_api_source。

确保 provider 实例在项目中确实存在私有属性 self._voiceclone_cleanup_paths，且 terminate() 会调用 cleanup_files(self._voiceclone_cleanup_paths) 并在完成后清空该列表。如果实现略有不同（例如属性名或清空逻辑不一致），请相应调整测试中的属性访问和断言。

gemini-code-assist · 2026-07-01T15:00:03Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

lingyun14beta · 2026-07-01T15:02:59Z

/gemini review

gemini-code-assist · 2026-07-01T15:04:59Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

feat(mimo-tts): support voiceclone model with reference audio

19d1443

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Jul 1, 2026

sourcery-ai Bot reviewed Jul 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(mimo-tts): support voiceclone model with reference audio#9106

feat(mimo-tts): support voiceclone model with reference audio#9106
lingyun14beta wants to merge 1 commit into
AstrBotDevs:masterfrom
lingyun14beta:mimo

lingyun14beta commented Jul 1, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

sourcery-ai Bot Jul 1, 2026

Uh oh!

sourcery-ai Bot Jul 1, 2026

Uh oh!

gemini-code-assist Bot commented Jul 1, 2026

Uh oh!

lingyun14beta commented Jul 1, 2026

Uh oh!

gemini-code-assist Bot commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

lingyun14beta commented Jul 1, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Summary by Sourcery

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot commented Jul 1, 2026

Uh oh!

lingyun14beta commented Jul 1, 2026

Uh oh!

gemini-code-assist Bot commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lingyun14beta commented Jul 1, 2026 •

edited by sourcery-ai Bot

Loading