Add local Kokoro TTS server#1208
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Walkthrough本 PR 添加本地轻量 Kokoro TTS:FastAPI WebSocket 服务与多引擎实现、Kokoro CLI、启动/打包脚本、后端路由与 voice 解析、前端设置与 voices 拉取、i18n 与 .gitignore 更新喵。 变更内容本地 Kokoro TTS 服务及客户端集成
Sequence Diagram(s)sequenceDiagram
participant UI
participant LocalKokoroServer
participant KokoroEngine
participant NEKO_Backend
UI->>LocalKokoroServer: WS connect /v1/audio/speech/stream
UI->>LocalKokoroServer: send {"voice": "...", "speed": ...}
UI->>LocalKokoroServer: stream {"text": "..."} / {"event":"end"}
LocalKokoroServer->>KokoroEngine: synthesize(text, voice, speed)
KokoroEngine-->>LocalKokoroServer: PCM bytes (s16le, mono, 22.05kHz)
LocalKokoroServer->>UI: stream PCM chunk (s16le, 22.05kHz)
NEKO_Backend->>UI: resample to 48kHz
预估评审工作量🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
诗意庆祝
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ce234ad6c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
templates/api_key_settings.html (1)
593-612: ⚡ Quick win新的 Local Kokoro TTS 区块缺少
data-i18n属性和无障碍属性喵!这个区块里所有用户可见的文本都没有
data-i18n属性,和模板其他区块(比如 GPT-SoVITS 块)的惯例不一致喵。另外 Apply 按钮缺少data-i18n-title、data-i18n-aria和aria-label属性喵,不符合仓库的无障碍规范喵~✨ 建议修复:补全 i18n 和无障碍属性
- <div class="field-row" id="localKokoroTtsConfig" + <div class="field-row" id="localKokoroTtsConfig" style="background: `#f7fbff`; border: 1px solid `#d7eefc`; border-radius: 6px; padding: 12px 16px; margin: 12px 16px;"> - <div style="font-weight: 600; margin-bottom: 10px;">Local Kokoro TTS</div> + <div style="font-weight: 600; margin-bottom: 10px;" data-i18n="api.localKokoroTtsTitle">Local Kokoro TTS</div> <div style="display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 10px; align-items: end; max-width: 760px;"> <div> - <label for="localKokoroWsUrl" style="display: block; margin-bottom: 6px;">WebSocket URL</label> + <label for="localKokoroWsUrl" style="display: block; margin-bottom: 6px;" data-i18n="api.localKokoroWsUrl">WebSocket URL</label> <input type="text" id="localKokoroWsUrl" placeholder="ws://127.0.0.1:50000" value="ws://127.0.0.1:50000" style="width: 100%;"> </div> <div> - <label for="localKokoroVoiceSelect" style="display: block; margin-bottom: 6px;">Voice</label> + <label for="localKokoroVoiceSelect" style="display: block; margin-bottom: 6px;" data-i18n="api.localKokoroVoice">Voice</label> <select id="localKokoroVoiceSelect" style="width: 100%;"> <option value="kokoro:zf_001">zf_001</option> </select> </div> - <button type="button" class="gsv-refresh-btn" onclick="applyLocalKokoroTtsConfig()">Apply</button> + <button type="button" class="gsv-refresh-btn" onclick="applyLocalKokoroTtsConfig()" + data-i18n="api.localKokoroApply" + data-i18n-title="api.localKokoroApply" + data-i18n-aria="api.localKokoroApply" + aria-label="Apply">Apply</button> </div> </div>同时需要在对应的 i18n 语言文件中添加这些新 key(
api.localKokoroTtsTitle、api.localKokoroWsUrl、api.localKokoroVoice、api.localKokoroApply)喵。基于 retrieved learnings:仓库使用
data-i18n-aria="<key>"驱动无障碍标签(不是data-i18n-aria-label),图标/文字按钮应同时设置data-i18n-title、data-i18n-aria和硬编码aria-label兜底喵。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@templates/api_key_settings.html` around lines 593 - 612, The new Local Kokoro TTS block is missing data-i18n and accessibility attributes; add data-i18n on the visible text nodes and add data-i18n-title, data-i18n-aria and an explicit aria-label on the Apply button (referencing the DOM IDs localKokoroTtsConfig, localKokoroWsUrl, localKokoroVoice, localKokoroVoiceSelect and the button that calls applyLocalKokoroTtsConfig), and then add the corresponding i18n keys api.localKokoroTtsTitle, api.localKokoroWsUrl, api.localKokoroVoice, api.localKokoroApply to the project's language files so the UI and ARIA labels are localized.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@local_server/local_tts_server/probe.py`:
- Line 47: The default for the CLI --voice argument is set to "tone:default"
which doesn't match the service protocol prefixes; update the
parser.add_argument("--voice", default="tone:default") call in probe.py to use a
working service prefix (e.g., "kokoro:zf_001") or the server's default so the
probe works out-of-the-box; ensure the change only updates the default string
passed to parser.add_argument("--voice", ...) and keep the flag name unchanged.
In `@local_server/local_tts_server/README.md`:
- Around line 19-20: The example JSON values under the "text" key are garbled;
replace the malformed strings (e.g. currently shown as 浣犲ソ锛? and 鎴戞槸 NEKO銆?)
with properly encoded UTF-8 Chinese examples such as "你好。" and "我是 NEKO。" so the
README’s JSON examples display correctly and can be copied for validation.
In `@local_server/local_tts_server/server.py`:
- Around line 362-367: en_callable 中的英语音素字面量被编码破坏(显示为 k藞Ok蓹晒O / s藞Ol),导致中文模式下
Kokoro/Sol 的特判发音错误;打开 en_callable 并将其返回的字符串替换为正确的 IPA/音素字面量(与 kokoro_cli.py 中使用的
"kˈOkəɹO" 和 "sˈOl" 一致),保留其余逻辑(fallback 使用
next(self._en_pipeline(text_part)).phonemes),确保函数名 en_callable 和对
self._en_pipeline 的调用不变。
- Around line 700-724: The logs currently print the raw synthesis text in two
places (the initial logger.info("Synthesis text: %s", text) and the final
logger.info call), which may leak sensitive user content; remove or redact any
logging of the full text and only log non-sensitive metadata—update the code
around logger.info/engine.synthesize (functions/variables: logger.info,
engine.synthesize, SYNTHESIS_MODE, result.device, spec.voice, len(text),
elapsed, audio_duration, rtf) so the first "Synthesis text" log is removed (or
replaced with a length-only message) and the final logger.info call omits the
text parameter, keeping engine.name, SYNTHESIS_MODE, "streaming"/"merged",
result.device, spec.voice, len(text), elapsed, audio_duration, and rtf only.
- Around line 190-204: The code builds cmd from template and passes it to
subprocess.run with shell=True, allowing command injection via voice/speed;
update the call to use an argv list and shell=False instead of formatting
user-controlled values into a shell string (use subprocess.run([sys.executable,
"<script>", "--text-file", str(text_path), "--out-file", str(out_path),
"--voice", voice, "--speed", speed], shell=False, ...)), or if invoking a single
script isn't possible, strictly validate/whitelist voice and speed (e.g., allow
only alphanumeric/known voices and numeric speeds) before inserting into
template; locate and modify the template, cmd construction, and the
subprocess.run invocation to implement the argv-list approach (or add
validation) so shell=True with user input is removed.
In `@local_server/local_tts_server/smoke_test.py`:
- Around line 81-85: The baseline_cases list currently treats
"tone:default"/"tone:high" as always-present tests which fails when the tone
engine isn't registered; change the smoke_test logic that builds baseline_cases
(variable baseline_cases) to first query the server /health endpoint and only
include tone-* entries if the server reports LOCAL_TTS_ENABLE_TONE (or the tone
capability) as enabled; alternatively default to using "kokoro" entries instead
of tone when tone is not enabled. Update the code that assembles baseline_cases
so it conditionally appends the ("tone:...", ...) tuples based on the /health
response or ENV flag rather than assuming tone is always present.
In `@static/js/api_key_settings.js`:
- Around line 54-65: localKokoroHttpBaseFromWs currently swallows URL parsing
errors and returns the local default, causing refreshLocalKokoroVoiceOptions to
show voices from the wrong source; change localKokoroHttpBaseFromWs (and the
similar functions at the other occurrences) to not fallback to
'http://127.0.0.1:50000' on catch but instead return a clear null/empty value
(e.g., null or '') so callers can detect failure; update callers like
refreshLocalKokoroVoiceOptions to check for null/empty and explicitly surface an
error or abort the refresh rather than silently using the local default.
In `@static/js/character_card_manager.js`:
- Around line 5520-5538: _fetchPanelKokoroVoices currently fetches
`${httpBase}/v1/voices` without a timeout which can hang the whole chain; add an
AbortController with a short timeout (same approach as used for GSV) and pass
its signal into the fetch call to `${httpBase}/v1/voices`, clear the timeout on
success, and catch AbortError to treat it as "not available" (log/ignore) so the
panel doesn't block when the local Kokoro service is unresponsive; update the
fetch call that assigns voicesResp and ensure the controller/signal and timer
are properly scoped and cleaned up.
---
Nitpick comments:
In `@templates/api_key_settings.html`:
- Around line 593-612: The new Local Kokoro TTS block is missing data-i18n and
accessibility attributes; add data-i18n on the visible text nodes and add
data-i18n-title, data-i18n-aria and an explicit aria-label on the Apply button
(referencing the DOM IDs localKokoroTtsConfig, localKokoroWsUrl,
localKokoroVoice, localKokoroVoiceSelect and the button that calls
applyLocalKokoroTtsConfig), and then add the corresponding i18n keys
api.localKokoroTtsTitle, api.localKokoroWsUrl, api.localKokoroVoice,
api.localKokoroApply to the project's language files so the UI and ARIA labels
are localized.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 12d29e40-fa03-46a0-ac7b-95725c93dd20
📒 Files selected for processing (15)
.gitignorelocal_server/cosyvoice_server/test_ws_client.pylocal_server/local_tts_server/README.mdlocal_server/local_tts_server/kokoro_cli.pylocal_server/local_tts_server/kokoro_models/.gitkeeplocal_server/local_tts_server/probe.pylocal_server/local_tts_server/server.pylocal_server/local_tts_server/smoke_test.pylocal_server/local_tts_server/start_kokoro_server.batlocal_server/local_tts_server/start_kokoro_server.ps1main_logic/tts_client.pystatic/js/api_key_settings.jsstatic/js/character_card_manager.jstemplates/api_key_settings.htmlutils/custom_tts_adapter.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f4dda09ee0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (5)
local_server/local_tts_server/start_kokoro_server.ps1 (3)
376-381: 💤 Low valuespawn 出去的服务器进程没拿来做生命周期管理喵
Start-Process -PassThru拿到$serverProcess之后整个脚本就再也没碰它了喵。健康检查超时(line 397 throw)或者 launcher 异常退出时,子 PowerShell 窗口和里面的 uvicorn 都会留在后台跑,得靠下次启动时Stop-ExistingLocalTtsIfNeeded来兜底关掉。如果是有意保留窗口给用户看日志(line 406 注释也是这意思),那建议干脆不要
-PassThru,避免静态分析吐槽未使用变量喵;如果想做更稳的清理,至少在throw "Kokoro local TTS server failed to become ready"之前用$serverProcess.Id主动 stop 一下喵。♻️ 简易改法(保留窗口、清掉未用变量)喵
if (-not $existingLocalTtsKept) { - $serverProcess = Start-Process -FilePath "powershell" ` - -ArgumentList $childArgs ` - -WorkingDirectory $repoRoot ` - -PassThru + Start-Process -FilePath "powershell" ` + -ArgumentList $childArgs ` + -WorkingDirectory $repoRoot | Out-Null }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@local_server/local_tts_server/start_kokoro_server.ps1` around lines 376 - 381, The spawned server process is assigned to $serverProcess via Start-Process -PassThru but never managed; either remove -PassThru to avoid an unused variable if you intentionally leave the window open, or implement lifecycle cleanup: keep -PassThru, then on failure paths (before the throw "Kokoro local TTS server failed to become ready") call Stop-Process (or use $serverProcess.Id with Stop-Process/Wait-Process) to terminate the child, and ensure any existing cleanup logic (e.g., Stop-ExistingLocalTtsIfNeeded) is invoked as a fallback.
165-172: 💤 Low valueCUDA 索引硬编码成 cu128 喵
--index-url https://download.pytorch.org/whl/cu128把 CUDA 版本写死成 12.8 了喵,老一点的显卡驱动(CUDA Runtime < 12.8)拉到的 wheel 跑不起来会直接 import 失败,然后落到cudaInstallFailedMarker路径喵。建议要么开放一个LOCAL_TTS_TORCH_INDEX_URL之类的环境变量来覆盖,要么在 README 里显式写一下"仅支持 CUDA 12.8+ 驱动"喵,避免后面有人来开 issue 抱怨喵。♻️ 参考改法喵
+ $torchIndex = if ($env:LOCAL_TTS_TORCH_INDEX_URL) { $env:LOCAL_TTS_TORCH_INDEX_URL } else { "https://download.pytorch.org/whl/cu128" } $installed = Invoke-UvChecked @( "pip", "install", "--python", $venvPython, "--force-reinstall", - "--index-url", "https://download.pytorch.org/whl/cu128", + "--index-url", $torchIndex, "torch" )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@local_server/local_tts_server/start_kokoro_server.ps1` around lines 165 - 172, The hard-coded CUDA torch index URL (--index-url "https://download.pytorch.org/whl/cu128") causes installs to require CUDA 12.8; change the install step that calls Invoke-UvChecked (the pip install invocation that sets $installed using $venvPython) to allow override via an environment variable (e.g. LOCAL_TTS_TORCH_INDEX_URL) and fall back to the current cu128 URL if unset; update the pip install arguments to use that variable and ensure any failure still writes to the existing cudaInstallFailedMarker path.
17-17: ⚡ Quick win
uv缺失时报错信息可以更友好喵
(Get-Command uv -ErrorAction Stop).Source在用户没装 uv 的时候直接抛 PS 默认错误,对终端用户不太友好喵。这个脚本是给本地用户跑的,建议 catch 一下,给一个明确的中英文双语提示,告诉用户去哪装 uv 喵。♻️ 建议改法喵
-$uvExe = (Get-Command uv -ErrorAction Stop).Source +$uvCmd = Get-Command uv -ErrorAction SilentlyContinue +if (-not $uvCmd) { + Write-Host "uv is required but not found in PATH." -ForegroundColor Red + Write-Host "Install it from https://github.com/astral-sh/uv before running this script." -ForegroundColor Yellow + exit 1 +} +$uvExe = $uvCmd.Source🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@local_server/local_tts_server/start_kokoro_server.ps1` at line 17, Wrap the existing (Get-Command uv -ErrorAction Stop).Source call in a try/catch inside start_kokoro_server.ps1: attempt to resolve uv into $uvExe, and in the catch emit a clear bilingual message (Chinese + English) telling the user that "uv" is not found and how to install it (suggest using their package manager, e.g. npm/choco/brew: "npm i -g uv" as an example), then exit with a non-zero status; reference the $uvExe variable and the Get-Command uv call when making the change.local_server/local_tts_server/server.py (2)
169-178: 💤 Low value
LOCAL_TTS_KOKORO_CMD在 docstring 里还提了一嘴,但实际用KokoroEngine路径已经走不到了喵类 docstring 里写着 "Configure per model with: LOCAL_TTS_KOKORO_CMD ...",但
build_engines()注册的kokoro是KokoroEngine()(进程内),不是CommandWavEngine,所以实际上LOCAL_TTS_KOKORO_CMD对 kokoro 模型完全没用喵。start_kokoro_server.ps1line 50-53 那边还在傻乎乎地设置这个环境变量,看着也挺误导的喵。建议把 docstring 里 kokoro 那一行删掉,并且 PS 脚本里也别再 set 这个无用变量了,免得后续维护的人调试半天一脸懵喵。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@local_server/local_tts_server/server.py` around lines 169 - 178, Remove the misleading reference to LOCAL_TTS_KOKORO_CMD from the module/class docstring and any related comments because build_engines() registers kokoro using KokoroEngine() (an in-process engine) rather than CommandWavEngine, so that env var is unused; also update start_kokoro_server.ps1 to stop setting the LOCAL_TTS_KOKORO_CMD environment variable (remove the set lines around that variable) to avoid confusion for future maintainers.
502-502: 💤 Low value建议迁移到
lifespancontext manager 写法喵
@app.on_event("startup")在 FastAPI 0.93+ 已被标记为 deprecated,虽然目前还能用(为了兼容性有重新实现),但会触发DeprecationWarning喵。官方文档现在推荐使用lifespan参数配合 async context manager,这样代码结构更清晰、错误处理更好呢喵。由于
start_kokoro_server.ps1里没有钉 FastAPI 版本号,新装的环境会收到 deprecation 警告喵。不是功能问题啦,PR 合并没影响,但建议之后慢慢切到新写法,这样以后维护才不会踩坑喵。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@local_server/local_tts_server/server.py` at line 502, The current use of `@app.on_event`("startup") in server.py is deprecated in FastAPI 0.93+; replace it with the recommended lifespan async context manager pattern by defining an async def lifespan(app): ... using "asynccontextmanager" and passing lifespan=lifespan to the FastAPI(...) constructor, move the startup logic currently inside the function registered with `@app.on_event`("startup") (e.g., the Kokoro server startup sequence) into the enter block and any shutdown cleanup into the exit block, and remove the `@app.on_event`("startup") decorator so the app is constructed with lifespan instead to avoid DeprecationWarning and improve error handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@local_server/local_tts_server/README.md`:
- Around line 74-80: The README currently shows only Windows "set KEY=VALUE"
examples for environment variables; add equivalent Linux/macOS export examples
using "export KEY=VALUE" for each variable shown (e.g.,
LOCAL_TTS_KOKORO_MODEL_DIR, LOCAL_TTS_KOKORO_REPO_ID,
LOCAL_TTS_KOKORO_DEFAULT_VOICE, LOCAL_TTS_KOKORO_CMD, LOCAL_TTS_MELOTTS_CMD,
LOCAL_TTS_CHATTTS_CMD) so users on Linux/macOS can copy/paste, and place them
alongside the Windows examples with a short note indicating platform usage.
In `@local_server/local_tts_server/server.py`:
- Around line 67-73: The CORS and auth posture is too permissive: change the
app.add_middleware(CORSMiddleware, ...) config to disallow credentials (set
allow_credentials=False) and restrict allow_origins to a narrow list (e.g.,
localhost origins) instead of ["*"]; additionally add authentication/origin
validation to the WebSocket handler for "/v1/audio/speech/stream" (e.g., require
a secret token or validate the Origin header) so anonymous web pages cannot open
TTS streams, and harden or sanitize any use of CommandWavEngine to avoid shell
injection by treating inputs as data rather than shell commands and
validating/escaping them.
- Around line 668-700: The websocket_endpoint currently appends incoming text
chunks (text_chunk) into text_parts until msg["event"] == "end" with no
size/count/time limits, so add defensive limits: enforce a per-message length
cap (reject/close if len(text_chunk) > PER_MESSAGE_LIMIT), enforce a cumulative
character cap (track total_chars and close if total_chars > TOTAL_LIMIT), and
enforce a max chunk count or max receive iterations (track chunk_count and
break/close when > MAX_CHUNKS); apply these checks inside the receive loop
before appending to text_parts and send a clear close with reason; keep the
existing behavior for parse_voice/ warmup_engine_for_voice and use constants
like PER_MESSAGE_LIMIT, TOTAL_LIMIT, MAX_CHUNKS to make limits easy to adjust.
In `@local_server/local_tts_server/start_kokoro_server.ps1`:
- Line 207: The script builds $launcherScript (launcher.py) but later invokes it
with & python $launcherScript which uses the system Python and fails to find
project modules like config and utils.*; modify the script so that before
invoking launcher.py you activate or source the main project virtualenv (the
repo's venv, not just .venv-local-tts) or adjust the environment (e.g., set
PATH/PYTHONHOME/PYTHONPATH) so the main project's site-packages are on Python's
import path; update the invocation around the & python $launcherScript call to
ensure the main venv is activated (or add a clear comment/instruction that users
must install/activate the main project's venv) so launcher.py can import project
modules successfully.
In `@static/js/api_key_settings.js`:
- Around line 95-126: refreshLocalKokoroVoiceOptions can produce a bogus
"kokoro:undefined" option and the app currently always triggers Local Kokoro
requests/side-effects; fix both by (1) hard-guarding voiceId in
refreshLocalKokoroVoiceOptions: derive voiceId from v.voice_id || v.id || v.name
and explicitly skip if that derived id is falsy before prefixing with "kokoro:",
and keep the existing check that filters "kokoro:"; and (2) stop unconditional
Local Kokoro activation by removing the always-run call to
loadLocalKokoroTtsConfig from loadCurrentApiKey and change the localKokoroWsUrl
input handler that calls applyLocalKokoroTtsConfig to either debounce the input
or split applyLocalKokoroTtsConfig into two actions (an explicit "use Local
Kokoro" action that flips ttsModelProvider/ttsModelApiKey/GPT-SoVITS and a
separate "refresh voices" action that only calls
refreshLocalKokoroVoiceOptions/loadLocalKokoroTtsConfig).
---
Nitpick comments:
In `@local_server/local_tts_server/server.py`:
- Around line 169-178: Remove the misleading reference to LOCAL_TTS_KOKORO_CMD
from the module/class docstring and any related comments because build_engines()
registers kokoro using KokoroEngine() (an in-process engine) rather than
CommandWavEngine, so that env var is unused; also update start_kokoro_server.ps1
to stop setting the LOCAL_TTS_KOKORO_CMD environment variable (remove the set
lines around that variable) to avoid confusion for future maintainers.
- Line 502: The current use of `@app.on_event`("startup") in server.py is
deprecated in FastAPI 0.93+; replace it with the recommended lifespan async
context manager pattern by defining an async def lifespan(app): ... using
"asynccontextmanager" and passing lifespan=lifespan to the FastAPI(...)
constructor, move the startup logic currently inside the function registered
with `@app.on_event`("startup") (e.g., the Kokoro server startup sequence) into
the enter block and any shutdown cleanup into the exit block, and remove the
`@app.on_event`("startup") decorator so the app is constructed with lifespan
instead to avoid DeprecationWarning and improve error handling.
In `@local_server/local_tts_server/start_kokoro_server.ps1`:
- Around line 376-381: The spawned server process is assigned to $serverProcess
via Start-Process -PassThru but never managed; either remove -PassThru to avoid
an unused variable if you intentionally leave the window open, or implement
lifecycle cleanup: keep -PassThru, then on failure paths (before the throw
"Kokoro local TTS server failed to become ready") call Stop-Process (or use
$serverProcess.Id with Stop-Process/Wait-Process) to terminate the child, and
ensure any existing cleanup logic (e.g., Stop-ExistingLocalTtsIfNeeded) is
invoked as a fallback.
- Around line 165-172: The hard-coded CUDA torch index URL (--index-url
"https://download.pytorch.org/whl/cu128") causes installs to require CUDA 12.8;
change the install step that calls Invoke-UvChecked (the pip install invocation
that sets $installed using $venvPython) to allow override via an environment
variable (e.g. LOCAL_TTS_TORCH_INDEX_URL) and fall back to the current cu128 URL
if unset; update the pip install arguments to use that variable and ensure any
failure still writes to the existing cudaInstallFailedMarker path.
- Line 17: Wrap the existing (Get-Command uv -ErrorAction Stop).Source call in a
try/catch inside start_kokoro_server.ps1: attempt to resolve uv into $uvExe, and
in the catch emit a clear bilingual message (Chinese + English) telling the user
that "uv" is not found and how to install it (suggest using their package
manager, e.g. npm/choco/brew: "npm i -g uv" as an example), then exit with a
non-zero status; reference the $uvExe variable and the Get-Command uv call when
making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 0e45f044-36d9-492d-85d7-96c5cc7277cb
📒 Files selected for processing (12)
.gitignorelocal_server/local_tts_server/README.mdlocal_server/local_tts_server/kokoro_cli.pylocal_server/local_tts_server/kokoro_models/.gitkeeplocal_server/local_tts_server/server.pylocal_server/local_tts_server/start_kokoro_server.batlocal_server/local_tts_server/start_kokoro_server.ps1main_logic/tts_client.pystatic/js/api_key_settings.jsstatic/js/character_card_manager.jstemplates/api_key_settings.htmlutils/custom_tts_adapter.py
✅ Files skipped from review due to trivial changes (3)
- local_server/local_tts_server/kokoro_models/.gitkeep
- local_server/local_tts_server/start_kokoro_server.bat
- .gitignore
🚧 Files skipped from review as they are similar to previous changes (3)
- main_logic/tts_client.py
- local_server/local_tts_server/kokoro_cli.py
- templates/api_key_settings.html
| app.add_middleware( | ||
| CORSMiddleware, | ||
| allow_origins=["*"], | ||
| allow_credentials=True, | ||
| allow_methods=["*"], | ||
| allow_headers=["*"], | ||
| ) |
There was a problem hiding this comment.
CORS 配置全开 + allow_credentials 不太对劲喵
allow_origins=["*"] 配 allow_credentials=True 这个组合按规范来说浏览器是会直接拒绝的,所以实际行为有点诡异;而且这个 server 默认只 bind 127.0.0.1,理论上根本不需要给跨源访问开 credentials 喵。
更严重的是 WS 端点 "/v1/audio/speech/stream" 没有任何鉴权,结合 CORS 全开,任何能跑 JS 的本地浏览器页面(恶意广告、被植入的网页)都能直接连本机 ws://127.0.0.1:50000 打 TTS 请求喵。组合上 CommandWavEngine 的 shell 注入,这就是个完整可被本地 web 利用的链路喵。
建议:
🔒 收紧 CORS 喵
app.add_middleware(
CORSMiddleware,
- allow_origins=["*"],
- allow_credentials=True,
+ allow_origins=[], # 本地 server 不需要跨源;如果确实需要,列具体 origin
+ allow_credentials=False,
allow_methods=["*"],
allow_headers=["*"],
)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@local_server/local_tts_server/server.py` around lines 67 - 73, The CORS and
auth posture is too permissive: change the app.add_middleware(CORSMiddleware,
...) config to disallow credentials (set allow_credentials=False) and restrict
allow_origins to a narrow list (e.g., localhost origins) instead of ["*"];
additionally add authentication/origin validation to the WebSocket handler for
"/v1/audio/speech/stream" (e.g., require a secret token or validate the Origin
header) so anonymous web pages cannot open TTS streams, and harden or sanitize
any use of CommandWavEngine to avoid shell injection by treating inputs as data
rather than shell commands and validating/escaping them.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7ed3e85a22
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
local_server/local_tts_server/start_kokoro_server.ps1 (1)
401-403:⚠️ Potential issue | 🟠 Major | ⚡ Quick win主启动器仍在用系统 Python 运行,可能直接导入失败喵
Line 402 还在用
python而不是已准备好的虚拟环境解释器;当系统环境缺少仓库依赖时,launcher.py会启动失败喵。建议直接复用$venvPython(或显式激活主项目 venv)喵。可直接套用的最小修复喵
- & python $launcherScript + & $venvPython $launcherScript exit $LASTEXITCODE🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@local_server/local_tts_server/start_kokoro_server.ps1` around lines 401 - 403, The launcher is invoked with the system "python" causing failures when dependencies are only in the prepared venv; update the invocation that runs $launcherScript to use the prepared virtualenv interpreter ($venvPython) (or explicitly activate the project venv before launching) so the script runs with the correct environment; replace the call that currently uses "python $launcherScript" with an invocation using $venvPython and preserve the subsequent exit $LASTEXITCODE handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@local_server/local_tts_server/README.md`:
- Around line 53-57: The README's Voice Selector list is missing the "melo:"
alias while the code's LOCAL_LIGHTWEIGHT_TTS_PREFIXES includes "melo:"; update
the Voice Selector examples to include "melo:" alongside kokoro:, melotts:, and
chattts: so the docs match the code (ensure the README Voice Selector section
explicitly lists "melo:" and any description mirrors how
LOCAL_LIGHTWEIGHT_TTS_PREFIXES and the voice selection logic treat that alias).
- Line 1: 文件开头包含 UTF-8 BOM(U+FEFF)字符,导致 Markdown 渲染或工具处理异常;请在 README.md 的首行删除该
BOM 字符(即移除行首的 `\uFEFF`),并确保将文件另存为 UTF-8 无 BOM(“UTF-8 without
BOM”/“UTF-8”)编码,保存后重新提交以验证渲染正常。
---
Duplicate comments:
In `@local_server/local_tts_server/start_kokoro_server.ps1`:
- Around line 401-403: The launcher is invoked with the system "python" causing
failures when dependencies are only in the prepared venv; update the invocation
that runs $launcherScript to use the prepared virtualenv interpreter
($venvPython) (or explicitly activate the project venv before launching) so the
script runs with the correct environment; replace the call that currently uses
"python $launcherScript" with an invocation using $venvPython and preserve the
subsequent exit $LASTEXITCODE handling.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 4a2b9b91-c07f-483c-be5b-3f192a5e4822
📒 Files selected for processing (20)
.gitignorelocal_server/local_tts_server/README.mdlocal_server/local_tts_server/kokoro_cli.pylocal_server/local_tts_server/kokoro_models/.gitkeeplocal_server/local_tts_server/server.pylocal_server/local_tts_server/start_kokoro_server.batlocal_server/local_tts_server/start_kokoro_server.ps1main_logic/tts_client.pystatic/js/api_key_settings.jsstatic/js/character_card_manager.jsstatic/locales/en.jsonstatic/locales/es.jsonstatic/locales/ja.jsonstatic/locales/ko.jsonstatic/locales/pt.jsonstatic/locales/ru.jsonstatic/locales/zh-CN.jsonstatic/locales/zh-TW.jsontemplates/api_key_settings.htmlutils/custom_tts_adapter.py
✅ Files skipped from review due to trivial changes (5)
- local_server/local_tts_server/kokoro_models/.gitkeep
- local_server/local_tts_server/start_kokoro_server.bat
- static/locales/zh-CN.json
- static/locales/es.json
- static/locales/zh-TW.json
🚧 Files skipped from review as they are similar to previous changes (7)
- .gitignore
- templates/api_key_settings.html
- static/js/api_key_settings.js
- local_server/local_tts_server/kokoro_cli.py
- static/js/character_card_manager.js
- main_logic/tts_client.py
- local_server/local_tts_server/server.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8c46d01178
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@local_server/local_tts_server/start_kokoro_server.ps1`:
- Around line 274-283: The current -KeepExisting branch unconditionally keeps
whatever process owns the port; change the branch so it only preserves the
existing process when the Test-LocalTtsHealth result indicates a real Kokoro TTS
(i.e. use the existing $isLocalTts boolean) before setting
$script:existingLocalTtsKept and returning; likewise update the later
readiness/wait logic (the code that currently only checks /health HTTP 200) to
reuse the same Test-LocalTtsHealth check and require $health.status -eq 'ok' and
$health.engines (the same signature used to compute $isLocalTts) instead of a
bare 200 so we only accept a true Kokoro backend.
In `@static/js/character_card_manager.js`:
- Around line 5559-5562: The voice_id normalization is inconsistent: in
voices.map currently you only add the "kokoro:" prefix if voice_id is missing,
which causes mismatches with other logic that checks startsWith('kokoro:');
update the mapping in the voices handling (the voices.map that builds
voice_id/name and the similar mapping at the other occurrence around the
5595–5605 block) to always normalize bare IDs by ensuring any id or name that
does not start with "kokoro:" is prepended with "kokoro:" when producing
voice_id (while leaving explicit voice_id values that already include the prefix
unchanged), so selected items consistently match the startsWith('kokoro:')
branch.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ed0a904d-a2c6-4edf-817c-805d521671fd
📒 Files selected for processing (4)
local_server/local_tts_server/README.mdlocal_server/local_tts_server/start_kokoro_server.ps1static/js/api_key_settings.jsstatic/js/character_card_manager.js
🚧 Files skipped from review as they are similar to previous changes (2)
- local_server/local_tts_server/README.md
- static/js/api_key_settings.js
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 99a02eed9b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0956b20dac
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f51c5888d7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2002c004d3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 42d8cb0277
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7b584b0f58
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
♻️ Duplicate comments (1)
local_server/local_tts_server/start_kokoro_server.ps1 (1)
300-303:⚠️ Potential issue | 🟠 Major | ⚡ Quick win避免把
torch.cuda.is_available()当成“是否已安装 torch”的判据喵
import torch成功但 CUDA 不可用时,这里仍会走--force-reinstall torch,容易在启动阶段反复重装;而后面其实已经有 CPU 回退路径了喵。🐾 建议的最小修改喵
- $torchProbe = Test-VenvPython "import torch; print(torch.__version__); print(torch.version.cuda); print(torch.cuda.is_available())" - if ($torchProbe.Ok -and ($torchProbe.Text -match "True")) { + $torchProbe = Test-VenvPython "import torch; print(torch.__version__)" + if ($torchProbe.Ok) { return }#!/bin/bash # 只做只读验证:确认当前分支仍用 cuda 可用性作为“已安装 torch”判据,且会强制重装 rg -n 'Ensure-CudaTorchIfNeeded|torch\.cuda\.is_available\(\)|-match "True"|--force-reinstall' local_server/local_tts_server/start_kokoro_server.ps1Also applies to: 318-325
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@local_server/local_tts_server/start_kokoro_server.ps1` around lines 300 - 303, The current guard uses Test-VenvPython output ($torchProbe) and matches torch.cuda.is_available() to decide whether torch is "installed", causing unnecessary --force-reinstall when CUDA is simply unavailable; change the check in the torch probe block (the call to Test-VenvPython that runs "import torch; print(...); print(torch.cuda.is_available())" and the subsequent if) to only consider import success (e.g., $torchProbe.Ok or absence of import error) as evidence that torch is installed, and do not require $torchProbe.Text -match "True" (the torch.cuda.is_available() result) to skip reinstall; apply the same fix to the duplicate block referenced by Ensure-CudaTorchIfNeeded (lines around the other torch probe at 318-325) so CUDA availability no longer gates the "already installed" decision.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@local_server/local_tts_server/start_kokoro_server.ps1`:
- Around line 300-303: The current guard uses Test-VenvPython output
($torchProbe) and matches torch.cuda.is_available() to decide whether torch is
"installed", causing unnecessary --force-reinstall when CUDA is simply
unavailable; change the check in the torch probe block (the call to
Test-VenvPython that runs "import torch; print(...);
print(torch.cuda.is_available())" and the subsequent if) to only consider import
success (e.g., $torchProbe.Ok or absence of import error) as evidence that torch
is installed, and do not require $torchProbe.Text -match "True" (the
torch.cuda.is_available() result) to skip reinstall; apply the same fix to the
duplicate block referenced by Ensure-CudaTorchIfNeeded (lines around the other
torch probe at 318-325) so CUDA availability no longer gates the "already
installed" decision.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 1105e977-98da-4185-bb21-d8864054b7d8
📒 Files selected for processing (3)
local_server/local_tts_server/package_kokoro_server.ps1local_server/local_tts_server/start_kokoro_server.ps1utils/custom_tts_adapter.py
# Conflicts: # static/js/api_key_settings.js
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f33601872c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| result = await loop.run_in_executor( | ||
| None, | ||
| lambda: engine.synthesize(text, voice=spec.voice, speed=speed), | ||
| ) |
There was a problem hiding this comment.
Honor LOCAL_TTS_SYNTHESIS_MODE when sending WS audio
LOCAL_TTS_SYNTHESIS_MODE=streaming currently has no effect on synthesis behavior: the handler always waits for engine.synthesize(...) to finish before sending any bytes, so output is still fully buffered/merged. This makes the /health streaming_output flag misleading and keeps first-byte latency high even when operators explicitly enable streaming mode. Add a real mode branch (or remove/rename the flag) so runtime behavior matches the advertised setting.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@local_server/local_tts_server/server.py`:
- Around line 760-771: The chunk_count increment causes an off-by-one when an
"end" message is received because chunk_count is incremented before checking
msg.get("event") == "end"; change the logic in the loop so you only increment or
enforce MAX_WS_TEXT_CHUNKS when actually accepting a text chunk (e.g. move the
chunk_count += 1 and the MAX_WS_TEXT_CHUNKS check to occur after verifying
msg.get("event") != "end" or only when a non-empty text_chunk is appended), and
keep validations for received_chars/MAX_WS_TEXT_CHARS tied to when
text_parts.append(text_chunk) happens; update references to chunk_count,
MAX_WS_TEXT_CHUNKS, msg.get("event"), text_chunk, received_chars, and
text_parts.append accordingly.
- Around line 742-745: The parsed speed value from config.get("speed") needs
validation to reject NaN/Inf and non-positive numbers; after converting to float
in the existing try/except around speed (the variable and config.get("speed")
call), check using math.isfinite(speed) and speed > 0.0 and if the check fails,
set speed = 1.0 (keeping the existing except fallback for TypeError/ValueError);
ensure to import math if not already present and keep behavior deterministic by
always falling back to 1.0 for invalid values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7382f271-367d-4cf9-a592-ba6e2ed94870
📒 Files selected for processing (17)
local_server/local_tts_server/README.mdlocal_server/local_tts_server/kokoro_cli.pylocal_server/local_tts_server/local_tts_profiles.pylocal_server/local_tts_server/package_kokoro_server.ps1local_server/local_tts_server/server.pymain_logic/tts_client.pystatic/js/api_key_settings.jsstatic/js/character_card_manager.jsstatic/locales/en.jsonstatic/locales/es.jsonstatic/locales/ja.jsonstatic/locales/ko.jsonstatic/locales/pt.jsonstatic/locales/ru.jsonstatic/locales/zh-CN.jsonstatic/locales/zh-TW.jsontemplates/api_key_settings.html
✅ Files skipped from review due to trivial changes (5)
- static/locales/zh-TW.json
- static/locales/ru.json
- static/locales/zh-CN.json
- local_server/local_tts_server/README.md
- static/locales/en.json
🚧 Files skipped from review as they are similar to previous changes (9)
- static/locales/ja.json
- static/locales/es.json
- static/locales/pt.json
- templates/api_key_settings.html
- static/js/api_key_settings.js
- local_server/local_tts_server/kokoro_cli.py
- static/js/character_card_manager.js
- static/locales/ko.json
- main_logic/tts_client.py
# Conflicts: # static/js/api_key_settings.js
This reverts commit 4fbc899.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8aa77531ae
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| speech_speed = float(parts[1]) | ||
| except ValueError: | ||
| pass | ||
| if voice_name and '|' in voice_name: |
There was a problem hiding this comment.
Preserve legacy
voice:speed parsing for WS local TTS
This change only parses speed from the new voice|speed=... suffix, so existing saved voice IDs that use the previous voice:speed form are now sent unchanged as voice. On the new local server path, parse_local_tts_voice splits on the first colon to derive model, so a legacy value like 中文女:1.1 is interpreted as model 中文女 and fails with Unsupported local TTS model. That creates a hard regression for previously working local-WS configurations unless users manually rewrite every stored voice ID.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b638f967e7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (providerSelect) { | ||
| providerSelect.value = 'custom'; | ||
| onCustomModelProviderChange('tts'); | ||
| } |
There was a problem hiding this comment.
Enable custom TTS when applying Local Kokoro
When a user presses the Local Kokoro apply controls while the global “Enable custom API” checkbox is still off, this only selects the custom provider and fills the URL/voice fields; it never sets enableCustomApi, and this same function disables GPTSOVITS_ENABLED. In that state get_model_api_config('tts_custom') reports is_custom=false, so get_tts_worker skips the new ws:// local route and kokoro: voices are rejected by validation unless the user separately knows to enable the global custom toggle. Toggle enableCustomApi here (or treat TTS ws:// configs as custom) so the new Local Kokoro panel actually activates the saved endpoint.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Accepted and fixed in 94d66f3. Local Kokoro Apply now enables the global custom API toggle before applying the TTS custom provider and WS fields, so get_model_api_config('tts_custom') resolves as custom for the local route.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 94d66f3024
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| @app.websocket("/v1/audio/speech/stream") | ||
| async def websocket_endpoint(websocket: WebSocket): | ||
| await websocket.accept() |
There was a problem hiding this comment.
Validate WebSocket origins before accepting synthesis requests
When the local TTS server is running on the default localhost port, any webpage the user visits can open ws://127.0.0.1:50000/v1/audio/speech/stream because this handler accepts the socket without checking Origin or requiring a token. That lets an untrusted site repeatedly submit up to the configured text limits and drive CPU/GPU synthesis work under the user's account; add an origin allowlist or shared local token before accepting synthesis requests.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Accepted and fixed in dfc5dd7. The synthesis WebSocket now validates Origin before accept: non-browser local clients without Origin remain allowed, localhost/127.0.0.1/[::1] browser origins are allowed, other browser origins are rejected with policy violation, and trusted extra origins can be configured via LOCAL_TTS_ALLOWED_ORIGINS. README documents the setting.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@static/js/api_key_settings.js`:
- Around line 183-193: The Local Kokoro panel is hidden when ttsModelProvider is
switched to 'custom' because shouldShowLocalKokoroTtsConfig returns false if
hasAnyTtsValue is empty; change shouldShowLocalKokoroTtsConfig so that when
provider === 'custom' it returns true (i.e., remove or bypass the hasAnyTtsValue
early-return) so the panel becomes visible on first switch and the existing
visibility-updater can perform the voice list fetch; keep the existing checks
for non-'custom' providers and continue to use the
ttsModelUrl/ttsModelId/ttsVoiceId values for non-custom logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 71f3e456-ed05-46f6-b846-3d3d956b357a
📒 Files selected for processing (2)
static/js/api_key_settings.jstemplates/api_key_settings.html
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 97c9aeb9b9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (!['ws:', 'wss:', 'http:', 'https:'].includes(url.protocol)) return false; | ||
| const hostname = url.hostname.replace(/^\[|\]$/g, '').toLowerCase(); | ||
| const isLocalHost = hostname === '127.0.0.1' || hostname === 'localhost' || hostname === '::1'; | ||
| return isLocalHost && url.port === '50000'; |
There was a problem hiding this comment.
Allow non-default local Kokoro ports in the panel
When users start the bundled local TTS server with LOCAL_TTS_PORT set to anything other than 50000, the saved URL is still a valid Kokoro endpoint, but this gate returns false and updateLocalKokoroTtsConfigVisibility() hides the Local Kokoro panel and skips voice refreshes. That prevents configuring or applying voices for supported custom-port deployments; the probe should accept valid local Kokoro URLs rather than hard-coding the default port.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Accepted and fixed in b4dbd4c. The panel gate now accepts any localhost/127.0.0.1/[::1] Kokoro URL instead of hard-coding port 50000. It still derives /health from that same URL and only shows the Local Kokoro panel when the backend reports engines containing kokoro, so non-Kokoro custom TTS configs remain hidden.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
static/js/api_key_settings.js (1)
291-302:⚠️ Potential issue | 🟠 Major | ⚡ Quick win别让 profile 和 voice 保存成两套不一致的值喵
这里把
ttsModelId(profile)和ttsVoiceId(voice)分开保存了,但当前代码没有任何地方强制两者保持一致:拉 voice 列表时不按currentProfile过滤,选中 voice 时也不回写localKokoroProfileSelect。这样很容易保存出ttsModelId='kokoro-zh'、ttsVoiceId='kokoro:af_heart'这类漂移组合,后端解析就会吃到互相打架的配置喵。♻️ 最小修法参考喵
voices.forEach(v => { const rawVoiceId = String(v.voice_id || v.id || v.name || '').trim(); if (!rawVoiceId) return; const voiceId = normalizeLocalKokoroVoiceId(rawVoiceId, currentProfile); if (!voiceId || voiceId === 'kokoro:') return; + const voiceProfile = localKokoroProfileFromVoiceId(voiceId); + if (voiceProfile && voiceProfile !== currentProfile) return; const option = document.createElement('option'); option.value = voiceId; option.textContent = v.name || voiceId.replace(/^kokoro:/, ''); option.title = voiceId; localVoiceSelect.appendChild(option); });const localKokoroVoiceSelect = document.getElementById('localKokoroVoiceSelect'); if (localKokoroVoiceSelect) { - localKokoroVoiceSelect.addEventListener('change', applyLocalKokoroTtsConfig); + localKokoroVoiceSelect.addEventListener('change', () => { + const impliedProfile = localKokoroProfileFromVoiceId(localKokoroVoiceSelect.value); + const localKokoroProfileSelect = document.getElementById('localKokoroProfileSelect'); + if (localKokoroProfileSelect && impliedProfile) { + localKokoroProfileSelect.value = impliedProfile; + } + applyLocalKokoroTtsConfig(); + }); }Also applies to: 2140-2142
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@static/js/api_key_settings.js` around lines 291 - 302, When populating and selecting Kokoro voices ensure profile and voice stay consistent: in the block that fills localVoiceSelect (using normalizeLocalKokoroVoiceId and currentProfile) filter the voices list to only include voices matching currentProfile (or whose normalized voiceId starts with the model prefix for currentProfile) and set option.value to that normalized voiceId; additionally add/update the change handler for localVoiceSelect so when a voice is chosen it parses the selected voiceId's model prefix (e.g., extract the part before ':' or map via normalizeLocalKokoroVoiceId) and sets localKokoroProfileSelect.value to that model id (and optionally triggers its change event) so ttsModelId and ttsVoiceId cannot drift; also apply the same filtering/fix referenced around the other occurrence (lines 2140-2142) where voices are listed or selected.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@static/js/api_key_settings.js`:
- Around line 291-302: When populating and selecting Kokoro voices ensure
profile and voice stay consistent: in the block that fills localVoiceSelect
(using normalizeLocalKokoroVoiceId and currentProfile) filter the voices list to
only include voices matching currentProfile (or whose normalized voiceId starts
with the model prefix for currentProfile) and set option.value to that
normalized voiceId; additionally add/update the change handler for
localVoiceSelect so when a voice is chosen it parses the selected voiceId's
model prefix (e.g., extract the part before ':' or map via
normalizeLocalKokoroVoiceId) and sets localKokoroProfileSelect.value to that
model id (and optionally triggers its change event) so ttsModelId and ttsVoiceId
cannot drift; also apply the same filtering/fix referenced around the other
occurrence (lines 2140-2142) where voices are listed or selected.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 5dfe596c-142e-4c20-9ddc-6172fd2290d4
📒 Files selected for processing (1)
static/js/api_key_settings.js
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b4dbd4c05a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if LOCAL_LIGHTWEIGHT_BARE_VOICE_RE.match(normalized_voice_id): | ||
| return True if is_local_ws_tts else None |
There was a problem hiding this comment.
Accept non-ASCII bare voices for local WS TTS
When custom TTS is a ws:///wss:// endpoint, bare local voice IDs are only accepted if they match this ASCII regex, so existing/local CosyVoice selectors such as 中文女 fall through as None and are later rejected by validate_voice_id. The worker itself still treats 中文女 as its default bare voice and sends bare voices unchanged to the local WebSocket path, so this blocks saving or updating valid local WS configurations that use Chinese voice names; consider allowing bare IDs whenever is_local_ws_tts is true, or at least using a Unicode-safe check.
Useful? React with 👍 / 👎.
|
Addressed CodeRabbit review 4418321824 in b1a0e16. The Local Kokoro voice list now filters voices to the selected profile, voice changes sync the profile selector, loading prefers a voice-implied profile when present, and apply normalizes model/voice so ttsModelId and ttsVoiceId cannot drift between zh/en profiles. Validation passed: node --check static/js/api_key_settings.js, git diff --check, and uv run python scripts/check_api_trailing_slash.py. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b1a0e16a74
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if base_url.startswith('ws://') or base_url.startswith('wss://'): | ||
| return local_cosyvoice_worker, None, 'local_cosyvoice' |
There was a problem hiding this comment.
Support URL-only local WS TTS configs
When custom TTS is enabled with a ws:///wss:// URL but the TTS model field is left blank (a common local Kokoro/CosyVoice setup, since the model is selected by the local server/voice), cm.get_model_api_config('tts_custom') does not return a custom config because it only treats the slot as complete when both custom_model and custom_url are present. In that scenario tts_config.get('is_custom') is false, so this WebSocket branch is skipped and TTS falls back to the wrong worker instead of the local server; route based on the saved/raw WS URL or treat WS TTS URLs as complete without requiring a model id.
Useful? React with 👍 / 👎.
Summary by CodeRabbit
新增功能
改进
文档
其他