Qwen3 tts with websocket support#351
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Walkthrough新增基于 Qwen3 的本地 WebSocket TTS 子系统:添加 git 子模块、实现异步 WebSocket TTS 服务器与演示客户端,并在 TTS 调度中加入 Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client (客户端)
participant Server as Qwen3_TTS_Server (WebSocket 服务器)
participant Queue as TaskQueue (任务队列)
participant Worker as Worker (后台推理线程)
participant Model as Qwen3_Model (模型)
Client->>Server: 建立 WebSocket 并发送文本喵
Server->>Server: 文本缓冲与分句喵
Server->>Queue: 入队句子任务喵
Queue->>Worker: 取出推理任务喵
Worker->>Model: 执行语音克隆推理喵
Model-->>Worker: 返回音频块喵
Worker->>Server: 推送音频块到发送队列喵
Server-->>Client: 流式发送 PCM 音频块喵
Client->>Server: 可选发送取消请求喵
Server->>Queue: 处理取消并停止对应任务喵
Estimated code review effort🎯 4 (复杂) | ⏱️ ~45 分钟 Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 14
🧹 Nitpick comments (4)
main_logic/tts_client.py (1)
1820-1826: 裸except: pass会吞掉所有异常喵这里的
session.update发送失败时,用裸except把所有异常都吃掉了,连日志都不打一条。如果连接有问题,这里会静默失败,后续排查起来很痛苦喵。💊 至少记录一下嘛
try: await ws.send(json.dumps({ "type": "session.update", "language": "Chinese" })) - except: - pass + except Exception as e: + logger.debug(f"session.update 发送失败(可忽略): {e}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1820 - 1826, The try/except around the await ws.send(...) which sends the "session.update" message currently swallows all exceptions; change it to catch Exception as e and log the failure instead of passing silently (e.g., use logger.exception or logger.error with the exception) so send failures on ws.send("session.update", language="Chinese") are recorded for debugging; keep the send call and only suppress expected recoverable errors after logging.local_server/qwen3_tts_server/server_commit.py (2)
97-108:consumer_task取消后没有await等待,可能产生警告喵Line 108 调用了
consumer_task.cancel()但之后没有await consumer_task来等待它真正结束。这在 asyncio 里会产生 "Task was destroyed but it is pending" 的警告喵。🐾 建议等待取消完成
await client.close() consumer_task.cancel() + try: + await consumer_task + except asyncio.CancelledError: + pass🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/server_commit.py` around lines 97 - 108, consumer_task is cancelled but not awaited, which can trigger "Task was destroyed but it is pending" warnings; after calling consumer_task.cancel() await consumer_task (inside a try/except asyncio.CancelledError) to ensure the task finishes and any cleanup runs. Locate the block that creates consumer_task from client.handle_messages(), producer_task from _produce_text, and the shutdown sequence that calls client.wait_for_response_done() and client.close(); modify it so you cancel consumer_task, then await consumer_task (catching and swallowing asyncio.CancelledError or handling exceptions) before finishing shutdown.
15-23: 模块级副作用:import 就会爆炸喵喵…这个文件在模块顶层做了两件危险的事:
- Line 15-16: 如果环境变量没设就直接
raise ValueError,别的模块想import一下就炸了喵- Line 22:
pyaudio.PyAudio()在模块加载时就初始化音频设备,没装 PortAudio 就会崩喵作为 demo 脚本倒还好,但如果有人想复用里面的函数就很头疼了。建议把这些初始化挪到
main()或_run_demo()里喵~🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/server_commit.py` around lines 15 - 23, The module has dangerous top-level side effects: checking API_KEY and calling pyaudio.PyAudio() at import; move these into runtime initialization (e.g., main() or _run_demo()). Specifically, remove the top-level raise for API_KEY and instead validate DASHSCOPE_API_KEY inside main() or _run_demo(), raising the ValueError there; likewise, defer creating the PyAudio instance and _audio_stream by moving the pyaudio.PyAudio() call and any audio device setup into a lazy initializer used by main()/_run_demo() (or create a helper like init_audio() that sets _audio_pyaudio and _audio_stream), keep _AUDIO_SAMPLE_RATE and _audio_chunks as simple constants/state but avoid device initialization at import. Ensure function names referenced: API_KEY, _audio_pyaudio, _audio_stream, _AUDIO_SAMPLE_RATE, main(), and _run_demo().local_server/qwen3_tts_server/local_server.py (1)
62-77: 硬编码的测试用路径和文本散落在构造函数里喵Line 62 的
nyaning_voice.pt、Line 72 的日文参考文本、Line 73 的uttid_f1.wav都是测试/开发用的默认值,直接硬编码在类的构造函数里了喵。虽然注释说了"测试用",但这些值会在生产环境中作为 fallback 使用,可能导致不可预期的行为喵。建议至少把这些提取为类常量或模块常量,方便后续修改喵~🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 62 - 77, The constructor currently hardcodes test/dev defaults into attributes (self.pt_path uses "nyaning_voice.pt", self.ref_text uses a Japanese sample, self.ref_wav uses "uttid_f1.wav"), which should be extracted to constants or config; define module- or class-level constants (e.g. DEFAULT_VOICE_PT, DEFAULT_REF_TEXT, DEFAULT_REF_WAV or pull from env/config) and change assignments in __init__ (references: self.pt_path, self.ref_text, self.ref_wav) to fall back to those constants only, document them and ensure callers can override via parameters so production never unintentionally uses test artifacts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@local_server/qwen3_tts_server/local_server.py`:
- Around line 215-217: The except block inside _load_engine currently only logs
the exception so self.model may remain None and the server continues; change it
to abort startup like the earlier startup logic by either re-raising the caught
exception or calling sys.exit(1) after logging so the process stops when engine
loading fails (ensure you import sys if you use sys.exit); this ensures
_do_inference sees a valid self.model or the process exits instead of silently
returning no audio.
- Around line 280-320: _sender_loop reads the shared variable current_job_id
after an await which can lead to a race; fix by snapshotting the job id
immediately after dequeuing a chunk and before any awaits so the response.done
uses the captured value. Specifically, in the async function _sender_loop (which
consumes audio_queue), after chunk = await audio_queue.get() capture
local_job_id = current_job_id (or include job_id when enqueuing) and then use
local_job_id when building/sending {"type":"response.done","job_id":
local_job_id}; this avoids relying on the mutable outer current_job_id that
_stop_current_job may change.
- Around line 179-186: The model loading call to
Qwen3TTSForConditionalGeneration.from_pretrained hard-codes device_map="cuda"
and attn_implementation="flash_attention_2", which conflicts with the earlier
CPU fallback; update the device_map to use the same conditional (e.g. the
existing "cuda" if torch.cuda.is_available() else "cpu" or use self.device) and
only set attn_implementation="flash_attention_2" when running on CUDA (omit or
choose a CPU-safe attention implementation when on CPU); adjust any dtype/device
parameters accordingly so model loading works on both GPU and CPU environments.
- Line 409: The TTS server currently defaults host to "0.0.0.0" (in the
local_server.py assignment using tts_custom.get("host")), which exposes the
service to the LAN; change the default to "127.0.0.1" so the line using host =
tts_custom.get("host") or "0.0.0.0" becomes host = tts_custom.get("host") or
"127.0.0.1" (or equivalent) and ensure any related startup code that reads this
host (e.g., the TTS server bootstrap/serve function) respects that value and
documents that remote access must be explicitly configured by the user.
- Around line 254-257: The conversion from float to int16 in the block that
computes audio_data = wavs[0].flatten() and audio_int16 = (audio_data *
32767).astype(np.int16) lacks clipping and can overflow; modify the conversion
to first scale then clip to int16 range (e.g., scaled = (audio_data *
32767).round() or just (audio_data * 32767) then use np.clip(scaled, -32768,
32767)) and finally astype(np.int16); follow the same approach used in
_resample_audio for reference to avoid overflow and distortion.
- Line 87: 当前代码在调用 torch.load(pt_path, map_location=self.device,
weights_only=False)(出现在本文件的这处和另一处加载逻辑)开启了 pickle 反序列化风险;将 weights_only 参数改为 True
并确保通过 torch.serialization.add_safe_globals(...) 注册所有在序列化对象中可能出现的自定义类型(例如
VoiceClonePromptItem 及其他自定义类)以便安全加载;具体操作:在加载点(torch.load 调用处)把
weights_only=False 改为 weights_only=True,并在初始化/模块加载阶段使用
torch.serialization.add_safe_globals([...]) 列出并注册所有需要的类型,确保 pt
文件只会还原权重而不会执行任意代码。
In `@local_server/qwen3_tts_server/Qwen3-TTS`:
- Line 1: The submodule in .gitmodules points to a third‑party fork
(https://github.com/rophec/Qwen3-TTS) and must be changed to the official
upstream; update the url entry in .gitmodules to
https://github.com/QwenLM/Qwen3-TTS (replace the current url value), then in the
submodule directory check out the exact trusted commit (the specific commit hash
you want) and commit the superproject so the submodule gitlink points to that
SHA rather than tracking a branch; also remove or stop relying on the branch =
qwen3_tts setting so the repo no longer follows a floating branch on a fork.
In `@main_logic/tts_client.py`:
- Around line 1888-1894: The handling for sid is missing waiting for the server
to finish after send_append/send_commit, which can drop the last audio chunk;
update the sid is None branch (where text_buf, send_append, send_commit and
current_speech_id are used) to, after calling send_commit, wait for the server's
response.done (with the same timeout logic used elsewhere in other workers)
before clearing text_buf and continuing—ensure you reference the same response
waiting helper or await pattern used by other workers so the final audio segment
is fully received.
- Around line 1734-1735: local_qwen3_tts_worker currently ignores the voice_id
parameter causing callers to be unable to choose voices; update
local_qwen3_tts_worker to use voice_id to select the appropriate voice model or
forward it to the local TTS service: locate the function local_qwen3_tts_worker
and either map voice_id to a voice_pt_path (or other local model path) and
load/use that model, or include voice_id in the payload sent to the local TTS
server so the server can select the voice; ensure the chosen symbol names
(voice_id, voice_pt_path, and any payload or request variable) are used
consistently when selecting or sending the voice selection.
- Around line 1511-1513: The current if-block routes all ws/wss base_url values
to local_qwen3_tts_worker, making local_cosyvoice_worker unreachable; revert the
URL routing logic to match the original ws/wss validation (as used around Line
1562-1563) so that ws/wss connections can still select local_cosyvoice_worker
when appropriate (adjust the conditional in the function that checks base_url
and returns local_qwen3_tts_worker/local_cosyvoice_worker), and if
local_cosyvoice_worker is truly deprecated add an explicit deprecation log
message before returning or remove the dead return path entirely; reference the
base_url check and the symbols local_qwen3_tts_worker and local_cosyvoice_worker
when applying the change.
- Around line 1833-1846: 函数 close_connection 名称与行为不符(它实际上创建新连接且未被调用),导致
voice_pt_path 及重连逻辑未被执行;请将其改为真正的重连/关闭逻辑并在 async_worker 中在 speech_id
变化时调用,或者移除无用函数以消除混淆。具体修复:在 async_worker 中查找对 speech_id 变化的处理点并在变化时调用一个正确实现的
reconnect 或 close_then_reopen
函数(定位符:async_worker、speech_id、receive_loop、voice_pt_path、close_connection);实现应先取消/await
现有 receive_task、关闭旧 ws(await ws.close())、然后根据需要创建新 ws、发送 session.update 包含
voice_pt_path,并重建 receive_task(使用现有 receive_loop(ws))。如果决定删除功能,移除
close_connection 并确保 voice_pt_path 的配置被另行发送或文档化为不可用。
- Around line 1756-1758: CURRENT_DIR is assigned but never used (F841) while
voice_pt_path is built using os.path.dirname(__file__) directly; either remove
the unused CURRENT_DIR assignment or use it to construct voice_pt_path for
consistency. Update the code around CURRENT_DIR and voice_pt_path so you either
drop the CURRENT_DIR line entirely, or replace os.path.dirname(__file__) with
CURRENT_DIR when building voice_pt_path (ensuring path remains absolute via
os.path.abspath).
- Around line 1775-1806: The async_worker lacks a finally cleanup so WebSocket
and receive_task can leak; add a try/finally around the worker's loop (the
async_worker function that spawns receive_loop) and in the finally ensure that
if receive_task exists you cancel it and await its termination, and if ws exists
you close the websocket connection (and await close) and set receive_task/ws to
None; reference the async_worker function, the receive_loop inner coroutine, and
the ws and receive_task variables when making these changes.
- Around line 1896-1918: When speech_id changes the code only sends
send_cancel() and appends to text_buf but does not reset the audio pipeline or
recreate the connection; update the logic where current_speech_id is compared to
sid (around current_speech_id, send_cancel, send_append, send_commit) to fully
teardown and reinitialize state: after detecting a different sid call
send_cancel(), clear/reset the resampler by invoking resampler.clear() (or
equivalent reset), close/recreate any per-speech connection or stream object
used by this worker, reset text_buf and other per-speech buffers, then set
current_speech_id = sid so the new speech starts with a clean pipeline before
send_append/send_commit.
---
Nitpick comments:
In `@local_server/qwen3_tts_server/local_server.py`:
- Around line 62-77: The constructor currently hardcodes test/dev defaults into
attributes (self.pt_path uses "nyaning_voice.pt", self.ref_text uses a Japanese
sample, self.ref_wav uses "uttid_f1.wav"), which should be extracted to
constants or config; define module- or class-level constants (e.g.
DEFAULT_VOICE_PT, DEFAULT_REF_TEXT, DEFAULT_REF_WAV or pull from env/config) and
change assignments in __init__ (references: self.pt_path, self.ref_text,
self.ref_wav) to fall back to those constants only, document them and ensure
callers can override via parameters so production never unintentionally uses
test artifacts.
In `@local_server/qwen3_tts_server/server_commit.py`:
- Around line 97-108: consumer_task is cancelled but not awaited, which can
trigger "Task was destroyed but it is pending" warnings; after calling
consumer_task.cancel() await consumer_task (inside a try/except
asyncio.CancelledError) to ensure the task finishes and any cleanup runs. Locate
the block that creates consumer_task from client.handle_messages(),
producer_task from _produce_text, and the shutdown sequence that calls
client.wait_for_response_done() and client.close(); modify it so you cancel
consumer_task, then await consumer_task (catching and swallowing
asyncio.CancelledError or handling exceptions) before finishing shutdown.
- Around line 15-23: The module has dangerous top-level side effects: checking
API_KEY and calling pyaudio.PyAudio() at import; move these into runtime
initialization (e.g., main() or _run_demo()). Specifically, remove the top-level
raise for API_KEY and instead validate DASHSCOPE_API_KEY inside main() or
_run_demo(), raising the ValueError there; likewise, defer creating the PyAudio
instance and _audio_stream by moving the pyaudio.PyAudio() call and any audio
device setup into a lazy initializer used by main()/_run_demo() (or create a
helper like init_audio() that sets _audio_pyaudio and _audio_stream), keep
_AUDIO_SAMPLE_RATE and _audio_chunks as simple constants/state but avoid device
initialization at import. Ensure function names referenced: API_KEY,
_audio_pyaudio, _audio_stream, _AUDIO_SAMPLE_RATE, main(), and _run_demo().
In `@main_logic/tts_client.py`:
- Around line 1820-1826: The try/except around the await ws.send(...) which
sends the "session.update" message currently swallows all exceptions; change it
to catch Exception as e and log the failure instead of passing silently (e.g.,
use logger.exception or logger.error with the exception) so send failures on
ws.send("session.update", language="Chinese") are recorded for debugging; keep
the send call and only suppress expected recoverable errors after logging.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.gitmoduleslocal_server/qwen3_tts_server/Qwen3-TTSlocal_server/qwen3_tts_server/local_server.pylocal_server/qwen3_tts_server/server_commit.pylocal_server/qwen3_tts_server/tts_realtime_client.pymain_logic/tts_client.py
| threading.Thread(target=self._worker_loop, daemon=True).start() | ||
|
|
||
| def _load_prompt_from_pt(self, pt_path: str): | ||
| payload = torch.load(pt_path, map_location=self.device, weights_only=False) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
torch.load weights_only parameter PyTorch 2.6 security RCE
💡 Result:
- Why
weights_onlymatters (security/RCE):torch.load()historically defaulted toweights_only=False, which uses Python pickle and is inherently unsafe (a malicious checkpoint can execute arbitrary code during deserialization). PyTorch announced (and warned since 2.4) that it would flip the default to reduce this risk. [1][2] - What changed in PyTorch 2.6: In PyTorch 2.6, the default for
torch.load(..., weights_only=...)was changed fromFalsetoTrueas a security improvement (backward-compatibility breaking). [1][2] - The key RCE issue you’re referring to: CVE-2025-32434 / GHSA-53q9-r3pm-6pq6 reported that even
torch.load(..., weights_only=True)could still lead to RCE in PyTorch versions < 2.6.0, and that the issue is fixed in 2.6.0. [3][4] - Practical takeaway: If you rely on
weights_only=Trueas a mitigation, you should be on PyTorch ≥ 2.6.0 (or newer), and still treat untrusted checkpoints as dangerous. [3][4]
Sources
[1] PyTorch 2.6 release blog (notes default weights_only change): https://pytorch.org/blog/pytorch2-6/
[2] PyTorch dev-discuss breaking change note (default flipped to weights_only=True): https://dev-discuss.pytorch.org/t/bc-breaking-change-torch-load-is-being-flipped-to-use-weights-only-true-by-default-in-the-nightlies-after-137602/2573
[3] GitHub Advisory GHSA-53q9-r3pm-6pq6 (RCE with weights_only=True, fixed in 2.6.0): GHSA-53q9-r3pm-6pq6
[4] OSV entry for GHSA-53q9-r3pm-6pq6 / CVE-2025-32434 (affected <2.6.0, fixed 2.6.0): https://osv.dev/vulnerability/GHSA-53q9-r3pm-6pq6
🏁 Script executed:
find . -type f -name "local_server.py" | head -5Repository: Project-N-E-K-O/N.E.K.O
Length of output: 114
🏁 Script executed:
wc -l ./local_server/qwen3_tts_server/local_server.pyRepository: Project-N-E-K-O/N.E.K.O
Length of output: 118
🏁 Script executed:
head -50 ./local_server/qwen3_tts_server/local_server.py | cat -nRepository: Project-N-E-K-O/N.E.K.O
Length of output: 1878
🏁 Script executed:
sed -n '35,45p' ./local_server/qwen3_tts_server/local_server.py | cat -nRepository: Project-N-E-K-O/N.E.K.O
Length of output: 584
🏁 Script executed:
sed -n '80,95p' ./local_server/qwen3_tts_server/local_server.py | cat -nRepository: Project-N-E-K-O/N.E.K.O
Length of output: 798
🏁 Script executed:
sed -n '195,205p' ./local_server/qwen3_tts_server/local_server.py | cat -nRepository: Project-N-E-K-O/N.E.K.O
Length of output: 663
torch.load(..., weights_only=False) 确实存在安全风险喵
Line 87 和 Line 199 都使用了 weights_only=False 来加载 .pt 文件。这允许 pickle 反序列化在解包任意恶意 .pt 文件时执行代码,存在 RCE 漏洞喵!PyTorch 官方已在 2.6 版本将默认值改为 weights_only=True 以应对这个安全隐患呢。
Line 40 的 torch.serialization.add_safe_globals([VoiceClonePromptItem]) 只能作为部分防御,但无法完全消除 weights_only=False 带来的风险喵。建议改用 weights_only=True 并通过 add_safe_globals 注册所有需要的类型,这样才是真正安全的做法喵~
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@local_server/qwen3_tts_server/local_server.py` at line 87, 当前代码在调用
torch.load(pt_path, map_location=self.device,
weights_only=False)(出现在本文件的这处和另一处加载逻辑)开启了 pickle 反序列化风险;将 weights_only 参数改为 True
并确保通过 torch.serialization.add_safe_globals(...) 注册所有在序列化对象中可能出现的自定义类型(例如
VoiceClonePromptItem 及其他自定义类)以便安全加载;具体操作:在加载点(torch.load 调用处)把
weights_only=False 改为 weights_only=True,并在初始化/模块加载阶段使用
torch.serialization.add_safe_globals([...]) 列出并注册所有需要的类型,确保 pt
文件只会还原权重而不会执行任意代码。
| raw_model = Qwen3TTSForConditionalGeneration.from_pretrained( | ||
| model_path, | ||
| torch_dtype=torch.bfloat16, # 给 transformers 看 | ||
| dtype=torch.bfloat16, # 给 qwen 看 | ||
| attn_implementation="flash_attention_2", | ||
| device_map="cuda", | ||
| low_cpu_mem_usage=True | ||
| ) |
There was a problem hiding this comment.
device_map="cuda" 硬编码,与 Line 61 的 CPU fallback 矛盾喵
Line 61 明明写了 "cuda" if torch.cuda.is_available() else "cpu",但 Line 184 又把 device_map 硬编码成 "cuda"。在没有 GPU 的机器上,self.device 是 "cpu" 但模型加载会因为 device_map="cuda" 而失败喵!
🐾 保持一致喵
raw_model = Qwen3TTSForConditionalGeneration.from_pretrained(
model_path,
torch_dtype=torch.bfloat16,
dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
- device_map="cuda",
+ device_map=self.device,
low_cpu_mem_usage=True
)注意:flash_attention_2 也只在 GPU 上可用,CPU 环境下也需要条件判断喵。
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| raw_model = Qwen3TTSForConditionalGeneration.from_pretrained( | |
| model_path, | |
| torch_dtype=torch.bfloat16, # 给 transformers 看 | |
| dtype=torch.bfloat16, # 给 qwen 看 | |
| attn_implementation="flash_attention_2", | |
| device_map="cuda", | |
| low_cpu_mem_usage=True | |
| ) | |
| raw_model = Qwen3TTSForConditionalGeneration.from_pretrained( | |
| model_path, | |
| torch_dtype=torch.bfloat16, # 给 transformers 看 | |
| dtype=torch.bfloat16, # 给 qwen 看 | |
| attn_implementation="flash_attention_2", | |
| device_map=self.device, | |
| low_cpu_mem_usage=True | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@local_server/qwen3_tts_server/local_server.py` around lines 179 - 186, The
model loading call to Qwen3TTSForConditionalGeneration.from_pretrained
hard-codes device_map="cuda" and attn_implementation="flash_attention_2", which
conflicts with the earlier CPU fallback; update the device_map to use the same
conditional (e.g. the existing "cuda" if torch.cuda.is_available() else "cpu" or
use self.device) and only set attn_implementation="flash_attention_2" when
running on CUDA (omit or choose a CPU-safe attention implementation when on
CPU); adjust any dtype/device parameters accordingly so model loading works on
both GPU and CPU environments.
| async def handle_tts(self, websocket): | ||
| logger.info(f"客户端连接: {websocket.remote_address}") | ||
| loop = asyncio.get_running_loop() | ||
|
|
||
| # 智能拼句缓冲区 | ||
| sentence_buffer = "" | ||
|
|
||
| current_job_id = None | ||
| cancel_event = threading.Event() | ||
| audio_queue = asyncio.Queue() | ||
| last_voice_version = self.voice_version | ||
|
|
||
| async def _stop_current_job(keep_buffer: bool = False): | ||
| nonlocal current_job_id, cancel_event, sentence_buffer | ||
| cancel_event.set() | ||
| current_job_id = None | ||
| cancel_event = threading.Event() | ||
| while not audio_queue.empty(): | ||
| try: | ||
| audio_queue.get_nowait() | ||
| except: | ||
| break | ||
| if not keep_buffer: | ||
| sentence_buffer = "" | ||
|
|
||
| # 发送循环 | ||
| async def _sender_loop(): | ||
| while True: | ||
| try: | ||
| chunk = await audio_queue.get() | ||
| if chunk == b"__END__": | ||
| if current_job_id: | ||
| # 适配 N.E.K.O 客户端协议 | ||
| response_done = {"type": "response.done", "job_id": current_job_id} | ||
| await websocket.send(json.dumps(response_done)) | ||
| continue | ||
| await websocket.send(chunk) | ||
| except Exception: | ||
| break | ||
|
|
||
| sender_task = asyncio.create_task(_sender_loop()) |
There was a problem hiding this comment.
_sender_loop 中收到 __END__ 时用 current_job_id 存在竞态喵
_sender_loop(Line 306-318)是一个独立的 async task,它在 Line 311 读取外部作用域的 current_job_id。但 current_job_id 可能在 handle_tts 的消息处理循环中被修改(比如 _stop_current_job 会设成 None)。这两个协程虽然在同一个事件循环里不会真正并发,但在 await audio_queue.get() 挂起和恢复之间,current_job_id 的值可能已经变了,导致 response.done 发送时用了错误的 job_id 或者根本不发送喵。
🧰 Tools
🪛 Ruff (0.15.2)
[error] 300-300: Do not use bare except
(E722)
[warning] 317-317: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@local_server/qwen3_tts_server/local_server.py` around lines 280 - 320,
_sender_loop reads the shared variable current_job_id after an await which can
lead to a race; fix by snapshotting the job id immediately after dequeuing a
chunk and before any awaits so the response.done uses the captured value.
Specifically, in the async function _sender_loop (which consumes audio_queue),
after chunk = await audio_queue.get() capture local_job_id = current_job_id (or
include job_id when enqueuing) and then use local_job_id when building/sending
{"type":"response.done","job_id": local_job_id}; this avoids relying on the
mutable outer current_job_id that _stop_current_job may change.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
main_logic/tts_client.py (1)
208-244:⚠️ Potential issue | 🟡 Minor等待完成的超时日志写成 30 秒但实际是 20 秒,会误导排障喵
Line 220 的timeout=20.0,但 Line 223 的日志写了“30秒”,建议统一成同一个常量或修正文案喵。建议修改(修正文案)
- except asyncio.TimeoutError: - logger.warning("等待响应完成超时(30秒),强制关闭连接") + except asyncio.TimeoutError: + logger.warning("等待响应完成超时(20秒),强制关闭连接")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 208 - 244, The timeout log is inconsistent: asyncio.wait_for(response_done.wait(), timeout=20.0) uses 20 seconds but the warning message says “30秒”; update the code so the timeout value and log message match by either extracting the timeout into a single constant (e.g., TIMEOUT_SECONDS) used by the wait_for call and in the logger.warning, or change the warning text to “20秒”; modify the symbols response_done.wait(), timeout=20.0 and the logger.warning(...) to use the same value.
♻️ Duplicate comments (7)
main_logic/tts_client.py (4)
1775-1847:⚠️ Potential issue | 🟠 Major
async_worker缺少 finally 清理且close_connection行为/命名不符,容易泄漏与误导喵
Line 1775-1921 的主循环退出时不会关闭ws、也不会取消receive_task喵;而 Line 1833-1847 的close_connection名为“关闭”实为“新建连接”,且当前代码里没被调用喵。
建议喵:
- 给
async_worker外层加try/finally,统一 cancel/close 喵。close_connection要么删掉减少迷惑,要么实现成真正的 “close_then_reconnect”(先 cancel/close,再 connect,再发送更新)并在需要的地方调用喵。Also applies to: 1922-1925
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1775 - 1847, The async_worker lacks a finally cleanup and close_connection is misnamed/misimplemented, so add a try/finally around async_worker's main logic to ensure receive_task is cancelled/awaited and ws is closed (use receive_task.cancel() then await it with suppressions, and call await ws.close() if ws is not None), and either remove close_connection or refactor it into a true close_then_reconnect that first cancels/awaits receive_task and closes the existing ws before creating a new connection via websockets.connect; update callers to use connect() for initial open and the new close_then_reconnect (or delete unused close_connection) so there is no resource leak.
1734-1759:⚠️ Potential issue | 🟠 Major
local_qwen3_tts_worker仍然忽略voice_id,调用方没法选音色喵
Line 1734 的voice_id在函数体内没被用到(静态分析也提示 ARG001)喵。
至少需要二选一喵:
- 把
voice_id映射为voice_pt_path并在握手/session.update里发给服务端喵;或喵- 明确声明“本地 Qwen3 音色由服务端
active_voice.json控制,voice_id仅占位”并用_ = voice_id消掉告警喵。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1734 - 1759, The function local_qwen3_tts_worker currently ignores the voice_id parameter; either use it or explicitly mark it unused—preferably map voice_id to the local voice file and send it to the TTS server during handshake/session update: add a small mapping or resolution from voice_id -> voice_pt_path (e.g., switch/dict) and include that resolved path (or a voice name) in the initial handshake or session.update call to the server (referencing local_qwen3_tts_worker, voice_id, voice_pt_path, and session.update/handshake) so callers can select voices; alternatively, if you choose not to support selection, replace the unused param with a deliberate noop (_ = voice_id) and update the function docstring to state that voice selection is server-controlled.
1887-1905:⚠️ Potential issue | 🟠 Major
sid is None只 commit 不等待完成也不隔离连接,可能尾音丢失或下一轮串音喵
Line 1888-1895:flush 之后立刻把current_speech_id=None然后继续循环喵。
如果下一条 speech 很快进来,因为current_speech_id is None,Line 1898 的 cancel 分支不会触发,服务端可能还在吐上一轮音频,结果就把两轮混在一起了喵。
建议至少喵:在sid is None的 flush 后等待服务端一个“完成”信号(或协议加job_id对齐 done),再允许下一轮开始;做不到的话就把该分支改成“flush + 关闭连接 + 下轮重连”,并在重连时重置 resampler 喵。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1887 - 1905, 在处理 sid is None 分支中,当前只是调用 send_append/send_commit 后立即把 current_speech_id 置空并继续,导致下一轮语音可能与上一轮未完全完成的音频混音;请修改 main_logic/tts_client.py 中该分支(涉及 sid、current_speech_id、send_append、send_commit、send_cancel、resampler、soxr.ResampleStream、src_rate、DST_RATE)以确保在 flush 后要么等待服务端明确的“完成/done”信号再将 current_speech_id 置空并继续,要么在不支持等待的情况下改为 flush 后立即关闭并重建连接并重置 resampler(用 new soxr.ResampleStream(src_rate, DST_RATE, 1, dtype="float32"))以避免尾音丢失或串音。
1501-1514: ws/wss 自定义 TTS 现在固定走local_qwen3_tts_worker,需要明确迁移/退化策略喵
Line 1509-1513:http(s)→ GPT-SoVITS v3、ws(s)→ Qwen3,本地 CosyVoice WS 分支被注释掉了喵。
如果这是明确弃用,建议加一条显式 warning(提示“ws/wss 将按 Qwen3 协议解释”)并更新文档喵;如果仍需兼容旧 CosyVoice WS,则需要一个显式 selector(例如tts_custom.api_type)避免用户配置“看起来对但实际连不上”喵。#!/bin/bash # 核对是否还有地方依赖/文档提到 ws/wss=CosyVoice rg -n "local_cosyvoice_worker|get_tts_worker\\(|tts_custom" -S🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1501 - 1514, The current logic in get_config_manager()/tts_config inspects tts_custom.base_url and silently maps http(s)→gptsovits_tts_worker and ws(s)→local_qwen3_tts_worker while the local_cosyvoice_worker branch is commented out; update this by either (A) explicitly logging a warning when base_url startswith "ws://" or "wss://" that "ws/wss will be interpreted as Qwen3 protocol" and keep returning local_qwen3_tts_worker, or (B) add a clear selector field tts_config.get('api_type') (e.g., "qwen3" | "gptsovits" | "cosyvoice") and use that to choose between gptsovits_tts_worker, local_qwen3_tts_worker, or local_cosyvoice_worker respectively (falling back to a safe default and logging the chosen mapping); ensure you reference tts_config, base_url, gptsovits_tts_worker, local_qwen3_tts_worker and local_cosyvoice_worker and emit a log.warn when falling back so users aren’t surprised.local_server/qwen3_tts_server/local_server.py (3)
414-446:⚠️ Potential issue | 🟠 Major
_sender_loop用外部current_job_id发送response.done有竞态,建议把 job_id 跟 chunk 一起入队喵
chunk = await audio_queue.get()这个 await 期间current_job_id可能被_stop_current_job()改掉喵。
最稳的方式是:audio_queue里直接放(job_id, chunk),__END__也带 job_id,这样 sender 不需要读共享可变变量喵。建议修改(chunk 入队携带 job_id)
- loop.call_soon_threadsafe(audio_queue.put_nowait, audio_int16.tobytes()) + loop.call_soon_threadsafe(audio_queue.put_nowait, (job_id, audio_int16.tobytes())) @@ - loop.call_soon_threadsafe(audio_queue.put_nowait, chunk) + loop.call_soon_threadsafe(audio_queue.put_nowait, (job_id, chunk)) @@ - loop.call_soon_threadsafe(audio_queue.put_nowait, b"__END__") + loop.call_soon_threadsafe(audio_queue.put_nowait, (job_id, b"__END__"))async def _sender_loop(): while True: try: - chunk = await audio_queue.get() - if chunk == b"__END__": - if current_job_id: + job_id, chunk = await audio_queue.get() + if chunk == b"__END__": + if job_id: # 适配 N.E.K.O 客户端协议 - response_done = {"type": "response.done", "job_id": current_job_id} + response_done = {"type": "response.done", "job_id": job_id} await websocket.send(json.dumps(response_done)) continue await websocket.send(chunk) except Exception: break🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 414 - 446, _sender_loop reads shared current_job_id while waiting on audio_queue which creates a race; change the queue protocol so audio_queue carries tuples (job_id, chunk) and treat the end marker as (job_id, "__END__") so the sender uses the job_id from the dequeued tuple instead of the external current_job_id. Update places that put data into audio_queue (where chunks are enqueued and in _stop_current_job when signaling end) to enqueue (current_job_id, chunk) or (current_job_id, "__END__") and modify _sender_loop to unpack job_id, chunk = await audio_queue.get(), send chunk and, when chunk == "__END__", build response_done using the dequeued job_id rather than reading current_job_id.
208-234:⚠️ Potential issue | 🟠 MajorCPU fallback 还不完整:
flash_attention_2/device_map需要条件分支否则会直接启动失败喵
你前面把self.device = "cuda" if ... else "cpu"做好了喵,但 Line 220 仍然无条件attn_implementation="flash_attention_2"喵。
这个在 CPU 环境大概率直接炸(以及device_map=self.device是否被 transformers/accelerate 接受也需要确认)喵。建议修改(按 device 组装 kwargs)
- raw_model = Qwen3TTSForConditionalGeneration.from_pretrained( - model_path, - torch_dtype=torch.bfloat16, # 给 transformers 看 - dtype=torch.bfloat16, # 给 qwen 看 - attn_implementation="flash_attention_2", - device_map=self.device, # 旧的是cuda 硬编码 - low_cpu_mem_usage=True - ) + model_kwargs = { + "model_path": model_path, + "low_cpu_mem_usage": True, + } + if self.device == "cuda": + model_kwargs.update({ + "torch_dtype": torch.bfloat16, + "dtype": torch.bfloat16, + "attn_implementation": "flash_attention_2", + "device_map": "cuda", + }) + else: + # CPU: 不要强行 flash_attention_2 + model_kwargs.update({ + "torch_dtype": torch.float32, + "dtype": torch.float32, + "device_map": "cpu", + }) + + raw_model = Qwen3TTSForConditionalGeneration.from_pretrained(**model_kwargs)Transformers `from_pretrained` 的 `device_map` 参数支持直接传入字符串 "cuda"/"cpu" 吗?以及 `attn_implementation="flash_attention_2"` 在 CPU 环境会怎样?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 208 - 234, The current _load_engine unconditionally passes attn_implementation="flash_attention_2" and device_map=self.device into Qwen3TTSForConditionalGeneration.from_pretrained which will fail on CPU; change _load_engine to assemble a kwargs dict for from_pretrained: include torch_dtype/dtype=torch.bfloat16 and attn_implementation="flash_attention_2" and device_map=self.device only when self.device indicates CUDA (e.g. startswith "cuda"), otherwise set torch_dtype/dtype to torch.float32 (or omit), set device_map to "cpu" or None as appropriate, and skip the bfloat16 per-parameter/buffer conversion loops unless running on CUDA so the code paths in _load_engine, Qwen3TTSForConditionalGeneration.from_pretrained, attn_implementation, and device_map are conditional on self.device.
102-104:⚠️ Potential issue | 🟠 Major
torch.load(..., weights_only=False)仍然是 RCE 面,尤其 voice 路径可热切换时更危险喵
Line 103 与 Line 310 都显式weights_only=False,这会把.pt当 pickle 反序列化喵。
如果active_voice.json/配置允许指向任意.pt,那等于把“加载任意代码执行”这个开关交给了文件内容喵。
建议优先改成weights_only=True,并把确实需要反序列化的自定义类型全部注册到torch.serialization.add_safe_globals(...)喵。建议修改(方向性示例,需要按当前 torch 版本验证)
- payload = torch.load(pt_path, map_location=self.device, weights_only=False) + payload = torch.load(pt_path, map_location=self.device, weights_only=True)- prompt_snapshot = torch.load(prompt_snapshot, map_location=self.device, weights_only=False) + prompt_snapshot = torch.load(prompt_snapshot, map_location=self.device, weights_only=True)#!/bin/bash # 1) 盘点所有 torch.load 使用点与 weights_only 参数 rg -n "torch\\.load\\(" -S rg -n "weights_only\\s*=" -S # 2) 尝试定位仓库里对 torch/pytorch 版本的约束(不保证存在) fd -HI "requirements.*|pyproject\\.toml|poetry\\.lock|pdm\\.lock|environment\\.yml|dockerfile|Dockerfile" . \ -x rg -n "torch==|pytorch|cuda|flash_attn|transformers" {} -SAlso applies to: 308-313
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 102 - 104, The torch.load call in _load_prompt_from_pt (and the other torch.load usage nearby) currently uses weights_only=False which allows full pickle deserialization and enables RCE when voice paths are user-switchable; change these torch.load calls to use weights_only=True, and for any true custom objects that must be deserialized register them explicitly with torch.serialization.add_safe_globals(...) before loading; locate all torch.load usages (including the one in _load_prompt_from_pt and the similar load around the 308-313 area) and switch to weights_only=True, then add safe_globals registrations for each custom class/type referenced by those .pt files and validate active_voice.json/config only points to trusted files.
🧹 Nitpick comments (2)
local_server/qwen3_tts_server/local_server.py (2)
16-46: 模块 import 阶段就basicConfig+ 改sys.path,副作用有点大喵
现在只要有人 import 这个模块,就会全局改 logging 配置、并把MODEL_SOURCE_DIR插进sys.path喵。
建议把logging.basicConfig(...)和sys.path注入尽量挪到main()/if __name__ == "__main__"里,并在注入前显式检查目录存在与否,错误信息更可控喵。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 16 - 46, Move side-effectful initialization out of module import: remove logging.basicConfig(...) and the sys.path insertion from top-level module scope and place them inside a main entry (e.g., under if __name__ == "__main__" or a main() function). Before inserting MODEL_SOURCE_DIR into sys.path, check os.path.isdir(MODEL_SOURCE_DIR) and only modify sys.path if the directory exists; log an explicit error via logger and exit gracefully if it does not. Keep the try/except import block (Qwen3TTSForConditionalGeneration, Qwen3TTSProcessor, Qwen3TTSModel, VoiceClonePromptItem) but perform those imports after configuring logging and sys.path in main to avoid global side effects during import; ensure torch.serialization.add_safe_globals([VoiceClonePromptItem]) still runs after successful import.
536-573: 默认host=127.0.0.1很安全,但建议对非本地监听显式 warning/鉴权选项喵
Line 538 把默认 host 设为127.0.0.1这点很棒喵。
建议额外喵:当用户配置成0.0.0.0/ 非本地地址时,启动时打一条显式 warning(提示“无鉴权,可能暴露到局域网/公网”),并考虑加一个可选的 token(header/query)校验喵。
Based on learnings: Review server binding and port exposure across the repository. Ensure Python servers bind to localhost/127.0.0.1 by default and that container/docker-compose port mappings do not expose sensitive services to the LAN unless explicitly intended. 喵🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 536 - 573, The server currently defaults host to "127.0.0.1" but doesn't warn or enforce safety when a non-local host is configured; before the async with websockets.serve(...) block (after reading host/port and before creating QwenLocalServer/use of server.handle_tts) add a check on the host value and log a clear warning if it's not a loopback address (e.g., "0.0.0.0" or any non-local IP) indicating the service may be exposed and unauthenticated, and add optional token-based auth support by reading a config like tts_custom.get("auth_token") and wiring a simple header/query token validation into the request handling path (QwenLocalServer.handle_tts) so that when auth_token is set the server rejects requests without the correct token; keep defaults unchanged (localhost no warning, no auth).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@local_server/qwen3_tts_server/local_server.py`:
- Around line 267-281: The reviewer flagged Ruff errors E701 (multiple
statements on one line), E722/S112 (bare except), etc.; update the _worker_loop
and related functions (_do_inference usage, task_queue handling, cancel_event
checks, and any other blocks between lines ~419-509) to eliminate one-line
multiple statements by splitting them onto separate lines and replace any bare
"except:" with "except Exception as e:" and log or handle the exception (e.g.,
use logger.exception or logger.error with e). Run ruff --fix (or apply the same
manual fixes) to ensure all occurrences (including task_queue.task_done() calls
and other multi-statement lines) conform to style and CI requirements.
- Around line 178-207: The prompt_cache currently keys only by path so
_ensure_active_prompt_loaded may return a stale prompt after the file at
desired_path is overwritten; change the cache to account for mtime (either use a
composite key like (desired_path, mtime) or store a tuple value (prompt, mtime)
in prompt_cache) and update the lookup logic in _ensure_active_prompt_loaded to
compare the stored mtime with the newly read mtime; if mtimes differ, call
_load_prompt_from_pt(desired_path), replace the cached entry, and then update
cached_prompt, active_voice_mtime, active_voice_path, pt_path and increment
voice_version as before.
In `@main_logic/tts_client.py`:
- Around line 1808-1868: The code uses bare except and many try/except: pass
patterns causing E722/S110/TRY201; update connect(), close_connection(),
send_cancel(), send_append(), and send_commit() to catch explicit Exception as e
(not bare except), replace logger.error(... + raise e) with logger.error(...);
raise (bare) to preserve traceback where reconnect fails, and inside the
lightweight handlers (previously except: pass) call logger.debug with the
exception (e.g., logger.debug("ws send failed", exc_info=e)) instead of
swallowing errors so lint rules are satisfied and failures are still visible;
ensure JSON sends that used ensure_ascii=False remain unchanged.
---
Outside diff comments:
In `@main_logic/tts_client.py`:
- Around line 208-244: The timeout log is inconsistent:
asyncio.wait_for(response_done.wait(), timeout=20.0) uses 20 seconds but the
warning message says “30秒”; update the code so the timeout value and log message
match by either extracting the timeout into a single constant (e.g.,
TIMEOUT_SECONDS) used by the wait_for call and in the logger.warning, or change
the warning text to “20秒”; modify the symbols response_done.wait(), timeout=20.0
and the logger.warning(...) to use the same value.
---
Duplicate comments:
In `@local_server/qwen3_tts_server/local_server.py`:
- Around line 414-446: _sender_loop reads shared current_job_id while waiting on
audio_queue which creates a race; change the queue protocol so audio_queue
carries tuples (job_id, chunk) and treat the end marker as (job_id, "__END__")
so the sender uses the job_id from the dequeued tuple instead of the external
current_job_id. Update places that put data into audio_queue (where chunks are
enqueued and in _stop_current_job when signaling end) to enqueue
(current_job_id, chunk) or (current_job_id, "__END__") and modify _sender_loop
to unpack job_id, chunk = await audio_queue.get(), send chunk and, when chunk ==
"__END__", build response_done using the dequeued job_id rather than reading
current_job_id.
- Around line 208-234: The current _load_engine unconditionally passes
attn_implementation="flash_attention_2" and device_map=self.device into
Qwen3TTSForConditionalGeneration.from_pretrained which will fail on CPU; change
_load_engine to assemble a kwargs dict for from_pretrained: include
torch_dtype/dtype=torch.bfloat16 and attn_implementation="flash_attention_2" and
device_map=self.device only when self.device indicates CUDA (e.g. startswith
"cuda"), otherwise set torch_dtype/dtype to torch.float32 (or omit), set
device_map to "cpu" or None as appropriate, and skip the bfloat16
per-parameter/buffer conversion loops unless running on CUDA so the code paths
in _load_engine, Qwen3TTSForConditionalGeneration.from_pretrained,
attn_implementation, and device_map are conditional on self.device.
- Around line 102-104: The torch.load call in _load_prompt_from_pt (and the
other torch.load usage nearby) currently uses weights_only=False which allows
full pickle deserialization and enables RCE when voice paths are
user-switchable; change these torch.load calls to use weights_only=True, and for
any true custom objects that must be deserialized register them explicitly with
torch.serialization.add_safe_globals(...) before loading; locate all torch.load
usages (including the one in _load_prompt_from_pt and the similar load around
the 308-313 area) and switch to weights_only=True, then add safe_globals
registrations for each custom class/type referenced by those .pt files and
validate active_voice.json/config only points to trusted files.
In `@main_logic/tts_client.py`:
- Around line 1775-1847: The async_worker lacks a finally cleanup and
close_connection is misnamed/misimplemented, so add a try/finally around
async_worker's main logic to ensure receive_task is cancelled/awaited and ws is
closed (use receive_task.cancel() then await it with suppressions, and call
await ws.close() if ws is not None), and either remove close_connection or
refactor it into a true close_then_reconnect that first cancels/awaits
receive_task and closes the existing ws before creating a new connection via
websockets.connect; update callers to use connect() for initial open and the new
close_then_reconnect (or delete unused close_connection) so there is no resource
leak.
- Around line 1734-1759: The function local_qwen3_tts_worker currently ignores
the voice_id parameter; either use it or explicitly mark it unused—preferably
map voice_id to the local voice file and send it to the TTS server during
handshake/session update: add a small mapping or resolution from voice_id ->
voice_pt_path (e.g., switch/dict) and include that resolved path (or a voice
name) in the initial handshake or session.update call to the server (referencing
local_qwen3_tts_worker, voice_id, voice_pt_path, and session.update/handshake)
so callers can select voices; alternatively, if you choose not to support
selection, replace the unused param with a deliberate noop (_ = voice_id) and
update the function docstring to state that voice selection is
server-controlled.
- Around line 1887-1905: 在处理 sid is None 分支中,当前只是调用 send_append/send_commit 后立即把
current_speech_id 置空并继续,导致下一轮语音可能与上一轮未完全完成的音频混音;请修改 main_logic/tts_client.py
中该分支(涉及
sid、current_speech_id、send_append、send_commit、send_cancel、resampler、soxr.ResampleStream、src_rate、DST_RATE)以确保在
flush 后要么等待服务端明确的“完成/done”信号再将 current_speech_id 置空并继续,要么在不支持等待的情况下改为 flush
后立即关闭并重建连接并重置 resampler(用 new soxr.ResampleStream(src_rate, DST_RATE, 1,
dtype="float32"))以避免尾音丢失或串音。
- Around line 1501-1514: The current logic in get_config_manager()/tts_config
inspects tts_custom.base_url and silently maps http(s)→gptsovits_tts_worker and
ws(s)→local_qwen3_tts_worker while the local_cosyvoice_worker branch is
commented out; update this by either (A) explicitly logging a warning when
base_url startswith "ws://" or "wss://" that "ws/wss will be interpreted as
Qwen3 protocol" and keep returning local_qwen3_tts_worker, or (B) add a clear
selector field tts_config.get('api_type') (e.g., "qwen3" | "gptsovits" |
"cosyvoice") and use that to choose between gptsovits_tts_worker,
local_qwen3_tts_worker, or local_cosyvoice_worker respectively (falling back to
a safe default and logging the chosen mapping); ensure you reference tts_config,
base_url, gptsovits_tts_worker, local_qwen3_tts_worker and
local_cosyvoice_worker and emit a log.warn when falling back so users aren’t
surprised.
---
Nitpick comments:
In `@local_server/qwen3_tts_server/local_server.py`:
- Around line 16-46: Move side-effectful initialization out of module import:
remove logging.basicConfig(...) and the sys.path insertion from top-level module
scope and place them inside a main entry (e.g., under if __name__ == "__main__"
or a main() function). Before inserting MODEL_SOURCE_DIR into sys.path, check
os.path.isdir(MODEL_SOURCE_DIR) and only modify sys.path if the directory
exists; log an explicit error via logger and exit gracefully if it does not.
Keep the try/except import block (Qwen3TTSForConditionalGeneration,
Qwen3TTSProcessor, Qwen3TTSModel, VoiceClonePromptItem) but perform those
imports after configuring logging and sys.path in main to avoid global side
effects during import; ensure
torch.serialization.add_safe_globals([VoiceClonePromptItem]) still runs after
successful import.
- Around line 536-573: The server currently defaults host to "127.0.0.1" but
doesn't warn or enforce safety when a non-local host is configured; before the
async with websockets.serve(...) block (after reading host/port and before
creating QwenLocalServer/use of server.handle_tts) add a check on the host value
and log a clear warning if it's not a loopback address (e.g., "0.0.0.0" or any
non-local IP) indicating the service may be exposed and unauthenticated, and add
optional token-based auth support by reading a config like
tts_custom.get("auth_token") and wiring a simple header/query token validation
into the request handling path (QwenLocalServer.handle_tts) so that when
auth_token is set the server rejects requests without the correct token; keep
defaults unchanged (localhost no warning, no auth).
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (5)
local_server/qwen3_tts_server/local_server.py (4)
444-456:_sender_loop读取current_job_id存在竞态,response.done可能丢失喵这个问题之前也提过但没修!
__END__被推入队列后、_sender_loop处理之前,_stop_current_job()可能已经把current_job_id置为None,导致response.done无法发出喵。建议在
await audio_queue.get()拿到__END__的瞬间立刻快照job_id:chunk = await audio_queue.get() snapshot_job_id = current_job_id # 立刻快照 if chunk == b"__END__": if snapshot_job_id: await websocket.send(json.dumps({"type": "response.done", "job_id": snapshot_job_id})) continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 444 - 456, _sender_loop can miss sending the response.done when __END__ is dequeued because current_job_id may be cleared by _stop_current_job between await audio_queue.get() and the check; fix by snapshotting current_job_id immediately after receiving chunk from audio_queue (in _sender_loop) into a local variable (e.g., snapshot_job_id) and use that snapshot when forming/sending the {"type":"response.done","job_id":...} over websocket so the job_id remains stable even if _stop_current_job clears the shared current_job_id concurrently.
222-229:⚠️ Potential issue | 🟠 Major
attn_implementation="flash_attention_2"在 CPU 环境下会直接报错喵上次评审已经提到这个问题了,
device_map修好了,但attn_implementation="flash_attention_2"还是硬编码,CPU 上根本不支持 Flash Attention 2,模型加载会直接崩喵!🐾 修复建议
raw_model = Qwen3TTSForConditionalGeneration.from_pretrained( model_path, torch_dtype=torch.bfloat16, dtype=torch.bfloat16, - attn_implementation="flash_attention_2", + attn_implementation="flash_attention_2" if torch.cuda.is_available() else "eager", device_map=self.device, low_cpu_mem_usage=True )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 222 - 229, The call to Qwen3TTSForConditionalGeneration.from_pretrained hardcodes attn_implementation="flash_attention_2", which crashes on CPU; change it to set attn_implementation only when running on a GPU that supports flash attention (e.g., check self.device or torch.cuda.is_available()), otherwise omit the parameter or pass None/leave default so CPU loads normally; update the call site around model_path and device_map to compute attn_impl based on the environment and pass that variable into from_pretrained instead of the hardcoded "flash_attention_2".
102-103:⚠️ Potential issue | 🟠 Major
torch.load(..., weights_only=False)的安全风险还没修喵(老问题又来了)
_load_prompt_from_pt(Line 103)和_do_inference(Line 318)两处都还在用weights_only=False,允许 pickle 反序列化执行任意代码喵。上次评审里的修复建议还没落实呢!Line 41 的
add_safe_globals([VoiceClonePromptItem])只注册了一个类,不足以完全消除风险喵。建议将两处都改为weights_only=True,同时通过add_safe_globals把所有需要的自定义类型注册进去喵。Also applies to: 316-321
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 102 - 103, The code is insecure because torch.load is called with weights_only=False in _load_prompt_from_pt and _do_inference; change both calls to torch.load(..., weights_only=True) and register every custom type used in the saved payload with add_safe_globals (not just VoiceClonePromptItem) so the safe deserializer can reconstruct objects; locate _load_prompt_from_pt and _do_inference and update the torch.load call signatures and extend the add_safe_globals invocation to include all custom classes/structs used by the prompts (e.g., VoiceClonePromptItem plus any other prompt item classes, enums or dataclasses referenced).
473-473: E701:一行多语句,Ruff error 级别,可能卡 CI 喵Line 473 和 Line 497 的单行
if ... : statement是 Ruff error(E701),之前已经提过还没修喵!🧹 修复示意
- if "text" in data and not msg_type: msg_type = "legacy.text" + if "text" in data and not msg_type: + msg_type = "legacy.text"- if not sentence: continue + if not sentence: + continueAlso applies to: 497-497
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` at line 473, The single-line if statement "if 'text' in data and not msg_type: msg_type = 'legacy.text'" (and the similar occurrence around the same block) triggers Ruff E701; replace the one-line form with a normal multi-line if block so it's two statements: an if header and a separate indented assignment inside, e.g., use "if 'text' in data and not msg_type:" followed on the next line by "msg_type = 'legacy.text'"; update both occurrences that reference msg_type and data accordingly.main_logic/tts_client.py (1)
1735-1736:voice_id传进来就被忽略了喵(老问题还没修呢)这个问题之前就提过,
voice_id参数至今仍完全没用上。即使session.update消息里加了voice_pt_path,服务端的handle_tts也不解析该字段,所以音色选择路径对客户端来说完全是死路喵。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1735 - 1736, The worker function local_qwen3_tts_worker currently ignores the voice_id parameter and the TTS pipeline also doesn't parse voice_pt_path from session.update, so voice selection is never applied; fix by wiring voice selection end-to-end: make local_qwen3_tts_worker accept and use voice_id (or mapped voice file path) when constructing/initializing the local TTS engine, update the server-side handle_tts to parse voice_pt_path from incoming session.update messages and pass that value (or resolve it to a voice_id) into the call that enqueues work for local_qwen3_tts_worker, and ensure the queue payload keys match (e.g. include "voice_pt_path" or "voice_id") so the worker uses that value when generating audio. Ensure function names to change include local_qwen3_tts_worker and handle_tts and that session.update handling reads voice_pt_path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@main_logic/tts_client.py`:
- Around line 1776-1783: The worker is missing the response_done_event used by
receive_loop and the shutdown logic; inside async_worker() initialize
response_done_event = asyncio.Event() alongside ws, receive_task,
current_speech_id, text_buf, etc., so receive_loop and the termination branches
can await/set/clear it without raising NameError; locate async_worker, add the
Event initialization (response_done_event) at the top of the function and ensure
existing uses in receive_loop and the termination/cleanup code reference this
variable.
---
Duplicate comments:
In `@local_server/qwen3_tts_server/local_server.py`:
- Around line 444-456: _sender_loop can miss sending the response.done when
__END__ is dequeued because current_job_id may be cleared by _stop_current_job
between await audio_queue.get() and the check; fix by snapshotting
current_job_id immediately after receiving chunk from audio_queue (in
_sender_loop) into a local variable (e.g., snapshot_job_id) and use that
snapshot when forming/sending the {"type":"response.done","job_id":...} over
websocket so the job_id remains stable even if _stop_current_job clears the
shared current_job_id concurrently.
- Around line 222-229: The call to
Qwen3TTSForConditionalGeneration.from_pretrained hardcodes
attn_implementation="flash_attention_2", which crashes on CPU; change it to set
attn_implementation only when running on a GPU that supports flash attention
(e.g., check self.device or torch.cuda.is_available()), otherwise omit the
parameter or pass None/leave default so CPU loads normally; update the call site
around model_path and device_map to compute attn_impl based on the environment
and pass that variable into from_pretrained instead of the hardcoded
"flash_attention_2".
- Around line 102-103: The code is insecure because torch.load is called with
weights_only=False in _load_prompt_from_pt and _do_inference; change both calls
to torch.load(..., weights_only=True) and register every custom type used in the
saved payload with add_safe_globals (not just VoiceClonePromptItem) so the safe
deserializer can reconstruct objects; locate _load_prompt_from_pt and
_do_inference and update the torch.load call signatures and extend the
add_safe_globals invocation to include all custom classes/structs used by the
prompts (e.g., VoiceClonePromptItem plus any other prompt item classes, enums or
dataclasses referenced).
- Line 473: The single-line if statement "if 'text' in data and not msg_type:
msg_type = 'legacy.text'" (and the similar occurrence around the same block)
triggers Ruff E701; replace the one-line form with a normal multi-line if block
so it's two statements: an if header and a separate indented assignment inside,
e.g., use "if 'text' in data and not msg_type:" followed on the next line by
"msg_type = 'legacy.text'"; update both occurrences that reference msg_type and
data accordingly.
In `@main_logic/tts_client.py`:
- Around line 1735-1736: The worker function local_qwen3_tts_worker currently
ignores the voice_id parameter and the TTS pipeline also doesn't parse
voice_pt_path from session.update, so voice selection is never applied; fix by
wiring voice selection end-to-end: make local_qwen3_tts_worker accept and use
voice_id (or mapped voice file path) when constructing/initializing the local
TTS engine, update the server-side handle_tts to parse voice_pt_path from
incoming session.update messages and pass that value (or resolve it to a
voice_id) into the call that enqueues work for local_qwen3_tts_worker, and
ensure the queue payload keys match (e.g. include "voice_pt_path" or "voice_id")
so the worker uses that value when generating audio. Ensure function names to
change include local_qwen3_tts_worker and handle_tts and that session.update
handling reads voice_pt_path.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
main_logic/tts_client.py (1)
1735-1736:⚠️ Potential issue | 🟠 Major
voice_id至今仍未使用(ARG001),audio_api_key有占位但voice_id没有喵Line 1736 给
audio_api_key做了_ = audio_api_key的占位抑制,但voice_id既没有占位也没有实际使用。整个函数里voice_pt_path始终硬编码为nyaning_voice.pt,与传入的voice_id完全无关喵。至少应该加一行
_ = voice_id占位,或者按注释里说的把voice_id映射到voice_pt_path的选择逻辑实现了喵。💡 最小修复(抑制 ARG001 并与 audio_api_key 保持一致)
def local_qwen3_tts_worker(request_queue, response_queue, audio_api_key, voice_id): _ = audio_api_key # 本地不需要 key + _ = voice_id # TODO: 映射到不同的 voice_pt_path🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1735 - 1736, In local_qwen3_tts_worker add handling for the unused voice_id: either suppress the linter by adding a placeholder reference (e.g., _ = voice_id) like audio_api_key is suppressed, or implement mapping logic that uses voice_id to set voice_pt_path instead of the hardcoded "nyaning_voice.pt" (update the assignment of voice_pt_path within local_qwen3_tts_worker to select the correct .pt filename based on voice_id).
🧹 Nitpick comments (1)
main_logic/tts_client.py (1)
1786-1812:receive_loop缺少对ConnectionClosed和CancelledError的独立处理,正常重连也会刷错误日志喵当
speech_id变化触发reconnect()时,_cleanup_ws()会 cancelreceive_task,这时receive_loop会收到asyncio.CancelledError,被现在的except Exception吞掉并打出一条error日志。同样地,WebSocket 正常关闭会抛websockets.exceptions.ConnectionClosed,也会被当作错误记录喵。对比同文件里
local_cosyvoice_worker(Line 1613-1618)、gptsovits_tts_worker(Line 1326-1331)的receive_loop,都会把这两种情况静默处理喵。websockets 官方文档也明确建议捕获ConnectionClosed来防止它被日志记录喵。♻️ 建议修复(对齐其他 worker 的模式)
async def receive_loop(ws_conn): nonlocal src_rate, resampler try: async for message in ws_conn: if isinstance(message, str): try: evt = json.loads(message) except Exception as e: logger.debug(f"local_qwen3 parse message error:{e}") continue if evt.get("type") == "response.done": response_done_event.set() continue if evt.get("type") == "audio.meta": sr = int(evt.get("sample_rate", src_rate)) if sr != src_rate: src_rate = sr resampler = soxr.ResampleStream(src_rate, DST_RATE, 1, dtype="float32") continue if isinstance(message, bytes): audio_i16 = np.frombuffer(message, dtype=np.int16) resampled = _resample_audio(audio_i16, src_rate, DST_RATE, resampler) response_queue.put(resampled) + except websockets.exceptions.ConnectionClosed: + logger.debug("local_qwen3 receive_loop: connection closed") + except asyncio.CancelledError: + pass except Exception as e: logger.error(f"local_qwen3 receive_loop error: {e}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1786 - 1812, The receive_loop currently catches all exceptions and logs them as errors, which incorrectly treats normal websocket closure and task cancellation as errors; update receive_loop to explicitly catch and silently handle asyncio.CancelledError and websockets.exceptions.ConnectionClosed (or ConnectionClosedError)—matching the pattern used in local_cosyvoice_worker and gptsovits_tts_worker—so that when _cleanup_ws cancels receive_task during reconnect() or the socket closes normally no error is logged, and only unexpected exceptions fall through to logger.error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@main_logic/tts_client.py`:
- Line 1600: The new variable response_done_event declared inside async_worker
of local_cosyvoice_worker is never used (no .set()/.wait()/.clear() calls) and
appears to be leftover from copying local_qwen3_tts_worker; remove the unused
response_done_event declaration to clean up the code and avoid the F841 lint
warning.
- Around line 1757-1759: The path assigned to voice_pt_path is left as a
relative/non-normalized path (it contains "..") because the os.path.abspath call
was commented out; restore normalization by passing voice_pt_path through
os.path.abspath before sending it to the server (e.g., re-enable the
os.path.abspath(voice_pt_path) call) so voice_pt_path becomes an absolute,
canonical path; update any code that uses voice_pt_path (the variable named
voice_pt_path in this module) to use the normalized value.
---
Duplicate comments:
In `@main_logic/tts_client.py`:
- Around line 1735-1736: In local_qwen3_tts_worker add handling for the unused
voice_id: either suppress the linter by adding a placeholder reference (e.g., _
= voice_id) like audio_api_key is suppressed, or implement mapping logic that
uses voice_id to set voice_pt_path instead of the hardcoded "nyaning_voice.pt"
(update the assignment of voice_pt_path within local_qwen3_tts_worker to select
the correct .pt filename based on voice_id).
---
Nitpick comments:
In `@main_logic/tts_client.py`:
- Around line 1786-1812: The receive_loop currently catches all exceptions and
logs them as errors, which incorrectly treats normal websocket closure and task
cancellation as errors; update receive_loop to explicitly catch and silently
handle asyncio.CancelledError and websockets.exceptions.ConnectionClosed (or
ConnectionClosedError)—matching the pattern used in local_cosyvoice_worker and
gptsovits_tts_worker—so that when _cleanup_ws cancels receive_task during
reconnect() or the socket closes normally no error is logged, and only
unexpected exceptions fall through to logger.error.
| receive_task = None | ||
| current_speech_id = None | ||
|
|
||
| response_done_event = asyncio.Event() |
There was a problem hiding this comment.
response_done_event 赋值后从未使用喵(F841)
这行是新加进来的,但在整个 local_cosyvoice_worker 的 async_worker 里找不到任何 .set()、.wait() 或 .clear() 的调用喵。是不是复制 local_qwen3_tts_worker 时顺手带进来的残留代码?直接删掉喵~
🧹 建议清理
async def async_worker():
ws = None
receive_task = None
current_speech_id = None
- response_done_event = asyncio.Event()
resampler = soxr.ResampleStream(SRC_RATE, 48000, 1, dtype='float32')📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| response_done_event = asyncio.Event() | |
| async def async_worker(): | |
| ws = None | |
| receive_task = None | |
| current_speech_id = None | |
| resampler = soxr.ResampleStream(SRC_RATE, 48000, 1, dtype='float32') |
🧰 Tools
🪛 Ruff (0.15.2)
[error] 1600-1600: Local variable response_done_event is assigned to but never used
Remove assignment to unused variable response_done_event
(F841)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@main_logic/tts_client.py` at line 1600, The new variable response_done_event
declared inside async_worker of local_cosyvoice_worker is never used (no
.set()/.wait()/.clear() calls) and appears to be leftover from copying
local_qwen3_tts_worker; remove the unused response_done_event declaration to
clean up the code and avoid the F841 lint warning.
There was a problem hiding this comment.
♻️ Duplicate comments (6)
local_server/qwen3_tts_server/local_server.py (4)
102-104:⚠️ Potential issue | 🔴 Criticaltorch.load 使用 weights_only=False 仍有反序列化风险
这两处加载 .pt 仍走 pickle,若文件非完全可信会有 RCE 风险。建议改为weights_only=True并继续使用add_safe_globals白名单喵。🔐 建议修改
- payload = torch.load(pt_path, map_location=self.device, weights_only=False) + payload = torch.load(pt_path, map_location=self.device, weights_only=True)- prompt_snapshot = torch.load(prompt_snapshot, map_location=self.device, weights_only=False) + prompt_snapshot = torch.load(prompt_snapshot, map_location=self.device, weights_only=True)PyTorch torch.load weights_only 安全性与默认行为(近版本)Also applies to: 316-319
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 102 - 104, The torch.load call in _load_prompt_from_pt currently uses weights_only=False which still performs pickle deserialization; change it to weights_only=True to avoid executing untrusted code and apply the existing safe global whitelist via add_safe_globals when reconstructing any needed objects; update the corresponding loader at the other location (the load block referenced around lines 316-319) to mirror this change, handle the case where weights-only load omits non-tensor metadata by explicitly reconstructing or validating expected fields after load, and wrap the load operations in a try/except that logs and raises a clear error if the file cannot be safely loaded.
444-453:⚠️ Potential issue | 🟠 Majorresponse.done 的 job_id 可能串线
_sender_loop在 await 后读取外层current_job_id,在 cancel/切换 job 时可能发送错误的 job_id。建议把 job_id 随 chunk 一起入队,发送时用入队时的快照喵。🧷 建议修改
- chunk = await audio_queue.get() - if chunk == b"__END__": - if current_job_id: + job_id, chunk = await audio_queue.get() + if chunk == b"__END__": + if job_id: # 适配 N.E.K.O 客户端协议 - response_done = {"type": "response.done", "job_id": current_job_id} + response_done = {"type": "response.done", "job_id": job_id} await websocket.send(json.dumps(response_done)) continue await websocket.send(chunk)- loop.call_soon_threadsafe(audio_queue.put_nowait, audio_int16.tobytes()) + loop.call_soon_threadsafe(audio_queue.put_nowait, (job_id, audio_int16.tobytes())) ... - loop.call_soon_threadsafe(audio_queue.put_nowait, b"__END__") + loop.call_soon_threadsafe(audio_queue.put_nowait, (job_id, b"__END__"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 444 - 453, _sender_loop reads the outer current_job_id after an await which can race when jobs are cancelled/switched; fix by enqueuing a snapshot of the job id with each chunk (i.e. change audio_queue items to (job_id, chunk) where producers push (current_job_id, chunk) including the "__END__" marker) and update _sender_loop to unpack (job_id, chunk) and use that job_id for sending response_done and all sends so the job_id is consistent per-chunk.
464-473:⚠️ Potential issue | 🟡 MinorRuff E701/S112 可能卡 CI:拆行并记录解析失败
if "text" in data and not msg_type: ...是 E701,同时 JSON 解析异常被静默吞掉触发 S112。建议拆行并做 debug 记录喵。🧹 建议修改
- try: - data = json.loads(message) - except Exception: - continue + try: + data = json.loads(message) + except json.JSONDecodeError as e: + logger.debug(f"invalid json from client: {e}") + continue ... - if "text" in data and not msg_type: msg_type = "legacy.text" + if "text" in data and not msg_type: + msg_type = "legacy.text"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 464 - 473, The single-line conditional `if "text" in data and not msg_type: msg_type = "legacy.text"` causes E701 and the bare except around json.loads(message) silently swallows errors (S112); update the websocket message loop (the `async for message in websocket` block) to split the conditional into two lines (check `"text" in data` then `if not msg_type` set `msg_type = "legacy.text"`) and change the `except Exception:` that wraps `data = json.loads(message)` to log the exception (e.g., using the module logger or websocket logger with exception/debug level and include the exception info) before continuing so parse failures are recorded.
221-228:⚠️ Potential issue | 🟠 Majorflash_attention_2 在 CPU 上可能直接失败
当前无条件设置attn_implementation="flash_attention_2",CPU 环境下通常不可用,可能导致启动失败。建议仅在 CUDA 上启用喵。🧯 建议修改
- raw_model = Qwen3TTSForConditionalGeneration.from_pretrained( - model_path, - torch_dtype=torch.bfloat16, # 给 transformers 看 - dtype=torch.bfloat16, # 给 qwen 看 - attn_implementation="flash_attention_2", - device_map=self.device, # 旧的是cuda 硬编码 - low_cpu_mem_usage=True - ) + model_kwargs = dict( + torch_dtype=torch.bfloat16, # 给 transformers 看 + dtype=torch.bfloat16, # 给 qwen 看 + device_map=self.device, + low_cpu_mem_usage=True, + ) + if self.device == "cuda": + model_kwargs["attn_implementation"] = "flash_attention_2" + raw_model = Qwen3TTSForConditionalGeneration.from_pretrained(model_path, **model_kwargs)Transformers attn_implementation="flash_attention_2" 是否支持 CPU🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 221 - 228, The code unconditionally passes attn_implementation="flash_attention_2" into Qwen3TTSForConditionalGeneration.from_pretrained which can fail on CPU; change the call so attn_implementation is set only when running on CUDA (e.g., check torch.cuda.is_available() or that self.device contains "cuda") and omit or use the default attn_implementation for CPU. Locate the from_pretrained call around Qwen3TTSForConditionalGeneration.from_pretrained(model_path, torch_dtype=..., dtype=..., attn_implementation=..., device_map=self.device, ...) and make attn_implementation conditional based on the device check.main_logic/tts_client.py (2)
1735-1758:⚠️ Potential issue | 🟠 Majorlocal_qwen3_tts_worker 忽略 voice_id,调用方无法选音色
voice_id完全未使用,导致调用方传入的音色无效且触发 ARG001。建议至少将voice_id映射到voice_pt_path(或显式标注不支持)喵。🐾 一个最小可用的处理示例
- voice_pt_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "local_server", "qwen3_tts_server", "nyaning_voice.pt")) + voice_pt_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "local_server", "qwen3_tts_server", "nyaning_voice.pt")) + if voice_id and os.path.exists(voice_id): + voice_pt_path = os.path.abspath(voice_id)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1735 - 1758, local_qwen3_tts_worker currently ignores the voice_id parameter which causes caller-supplied voice selection to be lost and triggers ARG001; update the function to either map voice_id to the correct local model file (e.g. select different paths for voice_pt_path based on voice_id) or explicitly validate and raise/log a clear unsupported-voice error; modify the code paths that set voice_pt_path (and any downstream usage) to use this mapping/validation so callers can choose voice, and ensure you reference local_qwen3_tts_worker, voice_id, and voice_pt_path when implementing the change.
1596-1602:⚠️ Potential issue | 🟡 Minor删除未使用的 response_done_event 变量
Line 1600 声明后未被 set/wait/clear,属于死代码且会触发 F841。直接移除更干净喵。🧹 建议修改
- response_done_event = asyncio.Event()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1596 - 1602, The variable response_done_event declared inside async_worker is never used (no set/wait/clear) and triggers an unused-variable warning; remove the response_done_event = asyncio.Event() line from the async_worker function (and any related unused references if they exist) so the function no longer defines an unused asyncio.Event object.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@local_server/qwen3_tts_server/local_server.py`:
- Around line 102-104: The torch.load call in _load_prompt_from_pt currently
uses weights_only=False which still performs pickle deserialization; change it
to weights_only=True to avoid executing untrusted code and apply the existing
safe global whitelist via add_safe_globals when reconstructing any needed
objects; update the corresponding loader at the other location (the load block
referenced around lines 316-319) to mirror this change, handle the case where
weights-only load omits non-tensor metadata by explicitly reconstructing or
validating expected fields after load, and wrap the load operations in a
try/except that logs and raises a clear error if the file cannot be safely
loaded.
- Around line 444-453: _sender_loop reads the outer current_job_id after an
await which can race when jobs are cancelled/switched; fix by enqueuing a
snapshot of the job id with each chunk (i.e. change audio_queue items to
(job_id, chunk) where producers push (current_job_id, chunk) including the
"__END__" marker) and update _sender_loop to unpack (job_id, chunk) and use that
job_id for sending response_done and all sends so the job_id is consistent
per-chunk.
- Around line 464-473: The single-line conditional `if "text" in data and not
msg_type: msg_type = "legacy.text"` causes E701 and the bare except around
json.loads(message) silently swallows errors (S112); update the websocket
message loop (the `async for message in websocket` block) to split the
conditional into two lines (check `"text" in data` then `if not msg_type` set
`msg_type = "legacy.text"`) and change the `except Exception:` that wraps `data
= json.loads(message)` to log the exception (e.g., using the module logger or
websocket logger with exception/debug level and include the exception info)
before continuing so parse failures are recorded.
- Around line 221-228: The code unconditionally passes
attn_implementation="flash_attention_2" into
Qwen3TTSForConditionalGeneration.from_pretrained which can fail on CPU; change
the call so attn_implementation is set only when running on CUDA (e.g., check
torch.cuda.is_available() or that self.device contains "cuda") and omit or use
the default attn_implementation for CPU. Locate the from_pretrained call around
Qwen3TTSForConditionalGeneration.from_pretrained(model_path, torch_dtype=...,
dtype=..., attn_implementation=..., device_map=self.device, ...) and make
attn_implementation conditional based on the device check.
In `@main_logic/tts_client.py`:
- Around line 1735-1758: local_qwen3_tts_worker currently ignores the voice_id
parameter which causes caller-supplied voice selection to be lost and triggers
ARG001; update the function to either map voice_id to the correct local model
file (e.g. select different paths for voice_pt_path based on voice_id) or
explicitly validate and raise/log a clear unsupported-voice error; modify the
code paths that set voice_pt_path (and any downstream usage) to use this
mapping/validation so callers can choose voice, and ensure you reference
local_qwen3_tts_worker, voice_id, and voice_pt_path when implementing the
change.
- Around line 1596-1602: The variable response_done_event declared inside
async_worker is never used (no set/wait/clear) and triggers an unused-variable
warning; remove the response_done_event = asyncio.Event() line from the
async_worker function (and any related unused references if they exist) so the
function no longer defines an unused asyncio.Event object.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (6)
main_logic/tts_client.py (2)
1667-1667:⚠️ Potential issue | 🟡 Minor
response_done_event在这里是未使用变量,会触发 F841 喵Line [1667] 声明后没有任何
.set/.wait/.clear调用,建议直接删除,避免 lint 报错喵。🧹 建议修复
- response_done_event = asyncio.Event()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` at line 1667, The variable response_done_event declared at the top-level of the tts client (response_done_event = asyncio.Event()) is unused and triggers lint F841; remove that unused declaration from main_logic/tts_client.py (delete the response_done_event = asyncio.Event() line) and any redundant imports if they become unused as a result so no .set/.wait/.clear calls remain missing.
1802-1803:⚠️ Potential issue | 🟠 Major
voice_id仍被忽略,调用方无法按请求切换音色喵Line [1802] 接收了
voice_id,但 Line [1825] 固定写死voice_pt_path,这会让上层传入音色参数失效喵。建议至少把voice_id映射到voice_pt_path(或直接透传给服务端)喵。🎛️ 建议修复
- voice_pt_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "local_server", "qwen3_tts_server", "nyaning_voice.pt")) + default_pt = os.path.abspath(os.path.join( + os.path.dirname(__file__), "..", "local_server", "qwen3_tts_server", "nyaning_voice.pt" + )) + voice_pt_path = voice_id.strip() if voice_id and voice_id.strip() else default_ptAlso applies to: 1822-1826
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@main_logic/tts_client.py` around lines 1802 - 1803, The function local_qwen3_tts_worker currently ignores the voice_id parameter and always uses a hardcoded voice_pt_path; update local_qwen3_tts_worker to honor voice_id by either mapping voice_id to the correct voice_pt_path (e.g., via a local dictionary or lookup function) or by including/forwarding voice_id in the request payload sent to the local TTS service instead of the hardcoded path; make the change where voice_pt_path is set and where the request payload/body is assembled so callers can switch voices dynamically (referencing local_qwen3_tts_worker and the voice_pt_path usage).local_server/qwen3_tts_server/local_server.py (4)
109-109:⚠️ Potential issue | 🟠 Major
torch.load(..., weights_only=False)仍有反序列化执行风险喵Line [109]、Line [329]、Line [498] 显式使用了
weights_only=False,只要.pt文件来源不完全可信,就有反序列化执行风险喵。建议统一改为weights_only=True,并继续通过add_safe_globals维护必要白名单喵。🔒 建议修复
- payload = torch.load(pt_path, map_location=self.device, weights_only=False) + payload = torch.load(pt_path, map_location=self.device, weights_only=True) ... - prompt_snapshot = torch.load(prompt_snapshot, map_location=self.device, weights_only=False) + prompt_snapshot = torch.load(prompt_snapshot, map_location=self.device, weights_only=True) ... - prompt_snapshot = torch.load(prompt_snapshot, map_location=self.device, weights_only=False) + prompt_snapshot = torch.load(prompt_snapshot, map_location=self.device, weights_only=True)Also applies to: 329-330, 498-499
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` at line 109, Replace unsafe torch.load calls that pass weights_only=False with weights_only=True to avoid deserialization execution risks: update the torch.load invocations (e.g., the assignment to payload at the torch.load(pt_path, map_location=self.device, ...) locations and the other similar calls around the same file) to use weights_only=True while preserving map_location=self.device; if the code relies on executing pickled globals, continue to expose only vetted symbols via add_safe_globals (maintain or extend the whitelist there) so loading still works without enabling full deserialization.
475-477:⚠️ Potential issue | 🟠 Major0.6B 引擎加载失败后仍继续启动会导致“连上但无声”喵
Line [475]-Line [477] 仅记录错误但不退出,后续服务会继续接连接但推理路径直接空返回,排障会很痛苦喵。这里建议与
_load_engine对齐:失败即中止启动喵。🚨 建议修复
except Exception as e: logger.error(f"❌ 0.6B 加载引擎异常: {e}") + sys.exit(1)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 475 - 477, The except block that logs "❌ 0.6B 加载引擎异常" must abort startup like _load_engine does; update the exception handler around the 0.6B engine load so it not only logs the error (logger.error(...)) but also stops the process (e.g., call sys.exit(1) or re-raise the exception) to prevent the server from coming up in a non-functional "connected but silent" state; mirror the failure behavior used in _load_engine to ensure a hard abort on load failure.
228-235:⚠️ Potential issue | 🟠 MajorCPU fallback 与 FlashAttention2 配置仍然冲突喵
Line [62] 明确支持 CPU fallback,但 Line [232] 和 Line [445] 仍无条件使用
attn_implementation="flash_attention_2",CPU 环境下容易直接加载失败喵。建议按设备条件设置 attention 实现与 dtype 喵。🛠️ 建议修复
+ attn_impl = "flash_attention_2" if self.device == "cuda" else "eager" + load_dtype = torch.bfloat16 if self.device == "cuda" else torch.float32 raw_model = Qwen3TTSForConditionalGeneration.from_pretrained( model_path, - torch_dtype=torch.bfloat16, # 给 transformers 看 - dtype=torch.bfloat16, # 给 qwen 看 - attn_implementation="flash_attention_2", + torch_dtype=load_dtype, + dtype=load_dtype, + attn_implementation=attn_impl, device_map=self.device, # 旧的是cuda 硬编码 low_cpu_mem_usage=True )Also applies to: 442-448
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` around lines 228 - 235, The code unconditionally sets attn_implementation="flash_attention_2" and bfloat16 dtypes when calling Qwen3TTSForConditionalGeneration.from_pretrained, which will fail on CPU fallback; change the call so it branches on the runtime device (self.device / device_map) and only sets attn_implementation="flash_attention_2" and torch.bfloat16/dtype=torch.bfloat16 when the target device is CUDA (GPU); for CPU use attn_implementation=None (or omit the kwarg) and torch_dtype/dtype=torch.float32 (or remove dtype args) to ensure CPU loads succeed—update both occurrences around Qwen3TTSForConditionalGeneration.from_pretrained and the similar call later so the attn_implementation and dtype choices are conditional on the device.
428-428:⚠️ Potential issue | 🟠 Major
__END__使用外部current_job_id会回传错误 job_id 喵Line [611]-Line [614] 在 sender 协程里读取可变的
current_job_id,而__END__是异步到达的,可能对应的是旧任务,导致response.done绑定错会话喵。建议把job_id一起入队,避免跨协程共享状态歧义喵。Also applies to: 578-578, 606-615
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@local_server/qwen3_tts_server/local_server.py` at line 428, The end-marker currently enqueues only b"__END__" and the sender coroutine (which reads current_job_id) may bind response.done to the wrong job; change the protocol to enqueue tuples including the job id (e.g., (job_id, b"__END__") and for all audio chunks) so the consumer uses the job_id from the queue instead of the shared current_job_id; update places that call loop.call_soon_threadsafe(audio_queue.put_nowait, ...) and the sender coroutine that reads from audio_queue to unpack (job_id, chunk) and only mark response.done for that job_id (avoid relying on mutable current_job_id and adjust handling in the code paths around audio_queue usage and response.done accordingly).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@local_server/qwen3_tts_server/local_server.py`:
- Around line 634-684: Add a branch for msg_type == "session.update" inside
handle_tts that reads voice_pt_path and language from data and writes them to
the running session (e.g., self.voice_pt_path = data.get("voice_pt_path") and
self.language = data.get("language")), then force a prompt/version refresh by
calling self._ensure_active_prompt_loaded() and, if the returned current_version
!= last_voice_version, set last_voice_version = current_version and await
_stop_current_job(keep_buffer=True) to trigger the hot voice switch; also
consider sending a small response back over websocket (optional) to acknowledge
the update.
- Around line 515-517: You set original_dtype = torch.get_default_dtype() then
call torch.set_default_dtype(torch.bfloat16) but never restore it (and
original_dtype is unused), causing a linter F841 and a global dtype leak; update
the try/finally that currently ends with a bare pass to restore the original
dtype by calling torch.set_default_dtype(original_dtype) in the finally block
(and keep the original_dtype variable in scope), ensuring the default dtype is
reset even on exceptions.
---
Duplicate comments:
In `@local_server/qwen3_tts_server/local_server.py`:
- Line 109: Replace unsafe torch.load calls that pass weights_only=False with
weights_only=True to avoid deserialization execution risks: update the
torch.load invocations (e.g., the assignment to payload at the
torch.load(pt_path, map_location=self.device, ...) locations and the other
similar calls around the same file) to use weights_only=True while preserving
map_location=self.device; if the code relies on executing pickled globals,
continue to expose only vetted symbols via add_safe_globals (maintain or extend
the whitelist there) so loading still works without enabling full
deserialization.
- Around line 475-477: The except block that logs "❌ 0.6B 加载引擎异常" must abort
startup like _load_engine does; update the exception handler around the 0.6B
engine load so it not only logs the error (logger.error(...)) but also stops the
process (e.g., call sys.exit(1) or re-raise the exception) to prevent the server
from coming up in a non-functional "connected but silent" state; mirror the
failure behavior used in _load_engine to ensure a hard abort on load failure.
- Around line 228-235: The code unconditionally sets
attn_implementation="flash_attention_2" and bfloat16 dtypes when calling
Qwen3TTSForConditionalGeneration.from_pretrained, which will fail on CPU
fallback; change the call so it branches on the runtime device (self.device /
device_map) and only sets attn_implementation="flash_attention_2" and
torch.bfloat16/dtype=torch.bfloat16 when the target device is CUDA (GPU); for
CPU use attn_implementation=None (or omit the kwarg) and
torch_dtype/dtype=torch.float32 (or remove dtype args) to ensure CPU loads
succeed—update both occurrences around
Qwen3TTSForConditionalGeneration.from_pretrained and the similar call later so
the attn_implementation and dtype choices are conditional on the device.
- Line 428: The end-marker currently enqueues only b"__END__" and the sender
coroutine (which reads current_job_id) may bind response.done to the wrong job;
change the protocol to enqueue tuples including the job id (e.g., (job_id,
b"__END__") and for all audio chunks) so the consumer uses the job_id from the
queue instead of the shared current_job_id; update places that call
loop.call_soon_threadsafe(audio_queue.put_nowait, ...) and the sender coroutine
that reads from audio_queue to unpack (job_id, chunk) and only mark
response.done for that job_id (avoid relying on mutable current_job_id and
adjust handling in the code paths around audio_queue usage and response.done
accordingly).
In `@main_logic/tts_client.py`:
- Line 1667: The variable response_done_event declared at the top-level of the
tts client (response_done_event = asyncio.Event()) is unused and triggers lint
F841; remove that unused declaration from main_logic/tts_client.py (delete the
response_done_event = asyncio.Event() line) and any redundant imports if they
become unused as a result so no .set/.wait/.clear calls remain missing.
- Around line 1802-1803: The function local_qwen3_tts_worker currently ignores
the voice_id parameter and always uses a hardcoded voice_pt_path; update
local_qwen3_tts_worker to honor voice_id by either mapping voice_id to the
correct voice_pt_path (e.g., via a local dictionary or lookup function) or by
including/forwarding voice_id in the request payload sent to the local TTS
service instead of the hardcoded path; make the change where voice_pt_path is
set and where the request payload/body is assembled so callers can switch voices
dynamically (referencing local_qwen3_tts_worker and the voice_pt_path usage).
| msg_type = data.get("type") | ||
| if "text" in data and not msg_type: msg_type = "legacy.text" | ||
|
|
||
| if msg_type == "input_text_buffer.append": | ||
| text_fragment = data.get("text", "") | ||
| sentence_buffer += text_fragment | ||
|
|
||
| elif msg_type in ("input_text_buffer.commit", "legacy.text"): | ||
| if msg_type == "legacy.text": | ||
| sentence_buffer += data.get("text", "") | ||
|
|
||
| _, current_version = self._ensure_active_prompt_loaded() | ||
| if current_version != last_voice_version: | ||
| last_voice_version = current_version | ||
| # Hot voice switch: cancel current audio job but keep the buffered text. | ||
| # Otherwise the first committed sentence right after switching may be dropped. | ||
| await _stop_current_job(keep_buffer=True) | ||
|
|
||
| # 正则智能断句 | ||
| parts = re.split(r'([。!?.!?\n]+)', sentence_buffer) | ||
|
|
||
| if len(parts) > 1: | ||
| for i in range(0, len(parts) - 1, 2): | ||
| sentence = parts[i] + parts[i + 1] | ||
| sentence = sentence.strip() | ||
| if not sentence: continue | ||
|
|
||
| if not current_job_id: | ||
| current_job_id = str(uuid.uuid4()) | ||
| await websocket.send(json.dumps({"type": "response.start", "job_id": current_job_id})) | ||
|
|
||
| logger.info(f"📥 句子: {sentence[:20]}...") | ||
| prompt, _ = self._ensure_active_prompt_loaded() | ||
| self.task_queue.put((sentence, current_job_id, loop, audio_queue, cancel_event, prompt, self.language)) | ||
|
|
||
| sentence_buffer = parts[-1] | ||
|
|
||
| # 缓冲区兜底 (防止一直不说话) | ||
| if len(sentence_buffer) > self.buffer_fallback_chars: | ||
| sentence = sentence_buffer | ||
| sentence_buffer = "" | ||
| if not current_job_id: | ||
| current_job_id = str(uuid.uuid4()) | ||
| await websocket.send(json.dumps({"type": "response.start", "job_id": current_job_id})) | ||
| prompt, _ = self._ensure_active_prompt_loaded() | ||
| self.task_queue.put((sentence, current_job_id, loop, audio_queue, cancel_event, prompt, self.language)) | ||
|
|
||
| elif msg_type == "cancel": | ||
| await _stop_current_job() | ||
|
|
||
| finally: |
There was a problem hiding this comment.
服务端未处理 session.update,客户端传的 voice_pt_path/language 目前不生效喵
当前 handle_tts 只处理 append/commit/cancel/legacy;而客户端会在连接后发送 session.update。这会让音色路径与语言参数“看起来传了但实际上没用”喵。建议补一个 msg_type == "session.update" 分支,更新运行参数并触发 prompt 热切换喵。
🧰 Tools
🪛 Ruff (0.15.2)
[error] 635-635: Multiple statements on one line (colon)
(E701)
[warning] 653-653: String contains ambiguous ! (FULLWIDTH EXCLAMATION MARK). Did you mean ! (EXCLAMATION MARK)?
(RUF001)
[warning] 653-653: String contains ambiguous ? (FULLWIDTH QUESTION MARK). Did you mean ? (QUESTION MARK)?
(RUF001)
[error] 659-659: Multiple statements on one line (colon)
(E701)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@local_server/qwen3_tts_server/local_server.py` around lines 634 - 684, Add a
branch for msg_type == "session.update" inside handle_tts that reads
voice_pt_path and language from data and writes them to the running session
(e.g., self.voice_pt_path = data.get("voice_pt_path") and self.language =
data.get("language")), then force a prompt/version refresh by calling
self._ensure_active_prompt_loaded() and, if the returned current_version !=
last_voice_version, set last_voice_version = current_version and await
_stop_current_job(keep_buffer=True) to trigger the hot voice switch; also
consider sending a small response back over websocket (optional) to acknowledge
the update.
There was a problem hiding this comment.
@wehos 很遗憾,这个 bug 还没解除喵 😿
检查了当前代码,handle_tts 方法里只处理了 input_text_buffer.append、input_text_buffer.commit、legacy.text 和 cancel 这几种消息类型,但没有处理 session.update 喵。
这意味着客户端发送的 voice_pt_path 和 language 参数目前不会生效喵。
建议修复方案:
在 handle_tts 的消息处理分支里补充一个 session.update 处理器:
elif msg_type == "session.update":
# 更新运行参数
if "voice_pt_path" in data:
self.voice_pt_path = data["voice_pt_path"]
if "language" in data:
self.language = data["language"]
# 触发音色热切换
_, current_version = self._ensure_active_prompt_loaded()
if current_version != last_voice_version:
last_voice_version = current_version
await _stop_current_job(keep_buffer=True)
# 可选:发送确认消息
await websocket.send(json.dumps({"type": "session.updated"}))这样就能让客户端动态更新音色和语言参数啦喵~ ✨
∧_∧
(´・ω・`) 需要我帮忙生成修复代码吗喵?
| ⊃/(___
| /
 ̄ ̄ ̄
🧠 Learnings used
Learnt from: wehos
Repo: Project-N-E-K-O/N.E.K.O PR: 330
File: config/prompts_sys.py:684-684
Timestamp: 2026-02-20T09:45:24.884Z
Learning: 在 config/prompts_sys.py 中,proactive_screen_web_en/ja/ko 和 proactive_generate_ja 等 prompt 的章节结束分隔符刻意使用中文"======以上为对话历史======"而不是目标语言,这是安全水印设计,不应修改。这种开始标签用目标语言、结束标签用中文的混合模式是有意为之的。
Learnt from: wehos
Repo: Project-N-E-K-O/N.E.K.O PR: 172
File: main_routers/characters_router.py:73-133
Timestamp: 2025-12-12T04:48:39.250Z
Learning: Review server binding and port exposure across the repository. Ensure Python servers bind to localhost/127.0.0.1 by default and that container/docker-compose port mappings do not expose sensitive services to the LAN unless explicitly intended. For docker-compose, verify that ports are not unnecessarily published to the host network; prefer binding to localhost or using internal networks, and document the exposure rationale. Specifically check main_server/config on bindings and the docker-compose.yml port mappings to confirm exposure is intentional and secure.
Learnt from: wehos
Repo: Project-N-E-K-O/N.E.K.O PR: 186
File: main_routers/workshop_router.py:1707-1721
Timestamp: 2025-12-16T12:04:33.158Z
Learning: In Python code, do not review or enforce changes between logger.error and logger.exception; choosing between these logging methods is a developer preference and should not be a code-review constraint. Treat this as a stylistic choice and focus reviews on functional correctness and other logging concerns as appropriate.
Summary by CodeRabbit
新功能
杂项