P0 wave-1: security + correctness + Phase 2 foundation by debpalash · Pull Request #88 · debpalash/OmniVoice-Studio

debpalash · 2026-05-19T03:21:49Z

P0 security + correctness fixes plus Phase 2 foundation work, bundled for atomic CI validation.

Depends on PR #87

This PR includes the docs commits from PR #87 (research + plans). Merge #87 first; this PR will rebase down to 7 code commits automatically once #87 lands.

Code commits in this PR (the P0 wave-1)

#	SHA	Type	What
1	`f15f6d9`	P0 security	Loopback guard on `/ws/transcribe` before `accept()` — closes LAN-mic-streaming surface
2	`41c71f0`	P0 dub	Atomic WAV writes — closes #48
3	`b7758cd`	P0 supply-chain	Pin BtbN ffmpeg URL via single `FFMPEG_BTBN_VERSION` constant
4	`772a4bc`	P0 security	Remove `torch.load` monkey-patch in asr_backend (pickle safety)
5	`9667456`	docs	Phase 4 plan `<action>` blocks on checkpoint tasks (structure validation)
6	`282c10a`	Phase 2 prep	`TTSBackend.unload()` foundation for Phase 2 isolation
7	`470519d`	test	Fix `tests/test_capture_ws.py` TestClient host for loopback guard

Test status

✅ pytest tests/ → 243 passed, 6 skipped, 12 xfailed, 1 xpassed (full suite green)
✅ All 18 plans pass gsd-sdk verify.plan-structure (Phase 4 fix in commit 9667456 resolves the 3 missing-<action> errors)

Security context

Three of the seven commits are P0 security/supply-chain. They harden:

WebSocket loopback guard — matches the /system/* HTTP guard pattern from PR P0: release.yml typecheck + bind audit + loopback middleware #84 (PR security: add loopback origin check to /system/set-env #81 → P0: release.yml typecheck + bind audit + loopback middleware #84 → this PR is the security continuation)
Pickle safety — removes the torch.load monkey-patch that bypassed the recent PyTorch security advisory
Supply-chain pin — ffmpeg binary URL constant prevents the "transitively bumped, silently new URL" failure mode

Phase 2 readiness

Commit 282c10a adds TTSBackend.unload() as a default no-op on the ABC. This is the Phase 2 foundation contract — Phase 2's SubprocessBackend primitive (per Plan 02-01) requires unload() to release sidecar processes cleanly. Test file backend/tests/test_tts_backend_lifecycle.py pins the contract so Phase 2 has something to migrate against.

#48 status

The atomic WAV write fix (41c71f0) addresses #48 at one of the 11 write sites identified by Phase 2 research. Plan 02-02 will migrate the remaining sites — this PR is the first step.

Test plan

pytest tests/ → 243 passed, 0 failures
gsd-sdk verify.plan-structure on all 18 plans → all valid
Phase 0 cross-platform CI matrix (Tauri shell + Smoke + Tests on macOS/Windows/Linux)
Manual: bun desktop smoke after merge to confirm no widget regression

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Improved WAV file write reliability to prevent truncated or corrupted files during export.
Security
- Added loopback-only access restriction for the WebSocket transcription endpoint.
Chores
- Pinned FFmpeg build version for more consistent releases.
- Enhanced TTS backend resource management capabilities.

coderabbitai · 2026-05-19T03:22:29Z

📝 Walkthrough

Walkthrough

Fixes silent WAV export corruption and enforces stability contracts via atomic persistence, WebSocket origin guard, ASR safe deserialization, and TTS lifecycle method. Introduces FFmpeg version-pinning infrastructure and expands Phase 4 planning with hardware-verification checkpoints for GGUF and singing-mode integration decisions.

Changes

Code Stability & Lifecycle Hardening

Layer / File(s)	Summary
Atomic WAV writes to fix silent corruption (`#48`) `backend/services/audio_io.py`, `backend/api/routers/dub_generate.py`, `backend/tests/test_atomic_wav.py`	New `atomic_save_wav` helper writes to uniquely-named temp file within target directory, then atomically replaces destination via `os.replace()`, with `BaseException`-level cleanup on failure. Three dub-pipeline sites (RVC per-segment, deferred batch post-watermark, final assembled track) migrated from `torchaudio.save` to atomic persistence. Tests validate success atomicity, failure cleanup, and temp-file contiguity.
WebSocket loopback-only origin enforcement `backend/api/routers/capture_ws.py`, `backend/tests/test_capture_ws.py`, `tests/test_capture_ws.py`	`/ws/transcribe` endpoint checks `websocket.client.host` against `_LOOPBACK_HOSTS` before accept; closes non-loopback connections with code 1008 before handshake. Source-level regression tests prevent removal. Test fixture updated to pass explicit loopback client to `TestClient` constructor.
ASR pickle safety simplification & regression guard `backend/services/asr_backend.py`, `backend/tests/test_asr_pickle_safety.py`	Unsafe `torch.load` monkey-patching removed from `WhisperXBackend._ensure_asr()` during `whisperx.load_model` call; relies solely on `_allow_vad_pickle_globals()` safe-globals allowlist. Regression tests statically inspect source to prevent reintroduction of unsafe patterns: no `torch.load=`/`_ts.load=` reassignments, no `weights_only=False` in executable code, and verification of `add_safe_globals` API usage.
TTS backend lifecycle unload() contract `backend/services/tts_backend.py`, `backend/tests/test_tts_backend_lifecycle.py`	New `unload()` method added to `TTSBackend` ABC with idempotent no-op default implementation. Docstring contract specifies synchronous, safe-before-first-generate behavior. Contract tests verify method presence on ABC (not abstract), self-only signature, default behavior, and callable status on all subclasses for coordinated engine-switch cleanup.
FFmpeg version pinning infrastructure `.github/workflows/release.yml`, `frontend/src-tauri/src/tools.rs`	Introduced `FFMPEG_BTBN_VERSION` constant (defaults to "latest") across CI workflow and Tauri tools. Release workflow and Rust download functions now interpolate version into BtbN FFmpeg-Builds release URLs and log messages. Provides foundation for future pinning to specific builds while maintaining current behavior.

Phase 4 Execution Planning

Layer / File(s)	Summary
Phase 04-01 GGUF cross-hardware smoke testing & macOS Metal GO/NO-GO `.planning/phases/04-adaptive-specialty-engines-spike-first/04-01-PLAN.md`	Task 3 checkpoint adds human-verification workflow: run GGUF-06 smoke tests across 3 hardware classes + macOS Apple Silicon, code-review pinned omnivoice.cpp commit in browser, determine macOS Metal viability from arm64 binary presence, update SPIKE-01 ADR `Status:` and append `Verified on:` entry before approval.
Phase 04-02 singing-mode integration & acceptance testing `.planning/phases/04-adaptive-specialty-engines-spike-first/04-02-PLAN.md`	Task 1 checkpoint: reviewer reads `dub_pipeline.py` for Demucs/TTS routing, selects option-a (per-segment override) or option-b (inline detection), records Decision in SPIKE-02, awaits resume-signal before Task 2 sizing. Task 3 checkpoint: run 30-second smoke on `mixed-30s.wav`, validate four acceptance criteria (intelligible speech, melodic singing, voice identity consistency, bit-identical instrumental), verify license-gate, exercise option-dependent paths (per-segment override or detector CLI), update ADR `Status:` and `Verified on:` before approval.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

debpalash/OmniVoice-Studio#32: Foundational WebSocket endpoint introduced in /ws/transcribe; this PR adds the loopback-only origin guard and contracts for the same handler.
debpalash/OmniVoice-Studio#75: Both PRs touch backend/api/routers/dub_generate.py WAV-write paths—this PR uses atomic writes while that PR addresses OOM exception handling and retry logic in the same dub segment flow.

Poem

🐰 Corrupt WAV files no more!
Atomic writes lock the door—
Loopback guards keep secrets safe,
While GGUF tests find their place.
Phase by phase, the roadmap grows 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 21.74% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title references security, correctness, and Phase 2 foundation work which align with the seven code commits, but is somewhat generic and could be more specific about the key changes.
Description check	✅ Passed	The PR description fully complies with the template, including summary, detailed change list with commit table, type checkboxes, testing results, and security context. All required sections are present and comprehensive.
Linked Issues check	✅ Passed	Issue `#48` requires fixing corrupted exported WAV files. Commit `41c71f0` implements atomic_save_wav() to prevent truncated/corrupt files through atomic writes, directly addressing the root cause and closing the issue.
Out of Scope Changes check	✅ Passed	All changes are scoped to the stated PR objectives: three P0 security/correctness commits, one supply-chain pin, two Phase 2 preparation commits, one test fix, and one documentation update for Phase 4 validation. No extraneous changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch p0-wave1-security-correctness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 19

🧹 Nitpick comments (2)

backend/services/audio_io.py (1)

40-41: ⚡ Quick win

Enforce WAV-only contract in atomic_save_wav.

This helper is explicitly WAV-scoped, but unrestricted **kwargs allows non-WAV formats to slip in accidentally. A simple guard (or kwargs.setdefault("format", "wav")) will keep callsites from silently violating the function contract.

Suggested patch

 def atomic_save_wav(
     target_path: str,
     audio: torch.Tensor,
     sample_rate: int,
     **kwargs: Any,
 ) -> None:
@@
+    fmt = kwargs.get("format", "wav")
+    if str(fmt).lower() != "wav":
+        raise ValueError(f"atomic_save_wav only supports WAV output, got format={fmt!r}")
+    kwargs["format"] = "wav"
+
     fd, tmp_path = tempfile.mkstemp(
         prefix=f".{target_base}.",
         suffix=".wav",
         dir=target_dir,
     )

Also applies to: 75-76

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/services/audio_io.py` around lines 40 - 41, atomic_save_wav currently
accepts unconstrained **kwargs which lets callers pass non‑WAV formats; enforce
the WAV-only contract by either setting kwargs.setdefault("format", "wav") at
the start of atomic_save_wav (and the other similar helper at the 75-76 region)
or explicitly validate kwargs.get("format") == "wav" and raise a ValueError if
not; update the function(s) named atomic_save_wav (and the other helper at the
second occurrence) to perform this guard so callsites cannot silently write
non‑WAV formats.

backend/tests/test_tts_backend_lifecycle.py (1)

123-131: ⚡ Quick win

Avoid broad Exception catch in the idempotency test.

Catching Exception here is unnecessary; direct calls let pytest fail naturally and keep lint clean.

Suggested fix

     def test_default_unload_is_idempotent(self):
         backend = self._make_minimal_subclass()
         # Two back-to-back calls must not raise. Real overriders need this
         # property to handle "user spam-clicks the engine switch" gracefully.
-        try:
-            backend.unload()
-            backend.unload()
-        except Exception as exc:  # pragma: no cover — failure surfaces here
-            pytest.fail(
-                f"Default TTSBackend.unload() not idempotent: {exc!r}. "
-                "Overriders must preserve this — make sure your override "
-                "is safe to call twice."
-            )
+        backend.unload()
+        backend.unload()

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/tests/test_tts_backend_lifecycle.py` around lines 123 - 131, The test
currently wraps two backend.unload() calls in a broad try/except that catches
Exception and calls pytest.fail; remove that try/except so the two calls to
backend.unload() (which exercise TTSBackend.unload()) are invoked directly and
any exception will naturally fail the test and satisfy linters — i.e., delete
the try/except block and keep the two backend.unload() calls in place (preserve
the existing comment if needed).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/release.yml:
- Line 270: FFMPEG_BTBN_VERSION is currently set to the mutable value "latest";
change both places that set this constant/variable (the CI workflow env var
FFMPEG_BTBN_VERSION and the Rust constant/assignment named FFMPEG_BTBN_VERSION
in tools.rs) to a pinned immutable autobuild tag like
"autobuild-YYYY-MM-DD-HH-MM" (use the same exact tag in both places) and add a
short comment that this must be intentionally bumped for updates; ensure no code
or workflow uses "latest" for FFMPEG_BTBN_VERSION going forward.

In
@.planning/phases/03-supertonic-3-engine-installer-mirror-reliability/03-01-PLAN.md:
- Line 390: The frontend/backend contract mismatch: Pydantic model
LicenseAcceptBody currently defines engine_id but the frontend sends engineId;
pick one canonical name and make both sides consistent—either rename the
frontend payload to engine_id or update LicenseAcceptBody to accept engineId via
Pydantic alias (e.g., Field(..., alias="engineId")) and enable
allow_population_by_field_name where needed; ensure the validator that checks
engine_id against {"supertonic3"} still runs (adjust validator to reference the
model field name chosen) so the allow-list logic in LicenseAcceptBody continues
to function.
- Line 425: The verification command hard-codes a machine-specific absolute path
in the automated block; update the command string (the automated command that
currently starts with "cd /Users/user4/Desktop/voice-design/OmniVoice/frontend
&& bun run lint...") to use repo-relative paths (for example "cd frontend && bun
run lint 2>&1 | grep -i \"supertonic\" ; grep -c \"SupertonicLicenseDialog\"
src/components/SettingsEngines.jsx") so it runs correctly in other dev/CI
environments and avoid any absolute paths.

In
@.planning/phases/03-supertonic-3-engine-installer-mirror-reliability/03-02-PLAN.md:
- Line 253: The automated command block currently hard-codes an absolute path
(/Users/user4/Desktop/voice-design/OmniVoice/frontend/src-tauri) making the plan
non-portable; change the automated entry to use repository-relative paths (e.g.,
start with cd frontend/src-tauri && ...) and keep the rest of the pipeline
intact (cargo build --release 2>&1 | grep -E 'error\[|warning:' | head -20 ; cd
frontend/src-tauri && cargo test bootstrap config 2>&1 | tail -30 ; test "$(grep
-c UV_INDEX_URL src/bootstrap.rs)" = "0") so references to UV_INDEX_URL and
src/bootstrap.rs remain correct while removing local absolute paths.

In
@.planning/phases/03-supertonic-3-engine-installer-mirror-reliability/03-RESEARCH.md:
- Line 195: The markdown has fenced code blocks using plain triple backticks
(```) that trigger markdownlint MD040; update each offending fence by adding a
language identifier (e.g., change ``` to ```text or a specific language) so the
code blocks are fenced like ```text to satisfy the linter wherever you see those
bare ``` fences in the file.
- Around line 35-36: The document contains a contradiction about mirror
rotation: clarify and pick one policy and make both statements consistent;
update the references to `frontend/src-tauri/resources/mirrors.json` and the
sentence about rotation at “via app updates only” so they match (for example:
"mirrors.json is rotatable without a release but changes are distributed only
via signed app updates" or "mirrors.json rotates only via app updates"); ensure
the `uv sync --frozen` / `uv.lock` paragraph remains unchanged in meaning and
cross-references the chosen mirror-rotation policy so the two sections no longer
conflict.

In @.planning/phases/04-adaptive-specialty-engines-spike-first/04-02-PLAN.md:
- Around line 7-20: Update the files_modified list in
.planning/phases/04-adaptive-specialty-engines-spike-first/04-02-PLAN.md to
include the backend settings store and the UI override control so the plan
accurately tracks required changes: add backend/services/settings_store.py (to
cover the "settings_store" changes referenced around lines 225–228) and the
frontend UI file that implements per-segment override controls (add the relevant
component/file name you will modify, e.g., a segment override control component
under your frontend/ui or webapp/components path referenced around line 35);
ensure both filenames are listed in the files_modified array so reviewers and
automation see the required implementation surfaces.

In @.planning/phases/04-adaptive-specialty-engines-spike-first/04-RESEARCH.md:
- Line 117: Three fenced code blocks are missing language identifiers (MD040);
update the opening fences for the blocks referenced at lines 117, 165, and 207
by adding the appropriate language tags: change the fence at 117 to ```text, the
fence at 165 to ```bash, and the fence at 207 to ```python so the markdown
linter passes.

In @.planning/phases/05-opt-in-bug-reporting/05-01-PLAN.md:
- Around line 126-129: The plan currently states spill files are created with
mode 0644 which allows other local users to read bug reports; change the
creation semantics so the spill file is created with owner-only permissions
(0600) and its parent directory is created with restricted permissions (e.g.,
0700) or created after temporarily setting umask; update the text that now reads
"File mode is 0644" to "File mode is 0600" and add a note to ensure code uses
explicit file creation flags (e.g., open/os.open with mode 0o600 or equivalent)
rather than relying on default umask so spill files are never world-readable.

In @.planning/phases/05-opt-in-bug-reporting/05-03-PLAN.md:
- Around line 31-32: Update the plan to clarify that
`@tauri-apps/plugin-opener.openUrl` cannot be invoked from headless CI; change the
proposed tests (referenced as tests/cross_platform/test_opener_url.py and the
ci.yml entries) to only validate URL format/encoding/https in unit/CI smoke runs
and remove any claim that the Tauri plugin "returns without error" in headless
runners, and add a note that actual plugin invocation must be covered by
e2e/integration tests against a running Tauri app (also apply same clarification
for the repeated text referenced on lines 46-47).

In @.planning/phases/05-opt-in-bug-reporting/05-RESEARCH.md:
- Line 11: The phase summary references "v0.3.x" and an explicit "v0.3.1" patch
which conflicts with the locked-release policy; edit the Phase 5 paragraph to
remove or replace the "v0.3.1" and "v0.3.x" phrasing (keep the strategic
meaning) so it only references a single v0.3.0 release window (e.g., "v0.3.0" or
"the v0.3 release") and ensure the sentence about compounding downstream value
remains intact; update mentions of "support-scaling lever" and the quoted
maintainer phrase only if they implicitly imply incremental patching so the text
aligns with the rule “Release tag v0.3.0 ships only once.”
- Around line 181-233: The unlabeled fenced code blocks containing the ASCII
diagrams (the triple-backtick blocks that start with "┌─────────────────────┐"
and the similar block later around the second diagram) are triggering MD040; add
a language identifier (e.g., ```text) immediately after each opening ``` to mark
them as plain text (do this for both the big diagram block shown and the other
fenced block noted in the comment) so the markdown linter stops flagging them.

In @.planning/phases/06-release-verification-retro/06-02-PLAN.md:
- Around line 87-90: Replace the permissive per-job guard pattern
(`github.event_name != 'pull_request' || startsWith(github.head_ref,
'release/')`) with a single strict guard on all release-mutating steps (every
publish/sign/softprops/action-gh-release step, e.g., the checksum-publish step
at release.yml:498) by using `if: github.event_name == 'push' &&
startsWith(github.ref, 'refs/tags/v')`; update the guard in the relevant steps
so tag pushes only trigger writes and ensure any steps referencing secrets
(TAURI_SIGNING_PRIVATE_KEY*, signing functions) are protected by this same
condition.

In @.planning/phases/06-release-verification-retro/06-03-PLAN.md:
- Around line 164-165: Update the tar extraction logic in the lib.rs
implementation to require tar >= 0.4.45 and replace any direct use of
Archive::unpack with explicit per-entry handling: iterate Archive::entries(),
for each Entry validate the entry path (reject components containing ".." or
absolute paths), resolve the destination path against the intended extraction
root and ensure it stays inside that root, and explicitly reject or safely
handle symlink/hardlink entries whose targets would escape the root; log and
return an error on any violation rather than relying on the crate default
behavior.

In @.planning/phases/06-release-verification-retro/06-04-PLAN.md:
- Around line 215-216: The gh issue filtering uses a GNU-only date invocation
(`gh issue list --state open --created ">=$(date -d '24 hours ago'
--iso-8601)"`) which fails on macOS/Windows; replace the `date -d ...` portion
with a cross-platform timestamp generator (e.g. a Python or Node one-liner that
emits an ISO-8601 timestamp for 24 hours ago) and update the command in the text
so the `gh issue list --state open --created ">=<TIMESTAMP>"` pattern works on
macOS, Linux and Windows; ensure you update the literal command line in the
document (the line containing `gh issue list --state open --created ...`) and
keep the rest of the instructions unchanged.
- Line 160: Update the plan's closure counts so the summary criteria and any
references (e.g., REL-06) reflect 13 closures instead of 11: find the text
instances that state "11" in the summary criteria and at the noted occurrences
around lines 329-330 and change them to "13" (also update any explanatory
parenthetical that mentions the original count or discrepancy to reference the
expansion to 13 issues).

In @.planning/phases/06-release-verification-retro/06-RESEARCH.md:
- Around line 768-776: The table row with A6/A7 is breaking the 4-column
Markdown table due to unescaped vertical-bar-like characters inside the "Claim"
or "Risk if Wrong" cells (notably the inline snippet draft: ${{ inputs.draft ||
'true' }} in the A6 row and the quoted text in A7); fix by removing or escaping
any literal '|' characters inside cells (wrap code/clauses in backticks or
replace '|' with &`#124`; or rephrase the sentence) so the row for A6 (referencing
draft: ${{ inputs.draft || 'true' }} and tauri-action releaseDraft) and A7
(referencing REL-04/Discord delta and PR `#84`) renders as a 4-column row.
- Around line 685-687: The comment notes a mismatch between the declared "11
originally-open issues" and the for loop that lists 13 issue numbers; update
either the descriptive text or the loop so they match. Locate the for loop that
iterates "for n in 35 42 48 54 55 56 57 58 60 65 72 76 80; do" and either remove
or add issue numbers so the list contains 11 entries, or change the adjacent
text that reads "11 originally-open issues" to the correct count (13) to keep
REL-06 consistent; ensure the chosen fix updates only the loop or the count text
and that the numbers accurately reflect the intended issue set.

In `@backend/tests/test_tts_backend_lifecycle.py`:
- Line 37: Remove the unnecessary inline noqa suppression from the late import
line; the project’s linter configuration does not enable WPS433/PLC0415 so the
comment is ineffective. Edit the import statement "from services import
tts_backend  # noqa: WPS433 — late import is the point" to drop the " # noqa:
WPS433 — late import is the point" suffix, leaving the plain import; retain the
existing docstring that documents the intentional late import.

---

Nitpick comments:
In `@backend/services/audio_io.py`:
- Around line 40-41: atomic_save_wav currently accepts unconstrained **kwargs
which lets callers pass non‑WAV formats; enforce the WAV-only contract by either
setting kwargs.setdefault("format", "wav") at the start of atomic_save_wav (and
the other similar helper at the 75-76 region) or explicitly validate
kwargs.get("format") == "wav" and raise a ValueError if not; update the
function(s) named atomic_save_wav (and the other helper at the second
occurrence) to perform this guard so callsites cannot silently write non‑WAV
formats.

In `@backend/tests/test_tts_backend_lifecycle.py`:
- Around line 123-131: The test currently wraps two backend.unload() calls in a
broad try/except that catches Exception and calls pytest.fail; remove that
try/except so the two calls to backend.unload() (which exercise
TTSBackend.unload()) are invoked directly and any exception will naturally fail
the test and satisfy linters — i.e., delete the try/except block and keep the
two backend.unload() calls in place (preserve the existing comment if needed).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 39369101-a4d8-4cdc-aaf3-ad74d346decb

📥 Commits

Reviewing files that changed from the base of the PR and between e4dbf4c and 470519d.

📒 Files selected for processing (35)

.github/workflows/release.yml
.planning/decisions/SPIKE-01-gguf.md
.planning/decisions/SPIKE-02-singing.md
.planning/phases/02-engine-isolation-subprocessbackend-indextts-wav-export-dubbi/02-01-PLAN.md
.planning/phases/02-engine-isolation-subprocessbackend-indextts-wav-export-dubbi/02-02-PLAN.md
.planning/phases/02-engine-isolation-subprocessbackend-indextts-wav-export-dubbi/02-03-PLAN.md
.planning/phases/02-engine-isolation-subprocessbackend-indextts-wav-export-dubbi/02-04-PLAN.md
.planning/phases/02-engine-isolation-subprocessbackend-indextts-wav-export-dubbi/02-RESEARCH.md
.planning/phases/03-supertonic-3-engine-installer-mirror-reliability/03-01-PLAN.md
.planning/phases/03-supertonic-3-engine-installer-mirror-reliability/03-02-PLAN.md
.planning/phases/03-supertonic-3-engine-installer-mirror-reliability/03-RESEARCH.md
.planning/phases/04-adaptive-specialty-engines-spike-first/04-01-PLAN.md
.planning/phases/04-adaptive-specialty-engines-spike-first/04-02-PLAN.md
.planning/phases/04-adaptive-specialty-engines-spike-first/04-RESEARCH.md
.planning/phases/05-opt-in-bug-reporting/05-01-PLAN.md
.planning/phases/05-opt-in-bug-reporting/05-02-PLAN.md
.planning/phases/05-opt-in-bug-reporting/05-03-PLAN.md
.planning/phases/05-opt-in-bug-reporting/05-RESEARCH.md
.planning/phases/06-release-verification-retro/06-01-PLAN.md
.planning/phases/06-release-verification-retro/06-02-PLAN.md
.planning/phases/06-release-verification-retro/06-03-PLAN.md
.planning/phases/06-release-verification-retro/06-04-PLAN.md
.planning/phases/06-release-verification-retro/06-RESEARCH.md
CLAUDE.md
backend/api/routers/capture_ws.py
backend/api/routers/dub_generate.py
backend/services/asr_backend.py
backend/services/audio_io.py
backend/services/tts_backend.py
backend/tests/test_asr_pickle_safety.py
backend/tests/test_atomic_wav.py
backend/tests/test_capture_ws.py
backend/tests/test_tts_backend_lifecycle.py
frontend/src-tauri/src/tools.rs
tests/test_capture_ws.py

coderabbitai · 2026-05-19T03:29:11Z

        shell: bash
        env:
          TRIPLE: ${{ matrix.rust_target }}
+          FFMPEG_BTBN_VERSION: "latest"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify whether ffmpeg source pin is immutable or still floating. set -euo pipefail echo "release.yml value:" rg -n 'FFMPEG_BTBN_VERSION:\s*".*"' .github/workflows/release.yml echo echo "tools.rs value:" rg -n 'pub const FFMPEG_BTBN_VERSION:\s*&str\s*=\s*".*";' frontend/src-tauri/src/tools.rs echo echo "Any floating 'latest' pins:" rg -n 'FFMPEG_BTBN_VERSION.*latest|ffmpeg-master-latest' .github/workflows/release.yml frontend/src-tauri/src/tools.rs || true

Repository: debpalash/OmniVoice-Studio

Length of output: 495

FFMPEG_BTBN_VERSION must be pinned to an immutable version to ensure reproducible release builds.

Line 270 and frontend/src-tauri/src/tools.rs line 32 both set this to "latest", which is mutable and reintroduces supply-chain drift risk. Pin an immutable autobuild tag (e.g., autobuild-YYYY-MM-DD-HH-MM) and bump intentionally when updates are needed.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/release.yml at line 270, FFMPEG_BTBN_VERSION is currently set to the mutable value "latest"; change both places that set this constant/variable (the CI workflow env var FFMPEG_BTBN_VERSION and the Rust constant/assignment named FFMPEG_BTBN_VERSION in tools.rs) to a pinned immutable autobuild tag like "autobuild-YYYY-MM-DD-HH-MM" (use the same exact tag in both places) and add a short comment that this must be intentionally bumped for updates; ensure no code or workflow uses "latest" for FFMPEG_BTBN_VERSION going forward.

coderabbitai · 2026-05-19T03:29:11Z

+           settings_store.set_license_accepted(body.engine_id, body.accepted)
+           return {"ok": True}
+       ```
+       With a Pydantic model `LicenseAcceptBody(engine_id: str, accepted: bool)` and a validator on `engine_id` against `{"supertonic3"}` (allow-list — do not let arbitrary engine IDs through). Per Pattern 4 in 03-RESEARCH.md and Phase 1's loopback-origin pattern (T-03-04 in threat model).


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align the license payload field name across frontend and backend.

Line 390 defines LicenseAcceptBody(engine_id: str, accepted: bool), but Line 412 sends engineId. This contract mismatch will fail unless explicit aliasing is added.

Use one canonical field (engine_id) end-to-end, or document and implement a Pydantic alias intentionally.

Also applies to: 412-413

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.planning/phases/03-supertonic-3-engine-installer-mirror-reliability/03-01-PLAN.md at line 390, The frontend/backend contract mismatch: Pydantic model LicenseAcceptBody currently defines engine_id but the frontend sends engineId; pick one canonical name and make both sides consistent—either rename the frontend payload to engine_id or update LicenseAcceptBody to accept engineId via Pydantic alias (e.g., Field(..., alias="engineId")) and enable allow_population_by_field_name where needed; ensure the validator that checks engine_id against {"supertonic3"} still runs (adjust validator to reference the model field name chosen) so the allow-list logic in LicenseAcceptBody continues to function.

coderabbitai · 2026-05-19T03:29:11Z

+    4. Verify the Tauri command exists. If `frontend/src-tauri/src/lib.rs` does not yet expose a `set_license_accepted` invoke target, add a thin forwarder there that POSTs to `http://127.0.0.1:{backend_port}/settings/license` with the JSON body — mirror the pattern in `frontend/src-tauri/src/backend.rs::call_backend_json` (already in tree per Phase 1 Plan 01-01 frontmatter). Do NOT add a freeform engine_id allow — hard-code `engineId: "supertonic3"` server-side.
+  </action>
+  <verify>
+    <automated>cd /Users/user4/Desktop/voice-design/OmniVoice/frontend &amp;&amp; bun run lint 2>&amp;1 | grep -i "supertonic" ; grep -c "SupertonicLicenseDialog" src/components/SettingsEngines.jsx</automated>


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace machine-specific absolute paths in verification commands.

Line 425 hard-codes /Users/user4/Desktop/..., which is non-portable and brittle for all other contributors/CI environments.

Switch to repo-relative commands (e.g., cd frontend && ...) so the plan is executable everywhere.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.planning/phases/03-supertonic-3-engine-installer-mirror-reliability/03-01-PLAN.md at line 425, The verification command hard-codes a machine-specific absolute path in the automated block; update the command string (the automated command that currently starts with "cd /Users/user4/Desktop/voice-design/OmniVoice/frontend && bun run lint...") to use repo-relative paths (for example "cd frontend && bun run lint 2>&1 | grep -i \"supertonic\" ; grep -c \"SupertonicLicenseDialog\" src/components/SettingsEngines.jsx") so it runs correctly in other dev/CI environments and avoid any absolute paths.

coderabbitai · 2026-05-19T03:29:11Z

+    Do NOT touch `backend.rs::spawn_backend`'s `HF_ENDPOINT=https://hf-mirror.com` block (already in tree at lines 168-173) — that path remains unchanged. The `hf_endpoint` map in `mirrors.json` is informational/documentation for Phase 3; future plans may wire it into `spawn_backend` to make HF endpoint data-driven, but that is OUT OF SCOPE for Phase 3 per the Deferred Ideas list in 03-RESEARCH.md.
+  </action>
+  <verify>
+    <automated>cd /Users/user4/Desktop/voice-design/OmniVoice/frontend/src-tauri &amp;&amp; cargo build --release 2&gt;&amp;1 | grep -E 'error\[|warning:' | head -20 ; cd /Users/user4/Desktop/voice-design/OmniVoice/frontend/src-tauri &amp;&amp; cargo test bootstrap config 2&gt;&amp;1 | tail -30 ; test "$(grep -c UV_INDEX_URL /Users/user4/Desktop/voice-design/OmniVoice/frontend/src-tauri/src/bootstrap.rs)" = "0"</automated>


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use repository-relative verification commands instead of absolute local paths.

Line 253 hard-codes /Users/user4/Desktop/..., which makes the plan non-reproducible for other machines and CI.

Use relative paths from repo root (cd frontend/src-tauri && ...) to keep execution portable.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.planning/phases/03-supertonic-3-engine-installer-mirror-reliability/03-02-PLAN.md at line 253, The automated command block currently hard-codes an absolute path (/Users/user4/Desktop/voice-design/OmniVoice/frontend/src-tauri) making the plan non-portable; change the automated entry to use repository-relative paths (e.g., start with cd frontend/src-tauri && ...) and keep the rest of the pipeline intact (cargo build --release 2>&1 | grep -E 'error\[|warning:' | head -20 ; cd frontend/src-tauri && cargo test bootstrap config 2>&1 | tail -30 ; test "$(grep -c UV_INDEX_URL src/bootstrap.rs)" = "0") so references to UV_INDEX_URL and src/bootstrap.rs remain correct while removing local absolute paths.

coderabbitai · 2026-05-19T03:29:12Z

+```
+
+11 originally-open issues for end-state confirmation:
+35, 42, 48, 54, 55, 56, 57, 58, 60, 65, 72, 76, 80 (13 total; the "11" in REL-06 is the original CLAUDE.md count, expanded to 13 after #76 / #80 / #72 were added per the inbox grow during planning — record the discrepancy in the retro).


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Normalize closure counts to 13 everywhere in this plan.

This file still references “11” in summary criteria while enumerating 13 tracked issues, which weakens release signoff clarity.

Also applies to: 329-330

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.planning/phases/06-release-verification-retro/06-04-PLAN.md at line 160, Update the plan's closure counts so the summary criteria and any references (e.g., REL-06) reflect 13 closures instead of 11: find the text instances that state "11" in the summary criteria and at the noted occurrences around lines 329-330 and change them to "13" (also update any explanatory parenthetical that mentions the original count or discrepancy to reference the expansion to 13 issues).

coderabbitai · 2026-05-19T03:29:12Z

+    2. At each check-in, scan all 3 surfaces: (a) Discord #help and #bugs channels — anything tagged or mentioned about v0.3.0-rc1 or the rc draft release; (b) `gh issue list --state open --created ">=$(date -d '24 hours ago' --iso-8601)"` (recent issues); (c) the opt-in auto-bug-report inbox (Phase 5 REPORT-* — if any reports landed via the prefilled URL pattern they show up as labeled `auto-report` in GH Issues).
+    3. If ANY regression report lands during the window: STOP. Cut rc2: fix the issue on a PR to main; merge; re-tag `v0.3.0-rc2`; restart Tasks 1+2+3 against rc2.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace GNU-only date -d examples with cross-platform-safe commands.

These instructions are likely to be run on macOS during release ops, where date -d is unavailable by default.

Based on learnings: “All fixes must work across macOS … Windows … and Linux with no platform-only regressions.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.planning/phases/06-release-verification-retro/06-04-PLAN.md around lines 215 - 216, The gh issue filtering uses a GNU-only date invocation (`gh issue list --state open --created ">=$(date -d '24 hours ago' --iso-8601)"`) which fails on macOS/Windows; replace the `date -d ...` portion with a cross-platform timestamp generator (e.g. a Python or Node one-liner that emits an ISO-8601 timestamp for 24 hours ago) and update the command in the text so the `gh issue list --state open --created ">=<TIMESTAMP>"` pattern works on macOS, Linux and Windows; ensure you update the literal command line in the document (the line containing `gh issue list --state open --created ...`) and keep the rest of the instructions unchanged.

coderabbitai · 2026-05-19T03:29:12Z

+for n in 35 42 48 54 55 56 57 58 60 65 72 76 80; do
+  gh issue edit $n --milestone v0.3.0
+done


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Issue baseline count is internally inconsistent with REL-06.

Line 685 says “11 originally-open issues,” but the loop currently assigns 13 issue numbers. This will skew milestone/retro metrics and closure validation.

🔧 Suggested fix

-# 2. Assign all 11 originally-open issues to the v0.3.0 milestone -for n in 35 42 48 54 55 56 57 58 60 65 72 76 80; do +# 2. Assign all originally-open issues to the v0.3.0 milestone +# Keep this list synchronized with REL-06 baseline and the verification grid. +for n in 35 42 48 54 55 56 57 58 60 65 72 76 80; do

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.planning/phases/06-release-verification-retro/06-RESEARCH.md around lines 685 - 687, The comment notes a mismatch between the declared "11 originally-open issues" and the for loop that lists 13 issue numbers; update either the descriptive text or the loop so they match. Locate the for loop that iterates "for n in 35 42 48 54 55 56 57 58 60 65 72 76 80; do" and either remove or add issue numbers so the list contains 11 entries, or change the adjacent text that reads "11 originally-open issues" to the correct count (13) to keep REL-06 consistent; ensure the chosen fix updates only the loop or the count text and that the numbers accurately reflect the intended issue set.

coderabbitai · 2026-05-19T03:29:12Z

+| # | Claim | Section | Risk if Wrong |
+|---|-------|---------|---------------|
+| A1 | User declines to authorize Apple Developer enrollment ($99/yr) for Phase 6 | "Locked Decisions" + Pitfall #4 | If user actually wants notarization in v0.3.0, Phase 6 scope expands; ~2-5 days added (enrollment delay + tauri-action APPLE_* secrets setup + notarize iteration). |
+| A2 | PR #73 should be reimplemented over rebased | "Locked Decisions" + Pitfall #3 | If rebase strategy is preferred, `lib.rs` conflict resolution is the load-bearing step — review carefully, don't accept auto-merge. |
+| A3 | `v0.3.0` milestone target is "after Phase 5 completes," not a calendar date | "Release flow" diagram | If a calendar deadline (e.g., "ship by end of June") is in play, soak/rc strategy may compress; rc2 becomes likely. |
+| A4 | Project-wide `gh issue list --milestone v0.3.0` baseline can be established at Phase 6 start by adding the milestone retroactively | "Issue closure query" code example | If milestone is created mid-phase, issues closed before that may need manual milestone-attribution. |
+| A5 | Discord support-volume baseline is feasibly measurable retroactively from channel history | Pitfall #6 | If no Discord history access or the channel is too noisy to count, REL-04's "Discord delta" becomes qualitative; PLAN.md should acknowledge this. |
+| A6 | release.yml's `draft: ${{ inputs.draft || 'true' }}` default applies to tag pushes too (publishes as draft) | Anti-Patterns: "Forgetting `gh release edit --draft=false`" | Need to verify by reading tauri-action's handling of `releaseDraft`; if tag-push actually publishes non-draft by default, the "promote step" in the runbook isn't needed. |
+| A7 | The Critic's Phase 0 retrospective items (migration-roundtrip CI test, bind-address audit, sibling-endpoint loopback) are already closed by Phase 1 plans + PR #84 | Brief mention in Critic context | If migration-roundtrip is NOT in Phase 1 plans, Phase 6 PLAN.md needs to add it as a pre-tag verification step. **Need to grep Phase 1 plans during planning.** |


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assumptions table row is malformed and may render incorrectly.

The row around Line 775 is triggering a column-count mismatch (extra pipes in the “Risk if Wrong” cell). Escape or rephrase inline pipes so the table stays 4-column stable.

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 775-775: Table column count
Expected: 4; Actual: 6; Too many cells, extra data will be missing

(MD056, table-column-count)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.planning/phases/06-release-verification-retro/06-RESEARCH.md around lines 768 - 776, The table row with A6/A7 is breaking the 4-column Markdown table due to unescaped vertical-bar-like characters inside the "Claim" or "Risk if Wrong" cells (notably the inline snippet draft: ${{ inputs.draft || 'true' }} in the A6 row and the quoted text in A7); fix by removing or escaping any literal '|' characters inside cells (wrap code/clauses in backticks or replace '|' with &`#124`; or rephrase the sentence) so the row for A6 (referencing draft: ${{ inputs.draft || 'true' }} and tauri-action releaseDraft) and A7 (referencing REL-04/Discord delta and PR `#84`) renders as a 4-column row.

coderabbitai · 2026-05-19T03:29:12Z

+    that some subclasses pull in at module import. We don't need a live
+    engine to validate the ABC contract.
+    """
+    from services import tts_backend  # noqa: WPS433 — late import is the point


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the invalid noqa and whether external lint codes are configured. rg -n 'noqa:\s*WPS433|lint\.external|external\s*='

Repository: debpalash/OmniVoice-Studio

Length of output: 197

🏁 Script executed:

# Find Ruff configuration files fd -e toml -e ini -e cfg | grep -E 'ruff|pyproject'

Repository: debpalash/OmniVoice-Studio

Length of output: 95

🏁 Script executed:

# Check pyproject.toml for Ruff config if it exists cat pyproject.toml 2>/dev/null | head -100

Repository: debpalash/OmniVoice-Studio

Length of output: 4149

🏁 Script executed:

# Check for ruff.toml cat ruff.toml 2>/dev/null

Repository: debpalash/OmniVoice-Studio

Length of output: 1185

🏁 Script executed:

# Look at the context around line 37 in the test file sed -n '30,45p' backend/tests/test_tts_backend_lifecycle.py

Repository: debpalash/OmniVoice-Studio

Length of output: 645

Remove the ineffective # noqa comment; no enabled linting rule flags this late import.

Neither WPS433 (flake8-wemake, not enabled) nor the suggested PLC0415 (pylint convention, not enabled) are valid in this project's Ruff configuration. Since no import-checking rule is currently active, the noqa suppression is unnecessary:

- from services import tts_backend # noqa: WPS433 — late import is the point + from services import tts_backend

The docstring already explains why the late import is intentional.

🧰 Tools

🪛 Ruff (0.15.13)

[warning] 37-37: Invalid rule code in # noqa: WPS433

Add non-Ruff rule codes to the lint.external configuration option

(RUF102)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/tests/test_tts_backend_lifecycle.py` at line 37, Remove the unnecessary inline noqa suppression from the late import line; the project’s linter configuration does not enable WPS433/PLC0415 so the comment is ineffective. Edit the import statement "from services import tts_backend # noqa: WPS433 — late import is the point" to drop the " # noqa: WPS433 — late import is the point" suffix, leaving the plain import; retain the existing docstring that documents the intentional late import.

The streaming-ASR WebSocket accepted any local connection without an origin check. Any process running as the same user (rogue extension, malware, sibling Electron app) could open ws://127.0.0.1:3900/ws/transcribe and exfiltrate the user's live microphone audio in real time. HTTP routers gate sensitive endpoints with Depends(require_loopback) at the router level (see backend/api/dependencies.py). FastAPI's WebSocket dependency injection differs across versions, so the guard is inlined in ws_transcribe before websocket.accept() — non-loopback origins receive a 1008 (Policy Violation) close and never see the open socket. Three source-level tests in backend/tests/test_capture_ws.py guard against regression — same shape as tests/test_bind_host.py: - references _LOOPBACK_HOSTS in the handler - closes non-loopback with code 1008 - close() appears before accept() in the source Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Direct torchaudio.save(path, ...) writes bytes as the encoder produces them. SIGKILL, OOM kill, or Tauri sidecar reap mid-write leaves the file at `path` truncated. Downstream tools (ffmpeg in the dub mux, NLEs the user imports the WAV into) happily read truncated RIFF — the header appears first, then the data chunk gets cut short — and surface as silently corrupt audio later in the pipeline. That's the shape of #48. New helper services/audio_io.py:atomic_save_wav() writes to a sibling temp file in the same directory then os.replace() into place. POSIX rename(2) is atomic; os.replace() ports the same guarantee to Windows. Either the target ends up with a complete WAV or it keeps its previous contents (or never exists) — no third state. Migrated three call sites in api/routers/dub_generate.py: - L289: RVC per-segment write - L328: deferred batch write of all segments - L390: final mixed-track export Left L508 alone — it writes to BytesIO (in-memory response body), no atomicity needed. Implementation note (recorded as a docstring in audio_io.py): the temp file must end in `.wav`, not `.tmp`. torchaudio.save infers the output format from the path suffix and ignores the `format=` kwarg with the soundfile backend. A `.tmp` suffix raises "Unsupported format: tmp". The leading dot + target-name prefix still marks the file as transient. Tests in backend/tests/test_atomic_wav.py: - success path: writes valid WAV, no temp leaks, overwrites cleanly - atomicity: target unchanged when save raises (pre-existing target) - atomicity: target absent when save raises (new target path) - no temp leaks on failure - temp file lives in target_dir (cross-fs renames are not atomic) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…constant The BtbN/FFmpeg-Builds download URL in release.yml + tools.rs hard-coded `latest` in two places per URL (release tag + filename datestamp), four total URL strings. That made an actual pin — flipping from `latest` to e.g. `autobuild-2026-04-15-12-50` — a fragile 4-place find-and-replace. More importantly, `latest` is a moving target: BtbN can retag, the build can regress, the binary can fail Windows SmartScreen, and the user- facing failure mode is a 2am page when fresh installs start hanging on ffmpeg download. This commit doesn't change the pinned version — it stays at `latest` to preserve current behaviour — but introduces the pin discipline as infrastructure: • release.yml: FFMPEG_BTBN_VERSION env var on the ffmpeg-bundling step. The single string drives both the {tag} and {datestamp} substitutions in the URL. • tools.rs: pub const FFMPEG_BTBN_VERSION (next to UV_VERSION). Same string drives both Linux and Windows first-run downloads via format!(). • Comments in both files: link to BtbN releases page so the maintainer can pick a real autobuild tag, and a note that the two constants must match. Pinning is now a two-file, one-string edit instead of a four-file, eight-string edit. The next maintainer cycle (or this one) should flip `latest` to a specific autobuild tag — but that decision needs a human choosing a known-good build, not an autonomous guess. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous code in _ensure_asr reassigned both torch.load and torch.serialization.load to a wrapper that forced weights_only=False for the duration of whisperx.load_model. The surrounding comment called this "belt-and-braces" — it was the opposite. The patch defeated PyTorch's secure unpickler *globally for the process* during the window, so any other code running on a different thread (or invoked transitively by whisperx itself) could have deserialised an attacker- controlled pickle without a warning. The correct mitigation is already present: add_safe_globals(...) in _allow_vad_pickle_globals registers exactly the trusted pickle classes the whisperx-shipped pyannote VAD checkpoint needs. The secure load path (weights_only=True) then succeeds without us disabling it. If pyannote ever updates the checkpoint with a new pickle class, loading fails loudly and we extend the allowlist — far better than silently leaving the secure unpickler off forever. Source-level guards in backend/tests/test_asr_pickle_safety.py block the patch from coming back. The guards strip Python comments and string literals via tokenize before matching, so the rationale explaining what was removed doesn't trip them. Same shape as tests/test_bind_host.py: - torch.load / torch.serialization.load not reassigned in code - no executable weights_only=False - _allow_vad_pickle_globals() still defined and still called in _ensure_asr - add_safe_globals API still referenced Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…validation

@AbstractMethod

Every backend today lazily loads its weights on first generate() and keeps them in VRAM for the lifetime of the process. Switching engines in Settings leaks the old engine's allocations until the next process restart — measurable on multi-engine sessions on 8 GB MPS Macs. This commit adds the contract the registry will call into in Phase 2: TTSBackend.unload() — a default no-op on the ABC, idempotent, returns None. Wave 1 deliberately does *not* make it @AbstractMethod, which would break every one of the 9 existing subclasses that haven't migrated. Phase 2 flips it to abstract alongside per-engine overrides and a CI gate that fails when a new engine forgets to implement it. Tests in backend/tests/test_tts_backend_lifecycle.py pin the contract so Phase 2 has something to migrate against: - unload() exists on the ABC - unload() is NOT abstract today (Wave 1 invariant) - signature is `(self) -> None` - default returns None - default is idempotent (two back-to-back calls don't raise) - every existing subclass has a callable unload() (via inheritance) If any of these break, the registry can no longer safely call unload() during a fast engine-switch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Follow-up to f15f6d9 (WS loopback guard on /ws/transcribe). The functional tests in tests/test_capture_ws.py constructed TestClient with the Starlette default client=("testclient", 50000), which the guard's _LOOPBACK_HOSTS allow-list rejects. PR #84 established the fix pattern for HTTP tests: explicitly pass client=("127.0.0.1", 50000) to TestClient — same security property, same semantics. Applied here. 3 tests previously failing now pass: - test_eof_text_frame_triggers_final_without_disconnect - test_legacy_disconnect_still_finalizes - test_empty_binary_frame_acts_as_eof

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

backend/tests/test_tts_backend_lifecycle.py (1)
62-77: 💤 Low value

Minor: Remove None from return annotation check.

The assertion at line 74 checks if sig.return_annotation in (None, type(None), "None", inspect.Signature.empty). However, None (the singleton value) will never appear as a return annotation. Return annotations are either:

A string "None" (with from __future__ import annotations)

The type type(None) (without future annotations)

inspect.Signature.empty (if unannotated)

The check for None is logically incorrect, though harmless since "None" is also in the tuple and will match.
♻️ Proposed fix
-        assert sig.return_annotation in (None, type(None), "None", inspect.Signature.empty), (
+        assert sig.return_annotation in (type(None), "None", inspect.Signature.empty), (
             f"TTSBackend.unload should return None (or be unannotated); "
             f"got annotation {sig.return_annotation!r}."
         )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/tests/test_tts_backend_lifecycle.py` around lines 62 - 77, Remove the
incorrect literal None from the return-annotation acceptance check in
test_unload_signature_takes_self_only: update the assertion that inspects
sig.return_annotation (for TTSBackend.unload) to only accept type(None), the
string "None", or inspect.Signature.empty (since the actual return annotation
will never be the singleton None); leave the rest of the test unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/tests/test_tts_backend_lifecycle.py`:
- Around line 140-158: The comment in test_all_subclasses_have_callable_unload
says "at least 9 engines" but the assertion checks for >=1; update the test to
make the expectation consistent by changing the assertion to assert
len(subclasses) >= 9 (or alternatively change the comment to reflect a defensive
>=1 check); modify the assertion in the test_all_subclasses_have_callable_unload
function (referencing the subclasses variable and the assertion block) so the
numeric check matches the comment (or update the comment to match the current
>=1 behavior).

---

Nitpick comments:
In `@backend/tests/test_tts_backend_lifecycle.py`:
- Around line 62-77: Remove the incorrect literal None from the
return-annotation acceptance check in test_unload_signature_takes_self_only:
update the assertion that inspects sig.return_annotation (for TTSBackend.unload)
to only accept type(None), the string "None", or inspect.Signature.empty (since
the actual return annotation will never be the singleton None); leave the rest
of the test unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 426d3aa7-4c8d-4663-9a44-a826015921a1

📥 Commits

Reviewing files that changed from the base of the PR and between 470519d and 71c10dc.

📒 Files selected for processing (14)

.github/workflows/release.yml
.planning/phases/04-adaptive-specialty-engines-spike-first/04-01-PLAN.md
.planning/phases/04-adaptive-specialty-engines-spike-first/04-02-PLAN.md
backend/api/routers/capture_ws.py
backend/api/routers/dub_generate.py
backend/services/asr_backend.py
backend/services/audio_io.py
backend/services/tts_backend.py
backend/tests/test_asr_pickle_safety.py
backend/tests/test_atomic_wav.py
backend/tests/test_capture_ws.py
backend/tests/test_tts_backend_lifecycle.py
frontend/src-tauri/src/tools.rs
tests/test_capture_ws.py

✅ Files skipped from review due to trivial changes (3)

backend/tests/test_capture_ws.py
.planning/phases/04-adaptive-specialty-engines-spike-first/04-01-PLAN.md
.planning/phases/04-adaptive-specialty-engines-spike-first/04-02-PLAN.md

🚧 Files skipped from review as they are similar to previous changes (9)

.github/workflows/release.yml
tests/test_capture_ws.py
backend/api/routers/capture_ws.py
frontend/src-tauri/src/tools.rs
backend/services/asr_backend.py
backend/tests/test_atomic_wav.py
backend/services/audio_io.py
backend/api/routers/dub_generate.py
backend/tests/test_asr_pickle_safety.py

coderabbitai · 2026-05-19T04:01:33Z

+    def test_all_subclasses_have_callable_unload(self):
+        tts = _load_tts_backend_module()
+        subclasses = [
+            cls for cls in vars(tts).values()
+            if isinstance(cls, type)
+            and issubclass(cls, tts.TTSBackend)
+            and cls is not tts.TTSBackend
+        ]
+        # Sanity: the file is supposed to ship at least 9 engines today.
+        assert len(subclasses) >= 1, (
+            "No TTSBackend subclasses found in services.tts_backend. "
+            "Did the registry split into another module without updating "
+            "this test?"
+        )
+        for cls in subclasses:
+            assert callable(getattr(cls, "unload", None)), (
+                f"{cls.__name__} has no callable unload() — even via the "
+                "ABC inheritance. Did someone shadow it?"
+            )


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Inconsistency: Comment claims "at least 9 engines" but assertion checks for ≥1.

Line 149 has a comment stating "the file is supposed to ship at least 9 engines today", but the assertion only checks assert len(subclasses) >= 1. This weak assertion won't catch regressions where engines are accidentally removed from the registry. Consider either:

Updating the assertion to match the comment: assert len(subclasses) >= 9

Updating the comment to match the assertion's defensive intent

The current mismatch could mislead future maintainers about the expected registry size.

🔧 Proposed fix (option 1: strengthen assertion)

- # Sanity: the file is supposed to ship at least 9 engines today. - assert len(subclasses) >= 1, ( + # Sanity: the file ships 9 engines as of Wave 1 (OmniVoice, VoxCPM2, + # MOSS-TTS-Nano, KittenTTS, MLX-Audio, CosyVoice, IndexTTS2, GPT-SoVITS, Sherpa-ONNX). + assert len(subclasses) >= 9, ( "No TTSBackend subclasses found in services.tts_backend. " "Did the registry split into another module without updating " "this test?"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/tests/test_tts_backend_lifecycle.py` around lines 140 - 158, The comment in test_all_subclasses_have_callable_unload says "at least 9 engines" but the assertion checks for >=1; update the test to make the expectation consistent by changing the assertion to assert len(subclasses) >= 9 (or alternatively change the comment to reflect a defensive >=1 check); modify the assertion in the test_all_subclasses_have_callable_unload function (referencing the subclasses variable and the assertion block) so the numeric check matches the comment (or update the comment to match the current >=1 behavior).

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

debpalash and others added 7 commits May 19, 2026 09:24

docs(phase-4): add <action> blocks to checkpoint tasks for structure …

6b49290

…validation

debpalash force-pushed the p0-wave1-security-correctness branch from 470519d to 71c10dc Compare May 19, 2026 03:55

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

debpalash merged commit 651e63b into main May 19, 2026
8 checks passed

		2. At each check-in, scan all 3 surfaces: (a) Discord #help and #bugs channels — anything tagged or mentioned about v0.3.0-rc1 or the rc draft release; (b) `gh issue list --state open --created ">=$(date -d '24 hours ago' --iso-8601)"` (recent issues); (c) the opt-in auto-bug-report inbox (Phase 5 REPORT-* — if any reports landed via the prefilled URL pattern they show up as labeled `auto-report` in GH Issues).
		3. If ANY regression report lands during the window: STOP. Cut rc2: fix the issue on a PR to main; merge; re-tag `v0.3.0-rc2`; restart Tasks 1+2+3 against rc2.

Conversation

debpalash commented May 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Depends on PR #87

Code commits in this PR (the P0 wave-1)

Test status

Security context

Phase 2 readiness

#48 status

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

debpalash commented May 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 19, 2026 •

edited

Loading