test(probe): whole-app coverage — dubbing, i18n, engines, security, migration, dictation, design by debpalash · Pull Request #247 · debpalash/OmniVoice-Studio

debpalash · 2026-06-02T07:26:46Z

What

Expands the probe harness from one happy-path spec per layer to whole-app feature coverage — web, backend, dictation, clone, design — keeping the Actor/Judge split (no LLM on the verdict path) and offline-by-default + enable-on-demand for heavy paths. A single subprocess boot is shared across all backend-touching specs.

New feature specs

Spec	Verifies	Runs for real here?
dubbing (`dub_export`)	per-segment duration ratio (core dubbing constraint), SRT/VTT well-formed, export-archive contents; output language-ID (advisory)	✅ offline (synthetic)
i18n (`i18n_parity`)	locale files valid JSON (gate); orphan-keys + coverage (advisory)	✅ against real locales
engine matrix (`engines`)	active TTS/ASR engine available + every unavailable engine explains how to install it (11 TTS / 7 ASR backends)	✅ real backend
security	system routes reject non-loopback origins (403)	✅ real backend
migration	alembic UPGRADE on the seeded `omnivoice_data` fixture (backward-compat)	✅ real backend
dictation	streaming-ASR WebSocket `/ws/transcribe` registered + accepts loopback handshake	✅ real backend
voice design	design output runs the audio-correctness ladder	✅ offline (synthetic)
coverage-critic	every declared layer still has a spec (drift gate) + API-surface inventory (advisory)	✅ real (156 routes)
real ASR round-trip	TTS→Whisper→WER	enable-on-demand (`PROBE_E2E=1`)

🐞 Real bug surfaced

The i18n orphan-key check found that all 20 non-en locales carry gallery.cat_* (and some bootstrap.lines) keys that are absent from the en reference — so those gallery-category labels don't render for English users. It's reported in the advisory lane (non-blocking), since the fix is a product change, not a harness change. Worth a follow-up to add the missing keys to en.json.

Design

One shared subprocess boot (boot_capture session fixture) captures first-run endpoints, engine/ASR matrices, loopback-reject, OpenAPI inventory, and the dictation WS handshake — so the suite pays for ~one boot, not one per spec. Subprocess isolation keeps it from contaminating the rest of the test session.
Heavy/live paths skip cleanly: real ASR (PROBE_E2E=1), live browser, Docker.

Verification

uv run pytest tests/probe → 74 passed, 5 skipped.
Full repo → 687 passed, 13 skipped, 13 xfailed, 0 failures; no contamination.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Comprehensive probe framework for dubbing, voice-design, streaming ASR/dictation, engine matrices, i18n parity, DB migration checks, and loopback-security; new judges for engines, dubbing/export, subtitles, dictation handshakes, and locale coverage.
Documentation
- README: added "Feature coverage (specs)" section and noted an i18n missing-key surfaced as an advisory.
Tests
- Numerous probe tests added (E2E and synthetic) exercising the new probes and judges.

…ation, dictation, design, coverage-critic Broadens the probe harness from one happy-path spec per layer to whole-app feature coverage (web, backend, dictation, clone, design), keeping the Actor/Judge split and offline-by-default + enable-on-demand for heavy paths. New specs + judges (one subprocess boot shared across backend-touching specs): - dubbing (L4): segment duration-ratio, SRT/VTT well-formed, export-archive contents, output language-ID (advisory) - i18n: locale files valid JSON (gate); orphan-keys + coverage (advisory). NOTE: surfaced a real bug — all 20 non-en locales carry gallery.cat_*/ bootstrap.lines keys absent from the en reference (reported, not gated). - engine matrix: active engine available + every unavailable engine explains why (11 TTS / 7 ASR backends via /engines/*) - loopback security: system routes reject non-loopback origins (403) - DB migration: alembic UPGRADE on the seeded omnivoice_data fixture - Coverage Critic: every declared layer still has a spec (gate) + API inventory - dictation: streaming-ASR WebSocket /ws/transcribe registered + handshake - voice design: reuses the audio-correctness ladder - real ASR round-trip: enable-on-demand (PROBE_E2E=1) Enriched _boot_runner.py to capture engines/asr/loopback/openapi/ws in ONE isolated boot (conftest boot_capture session fixture); added env.seeded_data_dir. 13 specs total. probe suite 74 passed / 5 skipped; full repo 687 passed, 0 failures. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-02T07:26:57Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: eb258f15-3767-4ed6-baff-142deb5d0533

📥 Commits

Reviewing files that changed from the base of the PR and between 7a8993a and 0b6d823.

📒 Files selected for processing (2)

tests/probe/judges/dubbing.py
tests/probe/test_probe_migration.py

🚧 Files skipped from review as they are similar to previous changes (2)

tests/probe/test_probe_migration.py
tests/probe/judges/dubbing.py

📝 Walkthrough

Walkthrough

This PR expands the probe testing framework with comprehensive backend validation. It enhances the boot runner to capture engine matrices, loopback rejection, websocket routes, and OpenAPI paths; adds judge modules for dubbing, i18n, engine, and coverage validation; introduces eight new probe specifications; and wires test implementations for each probe domain.

Changes

Probe Test Suite Expansion

Layer / File(s)	Summary
Boot infrastructure and session fixtures `tests/probe/_boot_runner.py`, `tests/probe/conftest.py`, `tests/probe/env.py`, `tests/probe/README.md`	Expands out-of-process boot runner to capture engine matrices, loopback rejection status, websocket routes, and OpenAPI paths. Adds session-scoped `boot_capture` fixture and `seeded_data_dir` context manager for test reuse. Documents probe spec coverage in README.
Dubbing pipeline validation `tests/probe/judges/dubbing.py`, `tests/probe/specs/dub_export.probe.yaml`, `tests/probe/test_probe_dubbing.py`	Judges validate segment duration ratios, SRT/VTT subtitle structure, archive entry patterns, and output language via pluggable detector protocol. Tests cover ratio bounds, subtitle well-formedness, archive matching, and language detection with/without detector.
Internationalization localization checks `tests/probe/judges/i18n.py`, `tests/probe/specs/i18n_parity.probe.yaml`, `tests/probe/test_probe_i18n.py`	Judges enforce locale JSON validity, detect orphan keys across non-reference locales, and measure translation coverage. Tests validate JSON parsing, orphan-key detection, and coverage measurement against frontend locale files.
Engine matrix and coverage analysis judges `tests/probe/judges/engine.py`, `tests/probe/judges/coverage.py`, `tests/probe/spec.py`	Engine judges validate backend presence, active engine availability, and unavailability explanations. Coverage judges scan probe specs, derive API route prefixes, and enforce required layer coverage. Registry wiring extends `JUDGE_REGISTRY` with new judge keys.
Engine matrix and coverage probe tests `tests/probe/specs/engines.probe.yaml`, `tests/probe/specs/coverage_critic.probe.yaml`, `tests/probe/test_probe_engines.py`, `tests/probe/test_probe_coverage.py`	Engine probe tests validate TTS/ASR registries and default engine availability. Coverage probe asserts required spec layers are present and validates OpenAPI surface meets thresholds.
Security and loopback rejection validation `tests/probe/specs/security.probe.yaml`, `tests/probe/test_probe_security.py`	Security probe validates non-loopback origins receive HTTP 403 from `/system/info`. Test verifies boot-captured loopback rejection status is 403.
Dictation WebSocket validation `tests/probe/judges/dictation.py`, `tests/probe/specs/dictation.probe.yaml`, `tests/probe/test_probe_dictation.py`	Dictation judges validate WebSocket endpoint registration and handshake success. Spec and tests verify `/ws/transcribe` contract (endpoint present and connection succeeds).
Voice design and audio output validation `tests/probe/specs/voice_design.probe.yaml`, `tests/probe/test_probe_design.py`	Voice design probe exercises `POST /generate` with fixed parameters, captures audio, and validates decoding, numerical validity, duration/silence, and ASR accuracy. Test includes fixture for deterministic synthetic audio.
Database migration and data integrity `tests/probe/specs/migration.probe.yaml`, `tests/probe/test_probe_migration.py`	Migration probe boots from seeded omnivoice_data fixture and validates health endpoint. Test confirms database integrity after upgrade on pre-existing data.
ASR end-to-end and transcriber wiring `tests/probe/test_probe_asr_e2e.py`	ASR tests validate FasterWhisper transcriber API conformance and lazy model loading, plus optional real round-trip validation with word error rate thresholds.

Sequence Diagram(s)

sequenceDiagram
  participant Tester as Probe Test Runner
  participant Boot as _boot_runner.py
  participant Env as seeded/fresh Env
  participant Client as TestClient
  participant App as FastAPI_app
  Tester->>Boot: invoke capture_first_run with data-dir
  Boot->>Env: prepare data-dir (fresh or seeded)
  Boot->>Client: start app under TestClient
  Client->>App: GET /engines/tts, GET /engines/asr
  Client->>App: GET /system/info (non-lifespan loopback check)
  Client->>App: open /ws/transcribe websocket
  Boot->>Boot: snapshot DB files pre/post and collect app.openapi()/ws routes

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

debpalash/OmniVoice-Studio#245: Prior probe-harness work that this change extends (boot runner and conftest fixture evolution).
debpalash/OmniVoice-Studio#84: Related loopback-origin middleware and /system/info loopback enforcement exercised by the security probes.
debpalash/OmniVoice-Studio#88: Related WebSocket loopback guard and /ws/transcribe behavior exercised by dictation probes.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 35.09% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title accurately and concisely summarizes the main change: expanding test probe coverage to include multiple new feature areas (dubbing, i18n, engines, security, migration, dictation, design).
Description check	✅ Passed	The pull request description is comprehensive and well-structured, covering what was changed, which new specs were added, the design rationale, verification results, and even a real bug discovered. It exceeds template requirements.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch debpalash/probe-coverage

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-06-02T07:31:28Z

Greptile Summary

This PR expands the probe harness from a per-layer skeleton into whole-app feature coverage: dubbing timing/export, i18n parity, engine matrices (11 TTS + 7 ASR backends), loopback security, DB migration, streaming-ASR dictation, and voice design. All new specs share a single subprocess boot via the session-scoped boot_capture fixture, and heavy paths (real ASR, browser, Docker) skip cleanly unless opted in.

Three bugs flagged by the previous review round were all resolved in commit 7a8993a: the open() file-handle leak in scan_specs, the weak advisory-only assertion in test_probe_i18n, and the misleading "(blocking)" docstring on locale_no_orphan_keys.
The dictation layer introduced by this PR is not included in the required drift-gate list inside coverage_critic.probe.yaml, so future deletion of dictation.probe.yaml would go undetected by the coverage critic.
The i18n orphan-key check correctly surfaces the real gallery.cat_* / bootstrap.lines gap in en.json as an advisory finding without blocking CI.

Confidence Score: 5/5

Safe to merge — all new judges are deterministic and offline-capable; the shared boot fixture is correctly scoped and doesn't leak temp directories or lifespan state to other tests.

The three bugs identified in the prior review round are confirmed fixed. The only remaining finding is that the new dictation layer was added to the spec set but not registered in the coverage-critic's required list, meaning its spec isn't guarded by the drift gate going forward. That gap doesn't affect the current test run and can be addressed with a one-line YAML change.

tests/probe/specs/coverage_critic.probe.yaml — the required layer list should include dictation to protect the new spec from future drift.

Important Files Changed

Filename	Overview
tests/probe/specs/coverage_critic.probe.yaml	Drift-gate spec for the probe suite. The newly added `dictation` layer is not included in the `required` list, so the gate won't catch future deletion of `dictation.probe.yaml`.
tests/probe/test_probe_migration.py	Migration probe — `run_judges` is correctly placed inside the `with seeded_data_dir()` block so the `path_exists` check sees the DB before teardown.
tests/probe/judges/dubbing.py	New dubbing judges: duration-ratio, SRT/VTT well-formedness, archive membership, and pluggable language-ID — all deterministic and well-guarded against zero/missing inputs.
tests/probe/judges/i18n.py	i18n judges covering valid-JSON gate, orphan-key detection (advisory), and coverage reporting. Docstring now correctly labels `locale_no_orphan_keys` as advisory.
tests/probe/_boot_runner.py	Shared-boot subprocess extended to capture engine matrices, loopback-reject status, dictation WS handshake, OpenAPI inventory, and WS route table in a single app boot.
tests/probe/conftest.py	Session-scoped `boot_capture` fixture added; shares a single subprocess boot across engine, security, coverage, and dictation specs.

Sequence Diagram

sequenceDiagram
    participant pytest
    participant boot_capture (session fixture)
    participant _boot_runner subprocess
    participant FastAPI app

    pytest->>boot_capture (session fixture): request (once per session)
    boot_capture (session fixture)->>_boot_runner subprocess: spawn (fresh data dir)
    _boot_runner subprocess->>FastAPI app: TestClient enter (lifespan start)
    _boot_runner subprocess->>FastAPI app: GET /health, /system/info, /model/status
    _boot_runner subprocess->>FastAPI app: GET /engines/tts, /engines/asr
    _boot_runner subprocess->>FastAPI app: GET /system/info (non-loopback client → 403)
    _boot_runner subprocess->>FastAPI app: WS /ws/transcribe (handshake only)
    _boot_runner subprocess->>FastAPI app: app.openapi() + app.routes
    _boot_runner subprocess->>boot_capture (session fixture): write JSON capture
    boot_capture (session fixture)->>pytest: return capture dict (shared)

    pytest->>test_probe_engines: run
    pytest->>test_probe_security: run
    pytest->>test_probe_dictation: run
    pytest->>test_probe_coverage: run
    Note over test_probe_engines,test_probe_coverage: all read from shared boot_capture dict

    pytest->>test_probe_migration: run (own seeded_data_dir + subprocess)
    pytest->>test_probe_i18n: run (reads real locale files directly)
    pytest->>test_probe_dubbing: run (offline synthetic data)
    pytest->>test_probe_design: run (offline synthetic audio)

_{Reviews (3): Last reviewed commit: "fix(probe): ASCII x in dubbing detail (r..." | Re-trigger Greptile}

coderabbitai

Actionable comments posted: 12

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/probe/_boot_runner.py`:
- Around line 78-80: Replace storing the raw exception string in
ctx["ws_transcribe_error"] (inside the except Exception as exc block) with a
sanitized value: set ctx["ws_transcribe_connected"]=False as now but assign
ctx["ws_transcribe_error"] to a non-sensitive identifier such as the exception
type (type(exc).__name__) or a short normalized reason (e.g.,
"WebSocketHandshakeError" or exc.__class__.__name__ + optional safe suffix),
truncated to 200 chars if needed; ensure no user paths or secret-like substrings
from str(exc) are persisted.
- Around line 92-99: The current logic sets ctx["db_created"] =
bool(ctx["db_path"]) which flags seeded runs as "created"; instead, detect
pre-boot DB state and post-boot DB state separately: search data_dir before the
boot and store that result (e.g., ctx["pre_db_path"] or ctx["pre_existing_db"]),
then run the boot and re-scan into ctx["post_db_path"] (or ctx["db_path"]) and
set ctx["db_created"] = True only when pre-boot had no DB and post-boot does;
update any code that reads ctx["db_created"] to use this fresh-dir-only
semantics and keep the same keys (ctx["db_path"] can represent the post-boot
path while adding the pre-boot key).

In `@tests/probe/judges/coverage.py`:
- Around line 20-23: The loop that loads probe specs currently calls
yaml.safe_load(open(path, encoding="utf-8")) which leaves file descriptors open;
change it to open each spec with a context manager (with open(path,
encoding="utf-8") as fh:) and pass fh to yaml.safe_load so the file is closed
immediately after parsing, then keep the existing logic that sets doc =
yaml.safe_load(...) or {} and appends {"feature": doc.get("feature"), "layer":
doc.get("layer"), "file": os.path.basename(path")} to out.

In `@tests/probe/judges/dubbing.py`:
- Around line 33-46: The loop currently skips segments with non-positive source
duration so an empty or fully-skipped input returns passed=True; change the
logic to track how many segments were actually checked (e.g., add a checked
counter incremented when src > 0), and if checked == 0 set passed=False and
return a JudgeResult indicating zero valid segments (use JudgeResult name
"segments_duration_ratio", passed=False, measured=0 and a clear detail message
like "no valid source durations" or similar); keep the existing outlier
detection using _seg_durations and outliers but set measured to len(outliers)
when checked>0 and compute ok as (checked>0 and not outliers) so malformed
captures won't produce false greens.

In `@tests/probe/judges/i18n.py`:
- Around line 31-49: The current judge passes when no locale files exist; update
locale_valid_json to fail when len(locales) == 0 or when there are parse errors
by computing passed = bool(locales) and passed = passed and not bad (i.e.,
passed only if at least one locale and no bad files), set measured =
len(locales), and update the detail string to reflect three cases: success ("N
locale files parse"), parse errors ("invalid JSON: [...]"), and empty directory
("no locale files found"); modify functions _load_locales and locale_valid_json
references accordingly and add a small regression test that creates an empty
locales directory and asserts that locale_valid_json returns passed=False and
measured=0.

In `@tests/probe/specs/coverage_critic.probe.yaml`:
- Around line 11-15: The gate "layers_have_specs" currently hard-codes the
required layers in the required array (missing "meta"), so update it to derive
required layers from a single source of truth instead of a static list: compute
the required set from $.specs (e.g. collect each spec's layer value) and use
that derived list for layers_have_specs.required (or at minimum add "meta" to
the static array to match the suite), ensuring the coverage_report advisory
continues to reference $.openapi_paths and $.specs unchanged.

In `@tests/probe/specs/dub_export.probe.yaml`:
- Around line 13-17: The probe is missing capture of the dub_audio context key
before the advisory reads it; add a capture entry mapping dub_audio: $.dub_audio
in the capture block (the same place that currently lists segments, srt, vtt,
zip_names) so the advisory and the language-ID actor step have a real source
value instead of relying on synthetic injection; apply the same change to the
other capture block referenced around lines 27-29 to ensure both advisory usages
get dub_audio captured.

In `@tests/probe/specs/migration.probe.yaml`:
- Around line 9-13: The probe currently only checks boot health via judge.checks
entries status_eq ($.health_status) and json_field_eq on $.health_body.status,
which doesn't verify data preservation; add at least one seeded-data invariant
check (e.g., assert a known seeded row or count and/or schema-version) by adding
a new judge.checks entry such as json_field_eq or numeric comparison against a
seeded endpoint field (for example assert $.seeded_user.id == <expected> or
$.users_count == <expected> or schema_version == <expected>) so the probe
verifies existing user data or schema version in addition to health.

In `@tests/probe/test_probe_asr_e2e.py`:
- Around line 29-33: The test currently gates on PROBE_ASR_SAMPLE using
os.path.exists, which allows directories and causes downstream failures; change
the check to require a regular file by using os.path.isfile on the sample
variable and call pytest.skip if not a file so the ceil of
T.FasterWhisperTranscriber(transcribe...) only runs with a valid file path.

In `@tests/probe/test_probe_design.py`:
- Around line 28-36: The test test_voice_design_verdict currently only calls
probe_spec.run_judges with a prebuilt context so it never exercises the
actor/capture path for the new POST /generate contract; update the test to
either (A) run the spec through the actor step instead of only run_judges
(invoke the probe runner function that executes actors — e.g.,
probe_spec.run_actors or the equivalent API in the test harness) using the
loaded spec from probe_spec.load_spec and the designed_audio so the actual
actor/capture flow is executed, or (B) add explicit assertions after
probe_spec.load_spec that the spec's actions include a request targeting POST
/generate and that the capture path includes $.output_path (inspect the loaded
spec object fields to assert the method/path and capture selector); modify
test_voice_design_verdict to choose one of these fixes so the new /generate
contract and $.output_path capture are validated.

In `@tests/probe/test_probe_engines.py`:
- Around line 20-22: The test is pinned to the "whisperx" engine; change it to
assert the captured ASR active engine instead. Replace the second assertion that
calls E.engine_available(boot_capture["engines_asr"], "whisperx") with
E.engine_available(boot_capture["engines_asr"],
boot_capture["engines_asr"]["active"]) (or otherwise assert that
boot_capture["engines_asr"]["active"] is available) so the probe checks the
recorded active engine rather than a hard-coded id.

In `@tests/probe/test_probe_migration.py`:
- Around line 19-24: The test currently uses db_ok = context["db_created"] (from
tests/probe/_boot_runner.py) which only checks for a DB file and can pass even
if migrations dropped or mutated seeded data; instead, open the database
referenced by context["db_path"] in test_probe_migration.py and assert a
concrete seeded-data invariant (for example: a known table exists, a specific
migrated column is present, or a specific row/value from
tests/fixtures/omnivoice_data is still present). Replace the final assert db_ok
with a targeted assertion that queries the DB for the expected table/row/value
or schema change so the test fails if migrations altered the seeded data
unexpectedly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 1ed63786-68f9-44e6-ad42-2df302409fd3

📥 Commits

Reviewing files that changed from the base of the PR and between 24a00be and 88e75e7.

📒 Files selected for processing (27)

tests/probe/README.md
tests/probe/_boot_runner.py
tests/probe/conftest.py
tests/probe/env.py
tests/probe/judges/coverage.py
tests/probe/judges/dictation.py
tests/probe/judges/dubbing.py
tests/probe/judges/engine.py
tests/probe/judges/i18n.py
tests/probe/spec.py
tests/probe/specs/coverage_critic.probe.yaml
tests/probe/specs/dictation.probe.yaml
tests/probe/specs/dub_export.probe.yaml
tests/probe/specs/engines.probe.yaml
tests/probe/specs/i18n_parity.probe.yaml
tests/probe/specs/migration.probe.yaml
tests/probe/specs/security.probe.yaml
tests/probe/specs/voice_design.probe.yaml
tests/probe/test_probe_asr_e2e.py
tests/probe/test_probe_coverage.py
tests/probe/test_probe_design.py
tests/probe/test_probe_dictation.py
tests/probe/test_probe_dubbing.py
tests/probe/test_probe_engines.py
tests/probe/test_probe_i18n.py
tests/probe/test_probe_migration.py
tests/probe/test_probe_security.py

- coverage.py:22 — use `with open(...)` context to close spec files after yaml.safe_load (file handle leak) - _boot_runner.py:80 — store only `type(exc).__name__` for WS errors; drop raw str(exc) that could leak home paths / secrets into capture JSON - _boot_runner.py:99 — snapshot DB files before boot; set db_created=True only when boot creates NEW files (not when fixture already had one) - dubbing.py:46 — FAIL segments_duration_ratio when validated==0 (guards against empty/corrupt segment list passing vacuously) - i18n.py:49 — FAIL locale_valid_json when locales_dir is empty/missing - i18n.py:7 — fix docstring: locale_no_orphan_keys is advisory, not blocking - test_probe_i18n.py:59 — assert r.passed is False, not just r.advisory - coverage_critic.probe.yaml:15 — add "meta" to required layers list - dub_export.probe.yaml:17 — capture dub_audio in steps before advisory reads it - migration.probe.yaml:13 — add path_exists(db_path) data-integrity check - test_probe_asr_e2e.py:33 — os.path.exists → os.path.isfile for PROBE_ASR_SAMPLE - test_probe_migration.py:24 — assert context["db_path"] (presence) not db_created (new creation), aligning with the boot_runner fix Two findings intentionally skipped with reasons (see review thread replies): test_probe_design.py:36 — offline pattern is intentional; actor step is bypassed by design throughout the probe suite for CI compatibility test_probe_engines.py:22 — whisperx pin is intentional; it verifies the shipped default ASR engine is available out-of-the-box Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

tests/probe/test_probe_migration.py (1)
22-25: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

The “data integrity” check still never inspects migrated data.

Line 25 only asserts that db_path is a non-empty string. The spec already runs path_exists on the same value, so this adds no protection, and the test still won't catch a migration that recreates or mutates the seeded DB while leaving a file behind. Please assert a concrete invariant from the copied omnivoice_data database instead of path presence alone.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/probe/test_probe_migration.py` around lines 22 - 25, The test currently
only asserts context["db_path"] exists but doesn't validate migrated data; open
the SQLite DB at context["db_path"] (the same variable used earlier) and run a
concrete query against the seeded omnivoice_data (e.g., verify a known table
exists and contains an expected row or count) and assert the result, so the test
fails if the migration recreated or mutated the seeded DB; use the test's
sqlite3 connection or existing DB helper to SELECT a known record/rowcount from
the omnivoice_data table and assert it matches the expected value.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/probe/judges/dubbing.py`:
- Line 59: Replace the Unicode multiplication sign in the f-string that builds
the detail message so it uses ASCII "x"; specifically update the f-string that
currently reads detail=f"all {validated} segment(s) within [{min_ratio},
{max_ratio}]×" in tests/probe/judges/dubbing.py to use "x" instead of "×" (the
variables validated, min_ratio, and max_ratio remain unchanged).

---

Duplicate comments:
In `@tests/probe/test_probe_migration.py`:
- Around line 22-25: The test currently only asserts context["db_path"] exists
but doesn't validate migrated data; open the SQLite DB at context["db_path"]
(the same variable used earlier) and run a concrete query against the seeded
omnivoice_data (e.g., verify a known table exists and contains an expected row
or count) and assert the result, so the test fails if the migration recreated or
mutated the seeded DB; use the test's sqlite3 connection or existing DB helper
to SELECT a known record/rowcount from the omnivoice_data table and assert it
matches the expected value.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 4f039a49-2902-4b64-a134-b5d039fd1316

📥 Commits

Reviewing files that changed from the base of the PR and between 88e75e7 and 7a8993a.

📒 Files selected for processing (10)

tests/probe/_boot_runner.py
tests/probe/judges/coverage.py
tests/probe/judges/dubbing.py
tests/probe/judges/i18n.py
tests/probe/specs/coverage_critic.probe.yaml
tests/probe/specs/dub_export.probe.yaml
tests/probe/specs/migration.probe.yaml
tests/probe/test_probe_asr_e2e.py
tests/probe/test_probe_i18n.py
tests/probe/test_probe_migration.py

✅ Files skipped from review due to trivial changes (2)

tests/probe/specs/coverage_critic.probe.yaml
tests/probe/specs/migration.probe.yaml

🚧 Files skipped from review as they are similar to previous changes (4)

tests/probe/judges/coverage.py
tests/probe/specs/dub_export.probe.yaml
tests/probe/test_probe_i18n.py
tests/probe/test_probe_asr_e2e.py

…nside seeded dir Two regressions from the hardening pass: - dubbing.py: replace non-ASCII '×' with 'x' (Ruff ambiguous-unicode → Tests lint fail) - test_probe_migration: move run_judges inside the seeded_data_dir with-block so the new path_exists check sees the DB before the temp dir is torn down (was always failing)

greptile-apps Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread tests/probe/judges/coverage.py

Comment thread tests/probe/test_probe_i18n.py Outdated

Comment thread tests/probe/judges/i18n.py Outdated

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread tests/probe/judges/dubbing.py Outdated

greptile-apps Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread tests/probe/test_probe_migration.py

debpalash merged commit ef98aae into main Jun 2, 2026
15 checks passed

debpalash deleted the debpalash/probe-coverage branch June 12, 2026 10:08

Conversation

debpalash commented Jun 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

New feature specs

🐞 Real bug surfaced

Design

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

greptile-apps Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

debpalash commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading

greptile-apps Bot commented Jun 2, 2026 •

edited

Loading