Skip to content

feat(mcp): MCP server v1 — mount on /mcp, per-agent voice binding, stdio shim (Wave 2.2)#368

Merged
debpalash merged 3 commits into
mainfrom
feat/mcp-server-v2
Jun 12, 2026
Merged

feat(mcp): MCP server v1 — mount on /mcp, per-agent voice binding, stdio shim (Wave 2.2)#368
debpalash merged 3 commits into
mainfrom
feat/mcp-server-v2

Conversation

@debpalash

@debpalash debpalash commented Jun 11, 2026

Copy link
Copy Markdown
Owner

What

Wave 2.2 of the parity program (Spec 2) — the headline "agents speak in your voice" feature. The FastMCP server existed in-tree but was never mounted (dead code); this wires it up and adds per-agent voice binding.

Mount. FastMCP's Streamable-HTTP app is sub-mounted at /mcp on the main FastAPI app; its session manager is composed into the app lifespan via AsyncExitStack (wrapping, not replacing, the existing startup). streamable_http_path="/" so the sub-mount lands at /mcp, not /mcp/mcp. Best-effort throughout — a missing mcp package or OMNIVOICE_MCP_DISABLE=1 leaves the rest of the backend untouched. Adds the mcp dependency (1.27.x).

Per-agent voice binding. Each MCP client sends X-OmniVoice-Client-Id; generate_speech resolves the voice as explicit arg → client binding → global default → app default. New mcp_client_bindings table (alembic 0004 + _BASE_SCHEMA, additive + idempotent), services/mcp_bindings.py (CRUD + resolve_voice + best-effort last-seen), loopback-gated REST (/api/mcp/bindings), and a Settings → Sharing panel.

More. New transcribe tool (base64 in, 200 MB cap). Stdio shim (backend/mcp_shim, httpx-only, ported from voicebox MIT) for clients that only speak stdio — forwards OMNIVOICE_CLIENT_ID as the binding header. docs/mcp.md (both connection modes + binding REST) and docs/mcp.json updated.

Verification

  • Pure tests (ran locally): tests/test_mcp_bindings.py — service CRUD, resolution precedence (explicit/binding/global/none), migration up+down; tests/test_mcp_mount.py build + streamable-app-serves-at-root.
  • Out-of-band (no torch): built the server, mounted it on a bare FastAPI with the same lifespan composition, and drove an initialize handshake — /mcp routes and the session manager runs.
  • CI-validated (local torch/Triton segfault on main-importing tests): REST CRUD round-trip, /mcp mounted (not 404), OMNIVOICE_MCP_DISABLE=1 skips the mount.
  • bun run typecheck:ci clean; bunx vitest run 312 passed; CJK + docs-drift gates green.

🤖 Generated with Claude Code

Overview

This PR implements Wave 2.2 ("agents speak in your voice") by:

  • Mounting the FastMCP Streamable-HTTP app as a sub-app at /mcp on the main FastAPI app (best-effort; failures don't break startup; OMNIVOICE_MCP_DISABLE=1 skips mount)
  • Adding per-agent voice binding with DB, services, loopback REST, and a Settings UI panel
  • Adding an MCP transcribe tool and enhancing generate_speech to respect per-agent bindings
  • Adding a stdio MCP shim proxy for stdio-only clients
  • Adding tests, docs, and a new migration
  • Adds runtime dependency on mcp (pinned in PR to 1.27.x)

Backend Changes

MCP Server Mounting

  • Mounts FastMCP Streamable-HTTP app at /mcp and sets streamable_http_path = "/" to avoid double-prefixing when sub-mounted.
  • Composes MCP session manager into app lifespan via AsyncExitStack and stores it at app.state.mcp_session_manager; startup logs warnings but does not fail if MCP creation/startup errors.
  • Respects OMNIVOICE_MCP_DISABLE to skip mounting.
  • Creates create_mcp_server() helper and exposes _current_client_id() to read X-OmniVoice-Client-Id from Streamable-HTTP request context.

Per-Agent Voice Binding

  • New persistent table: mcp_client_bindings (client_id PK, label, profile_id, default_engine, last_seen_at, created_at) via Alembic migration 0004 (idempotent). Base schema also includes CREATE TABLE IF NOT EXISTS for the table.
  • Service layer: backend/services/mcp_bindings.py
    • list_bindings(), get_binding(client_id), upsert_binding(...), delete_binding(client_id), touch_last_seen(client_id) (best-effort), resolve_voice(client_id, explicit_profile_id).
    • resolve_voice precedence: explicit tool arg → client binding → global preference (mcp_default_profile_id) → none.
  • REST API: backend/api/routers/mcp_bindings.py (loopback-gated, prefix /api/mcp)
    • GET /bindings — list bindings ordered by most-recently-seen
    • PUT /bindings — create/update binding (validates input, returns 400 on ValueError)
    • DELETE /bindings/{client_id} — remove binding (404 when missing)
  • generate_speech now consults resolve_voice(client_id, profile_id) (explicit arg wins), touches last_seen for the client (errors ignored), and falls back to previous global/default behavior if bindings unavailable.

Mermaid flowchart (voice resolution in generate_speech):

flowchart TD
  A[generate_speech call] --> B{explicit profile_id?}
  B -- Yes --> C[Use explicit profile_id]
  B -- No --> D{X-OmniVoice-Client-Id present?}
  D -- Yes --> E[Lookup client binding → profile_id/default_engine?]
  E -- Yes --> C
  E -- No --> F{global pref mcp_default_profile_id?}
  D -- No --> F
  F -- Yes --> C
  F -- No --> G[No voice resolved → return None]
Loading

New MCP Tools

  • transcribe(audio_base64, language=None): decodes base64 audio (rejects >200 MB), posts audio as a file to backend /transcribe, returns backend JSON as a string; invalid base64 or oversize returns an error JSON string.
  • generate_speech(...): unchanged signature; resolves and honors per-agent bindings per the precedence above.

MCP Stdio Shim

  • New executable backend/mcp_shim/main.py:
    • JSON-RPC stdio ↔ HTTP proxy sending requests to http://:/mcp/, forwards OMNIVOICE_CLIENT_ID as X-OmniVoice-Client-Id.
    • Maintains mcp-session-id across requests, distinguishes notifications vs requests (treats HTTP 202 as accepted for notifications), converts non-2xx into JSON-RPC errors for non-notifications.
    • Supports SSE (text/event-stream) streaming by decoding data: payloads and writing JSON to stdout.
    • Healthcheck loop on startup; exit codes: 0 (clean EOF), 1 (transport/exception), 2 (backend unhealthy).

Frontend Changes

Settings UI — Sharing Tab

BEFORE:
┌─ Settings (Sharing Tab) ─────────┐
│ ┌─ SharingPanel ─────────────┐ │
│ └────────────────────────────┘ │
│ ┌─ RemoteBackendPanel ───────┐ │
│ └────────────────────────────┘ │
└──────────────────────────────────┘

AFTER:
┌─ Settings (Sharing Tab) ──────────────┐
│ ┌─ SharingPanel ─────────────────┐ │
│ └────────────────────────────────┘ │
│ ┌─ RemoteBackendPanel ───────────┐ │
│ └────────────────────────────────┘ │
│ ┌─ MCPBindingsPanel ─────────────┐ │
│ │ • List existing bindings │ │
│ │ • Bind client ID → profile │ │
│ │ • Delete binding │ │
│ └────────────────────────────────┘ │
└───────────────────────────────────────┘

  • New React component: frontend/src/components/settings/MCPBindingsPanel.jsx
    • Fetches /api/mcp/bindings and voice profiles, displays bindings, supports PUT to create/update and DELETE to remove, shows errors/loading and refreshes after operations.
  • Settings page imports and renders MCPBindingsPanel in the Sharing tab.

Migrations & Schema

  • Alembic migration backend/migrations/versions/0004_mcp_client_bindings.py — creates mcp_client_bindings table idempotently; downgrade drops it if exists.
  • backend/core/db.py base schema updated with CREATE TABLE IF NOT EXISTS for mcp_client_bindings to support non-Alembic init paths.

Tests & CI

  • tests/test_mcp_bindings.py — service-layer CRUD and resolution precedence tests, migration upgrade/downgrade verification, and REST CRUD roundtrips (including validation and empty client_id rejection).
  • tests/test_mcp_mount.py — optional-dependency-gated tests that validate create_mcp_server tool surface, streamable_http_app root route, session_manager presence, and that importing/reloading main mounts /mcp; verifies OMNIVOICE_MCP_DISABLE handling.
  • Test fixture and mount tests were adjusted to avoid reload-main state leakage (OMNIVOICE_MCP_DISABLE toggle handling and teardown/reload to restore main/core state).

Documentation

  • docs/mcp.md — new guide covering connection modes (Streamable HTTP and stdio shim), available tools, per-agent binding semantics, loopback REST endpoints, and disabling via OMNIVOICE_MCP_DISABLE.
  • docs/mcp.json — example MCP client config updated to recommend python -m backend.mcp_shim for stdio clients and example env variables (OMNIVOICE_PORT / OMNIVOICE_CLIENT_ID).

Notes / Verification

  • Mounting is best-effort: missing mcp package or OMNIVOICE_MCP_DISABLE=1 leaves startup unaffected.
  • CI validates REST CRUD, mount behavior including OMNIVOICE_MCP_DISABLE handling, typecheck, and JS tests.

…dio shim (Wave 2.2)

The FastMCP server (previously dead code, never mounted) is now mounted on
the main FastAPI app at /mcp via Streamable HTTP, with its session manager
composed into the app lifespan through an AsyncExitStack (best-effort: a
missing mcp package or OMNIVOICE_MCP_DISABLE=1 never breaks startup).
streamable_http_path set to '/' so the sub-mount lands at /mcp, not
/mcp/mcp. Adds the 'mcp' dependency (1.27.x).

Per-agent voice binding (Spec 2 headline): each MCP client sends an
X-OmniVoice-Client-Id header; generate_speech resolves the voice as
explicit arg > the client's binding > global default > app default. New
mcp_client_bindings table (alembic 0004 + _BASE_SCHEMA, additive/idempotent),
services/mcp_bindings.py (CRUD + resolve_voice + best-effort last_seen),
and a loopback-gated REST router (/api/mcp/bindings) the Settings panel
drives.

New transcribe tool (base64 audio in, 200 MB cap). Stdio shim
(backend/mcp_shim, httpx-only, ported from voicebox MIT) proxies stdio
clients to the mounted endpoint and forwards OMNIVOICE_CLIENT_ID as the
binding header. Settings → Sharing gains an MCP bindings panel. Docs:
docs/mcp.md (both connection modes + binding REST) and docs/mcp.json
updated to the shim form.

Tests: bindings service + resolution precedence + migration up/down (pure,
run locally); REST CRUD + mount-not-404 + disable-flag (main-importing,
validated in CI). MCP build + mount + initialize handshake verified
out-of-band (no torch).

Spec: docs/competitive-analysis.md Spec 2 / parity program Wave 2.2.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0ea58681-fd19-4a88-9c6d-fd4f2f1f66f2

📥 Commits

Reviewing files that changed from the base of the PR and between 6be4f7b and c3c1296.

📒 Files selected for processing (2)
  • tests/test_mcp_bindings.py
  • tests/test_mcp_mount.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/test_mcp_mount.py
  • tests/test_mcp_bindings.py

📝 Walkthrough

Walkthrough

This PR integrates Model Context Protocol (MCP) server capabilities into OmniVoice Studio with per-agent voice binding support. The changes add a database persistence layer for MCP client voice bindings, REST CRUD endpoints, FastMCP tools (speech generation with agent-aware voice resolution and audio transcription), app-level mounting with session management, a stdio-compatible proxy, frontend settings UI, and comprehensive test coverage.

Changes

MCP Integration with Per-Agent Voice Bindings

Layer / File(s) Summary
Data persistence for MCP client voice bindings
backend/core/db.py, backend/migrations/versions/0004_mcp_client_bindings.py, backend/services/mcp_bindings.py, tests/test_mcp_bindings.py
Database schema adds mcp_client_bindings table with idempotent Alembic migration. Service layer provides CRUD operations and resolve_voice() precedence: explicit profile → client binding → global default → none, plus best-effort touch_last_seen() for tracking. Service and migration tests validate behaviors.
Voice binding management REST API
backend/api/routers/mcp_bindings.py, tests/test_mcp_bindings.py (REST endpoints)
Loopback-protected APIRouter at /api/mcp with GET /bindings (list), PUT /bindings (upsert with Pydantic validation), DELETE /bindings/{client_id} (remove). Converts service ValueError to HTTP 400. REST tests verify roundtrip CRUD, empty client_id rejection (HTTP 422), and 404 on missing deletion.
MCP server tools and voice binding integration
backend/mcp_server.py, tests/test_mcp_mount.py
FastMCP Streamable-HTTP configured at root path to avoid double-prefixing under /mcp mount. Client ID extracted from request context via _current_client_id(). generate_speech resolves profile_id per-agent using resolve_voice() and touches last-seen; transcribe tool base64-decodes audio (200MB limit), validates, and uploads to backend. Tool exposure and route/session tests verify tool names and HTTP app structure.
MCP application startup and mounting
backend/main.py, tests/test_mcp_mount.py
lifespan() wraps startup/shutdown with AsyncExitStack for FastMCP session manager. Startup conditionally mounts /mcp Streamable-HTTP app (respects OMNIVOICE_MCP_DISABLE flag), stores session manager on app.state, logs success/failure. Registers mcp_bindings router. Mount and disable-flag tests verify route presence and toggle behavior.
MCP stdio proxy transport and docs
backend/mcp_shim/__init__.py, backend/mcp_shim/__main__.py, docs/mcp.md, docs/mcp.json
Python executable forwards line-delimited JSON-RPC from stdin to backend /mcp Streamable-HTTP endpoint, handles SSE streaming, propagates X-OmniVoice-Client-Id and mcp-session-id, and performs health checks. Docs and example config updated to show shim-based launch and client-id binding usage.
Frontend voice binding management UI
frontend/src/components/settings/MCPBindingsPanel.jsx, frontend/src/pages/Settings.jsx
MCPBindingsPanel component fetches bindings and profiles on mount, renders list with delete actions, allows binding client ID to optional profile via PUT, displays errors, refreshes after mutations. Integrated into Settings.jsx sharing tab.
Dependency update
pyproject.toml
Adds mcp>=1.2 to runtime dependencies.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • debpalash/OmniVoice-Studio#93: Both PRs change backend/main.py’s lifespan() startup flow—main PR mounts/starts MCP (and per-agent bindings), while retrieved PR adds a non-fatal Gatekeeper quarantine probe/event emission.
  • debpalash/OmniVoice-Studio#84: Main PR’s new /api/mcp bindings router is protected by Depends(require_loopback), directly coupling it to the loopback-origin middleware (backend/api/dependencies.py) introduced/validated in retrieved PR #84.
🚥 Pre-merge checks | ✅ 7 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.82% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
I18n Completeness (21 Locales) ⚠️ Warning FAIL: MCPBindingsPanel.jsx hardcodes UI strings (e.g., lines 33/54/63/70/73/94/97) and has no t('...') keys, so locale completeness can't be satisfied. Fix: use i18n (useTranslation + t/Trans) for MCPBindingsPanel UI/error/loading text and add new keys to all 21 frontend/src/i18n/locales/*.json.
✅ Passed checks (7 passed)
Check name Status Explanation
Title check ✅ Passed Title follows conventional-commit style with scope 'mcp' and includes context (Wave 2.2), clearly summarizing the core changes: mounting MCP, per-agent voice binding, and stdio shim.
Description check ✅ Passed Description covers all template sections: What/Changes (mount, binding, transcribe, shim, docs), Type (New feature), Testing (unit tests, out-of-band, CI-validated), and Checklist items. Comprehensive and specific.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Cross-Platform Default Parity ✅ Passed Default MCP mounting is controlled only by OMNIVOICE_MCP_DISABLE and best-effort exception handling in backend/main.py; no added macOS/Windows/Linux branching detected, so behavior should match acr...
Local-First Guarantee ✅ Passed No telemetry/keys added; MCP mount is best-effort (main.py 776-788) and MCP tools only call OMNIVOICE_API_URL (default localhost) via httpx (mcp_server.py 68-82, 219-223).
Backward Compatibility ✅ Passed PASS: Only additive mcp_client_bindings table is added (core/db.py _BASE_SCHEMA CREATE TABLE IF NOT EXISTS, l150-171) with idempotent alembic 0004 upgrade (l44-57); no engine/model or other schemas...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/mcp-server-v2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread backend/mcp_server.py
# sub-mounted. Harmless for the standalone CLI run() path.
try:
mcp.settings.streamable_http_path = "/"
except Exception:
Comment thread backend/mcp_server.py
req = mcp.get_context().request_context.request
if req is not None:
return req.headers.get("x-omnivoice-client-id")
except Exception:
r = await client.get(health_url, timeout=2.0)
if r.status_code == 200:
return True
except Exception:
"UPDATE mcp_client_bindings SET last_seen_at=? WHERE client_id=?",
(time.time(), client_id),
)
except Exception:
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Wave 2.2 of the parity program: mounts the previously-dead FastMCP server at /mcp, adds per-agent voice binding (DB, service, loopback REST, Settings UI), a new transcribe MCP tool, and a stdio shim for clients that can't speak Streamable HTTP.

  • MCP mountstreamable_http_app() sub-mounted at /mcp via AsyncExitStack-composed lifespan; opt-out via OMNIVOICE_MCP_DISABLE=1; best-effort so a missing mcp package never breaks startup.
  • Per-agent voice binding — new mcp_client_bindings table (Alembic 0004 + _BASE_SCHEMA), service layer with resolve_voice precedence chain (explicit → client binding → global → none), loopback REST CRUD, and MCPBindingsPanel in Settings → Sharing.
  • Stdio shimbackend/mcp_shim/__main__.py proxies JSON-RPC stdio to the mounted HTTP endpoint, forwarding OMNIVOICE_CLIENT_ID as the binding header.

Confidence Score: 4/5

Safe to merge with one fix: the engine half of per-agent binding is wired up in the service and DB layers but never forwarded to the synthesis request.

The default_engine field is stored in mcp_client_bindings, returned by resolve_voice, and exposed in the REST API and Settings UI — but generate_speech only reads resolved.get("profile_id") and discards resolved.get("default_engine"). Any client that has an engine binding set will silently get the backend's default engine instead of their bound one. Everything else — mounting, session-manager lifespan, CRUD, migration, shim, frontend panel — looks correct.

backend/mcp_server.py — the generate_speech binding block needs to read and forward resolved.get("default_engine") to the form sent to /generate.

Important Files Changed

Filename Overview
backend/mcp_server.py Mounts FastMCP tools; adds _current_client_id, per-agent binding lookup in generate_speech, and new transcribe tool. default_engine from the binding resolution is resolved but never forwarded to the /generate form, silently breaking the engine-binding half of the feature.
backend/services/mcp_bindings.py New service layer: CRUD over mcp_client_bindings, resolve_voice with documented precedence chain, best-effort touch_last_seen. Logic is sound; known TOCTOU upsert race was flagged in a prior review thread.
backend/main.py Wires MCP mount and session-manager lifespan via AsyncExitStack; router included. Best-effort exception handling keeps startup clean if MCP is absent.
backend/mcp_shim/main.py stdio↔HTTP shim; handles SSE streaming, session-id threading, error responses, and clean EOF. Sequential request dispatch (noted in prior thread) is the main constraint.
backend/api/routers/mcp_bindings.py Loopback-gated REST CRUD for bindings; clean Pydantic validation, correct 404 on missing delete.
backend/migrations/versions/0004_mcp_client_bindings.py Idempotent Alembic migration matching the _BASE_SCHEMA DDL; upgrade/downgrade both guarded by sqlite_master check.
frontend/src/components/settings/MCPBindingsPanel.jsx New panel for managing per-agent voice bindings; fetches bindings and profiles on mount, PUT/DELETE with error display. No destructive-action confirmation and unawaited refresh() are minor UX nits.
backend/core/db.py Adds CREATE TABLE IF NOT EXISTS mcp_client_bindings to _BASE_SCHEMA for fresh installs; additive, no schema drift risk.
pyproject.toml Adds mcp>=1.2 to hard dependencies; locked to 1.27.2 in uv.lock. Lower bound and optional-vs-required concerns were flagged in prior review threads.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[generate_speech call] --> B{explicit profile_id?}
    B -- Yes --> C[Use explicit profile_id]
    B -- No --> D{X-OmniVoice-Client-Id present?}
    D -- Yes --> E[get_binding lookup]
    E -- binding.profile_id set --> C
    E -- not set --> F{mcp_default_profile_id pref?}
    D -- No --> F
    F -- Yes --> C
    F -- No --> G[profile_id = None — backend default]
    C --> H[POST /generate\nprofile_id in form\n⚠ default_engine NOT applied]
    G --> H
Loading

Comments Outside Diff (1)

  1. backend/mcp_server.py, line 130-145 (link)

    P1 default_engine from binding resolved but never applied

    resolve_voice returns {profile_id, default_engine, source} and the service layer, migration, and UI all store and expose default_engine per binding — but only resolved.get("profile_id") is pulled out and applied to the form. resolved.get("default_engine") is never read. Any client with a default_engine binding (e.g. via PUT /api/mcp/bindings with "default_engine":"zonos") will see that setting silently dropped on every generate_speech call; the form sent to /generate uses whatever engine the backend defaults to instead of the bound one.

    Fix in Claude Code

Fix All in Claude Code

Reviews (3): Last reviewed commit: "test(mcp): stop reload-main poisoning ac..." | Re-trigger Greptile

Comment thread backend/mcp_server.py
Comment on lines +60 to +63
try:
mcp.settings.streamable_http_path = "/"
except Exception:
pass

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Silent path misconfiguration leaves MCP dead with no diagnostic

If mcp.settings.streamable_http_path = "/" raises for any reason — validation error, attribute renamed in a FastMCP patch, settings becoming read-only — the assignment is silently swallowed. The FastMCP default path remains /mcp. After mounting the sub-app at /mcp on the main FastAPI, Starlette strips the /mcp prefix before forwarding, so the only reachable endpoint is /mcp/mcp. Every client connection fails with 404/405 while the startup log still says "MCP app mounted at /mcp". Replace the bare pass with at least a logger.warning so operators can diagnose the double-prefix scenario.

Fix in Claude Code

Comment on lines +49 to +69
existing = get_binding(cid)
now = time.time()
if existing:
merged = {
"label": existing["label"] if label is None else label,
"profile_id": existing["profile_id"] if profile_id is None else (profile_id or None),
"default_engine": existing["default_engine"] if default_engine is None else (default_engine or None),
}
with db_conn() as conn:
conn.execute(
"UPDATE mcp_client_bindings SET label=?, profile_id=?, default_engine=? WHERE client_id=?",
(merged["label"], merged["profile_id"], merged["default_engine"], cid),
)
else:
with db_conn() as conn:
conn.execute(
"INSERT INTO mcp_client_bindings "
"(client_id, label, profile_id, default_engine, last_seen_at, created_at) "
"VALUES (?, ?, ?, ?, NULL, ?)",
(cid, label or "", profile_id or None, default_engine or None, now),
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-atomic read-modify-write in upsert_binding

The current get_bindingUPDATE/INSERT pattern has a TOCTOU window: two concurrent callers with the same client_id can both see existing = None, then both attempt INSERT, hitting a UNIQUE constraint violation. SQLite's INSERT … ON CONFLICT DO UPDATE (available since SQLite 3.24 / 2018) handles this atomically and removes the extra SELECT round-trip.

Suggested change
existing = get_binding(cid)
now = time.time()
if existing:
merged = {
"label": existing["label"] if label is None else label,
"profile_id": existing["profile_id"] if profile_id is None else (profile_id or None),
"default_engine": existing["default_engine"] if default_engine is None else (default_engine or None),
}
with db_conn() as conn:
conn.execute(
"UPDATE mcp_client_bindings SET label=?, profile_id=?, default_engine=? WHERE client_id=?",
(merged["label"], merged["profile_id"], merged["default_engine"], cid),
)
else:
with db_conn() as conn:
conn.execute(
"INSERT INTO mcp_client_bindings "
"(client_id, label, profile_id, default_engine, last_seen_at, created_at) "
"VALUES (?, ?, ?, ?, NULL, ?)",
(cid, label or "", profile_id or None, default_engine or None, now),
)
now = time.time()
with db_conn() as conn:
conn.execute(
"""
INSERT INTO mcp_client_bindings
(client_id, label, profile_id, default_engine, last_seen_at, created_at)
VALUES (?, ?, ?, ?, NULL, ?)
ON CONFLICT(client_id) DO UPDATE SET
label = CASE WHEN excluded.label IS NOT NULL THEN excluded.label ELSE label END,
profile_id = CASE WHEN excluded.profile_id IS NOT NULL THEN excluded.profile_id ELSE profile_id END,
default_engine = CASE WHEN excluded.default_engine IS NOT NULL THEN excluded.default_engine ELSE default_engine END
""",
(cid, label or "", profile_id or None, default_engine or None, now),
)

Fix in Claude Code

Comment thread pyproject.toml
# (Assumption A1 in RESEARCH.md was checked at execute-time and proved
# false — `cryptography` is not on the install path today).
"cryptography>=41",
"mcp>=1.2",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Lower bound >=1.2 predates the Streamable-HTTP transport

streamable_http_app() and .session_manager were introduced with the Streamable HTTP transport in FastMCP ~1.9. With the current >=1.2 floor, a resolver could pick an older release that lacks these attributes; the best-effort except Exception in main.py would silently swallow the AttributeError, leaving /mcp unmounted with no explanation.

Suggested change
"mcp>=1.2",
"mcp>=1.9",

Fix in Claude Code

Comment thread pyproject.toml
# (Assumption A1 in RESEARCH.md was checked at execute-time and proved
# false — `cryptography` is not on the install path today).
"cryptography>=41",
"mcp>=1.2",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 mcp is a hard dependency but all consuming code treats it as optional

mcp is in the main dependencies list (always installed), yet main.py wraps the import in a best-effort try/except and test_mcp_mount.py uses pytest.importorskip("mcp"). If the intent is truly optional, move it to [project.optional-dependencies]. If required, remove the defensive guards. As-is, every OmniVoice install unconditionally pulls in pyjwt, pydantic-settings, sse-starlette, and pywin32 (Windows) while the runtime code pretends the package might be absent.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/api/routers/mcp_bindings.py`:
- Around line 45-46: Replace the bare re-raise of HTTPException with explicit
exception chaining so the original ValueError traceback is preserved: in the
except ValueError as e block where HTTPException is raised, change the raise to
use "from e" (i.e., raise HTTPException(status_code=400, detail=str(e)) from e)
to maintain the original exception context for debugging; this references the
except ValueError as e handler and the HTTPException construction in
mcp_bindings.py.

In `@backend/main.py`:
- Around line 438-447: The current context manager uses AsyncExitStack() so if
_sm.run().__aexit__() raises during the implicit exit it can bypass the rest of
shutdown; change the structure so the MCP context is entered and exited
explicitly and guarded rather than relying on the async-with to span the yield:
obtain _mcp_stack = AsyncExitStack() and enter _sm.run() with await
_mcp_stack.enter_async_context(_sm.run()) before the yield, then after the yield
close the MCP stack in its own try/except/finally block (call await
_mcp_stack.aclose() inside a protected block and log any exceptions) so teardown
of _mcp_stack and subsequent backend shutdown steps always run even if the MCP
teardown raises; reference symbols: AsyncExitStack, _mcp_stack, _sm, _sm.run(),
and the yield boundary.

In `@backend/mcp_server.py`:
- Around line 208-215: Reject oversized input before decoding: compute a safe
maximum encoded length from the 200 MB raw cap (e.g. max_encoded = int(200 *
1024 * 1024 * 4 / 3) + some padding for base64 padding/newlines) and check
len(audio_base64) against that and return the error if exceeded before calling
base64.b64decode; then proceed to decode into raw and keep the existing len(raw)
check as a secondary guard. Use the existing variable names audio_base64 and raw
(and the same error message) so you only add the preflight length check and
early return to avoid allocating huge decoded blobs.
- Around line 219-223: The code currently returns str(r.json()) which produces a
Python repr instead of valid JSON; import json at the top of the module and
replace that return with json.dumps(r.json()) so the transcribe endpoint (the
block calling _api_post_form and returning r.json()) emits proper JSON text that
clients can parse.

In `@backend/mcp_shim/__main__.py`:
- Around line 45-49: The _base_url() function currently calls
int(os.environ.get("OMNIVOICE_PORT", str(DEFAULT_PORT))) which can raise
ValueError on bad env input; wrap the port parsing in a try/except inside
_base_url() (or a small helper) to validate the env value and fall back to
DEFAULT_PORT if parsing fails, and in main() catch any parsing/ValueError from
_base_url(), call _err(...) with a clear message including the bad value and
exit with a nonzero status (no traceback). Ensure you reference the
OMNIVOICE_PORT env var, keep DEFAULT_PORT as the fallback, and use the existing
_err(...) helper and main() exit path so the process logs a friendly error
instead of crashing.

In `@backend/services/mcp_bindings.py`:
- Around line 82-89: Both resolve_voice and touch_last_seen are using raw
client_id/explicit_profile inputs; normalize by stripping whitespace and
converting empty/whitespace-only strings to None before any lookup or selection.
Update the functions (resolve_voice, touch_last_seen) to call .strip() on
client_id and explicit_profile, then treat "" as None, and use those
canonicalized values in database queries and explicit profile logic so padded or
whitespace-only inputs cannot bypass bindings or be treated as an explicit
profile; apply the same normalization to the other similar block around the code
handling lines 113-117.
- Around line 49-69: The upsert_binding function currently does a non-atomic
SELECT then INSERT/UPDATE using get_binding and db_conn which can cause
unique-constraint errors under concurrent writes; change it to perform an atomic
upsert by using a single SQL statement like INSERT ... ON CONFLICT(client_id) DO
UPDATE SET ... (updating label, profile_id, default_engine, last_seen_at)
against the mcp_client_bindings table, or alternatively wrap the
SELECT/INSERT/UPDATE in a single transaction with proper conflict
handling/locking; update the code paths that call get_binding/upsert_binding to
use the new upsert SQL or transactional helper so concurrent requests for the
same client_id no longer race.

In `@docs/mcp.md`:
- Around line 24-26: The fenced code block containing the URL should be labeled
as plain text to avoid MD040; update the snippet in docs/mcp.md so the fence
uses a language tag like `text` (or replace with inline code) for the URL
`http://localhost:3900/mcp`, e.g. change the block from a bare triple-backtick
fence to a ```text fenced block so the linter no longer flags it.

In `@frontend/src/components/settings/MCPBindingsPanel.jsx`:
- Around line 41-63: Add a loading/disabled state to prevent duplicate requests:
introduce a boolean state (e.g., isSaving or isBusy, and optionally isDeleting
keyed by client id) and use it in onAdd and onDelete to short-circuit repeated
clicks, set the flag true before the apiFetch call and false in a finally block,
and pass the flag down to the UI controls (the add button/input and delete
buttons rendered in the MCPBindingsPanel) so they are disabled while the request
is in-flight; ensure onAdd clears inputs only after success and onDelete uses
the per-item deleting flag to disable just that delete control to avoid blocking
other actions.
- Around line 26-37: The refresh() async can be overwritten by out-of-order
responses; add a stale-response guard using a refresh token/counter stored in a
ref (e.g., refreshCounterRef) that you increment at the start of refresh(),
capture into a local const (current = refreshCounterRef.current) before awaiting
Promise.all, and only call setBindings, setProfiles and setError when the
captured token still equals refreshCounterRef.current; apply the same pattern to
other refresh callers (the invocations near lines noted) so older responses
cannot overwrite newer state; keep existing function names (refresh,
setBindings, setProfiles, setError, apiJson, listProfiles, useEffect) when
implementing the guard.
- Around line 33-97: MCPBindingsPanel has multiple hardcoded user-facing
strings; replace each with i18n keys using t('...') and add those keys to all
locale files: update the heading text in the component JSX (currently "MCP voice
bindings"), the help paragraph text (including "/mcp" and "docs/mcp.md"
fragments), the error messages in setError calls inside the fetch catch blocks
("Failed to load MCP bindings", "Failed to save binding", "Failed to delete
binding"), the placeholder in the clientId input ("client id (e.g.
claude-code)"), the select default option ("default voice"), the button label
("Bind"), the aria-label prefix used in the delete button ("Remove …"), and the
profileName fallback (currently '—') so they call t('mcp.heading'),
t('mcp.help'), t('mcp.error.load'), t('mcp.error.save'), t('mcp.error.delete'),
t('mcp.placeholder.clientId'), t('mcp.option.defaultVoice'),
t('mcp.button.bind'), t('mcp.aria.remove'), t('mcp.fallback.empty')
respectively; touch the MCPBindingsPanel component and its helpers (profileName,
onAdd, onDelete) to import and use the t function, and add the corresponding
keys to all 21 locale files.

In `@pyproject.toml`:
- Line 111: Update the pyproject.toml dependency for mcp to pin it to the tested
1.27.x range (e.g. change "mcp>=1.2" to "mcp>=1.27,<1.28") so the runtime
assumptions in backend/mcp_server.py (settings.streamable_http_path = "/" and
use of mcp.get_context().request_context.request to read X-OmniVoice-Client-Id)
remain stable and match the tests that expect the 1.27 sub-mount behavior.

In `@tests/test_mcp_bindings.py`:
- Around line 117-121: The root-walking loop in the _run_alembic routine uses
`while root and root != "/"` which can never terminate on Windows (e.g.,
"C:\\"); update the loop condition to stop when the parent equals the current
directory (e.g., `while True:` break when `os.path.dirname(root) == root`) or
switch to pathlib.parents to iterate up parents, and keep the existing check for
os.path.isfile(os.path.join(root, "alembic.ini")) so the search still asserts
presence of alembic.ini; apply the same fix to the analogous loops in
tests/test_profile_consent.py and tests/backend/services/test_settings_store.py.
- Around line 13-14: Remove the module-level
os.environ.setdefault("OMNIVOICE_MODEL", "test") and
os.environ.setdefault("OMNIVOICE_DISABLE_FILE_LOG", "1") calls in
tests/test_mcp_bindings.py and instead add a pytest autouse fixture (e.g., def
env_autouse(monkeypatch):) that uses monkeypatch.setenv("OMNIVOICE_MODEL",
"test") and monkeypatch.setenv("OMNIVOICE_DISABLE_FILE_LOG", "1") before
yielding; this ensures each test gets the env vars and the monkeypatch restores
state after each test.

In `@tests/test_mcp_mount.py`:
- Around line 53-63: The test_mcp_disable_env_skips_mount function mutates
global module/env state and may skip cleanup if the assertion fails; wrap the
TestClient usage and assertion in a try/finally so that
monkeypatch.delenv("OMNIVOICE_MCP_DISABLE", raising=False) and
importlib.reload(_main) always run. Concretely, inside
test_mcp_disable_env_skips_mount move the importlib.reload(_main) and the
TestClient/assert into a try block and perform the monkeypatch.delenv and
importlib.reload(_main) in the finally block to guarantee environment and module
reset even on test failures.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 4f01a0b1-5dbb-4546-afdf-b8d20abd40a3

📥 Commits

Reviewing files that changed from the base of the PR and between c8fdcb6 and 957a03c.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock, !**/*.lock, !**/uv.lock
📒 Files selected for processing (15)
  • backend/api/routers/mcp_bindings.py
  • backend/core/db.py
  • backend/main.py
  • backend/mcp_server.py
  • backend/mcp_shim/__init__.py
  • backend/mcp_shim/__main__.py
  • backend/migrations/versions/0004_mcp_client_bindings.py
  • backend/services/mcp_bindings.py
  • docs/mcp.json
  • docs/mcp.md
  • frontend/src/components/settings/MCPBindingsPanel.jsx
  • frontend/src/pages/Settings.jsx
  • pyproject.toml
  • tests/test_mcp_bindings.py
  • tests/test_mcp_mount.py

Comment on lines +45 to +46
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Preserve exception cause when mapping validation errors to HTTP 400.

On Line 46, use explicit exception chaining so traceback origin stays intact during debugging (raise ... from e).

Suggested patch
-    except ValueError as e:
-        raise HTTPException(status_code=400, detail=str(e))
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e)) from e
🧰 Tools
🪛 Ruff (0.15.15)

[warning] 46-46: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/api/routers/mcp_bindings.py` around lines 45 - 46, Replace the bare
re-raise of HTTPException with explicit exception chaining so the original
ValueError traceback is preserved: in the except ValueError as e block where
HTTPException is raised, change the raise to use "from e" (i.e., raise
HTTPException(status_code=400, detail=str(e)) from e) to maintain the original
exception context for debugging; this references the except ValueError as e
handler and the HTTPException construction in mcp_bindings.py.

Source: Linters/SAST tools

Comment thread backend/main.py
Comment on lines +438 to +447
from contextlib import AsyncExitStack
async with AsyncExitStack() as _mcp_stack:
_sm = getattr(app.state, "mcp_session_manager", None)
if _sm is not None:
try:
await _mcp_stack.enter_async_context(_sm.run())
logger.info("MCP server mounted at /mcp")
except Exception as e:
logger.warning("MCP session manager failed to start: %s", e)
yield

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard MCP teardown so it cannot bypass the backend’s own shutdown path.

Because yield sits inside async with AsyncExitStack(), any exception from _sm.run().__aexit__() aborts the function before Lines 448-479 run. That leaves worker tasks alive, skips model/VRAM release, and can strand the next launch behind orphaned state. Close the MCP stack in its own guarded step instead of letting AsyncExitStack unwind unhandled around the yield boundary.

Suggested structure
-    from contextlib import AsyncExitStack
-    async with AsyncExitStack() as _mcp_stack:
-        _sm = getattr(app.state, "mcp_session_manager", None)
-        if _sm is not None:
-            try:
-                await _mcp_stack.enter_async_context(_sm.run())
-                logger.info("MCP server mounted at /mcp")
-            except Exception as e:
-                logger.warning("MCP session manager failed to start: %s", e)
-        yield
+    from contextlib import AsyncExitStack
+    _mcp_stack = AsyncExitStack()
+    try:
+        _sm = getattr(app.state, "mcp_session_manager", None)
+        if _sm is not None:
+            try:
+                await _mcp_stack.enter_async_context(_sm.run())
+                logger.info("MCP server mounted at /mcp")
+            except Exception as e:
+                logger.warning("MCP session manager failed to start: %s", e)
+        yield
+    finally:
+        try:
+            await _mcp_stack.aclose()
+        except Exception as e:
+            logger.warning("MCP session manager failed to stop: %s", e)
🧰 Tools
🪛 Ruff (0.15.15)

[warning] 445-445: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/main.py` around lines 438 - 447, The current context manager uses
AsyncExitStack() so if _sm.run().__aexit__() raises during the implicit exit it
can bypass the rest of shutdown; change the structure so the MCP context is
entered and exited explicitly and guarded rather than relying on the async-with
to span the yield: obtain _mcp_stack = AsyncExitStack() and enter _sm.run() with
await _mcp_stack.enter_async_context(_sm.run()) before the yield, then after the
yield close the MCP stack in its own try/except/finally block (call await
_mcp_stack.aclose() inside a protected block and log any exceptions) so teardown
of _mcp_stack and subsequent backend shutdown steps always run even if the MCP
teardown raises; reference symbols: AsyncExitStack, _mcp_stack, _sm, _sm.run(),
and the yield boundary.

Comment thread backend/mcp_server.py
Comment on lines +208 to +215
try:
raw = base64.b64decode(audio_base64, validate=True)
except Exception:
return '{"error":"audio_base64 is not valid base64"}'
# 200 MB cap — same spirit as voicebox's transcribe gate. Keeps a
# buggy/hostile agent from posting an unbounded blob.
if len(raw) > 200 * 1024 * 1024:
return '{"error":"audio exceeds 200 MB limit"}'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject oversized audio before b64decode().

Line 209 allocates the full decoded blob before the 200 MB guard on Line 214 runs, so a buggy or hostile client can still force a very large allocation and take down the mounted backend. Preflight the encoded length first, or stream-decode with a hard cap.

⚙️ Suggested fix
+        max_raw = 200 * 1024 * 1024
+        max_b64 = ((max_raw + 2) // 3) * 4
+        if len(audio_base64) > max_b64:
+            return '{"error":"audio exceeds 200 MB limit"}'
         try:
             raw = base64.b64decode(audio_base64, validate=True)
         except Exception:
             return '{"error":"audio_base64 is not valid base64"}'
-        # 200 MB cap — same spirit as voicebox's transcribe gate. Keeps a
-        # buggy/hostile agent from posting an unbounded blob.
-        if len(raw) > 200 * 1024 * 1024:
+        if len(raw) > max_raw:
             return '{"error":"audio exceeds 200 MB limit"}'

As per coding guidelines, "the backend serves loopback HTTP: treat every query/path/form param as hostile."

🧰 Tools
🪛 Ruff (0.15.15)

[warning] 210-210: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/mcp_server.py` around lines 208 - 215, Reject oversized input before
decoding: compute a safe maximum encoded length from the 200 MB raw cap (e.g.
max_encoded = int(200 * 1024 * 1024 * 4 / 3) + some padding for base64
padding/newlines) and check len(audio_base64) against that and return the error
if exceeded before calling base64.b64decode; then proceed to decode into raw and
keep the existing len(raw) check as a secondary guard. Use the existing variable
names audio_base64 and raw (and the same error message) so you only add the
preflight length check and early return to avoid allocating huge decoded blobs.

Source: Coding guidelines

Comment thread backend/mcp_server.py
Comment on lines +219 to +223
r = await _api_post_form(
"/transcribe", data=data,
files={"audio": ("audio.wav", raw, "application/octet-stream")},
)
return str(r.json())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return actual JSON from the new tool.

Line 223 uses str(r.json()), which yields Python repr with single quotes rather than parseable JSON. That breaks the documented tool contract for any MCP client expecting structured JSON back from transcribe().

🛠️ Suggested fix
-        return str(r.json())
+        return json.dumps(r.json(), ensure_ascii=False)

Add import json at the top of the module.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/mcp_server.py` around lines 219 - 223, The code currently returns
str(r.json()) which produces a Python repr instead of valid JSON; import json at
the top of the module and replace that return with json.dumps(r.json()) so the
transcribe endpoint (the block calling _api_post_form and returning r.json())
emits proper JSON text that clients can parse.

Comment on lines +45 to +49
def _base_url() -> tuple[str, str]:
host = os.environ.get("OMNIVOICE_HOST", "127.0.0.1")
port = int(os.environ.get("OMNIVOICE_PORT", str(DEFAULT_PORT)))
return f"http://{host}:{port}/mcp/", f"http://{host}:{port}/health"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard OMNIVOICE_PORT parsing to avoid startup crash on bad env values.

At Line 47, int(os.environ.get("OMNIVOICE_PORT", ...)) can raise ValueError and kill the shim before it emits a usable diagnostic. This turns a typoed env into a hard crash instead of an actionable startup error.

Suggested fix
 def _base_url() -> tuple[str, str]:
     host = os.environ.get("OMNIVOICE_HOST", "127.0.0.1")
-    port = int(os.environ.get("OMNIVOICE_PORT", str(DEFAULT_PORT)))
+    raw_port = os.environ.get("OMNIVOICE_PORT", str(DEFAULT_PORT))
+    try:
+        port = int(raw_port)
+    except ValueError:
+        raise ValueError(f"OMNIVOICE_PORT must be an integer, got: {raw_port!r}")
     return f"http://{host}:{port}/mcp/", f"http://{host}:{port}/health"

And in main() convert that exception to _err(...) + nonzero exit without traceback.

🧰 Tools
🪛 ast-grep (0.43.0)

[warning] 47-47: Do not make http calls without encryption
Context: f"http://{host}:{port}/mcp/"
Note: [CWE-319].

(requests-http)


[warning] 47-47: Do not make http calls without encryption
Context: f"http://{host}:{port}/health"
Note: [CWE-319].

(requests-http)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/mcp_shim/__main__.py` around lines 45 - 49, The _base_url() function
currently calls int(os.environ.get("OMNIVOICE_PORT", str(DEFAULT_PORT))) which
can raise ValueError on bad env input; wrap the port parsing in a try/except
inside _base_url() (or a small helper) to validate the env value and fall back
to DEFAULT_PORT if parsing fails, and in main() catch any parsing/ValueError
from _base_url(), call _err(...) with a clear message including the bad value
and exit with a nonzero status (no traceback). Ensure you reference the
OMNIVOICE_PORT env var, keep DEFAULT_PORT as the fallback, and use the existing
_err(...) helper and main() exit path so the process logs a friendly error
instead of crashing.

Comment on lines +41 to +63
const onAdd = async () => {
if (!clientId.trim()) return;
setError(null);
try {
await apiFetch('/api/mcp/bindings', {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ client_id: clientId.trim(), profile_id: profileId || null }),
});
setClientId('');
setProfileId('');
refresh();
} catch (e) {
setError(e?.message || 'Failed to save binding');
}
};

const onDelete = async (cid) => {
try {
await apiFetch(`/api/mcp/bindings/${encodeURIComponent(cid)}`, { method: 'DELETE' });
refresh();
} catch (e) {
setError(e?.message || 'Failed to delete binding');

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add pending/disabled state to prevent duplicate add/delete requests.

Line [41]-Line [63] fires async PUT/DELETE with controls still enabled at Line [83]-Line [97]. Repeated clicks can enqueue conflicting requests and surface misleading errors.

Proposed fix (shape)
   const [error, setError] = useState(null);
+  const [isSaving, setIsSaving] = useState(false);
+  const [deletingId, setDeletingId] = useState(null);

   const onAdd = async () => {
     if (!clientId.trim()) return;
+    if (isSaving) return;
     setError(null);
+    setIsSaving(true);
     try {
       await apiFetch('/api/mcp/bindings', {
...
       setClientId('');
       setProfileId('');
-      refresh();
+      await refresh();
     } catch (e) {
       setError(e?.message || 'Failed to save binding');
+    } finally {
+      setIsSaving(false);
     }
   };

   const onDelete = async (cid) => {
+    if (deletingId) return;
+    setDeletingId(cid);
     try {
       await apiFetch(`/api/mcp/bindings/${encodeURIComponent(cid)}`, { method: 'DELETE' });
-      refresh();
+      await refresh();
     } catch (e) {
       setError(e?.message || 'Failed to delete binding');
+    } finally {
+      setDeletingId(null);
     }
   };
...
-        <button type="button" onClick={() => onDelete(b.client_id)}
+        <button type="button" onClick={() => onDelete(b.client_id)} disabled={!!deletingId || isSaving}
...
-        <input type="text" value={clientId} ...
+        <input type="text" value={clientId} disabled={isSaving || !!deletingId} ...
...
-        <select value={profileId} ...
+        <select value={profileId} disabled={isSaving || !!deletingId} ...
...
-        <button type="button" onClick={onAdd} data-testid="mcp-add">Bind</button>
+        <button type="button" onClick={onAdd} disabled={isSaving || !!deletingId} data-testid="mcp-add">Bind</button>

As per coding guidelines, long-running frontend operations should expose loading/disabled states.

Also applies to: 83-97

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/components/settings/MCPBindingsPanel.jsx` around lines 41 - 63,
Add a loading/disabled state to prevent duplicate requests: introduce a boolean
state (e.g., isSaving or isBusy, and optionally isDeleting keyed by client id)
and use it in onAdd and onDelete to short-circuit repeated clicks, set the flag
true before the apiFetch call and false in a finally block, and pass the flag
down to the UI controls (the add button/input and delete buttons rendered in the
MCPBindingsPanel) so they are disabled while the request is in-flight; ensure
onAdd clears inputs only after success and onDelete uses the per-item deleting
flag to disable just that delete control to avoid blocking other actions.

Source: Coding guidelines

Comment thread pyproject.toml
# (Assumption A1 in RESEARCH.md was checked at execute-time and proved
# false — `cryptography` is not on the install path today).
"cryptography>=41",
"mcp>=1.2",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

What is the earliest mcpPython package version that supports FastMCPsettings.streamable_http_path and request context header access used in mounted Streamable-HTTP apps?

💡 Result:

The earliest FastMCP version that supports settings.streamable_http_path and request context header access in Streamable-HTTP mounted apps is v2.3.0 [1][2]. Support for the Streamable HTTP transport, including the associated streamable_http_path configuration setting and the necessary session management to handle requests and context headers in Streamable-HTTP apps, was introduced in FastMCP v2.3.0, released on May 8, 2025 [3][1][2]. This release (titled "Stream Me Up, Scotty") implemented the Streamable HTTP server and client support, which was tracked and finalized under pull request #361 [3][1][2]. While later versions—such as v2.5.0 and the v3.x series—have further refined HTTP routing, dependency management, and OpenAPI integration [4][5], the fundamental infrastructure for Streamable-HTTP configurations and context access originated in v2.3.0 [3][1][2]. Sources: [3] PrefectHQ/fastmcp#361 [4] https://github.com/PrefectHQ/fastmcp/releases/tag/v3.0.0 [1] PrefectHQ/fastmcp#130 [2] https://github.com/jlowin/fastmcp/releases/tag/v2.3.0 [5] https://github.com/jlowin/fastmcp/releases/tag/v2.5.0

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== pyproject.toml line around 111 =="
python3 - <<'PY'
import itertools
path="pyproject.toml"
start=95; end=130
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        if start<=i<=end:
            print(f"{i}: {line.rstrip()}")
PY

echo
echo "== Find mcp/fastmcp imports and mcp_server usage of streamable_http_path =="
rg -n --hidden --no-ignore-vcs "(FastMCP|streamable_http_path|/mcp|request context|request_context|header|get_header|X-|\bmcp_server\.py\b)" -S .
echo
echo "== Locate backend/mcp_server.py definition and the referenced line window (46-134) if present =="
if [ -f "backend/mcp_server.py" ]; then
  python3 - <<'PY'
path="backend/mcp_server.py"
start=35; end=160
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        if start<=i<=end:
            print(f"{i}: {line.rstrip()}")
PY
else
  echo "backend/mcp_server.py not found at repo root; searching by filename..."
  fd -t f "mcp_server.py" .
fi

Repository: debpalash/OmniVoice-Studio

Length of output: 50382


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== backend/mcp_server.py: lines 1-220 (only) =="
python3 - <<'PY'
import itertools
path="backend/mcp_server.py"
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        if i<=220:
            print(f"{i}: {line.rstrip()}")
PY

echo
echo "== uv.lock: mcp package entries (context around line numbers) =="
rg -n '^name = "mcp"$|mcp-1\.' uv.lock
echo
# show the specific resolved stanza for the locked mcp
python3 - <<'PY'
import re
path="uv.lock"
target="mcp"
with open(path,'r',encoding='utf-8') as f:
    lines=f.readlines()
for i,l in enumerate(lines):
    if l.strip().startswith('name = "mcp"'):
        start=max(0,i-15); end=min(len(lines), i+60)
        for j in range(start,end):
            print(f"{j+1}: {lines[j].rstrip()}")
        break
else:
    # fallback: print around the known artifact lines we already saw
    for i,l in enumerate(lines):
        if "mcp-1." in l:
            start=max(0,i-20); end=min(len(lines), i+50)
            for j in range(start,end):
                print(f"{j+1}: {lines[j].rstrip()}")
            break
PY

echo
echo "== docs/mcp.md mentions MCP/FastMCP versioning? =="
rg -n "mcp|FastMCP|streamable_http_path" docs/mcp.md

Repository: debpalash/OmniVoice-Studio

Length of output: 18563


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== backend/main.py MCP mount usage (around 420-480) =="
python3 - <<'PY'
path="backend/main.py"
start=400; end=500
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        if start<=i<=end:
            print(f"{i}: {line.rstrip()}")
PY

echo
echo "== tests/test_mcp_mount.py (full file, small) =="
wc -l tests/test_mcp_mount.py
python3 - <<'PY'
path="tests/test_mcp_mount.py"
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        print(f"{i}: {line.rstrip()}")
PY

echo
echo "== Search for explicit mcp version expectations in repo =="
rg -n "mcp[^\n]*(>=|==|<=|<)\s*|FastMCP|streamable_http_path" pyproject.toml backend docs .github . -S --glob='!uv.lock' --glob='!**/*.png' --glob='!**/*.jpg'

Repository: debpalash/OmniVoice-Studio

Length of output: 14371


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== backend/main.py MCP mount usage (around 420-480) =="
python3 - <<'PY'
path="backend/main.py"
start=420; end=480
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        if start<=i<=end:
            print(f"{i}: {line.rstrip()}")
PY

echo
echo "== tests/test_mcp_mount.py =="
wc -l tests/test_mcp_mount.py
python3 - <<'PY'
path="tests/test_mcp_mount.py"
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        print(f"{i}: {line.rstrip()}")
PY

echo
echo "== Search for explicit mcp version expectations in repo =="
rg -n "mcp[^\n]*(>=|==|<=|<)\s*|FastMCP|streamable_http_path" pyproject.toml backend docs .github . -S --glob='!uv.lock'

Repository: debpalash/OmniVoice-Studio

Length of output: 12378


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== backend/mcp_server.py: find streamable_http_app + session manager usage =="
rg -n "streamable_http_app|session_manager|AsyncExitStack|streamable_http_path|get_context\(\)|request_context" backend/mcp_server.py

echo
echo "== backend/mcp_server.py: tail (last ~120 lines) =="
python3 - <<'PY'
import pathlib
p=pathlib.Path("backend/mcp_server.py")
lines=p.read_text(encoding="utf-8").splitlines()
start=max(1,len(lines)-120+1)
for i in range(start, len(lines)+1):
    print(f"{i}: {lines[i-1]}")
PY

echo
echo "== Search for mcp 1.27.x mention in repo =="
rg -n "mcp[- ]?1\.27|FastMCP.*1\.27|streamable_http_path.*1\.27" . -S

Repository: debpalash/OmniVoice-Studio

Length of output: 5640


🌐 Web query:

Find where streamable_http_pathis introduced formcp.server.fastmcp.FastMCP(Pythonmcppackage), and whichmcpversions includemcp.get_context().request_context.request / request context header access.

💡 Result:

The streamable_http_path parameter was introduced in the mcp Python SDK to configure the base endpoint path for StreamableHTTP transports [1][2]. It is primarily configured during FastMCP initialization or passed to the streamable_http_app() method [3][4][5]. This parameter allows developers to specify where the server endpoints are mounted (e.g., setting streamable_http_path="/" allows the MCP server to respond at the root path of the mounted application) [3][1]. Access to raw request context headers—specifically via mcp.get_context().request_context.request—was introduced in the official modelcontextprotocol/python-sdk following Pull Request #380, which added support for raw request injection into the RequestContext [6]. This capability allows developers to access the underlying Starlette Request object and its headers directly within MCP tools and resources when using HTTP-based transports [6][7][8]. For robust, transport-agnostic header access, it is recommended to use the built-in dependency helpers provided by the SDK, such as get_http_headers(), which safely retrieve headers from the current HTTP request if available without raising exceptions when no HTTP context exists [9][10].

Citations:


🌐 Web query:

Search mcp.server.fastmcp streamable_http_pathand return the most relevantmcp release notes or PRs for the setting and request context API.

💡 Result:

The streamable_http_path setting in FastMCP controls the base endpoint path for StreamableHTTP connections [1][2]. It is a critical configuration option for integrating MCP servers into larger ASGI applications (e.g., via Starlette or FastAPI) to avoid nested path issues like /mcp/mcp [3][4][5]. Key Information: - Default Behavior: As of PR #1007, the default value for streamable_http_path was updated to / to improve consistency and prevent automatic, potentially confusing, path nesting when mounting [4]. - Usage: Users can configure this path during FastMCP initialization or when generating the ASGI app [3][6]. - API Integration: When using StreamableHTTP mode, request context (such as HTTP headers) can be accessed within MCP tools using the Context object or by importing helper functions like get_http_headers from fastmcp.server.dependencies [7]. - Migration: Recent versions of the SDK (e.g., v1.x+) have moved transport-specific parameters, including streamable_http_path, out of the MCPServer constructor and into the.run or.streamable_http_app methods to improve separation of concerns [8]. Related PRs and Issues: - PR #1007: Standardized the default streamable_http_path to / [4]. - Issue #750: Discusses accessing HTTP headers in streamable-http mode via the Context object [7]. - Issue #951: Addresses path mounting and redirection challenges when using FastAPI with streamable_http_app [5]. - Issue #1233: Highlights issues regarding stale request context access in some server implementations [9]. For detailed implementation examples, refer to the official FastMCP documentation on StreamableHTTP transport [1][6].

Citations:


🌐 Web query:

Find the mcp(Model Context Protocol Python SDK) version that includes PR#380(raw request injection intoRequestContext) and which mcpversion includes PR#1007(defaultstreamable_http_pathto/). Return the release tag/version numbers.

💡 Result:

The MCP (Model Context Protocol) Python SDK incorporated the following changes: - PR #380, which added support for raw request injection into RequestContext, was merged into the main branch and is available in releases starting from v1.3.0. - PR #1007, which updated the default streamable_http_path to /, was merged and is included in releases starting from v1.11.0. These versions represent the stable progression within the v1.x series of the SDK. As of June 11, 2026, the project is maintaining v1.x while actively developing v2.0.0 (currently in alpha).{top_results:[1][2][3][4][5]}

Citations:


🌐 Web query:

In the modelcontextprotocol/python-sdkrepo, search forstreamable_http_pathand identify the earliest taggedmcprelease version that containsFastMCP.settings.streamable_http_path (or equivalent) for StreamableHTTP servers.

💡 Result:

The streamable_http_path configuration for StreamableHTTP servers was introduced in the modelcontextprotocol/python-sdk as part of the initial support for the Streamable HTTP transport [1][2]. This functionality was officially released in version v1.8.0 [1][2]. In the original implementation and subsequent versions, streamable_http_path is a configuration setting used to define the path for StreamableHTTP connections [3][4]. Users can configure it to control where the MCP server is mounted, for example by setting mcp.settings.streamable_http_path = "/" [3]. It is important to note that the modelcontextprotocol/python-sdk is distinct from the fastmcp library (maintained by PrefectHQ) [5][6][7]. While both support similar functionality and terminology, release versions are tracked independently [8][9]. The modelcontextprotocol/python-sdk v1.8.0 release is the authoritative source for the initial introduction of Streamable HTTP support in the official SDK [1].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import re
path="backend/main.py"
with open(path,'r',encoding='utf-8') as f:
    lines=f.readlines()
for i,l in enumerate(lines, start=1):
    if "mcp_session_manager" in l or "streamable_http_app" in l or "mount_mcp" in l:
        print(f"{i}: {l.rstrip()}")
PY

echo "== backend/main.py: show around mcp_session_manager block (if any) =="
python3 - <<'PY'
path="backend/main.py"
with open(path,'r',encoding='utf-8') as f:
    lines=f.readlines()
hits=[]
for i,l in enumerate(lines, start=1):
    if "mcp_session_manager" in l or "streamable_http_app" in l or "mount_mcp" in l:
        hits.append(i)
if not hits:
    print("No hits found.")
    raise SystemExit(0)
for h in sorted(set(hits)):
    start=max(1,h-25); end=min(len(lines),h+25)
    print(f"\n--- context around line {h} ({start}-{end}) ---")
    for j in range(start,end+1):
        print(f"{j}: {lines[j-1].rstrip()}")
PY

Repository: debpalash/OmniVoice-Studio

Length of output: 15064


Pin mcp to the tested 1.27.x range to keep /mcp mount + per-agent header binding stablepyproject.toml currently allows mcp>=1.2 (line 111), but the runtime relies on mcp.settings.streamable_http_path = "/" (backend/mcp_server.py:61) and on HTTP request header access via mcp.get_context().request_context.request (backend/mcp_server.py:94-98); with an older/newer mcp, the mount can land on the wrong paths and/or fail to read X-OmniVoice-Client-Id, breaking agent voice binding.

Suggested fix
-    "mcp>=1.2",
+    "mcp>=1.27,<1.28",

This matches the repo’s resolved version (uv.lock pins mcp==1.27.2, and the mount tests assert the sub-mount shape: tests/test_mcp_mount.py:28-50).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pyproject.toml` at line 111, Update the pyproject.toml dependency for mcp to
pin it to the tested 1.27.x range (e.g. change "mcp>=1.2" to "mcp>=1.27,<1.28")
so the runtime assumptions in backend/mcp_server.py
(settings.streamable_http_path = "/" and use of
mcp.get_context().request_context.request to read X-OmniVoice-Client-Id) remain
stable and match the tests that expect the 1.27 sub-mount behavior.

Comment on lines +13 to +14
os.environ.setdefault("OMNIVOICE_MODEL", "test")
os.environ.setdefault("OMNIVOICE_DISABLE_FILE_LOG", "1")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Module-level env mutation leaks test state across the suite.

Line 13 and Line 14 mutate os.environ at import time, so unrelated tests can inherit OMNIVOICE_MODEL=test / OMNIVOICE_DISABLE_FILE_LOG=1 based on import order. Move these into an autouse fixture with monkeypatch so state is restored per test.

🔧 Proposed fix
-os.environ.setdefault("OMNIVOICE_MODEL", "test")
-os.environ.setdefault("OMNIVOICE_DISABLE_FILE_LOG", "1")
+@pytest.fixture(autouse=True)
+def _isolated_test_env(monkeypatch):
+    monkeypatch.setenv("OMNIVOICE_MODEL", "test")
+    monkeypatch.setenv("OMNIVOICE_DISABLE_FILE_LOG", "1")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_mcp_bindings.py` around lines 13 - 14, Remove the module-level
os.environ.setdefault("OMNIVOICE_MODEL", "test") and
os.environ.setdefault("OMNIVOICE_DISABLE_FILE_LOG", "1") calls in
tests/test_mcp_bindings.py and instead add a pytest autouse fixture (e.g., def
env_autouse(monkeypatch):) that uses monkeypatch.setenv("OMNIVOICE_MODEL",
"test") and monkeypatch.setenv("OMNIVOICE_DISABLE_FILE_LOG", "1") before
yielding; this ensures each test gets the env vars and the monkeypatch restores
state after each test.

Comment on lines +117 to +121
here = os.path.abspath(os.path.dirname(__file__))
root = here
while root and root != "/" and not os.path.isfile(os.path.join(root, "alembic.ini")):
root = os.path.dirname(root)
cfg = Config(os.path.join(root, "alembic.ini"))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import ntpath

root = r"C:\repo\tests"
for i in range(8):
    parent = ntpath.dirname(root)
    print(f"{i}: root={root!r}, parent={parent!r}")
    if parent == root:
        print("stuck_at_root=True")
        break
    root = parent
PY

Repository: debpalash/OmniVoice-Studio

Length of output: 198


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect suspected snippet
if [ -f "tests/test_mcp_bindings.py" ]; then
  echo "== tests/test_mcp_bindings.py (around 100-160) =="
  nl -ba tests/test_mcp_bindings.py | sed -n '90,170p'
else
  echo "Missing: tests/test_mcp_bindings.py"
fi

# Find any other occurrences of similar root-walk logic referencing alembic.ini
echo
echo "== Search: alembic.ini usage =="
rg -n "alembic\.ini" .

echo
echo "== Search: root != '/' patterns (possible non-terminating loops) =="
rg -n "root\s*!=\s*['\"]/['\"]" .

Repository: debpalash/OmniVoice-Studio

Length of output: 181


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Check file existence =="
ls -la tests | head
ls -la tests/test_mcp_bindings.py || true

echo
echo "== tests/test_mcp_bindings.py (around 100-160) =="
if [ -f "tests/test_mcp_bindings.py" ]; then
  awk 'NR>=90 && NR<=170 {printf "%d\t%s\n", NR, $0}' tests/test_mcp_bindings.py
else
  echo "Missing: tests/test_mcp_bindings.py"
fi

echo
echo "== Search: alembic.ini usage =="
rg -n --hidden --glob '!.git/*' "alembic\.ini" .

echo
echo "== Search: root != '/' patterns =="
rg -n --hidden --glob '!.git/*' "root\s*!=\s*['\"]/['\"]" .

Repository: debpalash/OmniVoice-Studio

Length of output: 8445


Fix potential non-terminating root-walk loops on Windows (alembic.ini discovery)

  • tests/test_mcp_bindings.py (_run_alembic, loop at ~line 119) uses while root and root != "/" ... root = os.path.dirname(root), which can hang on Windows because os.path.dirname("C:\\") returns C:\\ (so root never progresses to the termination condition).
  • Same pattern exists in tests/test_profile_consent.py (~line 220) and tests/backend/services/test_settings_store.py (~line 245).
  • Change the loop termination to stop when os.path.dirname(root) == root (or use pathlib.Path(...).parents) and keep the alembic.ini existence assertion.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_mcp_bindings.py` around lines 117 - 121, The root-walking loop in
the _run_alembic routine uses `while root and root != "/"` which can never
terminate on Windows (e.g., "C:\\"); update the loop condition to stop when the
parent equals the current directory (e.g., `while True:` break when
`os.path.dirname(root) == root`) or switch to pathlib.parents to iterate up
parents, and keep the existing check for os.path.isfile(os.path.join(root,
"alembic.ini")) so the search still asserts presence of alembic.ini; apply the
same fix to the analogous loops in tests/test_profile_consent.py and
tests/backend/services/test_settings_store.py.

Source: Coding guidelines

Comment thread tests/test_mcp_mount.py Outdated
The two main-importing mount tests ran the app lifespan, which now starts
the FastMCP session manager and binds asyncio queues to the test loop —
contaminating later lifespan-running tests ('bound to a different event
loop'). The mount happens at import time, so inspecting app.routes for the
/mcp Mount is the correct loop-free assertion. Same fix shape as the
Wave 0.2 consent tests.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Comment thread backend/mcp_server.py
Comment on lines +208 to +215
try:
raw = base64.b64decode(audio_base64, validate=True)
except Exception:
return '{"error":"audio_base64 is not valid base64"}'
# 200 MB cap — same spirit as voicebox's transcribe gate. Keeps a
# buggy/hostile agent from posting an unbounded blob.
if len(raw) > 200 * 1024 * 1024:
return '{"error":"audio exceeds 200 MB limit"}'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Size guard fires after the 200 MB allocation, not before

The comment explicitly says this cap "Keeps a buggy/hostile agent from posting an unbounded blob," but base64.b64decode runs before the check. A 200 MB audio file encodes to ~267 MB of base64; the server fully allocates the ~200 MB decoded bytes object and then rejects it. A buggy agent looping the call, or one injecting large payloads, can trigger repeated 200 MB heap allocations before any guard fires. Add an encoded-length pre-check (base64 expands by ~4/3) so the blob is never decoded when it would exceed the cap.

Suggested change
try:
raw = base64.b64decode(audio_base64, validate=True)
except Exception:
return '{"error":"audio_base64 is not valid base64"}'
# 200 MB cap — same spirit as voicebox's transcribe gate. Keeps a
# buggy/hostile agent from posting an unbounded blob.
if len(raw) > 200 * 1024 * 1024:
return '{"error":"audio exceeds 200 MB limit"}'
# 200 MB cap — same spirit as voicebox's transcribe gate. Keeps a
# buggy/hostile agent from posting an unbounded blob.
# Check the encoded length first (~4/3 overhead) so we never allocate
# the decoded blob when we'd immediately reject it anyway.
if len(audio_base64) > 200 * 1024 * 1024 * 4 // 3 + 64:
return '{"error":"audio exceeds 200 MB limit"}'
try:
raw = base64.b64decode(audio_base64, validate=True)
except Exception:
return '{"error":"audio_base64 is not valid base64"}'
if len(raw) > 200 * 1024 * 1024:
return '{"error":"audio exceeds 200 MB limit"}'

Fix in Claude Code

Root cause of the CI failure: the bindings REST fixture set
OMNIVOICE_MCP_DISABLE=1 and reloaded main but never restored it, so a
later 'from main import app' in test_mcp_mount saw /mcp un-mounted
({'/audio','/voice_audio'}). Reloading main mutates the shared module for
every subsequent test.

- REST fixture: drop the disable flag (the mount is harmless without a
  lifespan), yield the client, and restore main (+ core.config/db) to the
  default data dir in teardown so the global module is clean again.
- test_main_mounts_mcp_route: reload main with the disable flag cleared so
  the assertion is independent of any earlier reload.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@debpalash debpalash merged commit 99357e8 into main Jun 12, 2026
15 checks passed
@debpalash debpalash deleted the feat/mcp-server-v2 branch June 12, 2026 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants