feat(mcp): MCP server v1 — mount on /mcp, per-agent voice binding, stdio shim (Wave 2.2)#368
Conversation
…dio shim (Wave 2.2) The FastMCP server (previously dead code, never mounted) is now mounted on the main FastAPI app at /mcp via Streamable HTTP, with its session manager composed into the app lifespan through an AsyncExitStack (best-effort: a missing mcp package or OMNIVOICE_MCP_DISABLE=1 never breaks startup). streamable_http_path set to '/' so the sub-mount lands at /mcp, not /mcp/mcp. Adds the 'mcp' dependency (1.27.x). Per-agent voice binding (Spec 2 headline): each MCP client sends an X-OmniVoice-Client-Id header; generate_speech resolves the voice as explicit arg > the client's binding > global default > app default. New mcp_client_bindings table (alembic 0004 + _BASE_SCHEMA, additive/idempotent), services/mcp_bindings.py (CRUD + resolve_voice + best-effort last_seen), and a loopback-gated REST router (/api/mcp/bindings) the Settings panel drives. New transcribe tool (base64 audio in, 200 MB cap). Stdio shim (backend/mcp_shim, httpx-only, ported from voicebox MIT) proxies stdio clients to the mounted endpoint and forwards OMNIVOICE_CLIENT_ID as the binding header. Settings → Sharing gains an MCP bindings panel. Docs: docs/mcp.md (both connection modes + binding REST) and docs/mcp.json updated to the shim form. Tests: bindings service + resolution precedence + migration up/down (pure, run locally); REST CRUD + mount-not-404 + disable-flag (main-importing, validated in CI). MCP build + mount + initialize handshake verified out-of-band (no torch). Spec: docs/competitive-analysis.md Spec 2 / parity program Wave 2.2. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThis PR integrates Model Context Protocol (MCP) server capabilities into OmniVoice Studio with per-agent voice binding support. The changes add a database persistence layer for MCP client voice bindings, REST CRUD endpoints, FastMCP tools (speech generation with agent-aware voice resolution and audio transcription), app-level mounting with session management, a stdio-compatible proxy, frontend settings UI, and comprehensive test coverage. ChangesMCP Integration with Per-Agent Voice Bindings
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 7 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (7 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| # sub-mounted. Harmless for the standalone CLI run() path. | ||
| try: | ||
| mcp.settings.streamable_http_path = "/" | ||
| except Exception: |
| req = mcp.get_context().request_context.request | ||
| if req is not None: | ||
| return req.headers.get("x-omnivoice-client-id") | ||
| except Exception: |
| r = await client.get(health_url, timeout=2.0) | ||
| if r.status_code == 200: | ||
| return True | ||
| except Exception: |
| "UPDATE mcp_client_bindings SET last_seen_at=? WHERE client_id=?", | ||
| (time.time(), client_id), | ||
| ) | ||
| except Exception: |
|
| Filename | Overview |
|---|---|
| backend/mcp_server.py | Mounts FastMCP tools; adds _current_client_id, per-agent binding lookup in generate_speech, and new transcribe tool. default_engine from the binding resolution is resolved but never forwarded to the /generate form, silently breaking the engine-binding half of the feature. |
| backend/services/mcp_bindings.py | New service layer: CRUD over mcp_client_bindings, resolve_voice with documented precedence chain, best-effort touch_last_seen. Logic is sound; known TOCTOU upsert race was flagged in a prior review thread. |
| backend/main.py | Wires MCP mount and session-manager lifespan via AsyncExitStack; router included. Best-effort exception handling keeps startup clean if MCP is absent. |
| backend/mcp_shim/main.py | stdio↔HTTP shim; handles SSE streaming, session-id threading, error responses, and clean EOF. Sequential request dispatch (noted in prior thread) is the main constraint. |
| backend/api/routers/mcp_bindings.py | Loopback-gated REST CRUD for bindings; clean Pydantic validation, correct 404 on missing delete. |
| backend/migrations/versions/0004_mcp_client_bindings.py | Idempotent Alembic migration matching the _BASE_SCHEMA DDL; upgrade/downgrade both guarded by sqlite_master check. |
| frontend/src/components/settings/MCPBindingsPanel.jsx | New panel for managing per-agent voice bindings; fetches bindings and profiles on mount, PUT/DELETE with error display. No destructive-action confirmation and unawaited refresh() are minor UX nits. |
| backend/core/db.py | Adds CREATE TABLE IF NOT EXISTS mcp_client_bindings to _BASE_SCHEMA for fresh installs; additive, no schema drift risk. |
| pyproject.toml | Adds mcp>=1.2 to hard dependencies; locked to 1.27.2 in uv.lock. Lower bound and optional-vs-required concerns were flagged in prior review threads. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[generate_speech call] --> B{explicit profile_id?}
B -- Yes --> C[Use explicit profile_id]
B -- No --> D{X-OmniVoice-Client-Id present?}
D -- Yes --> E[get_binding lookup]
E -- binding.profile_id set --> C
E -- not set --> F{mcp_default_profile_id pref?}
D -- No --> F
F -- Yes --> C
F -- No --> G[profile_id = None — backend default]
C --> H[POST /generate\nprofile_id in form\n⚠ default_engine NOT applied]
G --> H
Comments Outside Diff (1)
-
backend/mcp_server.py, line 130-145 (link)default_enginefrom binding resolved but never appliedresolve_voicereturns{profile_id, default_engine, source}and the service layer, migration, and UI all store and exposedefault_engineper binding — but onlyresolved.get("profile_id")is pulled out and applied to the form.resolved.get("default_engine")is never read. Any client with adefault_enginebinding (e.g. viaPUT /api/mcp/bindingswith"default_engine":"zonos") will see that setting silently dropped on everygenerate_speechcall; the form sent to/generateuses whatever engine the backend defaults to instead of the bound one.
Reviews (3): Last reviewed commit: "test(mcp): stop reload-main poisoning ac..." | Re-trigger Greptile
| try: | ||
| mcp.settings.streamable_http_path = "/" | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
Silent path misconfiguration leaves MCP dead with no diagnostic
If mcp.settings.streamable_http_path = "/" raises for any reason — validation error, attribute renamed in a FastMCP patch, settings becoming read-only — the assignment is silently swallowed. The FastMCP default path remains /mcp. After mounting the sub-app at /mcp on the main FastAPI, Starlette strips the /mcp prefix before forwarding, so the only reachable endpoint is /mcp/mcp. Every client connection fails with 404/405 while the startup log still says "MCP app mounted at /mcp". Replace the bare pass with at least a logger.warning so operators can diagnose the double-prefix scenario.
| existing = get_binding(cid) | ||
| now = time.time() | ||
| if existing: | ||
| merged = { | ||
| "label": existing["label"] if label is None else label, | ||
| "profile_id": existing["profile_id"] if profile_id is None else (profile_id or None), | ||
| "default_engine": existing["default_engine"] if default_engine is None else (default_engine or None), | ||
| } | ||
| with db_conn() as conn: | ||
| conn.execute( | ||
| "UPDATE mcp_client_bindings SET label=?, profile_id=?, default_engine=? WHERE client_id=?", | ||
| (merged["label"], merged["profile_id"], merged["default_engine"], cid), | ||
| ) | ||
| else: | ||
| with db_conn() as conn: | ||
| conn.execute( | ||
| "INSERT INTO mcp_client_bindings " | ||
| "(client_id, label, profile_id, default_engine, last_seen_at, created_at) " | ||
| "VALUES (?, ?, ?, ?, NULL, ?)", | ||
| (cid, label or "", profile_id or None, default_engine or None, now), | ||
| ) |
There was a problem hiding this comment.
Non-atomic read-modify-write in
upsert_binding
The current get_binding → UPDATE/INSERT pattern has a TOCTOU window: two concurrent callers with the same client_id can both see existing = None, then both attempt INSERT, hitting a UNIQUE constraint violation. SQLite's INSERT … ON CONFLICT DO UPDATE (available since SQLite 3.24 / 2018) handles this atomically and removes the extra SELECT round-trip.
| existing = get_binding(cid) | |
| now = time.time() | |
| if existing: | |
| merged = { | |
| "label": existing["label"] if label is None else label, | |
| "profile_id": existing["profile_id"] if profile_id is None else (profile_id or None), | |
| "default_engine": existing["default_engine"] if default_engine is None else (default_engine or None), | |
| } | |
| with db_conn() as conn: | |
| conn.execute( | |
| "UPDATE mcp_client_bindings SET label=?, profile_id=?, default_engine=? WHERE client_id=?", | |
| (merged["label"], merged["profile_id"], merged["default_engine"], cid), | |
| ) | |
| else: | |
| with db_conn() as conn: | |
| conn.execute( | |
| "INSERT INTO mcp_client_bindings " | |
| "(client_id, label, profile_id, default_engine, last_seen_at, created_at) " | |
| "VALUES (?, ?, ?, ?, NULL, ?)", | |
| (cid, label or "", profile_id or None, default_engine or None, now), | |
| ) | |
| now = time.time() | |
| with db_conn() as conn: | |
| conn.execute( | |
| """ | |
| INSERT INTO mcp_client_bindings | |
| (client_id, label, profile_id, default_engine, last_seen_at, created_at) | |
| VALUES (?, ?, ?, ?, NULL, ?) | |
| ON CONFLICT(client_id) DO UPDATE SET | |
| label = CASE WHEN excluded.label IS NOT NULL THEN excluded.label ELSE label END, | |
| profile_id = CASE WHEN excluded.profile_id IS NOT NULL THEN excluded.profile_id ELSE profile_id END, | |
| default_engine = CASE WHEN excluded.default_engine IS NOT NULL THEN excluded.default_engine ELSE default_engine END | |
| """, | |
| (cid, label or "", profile_id or None, default_engine or None, now), | |
| ) |
| # (Assumption A1 in RESEARCH.md was checked at execute-time and proved | ||
| # false — `cryptography` is not on the install path today). | ||
| "cryptography>=41", | ||
| "mcp>=1.2", |
There was a problem hiding this comment.
Lower bound
>=1.2 predates the Streamable-HTTP transport
streamable_http_app() and .session_manager were introduced with the Streamable HTTP transport in FastMCP ~1.9. With the current >=1.2 floor, a resolver could pick an older release that lacks these attributes; the best-effort except Exception in main.py would silently swallow the AttributeError, leaving /mcp unmounted with no explanation.
| "mcp>=1.2", | |
| "mcp>=1.9", |
| # (Assumption A1 in RESEARCH.md was checked at execute-time and proved | ||
| # false — `cryptography` is not on the install path today). | ||
| "cryptography>=41", | ||
| "mcp>=1.2", |
There was a problem hiding this comment.
mcp is a hard dependency but all consuming code treats it as optional
mcp is in the main dependencies list (always installed), yet main.py wraps the import in a best-effort try/except and test_mcp_mount.py uses pytest.importorskip("mcp"). If the intent is truly optional, move it to [project.optional-dependencies]. If required, remove the defensive guards. As-is, every OmniVoice install unconditionally pulls in pyjwt, pydantic-settings, sse-starlette, and pywin32 (Windows) while the runtime code pretends the package might be absent.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Actionable comments posted: 15
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/api/routers/mcp_bindings.py`:
- Around line 45-46: Replace the bare re-raise of HTTPException with explicit
exception chaining so the original ValueError traceback is preserved: in the
except ValueError as e block where HTTPException is raised, change the raise to
use "from e" (i.e., raise HTTPException(status_code=400, detail=str(e)) from e)
to maintain the original exception context for debugging; this references the
except ValueError as e handler and the HTTPException construction in
mcp_bindings.py.
In `@backend/main.py`:
- Around line 438-447: The current context manager uses AsyncExitStack() so if
_sm.run().__aexit__() raises during the implicit exit it can bypass the rest of
shutdown; change the structure so the MCP context is entered and exited
explicitly and guarded rather than relying on the async-with to span the yield:
obtain _mcp_stack = AsyncExitStack() and enter _sm.run() with await
_mcp_stack.enter_async_context(_sm.run()) before the yield, then after the yield
close the MCP stack in its own try/except/finally block (call await
_mcp_stack.aclose() inside a protected block and log any exceptions) so teardown
of _mcp_stack and subsequent backend shutdown steps always run even if the MCP
teardown raises; reference symbols: AsyncExitStack, _mcp_stack, _sm, _sm.run(),
and the yield boundary.
In `@backend/mcp_server.py`:
- Around line 208-215: Reject oversized input before decoding: compute a safe
maximum encoded length from the 200 MB raw cap (e.g. max_encoded = int(200 *
1024 * 1024 * 4 / 3) + some padding for base64 padding/newlines) and check
len(audio_base64) against that and return the error if exceeded before calling
base64.b64decode; then proceed to decode into raw and keep the existing len(raw)
check as a secondary guard. Use the existing variable names audio_base64 and raw
(and the same error message) so you only add the preflight length check and
early return to avoid allocating huge decoded blobs.
- Around line 219-223: The code currently returns str(r.json()) which produces a
Python repr instead of valid JSON; import json at the top of the module and
replace that return with json.dumps(r.json()) so the transcribe endpoint (the
block calling _api_post_form and returning r.json()) emits proper JSON text that
clients can parse.
In `@backend/mcp_shim/__main__.py`:
- Around line 45-49: The _base_url() function currently calls
int(os.environ.get("OMNIVOICE_PORT", str(DEFAULT_PORT))) which can raise
ValueError on bad env input; wrap the port parsing in a try/except inside
_base_url() (or a small helper) to validate the env value and fall back to
DEFAULT_PORT if parsing fails, and in main() catch any parsing/ValueError from
_base_url(), call _err(...) with a clear message including the bad value and
exit with a nonzero status (no traceback). Ensure you reference the
OMNIVOICE_PORT env var, keep DEFAULT_PORT as the fallback, and use the existing
_err(...) helper and main() exit path so the process logs a friendly error
instead of crashing.
In `@backend/services/mcp_bindings.py`:
- Around line 82-89: Both resolve_voice and touch_last_seen are using raw
client_id/explicit_profile inputs; normalize by stripping whitespace and
converting empty/whitespace-only strings to None before any lookup or selection.
Update the functions (resolve_voice, touch_last_seen) to call .strip() on
client_id and explicit_profile, then treat "" as None, and use those
canonicalized values in database queries and explicit profile logic so padded or
whitespace-only inputs cannot bypass bindings or be treated as an explicit
profile; apply the same normalization to the other similar block around the code
handling lines 113-117.
- Around line 49-69: The upsert_binding function currently does a non-atomic
SELECT then INSERT/UPDATE using get_binding and db_conn which can cause
unique-constraint errors under concurrent writes; change it to perform an atomic
upsert by using a single SQL statement like INSERT ... ON CONFLICT(client_id) DO
UPDATE SET ... (updating label, profile_id, default_engine, last_seen_at)
against the mcp_client_bindings table, or alternatively wrap the
SELECT/INSERT/UPDATE in a single transaction with proper conflict
handling/locking; update the code paths that call get_binding/upsert_binding to
use the new upsert SQL or transactional helper so concurrent requests for the
same client_id no longer race.
In `@docs/mcp.md`:
- Around line 24-26: The fenced code block containing the URL should be labeled
as plain text to avoid MD040; update the snippet in docs/mcp.md so the fence
uses a language tag like `text` (or replace with inline code) for the URL
`http://localhost:3900/mcp`, e.g. change the block from a bare triple-backtick
fence to a ```text fenced block so the linter no longer flags it.
In `@frontend/src/components/settings/MCPBindingsPanel.jsx`:
- Around line 41-63: Add a loading/disabled state to prevent duplicate requests:
introduce a boolean state (e.g., isSaving or isBusy, and optionally isDeleting
keyed by client id) and use it in onAdd and onDelete to short-circuit repeated
clicks, set the flag true before the apiFetch call and false in a finally block,
and pass the flag down to the UI controls (the add button/input and delete
buttons rendered in the MCPBindingsPanel) so they are disabled while the request
is in-flight; ensure onAdd clears inputs only after success and onDelete uses
the per-item deleting flag to disable just that delete control to avoid blocking
other actions.
- Around line 26-37: The refresh() async can be overwritten by out-of-order
responses; add a stale-response guard using a refresh token/counter stored in a
ref (e.g., refreshCounterRef) that you increment at the start of refresh(),
capture into a local const (current = refreshCounterRef.current) before awaiting
Promise.all, and only call setBindings, setProfiles and setError when the
captured token still equals refreshCounterRef.current; apply the same pattern to
other refresh callers (the invocations near lines noted) so older responses
cannot overwrite newer state; keep existing function names (refresh,
setBindings, setProfiles, setError, apiJson, listProfiles, useEffect) when
implementing the guard.
- Around line 33-97: MCPBindingsPanel has multiple hardcoded user-facing
strings; replace each with i18n keys using t('...') and add those keys to all
locale files: update the heading text in the component JSX (currently "MCP voice
bindings"), the help paragraph text (including "/mcp" and "docs/mcp.md"
fragments), the error messages in setError calls inside the fetch catch blocks
("Failed to load MCP bindings", "Failed to save binding", "Failed to delete
binding"), the placeholder in the clientId input ("client id (e.g.
claude-code)"), the select default option ("default voice"), the button label
("Bind"), the aria-label prefix used in the delete button ("Remove …"), and the
profileName fallback (currently '—') so they call t('mcp.heading'),
t('mcp.help'), t('mcp.error.load'), t('mcp.error.save'), t('mcp.error.delete'),
t('mcp.placeholder.clientId'), t('mcp.option.defaultVoice'),
t('mcp.button.bind'), t('mcp.aria.remove'), t('mcp.fallback.empty')
respectively; touch the MCPBindingsPanel component and its helpers (profileName,
onAdd, onDelete) to import and use the t function, and add the corresponding
keys to all 21 locale files.
In `@pyproject.toml`:
- Line 111: Update the pyproject.toml dependency for mcp to pin it to the tested
1.27.x range (e.g. change "mcp>=1.2" to "mcp>=1.27,<1.28") so the runtime
assumptions in backend/mcp_server.py (settings.streamable_http_path = "/" and
use of mcp.get_context().request_context.request to read X-OmniVoice-Client-Id)
remain stable and match the tests that expect the 1.27 sub-mount behavior.
In `@tests/test_mcp_bindings.py`:
- Around line 117-121: The root-walking loop in the _run_alembic routine uses
`while root and root != "/"` which can never terminate on Windows (e.g.,
"C:\\"); update the loop condition to stop when the parent equals the current
directory (e.g., `while True:` break when `os.path.dirname(root) == root`) or
switch to pathlib.parents to iterate up parents, and keep the existing check for
os.path.isfile(os.path.join(root, "alembic.ini")) so the search still asserts
presence of alembic.ini; apply the same fix to the analogous loops in
tests/test_profile_consent.py and tests/backend/services/test_settings_store.py.
- Around line 13-14: Remove the module-level
os.environ.setdefault("OMNIVOICE_MODEL", "test") and
os.environ.setdefault("OMNIVOICE_DISABLE_FILE_LOG", "1") calls in
tests/test_mcp_bindings.py and instead add a pytest autouse fixture (e.g., def
env_autouse(monkeypatch):) that uses monkeypatch.setenv("OMNIVOICE_MODEL",
"test") and monkeypatch.setenv("OMNIVOICE_DISABLE_FILE_LOG", "1") before
yielding; this ensures each test gets the env vars and the monkeypatch restores
state after each test.
In `@tests/test_mcp_mount.py`:
- Around line 53-63: The test_mcp_disable_env_skips_mount function mutates
global module/env state and may skip cleanup if the assertion fails; wrap the
TestClient usage and assertion in a try/finally so that
monkeypatch.delenv("OMNIVOICE_MCP_DISABLE", raising=False) and
importlib.reload(_main) always run. Concretely, inside
test_mcp_disable_env_skips_mount move the importlib.reload(_main) and the
TestClient/assert into a try block and perform the monkeypatch.delenv and
importlib.reload(_main) in the finally block to guarantee environment and module
reset even on test failures.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 4f01a0b1-5dbb-4546-afdf-b8d20abd40a3
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock,!**/*.lock,!**/uv.lock
📒 Files selected for processing (15)
backend/api/routers/mcp_bindings.pybackend/core/db.pybackend/main.pybackend/mcp_server.pybackend/mcp_shim/__init__.pybackend/mcp_shim/__main__.pybackend/migrations/versions/0004_mcp_client_bindings.pybackend/services/mcp_bindings.pydocs/mcp.jsondocs/mcp.mdfrontend/src/components/settings/MCPBindingsPanel.jsxfrontend/src/pages/Settings.jsxpyproject.tomltests/test_mcp_bindings.pytests/test_mcp_mount.py
| except ValueError as e: | ||
| raise HTTPException(status_code=400, detail=str(e)) |
There was a problem hiding this comment.
Preserve exception cause when mapping validation errors to HTTP 400.
On Line 46, use explicit exception chaining so traceback origin stays intact during debugging (raise ... from e).
Suggested patch
- except ValueError as e:
- raise HTTPException(status_code=400, detail=str(e))
+ except ValueError as e:
+ raise HTTPException(status_code=400, detail=str(e)) from e🧰 Tools
🪛 Ruff (0.15.15)
[warning] 46-46: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/api/routers/mcp_bindings.py` around lines 45 - 46, Replace the bare
re-raise of HTTPException with explicit exception chaining so the original
ValueError traceback is preserved: in the except ValueError as e block where
HTTPException is raised, change the raise to use "from e" (i.e., raise
HTTPException(status_code=400, detail=str(e)) from e) to maintain the original
exception context for debugging; this references the except ValueError as e
handler and the HTTPException construction in mcp_bindings.py.
Source: Linters/SAST tools
| from contextlib import AsyncExitStack | ||
| async with AsyncExitStack() as _mcp_stack: | ||
| _sm = getattr(app.state, "mcp_session_manager", None) | ||
| if _sm is not None: | ||
| try: | ||
| await _mcp_stack.enter_async_context(_sm.run()) | ||
| logger.info("MCP server mounted at /mcp") | ||
| except Exception as e: | ||
| logger.warning("MCP session manager failed to start: %s", e) | ||
| yield |
There was a problem hiding this comment.
Guard MCP teardown so it cannot bypass the backend’s own shutdown path.
Because yield sits inside async with AsyncExitStack(), any exception from _sm.run().__aexit__() aborts the function before Lines 448-479 run. That leaves worker tasks alive, skips model/VRAM release, and can strand the next launch behind orphaned state. Close the MCP stack in its own guarded step instead of letting AsyncExitStack unwind unhandled around the yield boundary.
Suggested structure
- from contextlib import AsyncExitStack
- async with AsyncExitStack() as _mcp_stack:
- _sm = getattr(app.state, "mcp_session_manager", None)
- if _sm is not None:
- try:
- await _mcp_stack.enter_async_context(_sm.run())
- logger.info("MCP server mounted at /mcp")
- except Exception as e:
- logger.warning("MCP session manager failed to start: %s", e)
- yield
+ from contextlib import AsyncExitStack
+ _mcp_stack = AsyncExitStack()
+ try:
+ _sm = getattr(app.state, "mcp_session_manager", None)
+ if _sm is not None:
+ try:
+ await _mcp_stack.enter_async_context(_sm.run())
+ logger.info("MCP server mounted at /mcp")
+ except Exception as e:
+ logger.warning("MCP session manager failed to start: %s", e)
+ yield
+ finally:
+ try:
+ await _mcp_stack.aclose()
+ except Exception as e:
+ logger.warning("MCP session manager failed to stop: %s", e)🧰 Tools
🪛 Ruff (0.15.15)
[warning] 445-445: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/main.py` around lines 438 - 447, The current context manager uses
AsyncExitStack() so if _sm.run().__aexit__() raises during the implicit exit it
can bypass the rest of shutdown; change the structure so the MCP context is
entered and exited explicitly and guarded rather than relying on the async-with
to span the yield: obtain _mcp_stack = AsyncExitStack() and enter _sm.run() with
await _mcp_stack.enter_async_context(_sm.run()) before the yield, then after the
yield close the MCP stack in its own try/except/finally block (call await
_mcp_stack.aclose() inside a protected block and log any exceptions) so teardown
of _mcp_stack and subsequent backend shutdown steps always run even if the MCP
teardown raises; reference symbols: AsyncExitStack, _mcp_stack, _sm, _sm.run(),
and the yield boundary.
| try: | ||
| raw = base64.b64decode(audio_base64, validate=True) | ||
| except Exception: | ||
| return '{"error":"audio_base64 is not valid base64"}' | ||
| # 200 MB cap — same spirit as voicebox's transcribe gate. Keeps a | ||
| # buggy/hostile agent from posting an unbounded blob. | ||
| if len(raw) > 200 * 1024 * 1024: | ||
| return '{"error":"audio exceeds 200 MB limit"}' |
There was a problem hiding this comment.
Reject oversized audio before b64decode().
Line 209 allocates the full decoded blob before the 200 MB guard on Line 214 runs, so a buggy or hostile client can still force a very large allocation and take down the mounted backend. Preflight the encoded length first, or stream-decode with a hard cap.
⚙️ Suggested fix
+ max_raw = 200 * 1024 * 1024
+ max_b64 = ((max_raw + 2) // 3) * 4
+ if len(audio_base64) > max_b64:
+ return '{"error":"audio exceeds 200 MB limit"}'
try:
raw = base64.b64decode(audio_base64, validate=True)
except Exception:
return '{"error":"audio_base64 is not valid base64"}'
- # 200 MB cap — same spirit as voicebox's transcribe gate. Keeps a
- # buggy/hostile agent from posting an unbounded blob.
- if len(raw) > 200 * 1024 * 1024:
+ if len(raw) > max_raw:
return '{"error":"audio exceeds 200 MB limit"}'As per coding guidelines, "the backend serves loopback HTTP: treat every query/path/form param as hostile."
🧰 Tools
🪛 Ruff (0.15.15)
[warning] 210-210: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/mcp_server.py` around lines 208 - 215, Reject oversized input before
decoding: compute a safe maximum encoded length from the 200 MB raw cap (e.g.
max_encoded = int(200 * 1024 * 1024 * 4 / 3) + some padding for base64
padding/newlines) and check len(audio_base64) against that and return the error
if exceeded before calling base64.b64decode; then proceed to decode into raw and
keep the existing len(raw) check as a secondary guard. Use the existing variable
names audio_base64 and raw (and the same error message) so you only add the
preflight length check and early return to avoid allocating huge decoded blobs.
Source: Coding guidelines
| r = await _api_post_form( | ||
| "/transcribe", data=data, | ||
| files={"audio": ("audio.wav", raw, "application/octet-stream")}, | ||
| ) | ||
| return str(r.json()) |
There was a problem hiding this comment.
Return actual JSON from the new tool.
Line 223 uses str(r.json()), which yields Python repr with single quotes rather than parseable JSON. That breaks the documented tool contract for any MCP client expecting structured JSON back from transcribe().
🛠️ Suggested fix
- return str(r.json())
+ return json.dumps(r.json(), ensure_ascii=False)Add import json at the top of the module.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/mcp_server.py` around lines 219 - 223, The code currently returns
str(r.json()) which produces a Python repr instead of valid JSON; import json at
the top of the module and replace that return with json.dumps(r.json()) so the
transcribe endpoint (the block calling _api_post_form and returning r.json())
emits proper JSON text that clients can parse.
| def _base_url() -> tuple[str, str]: | ||
| host = os.environ.get("OMNIVOICE_HOST", "127.0.0.1") | ||
| port = int(os.environ.get("OMNIVOICE_PORT", str(DEFAULT_PORT))) | ||
| return f"http://{host}:{port}/mcp/", f"http://{host}:{port}/health" | ||
|
|
There was a problem hiding this comment.
Guard OMNIVOICE_PORT parsing to avoid startup crash on bad env values.
At Line 47, int(os.environ.get("OMNIVOICE_PORT", ...)) can raise ValueError and kill the shim before it emits a usable diagnostic. This turns a typoed env into a hard crash instead of an actionable startup error.
Suggested fix
def _base_url() -> tuple[str, str]:
host = os.environ.get("OMNIVOICE_HOST", "127.0.0.1")
- port = int(os.environ.get("OMNIVOICE_PORT", str(DEFAULT_PORT)))
+ raw_port = os.environ.get("OMNIVOICE_PORT", str(DEFAULT_PORT))
+ try:
+ port = int(raw_port)
+ except ValueError:
+ raise ValueError(f"OMNIVOICE_PORT must be an integer, got: {raw_port!r}")
return f"http://{host}:{port}/mcp/", f"http://{host}:{port}/health"And in main() convert that exception to _err(...) + nonzero exit without traceback.
🧰 Tools
🪛 ast-grep (0.43.0)
[warning] 47-47: Do not make http calls without encryption
Context: f"http://{host}:{port}/mcp/"
Note: [CWE-319].
(requests-http)
[warning] 47-47: Do not make http calls without encryption
Context: f"http://{host}:{port}/health"
Note: [CWE-319].
(requests-http)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/mcp_shim/__main__.py` around lines 45 - 49, The _base_url() function
currently calls int(os.environ.get("OMNIVOICE_PORT", str(DEFAULT_PORT))) which
can raise ValueError on bad env input; wrap the port parsing in a try/except
inside _base_url() (or a small helper) to validate the env value and fall back
to DEFAULT_PORT if parsing fails, and in main() catch any parsing/ValueError
from _base_url(), call _err(...) with a clear message including the bad value
and exit with a nonzero status (no traceback). Ensure you reference the
OMNIVOICE_PORT env var, keep DEFAULT_PORT as the fallback, and use the existing
_err(...) helper and main() exit path so the process logs a friendly error
instead of crashing.
| const onAdd = async () => { | ||
| if (!clientId.trim()) return; | ||
| setError(null); | ||
| try { | ||
| await apiFetch('/api/mcp/bindings', { | ||
| method: 'PUT', | ||
| headers: { 'Content-Type': 'application/json' }, | ||
| body: JSON.stringify({ client_id: clientId.trim(), profile_id: profileId || null }), | ||
| }); | ||
| setClientId(''); | ||
| setProfileId(''); | ||
| refresh(); | ||
| } catch (e) { | ||
| setError(e?.message || 'Failed to save binding'); | ||
| } | ||
| }; | ||
|
|
||
| const onDelete = async (cid) => { | ||
| try { | ||
| await apiFetch(`/api/mcp/bindings/${encodeURIComponent(cid)}`, { method: 'DELETE' }); | ||
| refresh(); | ||
| } catch (e) { | ||
| setError(e?.message || 'Failed to delete binding'); |
There was a problem hiding this comment.
Add pending/disabled state to prevent duplicate add/delete requests.
Line [41]-Line [63] fires async PUT/DELETE with controls still enabled at Line [83]-Line [97]. Repeated clicks can enqueue conflicting requests and surface misleading errors.
Proposed fix (shape)
const [error, setError] = useState(null);
+ const [isSaving, setIsSaving] = useState(false);
+ const [deletingId, setDeletingId] = useState(null);
const onAdd = async () => {
if (!clientId.trim()) return;
+ if (isSaving) return;
setError(null);
+ setIsSaving(true);
try {
await apiFetch('/api/mcp/bindings', {
...
setClientId('');
setProfileId('');
- refresh();
+ await refresh();
} catch (e) {
setError(e?.message || 'Failed to save binding');
+ } finally {
+ setIsSaving(false);
}
};
const onDelete = async (cid) => {
+ if (deletingId) return;
+ setDeletingId(cid);
try {
await apiFetch(`/api/mcp/bindings/${encodeURIComponent(cid)}`, { method: 'DELETE' });
- refresh();
+ await refresh();
} catch (e) {
setError(e?.message || 'Failed to delete binding');
+ } finally {
+ setDeletingId(null);
}
};
...
- <button type="button" onClick={() => onDelete(b.client_id)}
+ <button type="button" onClick={() => onDelete(b.client_id)} disabled={!!deletingId || isSaving}
...
- <input type="text" value={clientId} ...
+ <input type="text" value={clientId} disabled={isSaving || !!deletingId} ...
...
- <select value={profileId} ...
+ <select value={profileId} disabled={isSaving || !!deletingId} ...
...
- <button type="button" onClick={onAdd} data-testid="mcp-add">Bind</button>
+ <button type="button" onClick={onAdd} disabled={isSaving || !!deletingId} data-testid="mcp-add">Bind</button>As per coding guidelines, long-running frontend operations should expose loading/disabled states.
Also applies to: 83-97
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@frontend/src/components/settings/MCPBindingsPanel.jsx` around lines 41 - 63,
Add a loading/disabled state to prevent duplicate requests: introduce a boolean
state (e.g., isSaving or isBusy, and optionally isDeleting keyed by client id)
and use it in onAdd and onDelete to short-circuit repeated clicks, set the flag
true before the apiFetch call and false in a finally block, and pass the flag
down to the UI controls (the add button/input and delete buttons rendered in the
MCPBindingsPanel) so they are disabled while the request is in-flight; ensure
onAdd clears inputs only after success and onDelete uses the per-item deleting
flag to disable just that delete control to avoid blocking other actions.
Source: Coding guidelines
| # (Assumption A1 in RESEARCH.md was checked at execute-time and proved | ||
| # false — `cryptography` is not on the install path today). | ||
| "cryptography>=41", | ||
| "mcp>=1.2", |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What is the earliest mcpPython package version that supports FastMCPsettings.streamable_http_path and request context header access used in mounted Streamable-HTTP apps?
💡 Result:
The earliest FastMCP version that supports settings.streamable_http_path and request context header access in Streamable-HTTP mounted apps is v2.3.0 [1][2]. Support for the Streamable HTTP transport, including the associated streamable_http_path configuration setting and the necessary session management to handle requests and context headers in Streamable-HTTP apps, was introduced in FastMCP v2.3.0, released on May 8, 2025 [3][1][2]. This release (titled "Stream Me Up, Scotty") implemented the Streamable HTTP server and client support, which was tracked and finalized under pull request #361 [3][1][2]. While later versions—such as v2.5.0 and the v3.x series—have further refined HTTP routing, dependency management, and OpenAPI integration [4][5], the fundamental infrastructure for Streamable-HTTP configurations and context access originated in v2.3.0 [3][1][2]. Sources: [3] PrefectHQ/fastmcp#361 [4] https://github.com/PrefectHQ/fastmcp/releases/tag/v3.0.0 [1] PrefectHQ/fastmcp#130 [2] https://github.com/jlowin/fastmcp/releases/tag/v2.3.0 [5] https://github.com/jlowin/fastmcp/releases/tag/v2.5.0
Citations:
- 1: Streamable HTTP support PrefectHQ/fastmcp#130
- 2: https://github.com/jlowin/fastmcp/releases/tag/v2.3.0
- 3: Streamable HTTP support PrefectHQ/fastmcp#361
- 4: https://github.com/PrefectHQ/fastmcp/releases/tag/v3.0.0
- 5: https://github.com/jlowin/fastmcp/releases/tag/v2.5.0
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== pyproject.toml line around 111 =="
python3 - <<'PY'
import itertools
path="pyproject.toml"
start=95; end=130
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
if start<=i<=end:
print(f"{i}: {line.rstrip()}")
PY
echo
echo "== Find mcp/fastmcp imports and mcp_server usage of streamable_http_path =="
rg -n --hidden --no-ignore-vcs "(FastMCP|streamable_http_path|/mcp|request context|request_context|header|get_header|X-|\bmcp_server\.py\b)" -S .
echo
echo "== Locate backend/mcp_server.py definition and the referenced line window (46-134) if present =="
if [ -f "backend/mcp_server.py" ]; then
python3 - <<'PY'
path="backend/mcp_server.py"
start=35; end=160
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
if start<=i<=end:
print(f"{i}: {line.rstrip()}")
PY
else
echo "backend/mcp_server.py not found at repo root; searching by filename..."
fd -t f "mcp_server.py" .
fiRepository: debpalash/OmniVoice-Studio
Length of output: 50382
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== backend/mcp_server.py: lines 1-220 (only) =="
python3 - <<'PY'
import itertools
path="backend/mcp_server.py"
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
if i<=220:
print(f"{i}: {line.rstrip()}")
PY
echo
echo "== uv.lock: mcp package entries (context around line numbers) =="
rg -n '^name = "mcp"$|mcp-1\.' uv.lock
echo
# show the specific resolved stanza for the locked mcp
python3 - <<'PY'
import re
path="uv.lock"
target="mcp"
with open(path,'r',encoding='utf-8') as f:
lines=f.readlines()
for i,l in enumerate(lines):
if l.strip().startswith('name = "mcp"'):
start=max(0,i-15); end=min(len(lines), i+60)
for j in range(start,end):
print(f"{j+1}: {lines[j].rstrip()}")
break
else:
# fallback: print around the known artifact lines we already saw
for i,l in enumerate(lines):
if "mcp-1." in l:
start=max(0,i-20); end=min(len(lines), i+50)
for j in range(start,end):
print(f"{j+1}: {lines[j].rstrip()}")
break
PY
echo
echo "== docs/mcp.md mentions MCP/FastMCP versioning? =="
rg -n "mcp|FastMCP|streamable_http_path" docs/mcp.mdRepository: debpalash/OmniVoice-Studio
Length of output: 18563
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== backend/main.py MCP mount usage (around 420-480) =="
python3 - <<'PY'
path="backend/main.py"
start=400; end=500
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
if start<=i<=end:
print(f"{i}: {line.rstrip()}")
PY
echo
echo "== tests/test_mcp_mount.py (full file, small) =="
wc -l tests/test_mcp_mount.py
python3 - <<'PY'
path="tests/test_mcp_mount.py"
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
print(f"{i}: {line.rstrip()}")
PY
echo
echo "== Search for explicit mcp version expectations in repo =="
rg -n "mcp[^\n]*(>=|==|<=|<)\s*|FastMCP|streamable_http_path" pyproject.toml backend docs .github . -S --glob='!uv.lock' --glob='!**/*.png' --glob='!**/*.jpg'Repository: debpalash/OmniVoice-Studio
Length of output: 14371
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== backend/main.py MCP mount usage (around 420-480) =="
python3 - <<'PY'
path="backend/main.py"
start=420; end=480
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
if start<=i<=end:
print(f"{i}: {line.rstrip()}")
PY
echo
echo "== tests/test_mcp_mount.py =="
wc -l tests/test_mcp_mount.py
python3 - <<'PY'
path="tests/test_mcp_mount.py"
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
print(f"{i}: {line.rstrip()}")
PY
echo
echo "== Search for explicit mcp version expectations in repo =="
rg -n "mcp[^\n]*(>=|==|<=|<)\s*|FastMCP|streamable_http_path" pyproject.toml backend docs .github . -S --glob='!uv.lock'Repository: debpalash/OmniVoice-Studio
Length of output: 12378
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== backend/mcp_server.py: find streamable_http_app + session manager usage =="
rg -n "streamable_http_app|session_manager|AsyncExitStack|streamable_http_path|get_context\(\)|request_context" backend/mcp_server.py
echo
echo "== backend/mcp_server.py: tail (last ~120 lines) =="
python3 - <<'PY'
import pathlib
p=pathlib.Path("backend/mcp_server.py")
lines=p.read_text(encoding="utf-8").splitlines()
start=max(1,len(lines)-120+1)
for i in range(start, len(lines)+1):
print(f"{i}: {lines[i-1]}")
PY
echo
echo "== Search for mcp 1.27.x mention in repo =="
rg -n "mcp[- ]?1\.27|FastMCP.*1\.27|streamable_http_path.*1\.27" . -SRepository: debpalash/OmniVoice-Studio
Length of output: 5640
🌐 Web query:
Find where streamable_http_pathis introduced formcp.server.fastmcp.FastMCP(Pythonmcppackage), and whichmcpversions includemcp.get_context().request_context.request / request context header access.
💡 Result:
The streamable_http_path parameter was introduced in the mcp Python SDK to configure the base endpoint path for StreamableHTTP transports [1][2]. It is primarily configured during FastMCP initialization or passed to the streamable_http_app() method [3][4][5]. This parameter allows developers to specify where the server endpoints are mounted (e.g., setting streamable_http_path="/" allows the MCP server to respond at the root path of the mounted application) [3][1]. Access to raw request context headers—specifically via mcp.get_context().request_context.request—was introduced in the official modelcontextprotocol/python-sdk following Pull Request #380, which added support for raw request injection into the RequestContext [6]. This capability allows developers to access the underlying Starlette Request object and its headers directly within MCP tools and resources when using HTTP-based transports [6][7][8]. For robust, transport-agnostic header access, it is recommended to use the built-in dependency helpers provided by the SDK, such as get_http_headers(), which safely retrieve headers from the current HTTP request if available without raising exceptions when no HTTP context exists [9][10].
Citations:
- 1: https://deepwiki.com/modelcontextprotocol/python-sdk/8.2-streamablehttp-server-examples
- 2: https://github.com/modelcontextprotocol/python-sdk/blob/e8e64842/docs/migration.md
- 3: https://github.com/modelcontextprotocol/python-sdk/blob/v1.x/examples/snippets/servers/streamable_http_path_config.py
- 4: https://github.com/modelcontextprotocol/python-sdk/blob/e8e64842/README.v2.md
- 5: https://github.com/modelcontextprotocol/python-sdk/blob/main/src/mcp/server/mcpserver/server.py
- 6: Added support for raw request injection in RequestContext. modelcontextprotocol/python-sdk#380
- 7: how to get http request headers in mcp tools logic when i use streamable http mode to run mcp server ? modelcontextprotocol/python-sdk#750
- 8: Pass Token Bearer from Client to MCP Server modelcontextprotocol/python-sdk#998
- 9: https://gofastmcp.com/v2/servers/context
- 10: https://gofastmcp.com/python-sdk/fastmcp-server-dependencies
🌐 Web query:
Search mcp.server.fastmcp streamable_http_pathand return the most relevantmcp release notes or PRs for the setting and request context API.
💡 Result:
The streamable_http_path setting in FastMCP controls the base endpoint path for StreamableHTTP connections [1][2]. It is a critical configuration option for integrating MCP servers into larger ASGI applications (e.g., via Starlette or FastAPI) to avoid nested path issues like /mcp/mcp [3][4][5]. Key Information: - Default Behavior: As of PR #1007, the default value for streamable_http_path was updated to / to improve consistency and prevent automatic, potentially confusing, path nesting when mounting [4]. - Usage: Users can configure this path during FastMCP initialization or when generating the ASGI app [3][6]. - API Integration: When using StreamableHTTP mode, request context (such as HTTP headers) can be accessed within MCP tools using the Context object or by importing helper functions like get_http_headers from fastmcp.server.dependencies [7]. - Migration: Recent versions of the SDK (e.g., v1.x+) have moved transport-specific parameters, including streamable_http_path, out of the MCPServer constructor and into the.run or.streamable_http_app methods to improve separation of concerns [8]. Related PRs and Issues: - PR #1007: Standardized the default streamable_http_path to / [4]. - Issue #750: Discusses accessing HTTP headers in streamable-http mode via the Context object [7]. - Issue #951: Addresses path mounting and redirection challenges when using FastAPI with streamable_http_app [5]. - Issue #1233: Highlights issues regarding stale request context access in some server implementations [9]. For detailed implementation examples, refer to the official FastMCP documentation on StreamableHTTP transport [1][6].
Citations:
- 1: https://gofastmcp.com/python-sdk/fastmcp-server-http
- 2: https://github.com/jlowin/fastmcp/blob/main/src/fastmcp/server/http.py
- 3: https://github.com/modelcontextprotocol/python-sdk/blob/v1.x/examples/snippets/servers/streamable_http_path_config.py
- 4: feat: default streamable_http_path to "/" for consistency modelcontextprotocol/python-sdk#1007
- 5: Request redirection for the MCP server mounted with FastAPI modelcontextprotocol/python-sdk#951
- 6: https://pypi.org/project/mcp/1.25.0/
- 7: how to get http request headers in mcp tools logic when i use streamable http mode to run mcp server ? modelcontextprotocol/python-sdk#750
- 8: https://github.com/modelcontextprotocol/python-sdk/blob/e8e64842/docs/migration.md
- 9: Issue with HTTP Request Context Access in MCP Tool Execution PrefectHQ/fastmcp#1233
🌐 Web query:
Find the mcp(Model Context Protocol Python SDK) version that includes PR#380(raw request injection intoRequestContext) and which mcpversion includes PR#1007(defaultstreamable_http_pathto/). Return the release tag/version numbers.
💡 Result:
The MCP (Model Context Protocol) Python SDK incorporated the following changes: - PR #380, which added support for raw request injection into RequestContext, was merged into the main branch and is available in releases starting from v1.3.0. - PR #1007, which updated the default streamable_http_path to /, was merged and is included in releases starting from v1.11.0. These versions represent the stable progression within the v1.x series of the SDK. As of June 11, 2026, the project is maintaining v1.x while actively developing v2.0.0 (currently in alpha).{top_results:[1][2][3][4][5]}
Citations:
- 1: Added support for raw request injection in RequestContext. modelcontextprotocol/python-sdk#380
- 2: feat: default streamable_http_path to "/" for consistency modelcontextprotocol/python-sdk#1007
- 3: https://github.com/modelcontextprotocol/python-sdk/tree/v1.x
- 4: https://github.com/modelcontextprotocol/python-sdk/releases/tag/v1.9.4
- 5: https://github.com/modelcontextprotocol/python-sdk/releases/tag/v1.10.0
🌐 Web query:
In the modelcontextprotocol/python-sdkrepo, search forstreamable_http_pathand identify the earliest taggedmcprelease version that containsFastMCP.settings.streamable_http_path (or equivalent) for StreamableHTTP servers.
💡 Result:
The streamable_http_path configuration for StreamableHTTP servers was introduced in the modelcontextprotocol/python-sdk as part of the initial support for the Streamable HTTP transport [1][2]. This functionality was officially released in version v1.8.0 [1][2]. In the original implementation and subsequent versions, streamable_http_path is a configuration setting used to define the path for StreamableHTTP connections [3][4]. Users can configure it to control where the MCP server is mounted, for example by setting mcp.settings.streamable_http_path = "/" [3]. It is important to note that the modelcontextprotocol/python-sdk is distinct from the fastmcp library (maintained by PrefectHQ) [5][6][7]. While both support similar functionality and terminology, release versions are tracked independently [8][9]. The modelcontextprotocol/python-sdk v1.8.0 release is the authoritative source for the initial introduction of Streamable HTTP support in the official SDK [1].
Citations:
- 1: https://github.com/modelcontextprotocol/python-sdk/releases/tag/v1.8.0
- 2: streamable_http.py not in Python SDK 1.7.1 modelcontextprotocol/python-sdk#650
- 3: http://github.com/modelcontextprotocol/python-sdk?tab=readme-ov-file
- 4: modelcontextprotocol/python-sdk@8adb5bd
- 5: https://github.com/jlowin/fastmcp/blob/e87ede07/src/fastmcp/server/http.py
- 6: https://github.com/jlowin/fastmcp/blob/main/docs/deployment/http.mdx
- 7: https://github.com/jlowin/fastmcp/blob/e87ede07/docs/updates.mdx
- 8: https://github.com/modelcontextprotocol/python-sdk/releases/tag/v2.0.0a1
- 9: https://github.com/PrefectHQ/fastmcp/releases
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import re
path="backend/main.py"
with open(path,'r',encoding='utf-8') as f:
lines=f.readlines()
for i,l in enumerate(lines, start=1):
if "mcp_session_manager" in l or "streamable_http_app" in l or "mount_mcp" in l:
print(f"{i}: {l.rstrip()}")
PY
echo "== backend/main.py: show around mcp_session_manager block (if any) =="
python3 - <<'PY'
path="backend/main.py"
with open(path,'r',encoding='utf-8') as f:
lines=f.readlines()
hits=[]
for i,l in enumerate(lines, start=1):
if "mcp_session_manager" in l or "streamable_http_app" in l or "mount_mcp" in l:
hits.append(i)
if not hits:
print("No hits found.")
raise SystemExit(0)
for h in sorted(set(hits)):
start=max(1,h-25); end=min(len(lines),h+25)
print(f"\n--- context around line {h} ({start}-{end}) ---")
for j in range(start,end+1):
print(f"{j}: {lines[j-1].rstrip()}")
PYRepository: debpalash/OmniVoice-Studio
Length of output: 15064
Pin mcp to the tested 1.27.x range to keep /mcp mount + per-agent header binding stable — pyproject.toml currently allows mcp>=1.2 (line 111), but the runtime relies on mcp.settings.streamable_http_path = "/" (backend/mcp_server.py:61) and on HTTP request header access via mcp.get_context().request_context.request (backend/mcp_server.py:94-98); with an older/newer mcp, the mount can land on the wrong paths and/or fail to read X-OmniVoice-Client-Id, breaking agent voice binding.
Suggested fix
- "mcp>=1.2",
+ "mcp>=1.27,<1.28",This matches the repo’s resolved version (uv.lock pins mcp==1.27.2, and the mount tests assert the sub-mount shape: tests/test_mcp_mount.py:28-50).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pyproject.toml` at line 111, Update the pyproject.toml dependency for mcp to
pin it to the tested 1.27.x range (e.g. change "mcp>=1.2" to "mcp>=1.27,<1.28")
so the runtime assumptions in backend/mcp_server.py
(settings.streamable_http_path = "/" and use of
mcp.get_context().request_context.request to read X-OmniVoice-Client-Id) remain
stable and match the tests that expect the 1.27 sub-mount behavior.
| os.environ.setdefault("OMNIVOICE_MODEL", "test") | ||
| os.environ.setdefault("OMNIVOICE_DISABLE_FILE_LOG", "1") |
There was a problem hiding this comment.
Module-level env mutation leaks test state across the suite.
Line 13 and Line 14 mutate os.environ at import time, so unrelated tests can inherit OMNIVOICE_MODEL=test / OMNIVOICE_DISABLE_FILE_LOG=1 based on import order. Move these into an autouse fixture with monkeypatch so state is restored per test.
🔧 Proposed fix
-os.environ.setdefault("OMNIVOICE_MODEL", "test")
-os.environ.setdefault("OMNIVOICE_DISABLE_FILE_LOG", "1")
+@pytest.fixture(autouse=True)
+def _isolated_test_env(monkeypatch):
+ monkeypatch.setenv("OMNIVOICE_MODEL", "test")
+ monkeypatch.setenv("OMNIVOICE_DISABLE_FILE_LOG", "1")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/test_mcp_bindings.py` around lines 13 - 14, Remove the module-level
os.environ.setdefault("OMNIVOICE_MODEL", "test") and
os.environ.setdefault("OMNIVOICE_DISABLE_FILE_LOG", "1") calls in
tests/test_mcp_bindings.py and instead add a pytest autouse fixture (e.g., def
env_autouse(monkeypatch):) that uses monkeypatch.setenv("OMNIVOICE_MODEL",
"test") and monkeypatch.setenv("OMNIVOICE_DISABLE_FILE_LOG", "1") before
yielding; this ensures each test gets the env vars and the monkeypatch restores
state after each test.
| here = os.path.abspath(os.path.dirname(__file__)) | ||
| root = here | ||
| while root and root != "/" and not os.path.isfile(os.path.join(root, "alembic.ini")): | ||
| root = os.path.dirname(root) | ||
| cfg = Config(os.path.join(root, "alembic.ini")) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import ntpath
root = r"C:\repo\tests"
for i in range(8):
parent = ntpath.dirname(root)
print(f"{i}: root={root!r}, parent={parent!r}")
if parent == root:
print("stuck_at_root=True")
break
root = parent
PYRepository: debpalash/OmniVoice-Studio
Length of output: 198
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect suspected snippet
if [ -f "tests/test_mcp_bindings.py" ]; then
echo "== tests/test_mcp_bindings.py (around 100-160) =="
nl -ba tests/test_mcp_bindings.py | sed -n '90,170p'
else
echo "Missing: tests/test_mcp_bindings.py"
fi
# Find any other occurrences of similar root-walk logic referencing alembic.ini
echo
echo "== Search: alembic.ini usage =="
rg -n "alembic\.ini" .
echo
echo "== Search: root != '/' patterns (possible non-terminating loops) =="
rg -n "root\s*!=\s*['\"]/['\"]" .Repository: debpalash/OmniVoice-Studio
Length of output: 181
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check file existence =="
ls -la tests | head
ls -la tests/test_mcp_bindings.py || true
echo
echo "== tests/test_mcp_bindings.py (around 100-160) =="
if [ -f "tests/test_mcp_bindings.py" ]; then
awk 'NR>=90 && NR<=170 {printf "%d\t%s\n", NR, $0}' tests/test_mcp_bindings.py
else
echo "Missing: tests/test_mcp_bindings.py"
fi
echo
echo "== Search: alembic.ini usage =="
rg -n --hidden --glob '!.git/*' "alembic\.ini" .
echo
echo "== Search: root != '/' patterns =="
rg -n --hidden --glob '!.git/*' "root\s*!=\s*['\"]/['\"]" .Repository: debpalash/OmniVoice-Studio
Length of output: 8445
Fix potential non-terminating root-walk loops on Windows (alembic.ini discovery)
tests/test_mcp_bindings.py(_run_alembic, loop at ~line 119) useswhile root and root != "/" ... root = os.path.dirname(root), which can hang on Windows becauseos.path.dirname("C:\\")returnsC:\\(sorootnever progresses to the termination condition).- Same pattern exists in
tests/test_profile_consent.py(~line 220) andtests/backend/services/test_settings_store.py(~line 245). - Change the loop termination to stop when
os.path.dirname(root) == root(or usepathlib.Path(...).parents) and keep thealembic.iniexistence assertion.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/test_mcp_bindings.py` around lines 117 - 121, The root-walking loop in
the _run_alembic routine uses `while root and root != "/"` which can never
terminate on Windows (e.g., "C:\\"); update the loop condition to stop when the
parent equals the current directory (e.g., `while True:` break when
`os.path.dirname(root) == root`) or switch to pathlib.parents to iterate up
parents, and keep the existing check for os.path.isfile(os.path.join(root,
"alembic.ini")) so the search still asserts presence of alembic.ini; apply the
same fix to the analogous loops in tests/test_profile_consent.py and
tests/backend/services/test_settings_store.py.
Source: Coding guidelines
The two main-importing mount tests ran the app lifespan, which now starts
the FastMCP session manager and binds asyncio queues to the test loop —
contaminating later lifespan-running tests ('bound to a different event
loop'). The mount happens at import time, so inspecting app.routes for the
/mcp Mount is the correct loop-free assertion. Same fix shape as the
Wave 0.2 consent tests.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
| try: | ||
| raw = base64.b64decode(audio_base64, validate=True) | ||
| except Exception: | ||
| return '{"error":"audio_base64 is not valid base64"}' | ||
| # 200 MB cap — same spirit as voicebox's transcribe gate. Keeps a | ||
| # buggy/hostile agent from posting an unbounded blob. | ||
| if len(raw) > 200 * 1024 * 1024: | ||
| return '{"error":"audio exceeds 200 MB limit"}' |
There was a problem hiding this comment.
Size guard fires after the 200 MB allocation, not before
The comment explicitly says this cap "Keeps a buggy/hostile agent from posting an unbounded blob," but base64.b64decode runs before the check. A 200 MB audio file encodes to ~267 MB of base64; the server fully allocates the ~200 MB decoded bytes object and then rejects it. A buggy agent looping the call, or one injecting large payloads, can trigger repeated 200 MB heap allocations before any guard fires. Add an encoded-length pre-check (base64 expands by ~4/3) so the blob is never decoded when it would exceed the cap.
| try: | |
| raw = base64.b64decode(audio_base64, validate=True) | |
| except Exception: | |
| return '{"error":"audio_base64 is not valid base64"}' | |
| # 200 MB cap — same spirit as voicebox's transcribe gate. Keeps a | |
| # buggy/hostile agent from posting an unbounded blob. | |
| if len(raw) > 200 * 1024 * 1024: | |
| return '{"error":"audio exceeds 200 MB limit"}' | |
| # 200 MB cap — same spirit as voicebox's transcribe gate. Keeps a | |
| # buggy/hostile agent from posting an unbounded blob. | |
| # Check the encoded length first (~4/3 overhead) so we never allocate | |
| # the decoded blob when we'd immediately reject it anyway. | |
| if len(audio_base64) > 200 * 1024 * 1024 * 4 // 3 + 64: | |
| return '{"error":"audio exceeds 200 MB limit"}' | |
| try: | |
| raw = base64.b64decode(audio_base64, validate=True) | |
| except Exception: | |
| return '{"error":"audio_base64 is not valid base64"}' | |
| if len(raw) > 200 * 1024 * 1024: | |
| return '{"error":"audio exceeds 200 MB limit"}' |
Root cause of the CI failure: the bindings REST fixture set
OMNIVOICE_MCP_DISABLE=1 and reloaded main but never restored it, so a
later 'from main import app' in test_mcp_mount saw /mcp un-mounted
({'/audio','/voice_audio'}). Reloading main mutates the shared module for
every subsequent test.
- REST fixture: drop the disable flag (the mount is harmless without a
lifespan), yield the client, and restore main (+ core.config/db) to the
default data dir in teardown so the global module is clean again.
- test_main_mounts_mcp_route: reload main with the disable flag cleared so
the assertion is independent of any earlier reload.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
What
Wave 2.2 of the parity program (Spec 2) — the headline "agents speak in your voice" feature. The FastMCP server existed in-tree but was never mounted (dead code); this wires it up and adds per-agent voice binding.
Mount. FastMCP's Streamable-HTTP app is sub-mounted at
/mcpon the main FastAPI app; its session manager is composed into the app lifespan viaAsyncExitStack(wrapping, not replacing, the existing startup).streamable_http_path="/"so the sub-mount lands at/mcp, not/mcp/mcp. Best-effort throughout — a missingmcppackage orOMNIVOICE_MCP_DISABLE=1leaves the rest of the backend untouched. Adds themcpdependency (1.27.x).Per-agent voice binding. Each MCP client sends
X-OmniVoice-Client-Id;generate_speechresolves the voice as explicit arg → client binding → global default → app default. Newmcp_client_bindingstable (alembic 0004 +_BASE_SCHEMA, additive + idempotent),services/mcp_bindings.py(CRUD +resolve_voice+ best-effort last-seen), loopback-gated REST (/api/mcp/bindings), and a Settings → Sharing panel.More. New
transcribetool (base64 in, 200 MB cap). Stdio shim (backend/mcp_shim, httpx-only, ported from voicebox MIT) for clients that only speak stdio — forwardsOMNIVOICE_CLIENT_IDas the binding header.docs/mcp.md(both connection modes + binding REST) anddocs/mcp.jsonupdated.Verification
tests/test_mcp_bindings.py— service CRUD, resolution precedence (explicit/binding/global/none), migration up+down;tests/test_mcp_mount.pybuild + streamable-app-serves-at-root.initializehandshake —/mcproutes and the session manager runs./mcpmounted (not 404),OMNIVOICE_MCP_DISABLE=1skips the mount.bun run typecheck:ciclean;bunx vitest run312 passed; CJK + docs-drift gates green.🤖 Generated with Claude Code
Overview
This PR implements Wave 2.2 ("agents speak in your voice") by:
Backend Changes
MCP Server Mounting
Per-Agent Voice Binding
Mermaid flowchart (voice resolution in generate_speech):
flowchart TD A[generate_speech call] --> B{explicit profile_id?} B -- Yes --> C[Use explicit profile_id] B -- No --> D{X-OmniVoice-Client-Id present?} D -- Yes --> E[Lookup client binding → profile_id/default_engine?] E -- Yes --> C E -- No --> F{global pref mcp_default_profile_id?} D -- No --> F F -- Yes --> C F -- No --> G[No voice resolved → return None]New MCP Tools
MCP Stdio Shim
Frontend Changes
Settings UI — Sharing Tab
BEFORE:
┌─ Settings (Sharing Tab) ─────────┐
│ ┌─ SharingPanel ─────────────┐ │
│ └────────────────────────────┘ │
│ ┌─ RemoteBackendPanel ───────┐ │
│ └────────────────────────────┘ │
└──────────────────────────────────┘
AFTER:
┌─ Settings (Sharing Tab) ──────────────┐
│ ┌─ SharingPanel ─────────────────┐ │
│ └────────────────────────────────┘ │
│ ┌─ RemoteBackendPanel ───────────┐ │
│ └────────────────────────────────┘ │
│ ┌─ MCPBindingsPanel ─────────────┐ │
│ │ • List existing bindings │ │
│ │ • Bind client ID → profile │ │
│ │ • Delete binding │ │
│ └────────────────────────────────┘ │
└───────────────────────────────────────┘
Migrations & Schema
Tests & CI
Documentation
Notes / Verification