Skip to content

fix(connectors): MCP single-writer enforcement + security hardening (#976)#998

Merged
itomek merged 1 commit intomainfrom
feat/976-mcp-single-writer-security
May 8, 2026
Merged

fix(connectors): MCP single-writer enforcement + security hardening (#976)#998
itomek merged 1 commit intomainfrom
feat/976-mcp-single-writer-security

Conversation

@itomek
Copy link
Copy Markdown
Collaborator

@itomek itomek commented May 8, 2026

Closes #976. Part of #927.

Three vulnerabilities shipped on main that this PR fixes:

  1. Dual-writer / plaintext secret leakMCPConfig.add_server, POST /api/mcp/servers, and gaia mcp add all wrote mcp_servers.json directly, bypassing the keyring. API keys ended up plaintext on disk. Now McpServerHandler.configure() is the sole writer; everything else is read-only or deleted.

  2. Keyring exfiltration at MCP spawn time_resolve_keyring_refs accepted any service string in a $keyring reference. A corrupted mcp_servers.json could carry "Chrome Safe Storage:Chrome:..." and inject another app's secrets into a spawned MCP subprocess env. Now any service other than gaia.connections is rejected before keyring.get_password is called.

  3. Silent grant inheritance on reconnectdisconnect() cleared keyring entries but left grants.json intact. Re-adding a connector with the same id silently re-attached prior agent consents — a real bypass. Both McpServerHandler and OAuthPkceHandler now call revoke_all_grants_for(connector_id) on disconnect.

Plus: rehype-raw in MessageBubble.tsx allowed LLM-generated <script> tags and javascript: URLs to execute in the Electron renderer. Replaced with SAFE_DISALLOWED_ELEMENTS + safeUrlTransform (new src/utils/markdown.ts).

Catalog reduction

The 22-entry MCP catalog is reduced to 5 tested entries: mcp-github, mcp-filesystem, mcp-fetch, mcp-memory, mcp-git. The other 17 had zero test or doc coverage and are deleted. Users who want the deleted servers use the custom MCP path (tracked in #977).

Test plan

# New regression guards (22 new tests across 3 new files)
pytest tests/unit/connectors/test_catalog_ledger.py -v         # catalog ledger: 4 pass, 5 skip
pytest tests/unit/connectors/test_no_dual_writer.py -v         # dual-writer guard: 9 pass
pytest tests/unit/connectors/test_disconnect_clears_grants.py -v  # grant cleanup: 4 pass

# Keyring security (service pinning + fail-closed)
pytest tests/unit/connectors/test_mcp_server.py::TestResolveKeyringRefs -v  # 6 pass

# Full connectors suite
pytest tests/unit/connectors/ -q  # 305 pass, 5 skip, 1 pre-existing failure (test_oauth_pkce unrelated)

# Verify deletions
grep "rehype-raw" src/gaia/apps/webui/package.json            # nothing
grep -c "ConnectorSpec(" src/gaia/connectors/catalog/mcp_servers.py  # 5
python -m gaia.cli mcp --help                                  # no add/remove subcommands

…976)

Closes #976. Part of #927.

Before this change, three code paths independently wrote `~/.gaia/mcp_servers.json`,
two of them bypassing the keyring and storing API keys as plaintext. A compromised
MCP server entry could also exfiltrate arbitrary keyring secrets at spawn time,
disconnecting a connector left prior agent grants intact for re-use, and LLM output
in the Electron renderer could inject `<script>` tags via `rehype-raw`.

Changes:
- Delete `MCPConfig.add_server`, `remove_server`, `_save` — config.py is now read-only;
  `McpServerHandler.configure()` is the sole writer of mcp_servers.json
- Delete `POST /api/mcp/servers`, `DELETE /api/mcp/servers/{name}`, enable/disable, and
  `GET /api/mcp/catalog` from mcp.py; delete corresponding dead stubs from api.ts
- Remove `gaia mcp add` and `gaia mcp remove` CLI subcommands (runtime commands kept)
- Pin `_resolve_keyring_refs` to service `gaia.connections`; raise `ConnectorsError`
  (not `RuntimeError`) on foreign service or missing entry, with an actionable message
- Wipe all agent grants (`revoke_all_grants_for`) in both `McpServerHandler.disconnect`
  and `OAuthPkceHandler.disconnect` — prevents silent grant inheritance on reconnect
- Harden file permissions: `chmod 0600` on `mcp_servers.json` and `grants.json` after
  every atomic write
- Reduce MCP catalog from 22 to 5 tested entries (mcp-github, mcp-filesystem, mcp-fetch,
  mcp-memory, mcp-git); delete 17 untested entries that had zero test/doc coverage
- Replace `rehype-raw` with hardened `SAFE_DISALLOWED_ELEMENTS` + `safeUrlTransform` in
  MessageBubble; new `src/utils/markdown.ts` centralises the safe rendering policy

Tests: 3 new test files (22 tests) — catalog ledger, dual-writer regression guard,
disconnect-clears-grants; keyring tests extended with foreign-service refusal case.
305 connectors tests pass (1 pre-existing failure in test_oauth_pkce unrelated to this PR).
@github-actions github-actions Bot added mcp MCP integration changes cli CLI changes tests Test changes labels May 8, 2026
@itomek itomek self-assigned this May 8, 2026
itomek added a commit that referenced this pull request May 8, 2026
Closes #1004. Part of #927. Builds on #976 (PR #998).

Users can now temporarily disable a configured MCP connector without
losing its credentials or per-agent grants. Toggle off the GitHub MCP
during a sensitive code review, toggle back on without re-pasting the
PAT — credentials and agent consent persist across the round-trip.

Persistence already worked: ``mcp_servers.json`` carries a per-entry
``disabled`` flag and ``MCPClientManager.load_from_config()`` already
skips disabled entries. This PR adds the API/UI/wiring to drive it.

Backend:
- ``McpServerHandler.set_enabled(connector_id, enabled)`` flips the
  ``disabled`` flag without touching keyring or grants. Reuses #976's
  atomic-write + chmod 0600 helpers; raises ``ConnectorsError`` on
  unknown / unconfigured ids
- ``McpServerHandler.set_reload_callback`` setter so the FastAPI
  lifespan can attach a callback after the singleton handler imports
- New endpoints ``POST /api/connectors/{id}/enable`` and ``/disable``
  (CSRF-guarded, 400 on non-MCP types, 404 on unconfigured)
- New SSE events ``connector.enabled`` / ``connector.disabled``
- ``_connector_summary`` exposes ``enabled: bool`` (defaults true for
  OAuth / not-yet-configured MCP so no spurious "Disabled" pills)
- ``reload_all_session_agents_mcp`` broadcasts to every cached
  ChatAgent's per-instance MCPClientManager — wired as the handler's
  reload_callback at lifespan startup. Toggling takes effect live; no
  GAIA restart needed

Frontend:
- ``ConnectorRow.enabled: boolean`` on the wire
- ``enableConnector`` / ``disableConnector`` API helpers
- SSE consumer recognises ``'enabled' | 'disabled'`` reasons
- Toggle row in MCPServerConfigureBody (only when configured), with
  helper text when disabled. Reuses existing ``.toggle-switch`` CSS
- New ConnectorTileMenu (⋯) on each tile header for one-click
  enable/disable/disconnect without drilling in. Self-positioning
  popup, click-outside + Escape to close
- New ``.connector-status.disabled`` pill in the tile header

Tests: 30 new (12 handler + 14 router + 4 reload-wiring). Full
connectors suite: 335 pass, 5 skipped, 1 pre-existing failure
(test_oauth_pkce, unrelated).
@itomek itomek marked this pull request as ready for review May 8, 2026 20:43
@itomek itomek requested a review from kovtcharov-amd as a code owner May 8, 2026 20:43
@itomek itomek enabled auto-merge May 8, 2026 20:45
@itomek itomek added this pull request to the merge queue May 8, 2026
Merged via the queue into main with commit b8e0cfa May 8, 2026
60 of 61 checks passed
@itomek itomek deleted the feat/976-mcp-single-writer-security branch May 8, 2026 21:18
itomek added a commit that referenced this pull request May 8, 2026
Closes #1004. Part of #927. Builds on #976 (PR #998).

Users can now temporarily disable a configured MCP connector without
losing its credentials or per-agent grants. Toggle off the GitHub MCP
during a sensitive code review, toggle back on without re-pasting the
PAT — credentials and agent consent persist across the round-trip.

Persistence already worked: ``mcp_servers.json`` carries a per-entry
``disabled`` flag and ``MCPClientManager.load_from_config()`` already
skips disabled entries. This PR adds the API/UI/wiring to drive it.

Backend:
- ``McpServerHandler.set_enabled(connector_id, enabled)`` flips the
  ``disabled`` flag without touching keyring or grants. Reuses #976's
  atomic-write + chmod 0600 helpers; raises ``ConnectorsError`` on
  unknown / unconfigured ids
- ``McpServerHandler.set_reload_callback`` setter so the FastAPI
  lifespan can attach a callback after the singleton handler imports
- New endpoints ``POST /api/connectors/{id}/enable`` and ``/disable``
  (CSRF-guarded, 400 on non-MCP types, 404 on unconfigured)
- New SSE events ``connector.enabled`` / ``connector.disabled``
- ``_connector_summary`` exposes ``enabled: bool`` (defaults true for
  OAuth / not-yet-configured MCP so no spurious "Disabled" pills)
- ``reload_all_session_agents_mcp`` broadcasts to every cached
  ChatAgent's per-instance MCPClientManager — wired as the handler's
  reload_callback at lifespan startup. Toggling takes effect live; no
  GAIA restart needed

Frontend:
- ``ConnectorRow.enabled: boolean`` on the wire
- ``enableConnector`` / ``disableConnector`` API helpers
- SSE consumer recognises ``'enabled' | 'disabled'`` reasons
- Toggle row in MCPServerConfigureBody (only when configured), with
  helper text when disabled. Reuses existing ``.toggle-switch`` CSS
- New ConnectorTileMenu (⋯) on each tile header for one-click
  enable/disable/disconnect without drilling in. Self-positioning
  popup, click-outside + Escape to close
- New ``.connector-status.disabled`` pill in the tile header

Tests: 30 new (12 handler + 14 router + 4 reload-wiring). Full
connectors suite: 335 pass, 5 skipped, 1 pre-existing failure
(test_oauth_pkce, unrelated).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli CLI changes mcp MCP integration changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(connectors): MCP single-writer + security hardening

2 participants