Skip to content

feat(agent): add Playwright browser MCP integration#1255

Open
srtab wants to merge 9 commits into
mainfrom
worktree-feat+browser-mcp
Open

feat(agent): add Playwright browser MCP integration#1255
srtab wants to merge 9 commits into
mainfrom
worktree-feat+browser-mcp

Conversation

@srtab

@srtab srtab commented May 25, 2026

Copy link
Copy Markdown
Owner

Summary

Adds the Microsoft Playwright MCP server as a first-class built-in MCP integration. When the mcp compose profile is enabled, the DAIV agent gains a full headless-Chromium browser surface (~23 playwright_browser_* tools — navigate, screenshot, console messages, network requests, click, type, evaluate, etc.) via streamable HTTP. Each agent session gets a fresh isolated browser context.

  • feat(agent): new PlaywrightMCPServer(MCPServer) registered via the existing @mcp_server decorator, plus a PLAYWRIGHT_URL setting (default http://mcp_playwright:8931/mcp).
  • feat(compose): new mcp_playwright service running mcr.microsoft.com/playwright/mcp:latest under the mcp / full profiles, headless Chromium with --isolated, healthchecked via node -e HTTP probe (the image ships neither wget nor curl).
  • docs: documented the new env var and added a Playwright section to the MCP tools customization guide.

Deliberate choice: no tool_filter, so the full Playwright surface is exposed (including browser_run_code_unsafe). The trust boundary is documented in the class docstring and asserted by a dedicated test so any future narrowing shows up in the diff.

Design doc and implementation plan live at docs/superpowers/specs/2026-05-25-browser-mcp-design.md and docs/superpowers/plans/2026-05-25-browser-mcp.md respectively (gitignored).

Test plan

  • 5 new unit tests for PlaywrightMCPServer (name, enabled/disabled, no tool_filter, connection URL) — all pass.
  • Full MCP suite (72 tests) passes with no regressions.
  • `make lint-fix` / `make lint-typing` clean on changed files.
  • `docker compose --profile mcp up -d mcp_playwright` starts cleanly; healthcheck reaches `healthy` within ~5s.
  • MCP `initialize` + `tools/list` round-trip against the running container returns 23 `browser_*` tools.
  • Manual: drive a real `playwright_browser_navigate` + `playwright_browser_take_screenshot` from a DAIV agent run against `https://app:8000/\` (deferred — requires the full app stack + LLM key).

@srtab srtab self-assigned this May 25, 2026
srtab added 5 commits May 25, 2026 11:46
Replace MCPToolkit.get_tools() with an aopen() async context manager so
callers own session lifetime, and persist the server-issued Mcp-Session-Id
across turns. Stateful servers (Playwright above all) tie browser context
to the session id, so without resume each chat turn opened a fresh
browser and snapshot/navigate landed in different contexts.

- aopen() captures the session id per server via streamablehttp_client's
  get_session_id callback (discarded by MultiServerMCPClient) and reuses
  it on subsequent opens with initialize=False to satisfy Playwright's
  "already initialized" check.
- MCPSessionStateMiddleware writes the ids dict into graph state so the
  checkpointer carries it forward; chat streaming reads it back on the
  next turn and passes it to aopen() to resume.
- Stale-id recovery falls back to a fresh session and overwrites the
  dead id so persistence doesn't preserve it.
- Job/webhook callers pass session_ids=None (one-shot semantics
  unchanged); only chat persists.
- mcp_playwright --allowed-hosts updated to include the docker service
  name so the daiv app can reach it past the DNS-rebinding guard.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant