Skip to content

Proposal: Envoy Support for Model Context Protocol (MCP) #39174

Open
@botengyao

Description

@botengyao

Objective

We’d like to initiate an issue and discussion around supporting the Model Context Protocol (MCP) in Envoy as a gateway.

What is MCP?

MCP is an open, stateless/stateful protocol that allows GenAI applications to retrieve and exchange context (e.g. source code, files, documents) with LLMs, using JSON-RPC semantics. A significant MCP streamable HTTP update last month, introduces OAuth 2.1-based authorization, streamable HTTP transport, JSON-RPC batching, and tool annotations. This is a major update that can make MCP work as a remote server.

Details

MCP can use transports such as stdio or Streamable HTTP. With the streamable HTTP update, the bidirectional JSON-RPC messages can be exchanged over HTTP POST and GET. And SSE is wrapped into the streamable HTTP. The following diagrams show the HTTP transport and capacity negotiation in MCP:


Transport

Capacity Negotiation

This proposal explores how Envoy can serve as a gateway between MCP clients and servers — helping route, process, and secure MCP messages in a scalable and extensible way.

Design Proposal

With MCP gaining traction as a standard way for AI tools to interact with contextual data, we believe Envoy can play an important role in enabling infrastructure-level routing, load balancing, and observability for these interactions.

This issue proposes a set of functions that enables Envoy to act as a gateway between MCP clients and servers, covering the following use cases, in order of their complexity and implementation:

Proposed Functionality

  1. MCP session aware load balancing based on the MCP endpoint (HTTP request URI).

  2. Parsing of MCP protocol to make Envoy aware of MCP request properties such as method/id, call arguments, or return values.

  3. Authentication of MCP requests using O-Auth2, JWT or API keys based on MCP request properties.

  4. Authorization of MCP requests and messages using RBAC (for example authorizing specific MCP methods based on the caller identity). This authorization will apply to both client and server requests.

  5. Transcoding JSON-RPC messages to existing API surfaces, for example gRPC and OpenAPI.

  6. Rate limiting of MCP requests.

  7. Customizable business logic for MCP messages, similar to HTTP filters, including remote callouts for MCP messages.

  8. Load balancing and fanning-out of individual MCP messages (i.e. based on method) from a single HTTP stream.

  9. Gateway initialized SSE stream support with session resumption and JSON-RPC batch support.

Note

MCP is closely related to the A2A protocol that is proposed for agent to agent communications. Both protocols use JSON-RPC and streaming semantics and stateful sessions. While this proposal is covering MCP, it does not preclude extending the same functions to A2A protocol. While some of the functions are agnostic of the underlying protocol, business logic specific to A2A can be implemented in its own extension, sharing common implementation, such as JSON-RPC parser and framing, with MCP.

Acknowledgments

This proposal was framed collaboratively with @htuch, @yanavlasov, and @botengyao. This issue is intended to surface the proposal for public discussion, gather feedback, and coordinate OSS collaboration.

We welcome thoughts, feedback, and ideas from the community as we continue to iterate on this direction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementFeature requests. Not bugs or questions.no stalebotDisables stalebot from closing an issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions