Skip to content

Latest commit

 

History

History
394 lines (325 loc) · 14.6 KB

File metadata and controls

394 lines (325 loc) · 14.6 KB

AGENTS.md

This file provides guidance to AI Agents when working with this repository.

Project Overview

MCP Gateway is an Envoy-based gateway for Model Context Protocol (MCP) servers. Single binary (mcp-broker-router) with three components:

  • MCP Router: Envoy external processor that routes MCP requests (gRPC on :50051)
  • MCP Broker: HTTP service that aggregates tools from multiple MCP servers (HTTP on :8080/mcp)
  • MCP Gateway Controller: Kubernetes controller that discovers MCP servers via MCPServerRegistration CRDs (optional, --controller flag)

Architecture

Client → Gateway (Envoy) → Router (ext_proc) → Broker → Upstream MCP Servers
                ↑                                 ↑
           Controller → ConfigMap ────────────────┘
  • Controller watches MCPServerRegistration CRDs, discovers backends via HTTPRoutes, writes ConfigMap
  • Broker reads ConfigMap, connects to upstream servers, federates tools with prefixes
  • Router parses MCP requests, adds auth headers, tells Envoy where to route
  • All MCP traffic flows through Envoy for consistent policies

Important: We use Istio ONLY as a Gateway API provider, NOT as a service mesh:

  • No sidecars on any workload pods
  • No ambient mode (no ztunnels or waypoint proxies)
  • Just istiod programming the Gateway's Envoy proxy
  • ServiceEntry/DestinationRule only used for external service routing

Key Files

Core Components

  • cmd/mcp-broker-router/main.go: Binary entry point
  • internal/broker/broker.go: MCP broker implementation
  • internal/mcp-router/server.go: Envoy external processor
  • internal/controller/mcpserverregistration_controller.go: MCPServerRegistration reconciliation

Configuration

  • config/crd/mcp.kuadrant.io_*.yaml: CRD definitions (generated by controller-gen)
  • config/mcp-system/: Kubernetes deployment manifests
  • config/test-servers/: Test MCP server deployments
  • docs/guides/external-mcp-server.md: Guide for connecting to external MCP servers
  • docs/examples/github-mcp-external.yaml: Example manifest for GitHub MCP integration

Development

Quick Start

make local-env-setup     # Create Kind cluster with everything
make reload              # Build, load to Kind, and restart controller and broker

Testing

make lint               # Run all lint and style checks
make test-unit          # Unit tests
make test-e2e-ci        # E2E tests for CI environment
make test-e2e           # E2E tests with local Kind cluster

# Running specific E2E tests locally
cd tests/e2e && go test -v -tags=e2e -run TestE2E -ginkgo.focus="test description" -timeout 5m

# Alternative with ginkgo CLI
ginkgo run -v --tags=e2e --focus="test description" tests/e2e/

E2E Test Reliability

  • Tests use broker /status endpoint for reliable server registration checks (not log parsing)
  • Port-forwards target deployments directly: deployment/mcp-gateway
  • Tests clean up existing resources before creating to avoid conflicts
  • Structured JSON responses provide better debugging when tests fail

CI/CD Optimizations

All GitHub workflows have concurrency control to cancel stale runs:

concurrency:
  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
  cancel-in-progress: true

This prevents resource waste during rapid development/force-pushing.

Important Ports

  • 8080: Broker HTTP (/mcp endpoint)
  • 50051: Router gRPC (ext_proc)
  • 8081: Controller health probes
  • 8001: Gateway port mapping
  • 8002: Keycloak port mapping

Operations & Rules

Git Commit Sign-off

CRITICAL: All commits MUST be signed off.

  • Use git commit --signoff (or -s) for every commit.
  • You can configure git to do this automatically: git config --local commit.signoff true.

Known Issues & Solutions

Flaky E2E Tests

Problem: Tests timeout waiting for broker to register servers due to:

  • ConfigMap volume mount sync delays (60-120s in Kubernetes)
  • Log-based checks becoming unreliable

Solution: Use broker /status API endpoint instead of log parsing for all server state checks.

Current Status

Working ✅

  • MCP broker federates tools from multiple servers with prefixes
  • Controller discovers servers via HTTPRoutes and generates ConfigMap
  • Authentication via Kubernetes secrets and env vars
  • Dynamic config updates via HTTP push API
  • Tool call forwarding to upstream servers
  • E2E tests validate full flow
  • prefix field immutability enforced via CEL validation
  • External service detection via ExternalName Services
  • Tool discovery from external MCP servers (GitHub MCP: 94 tools discovered)
  • Controller correctly generates HTTPS URLs for external services
  • Router (ext_proc) properly sets routing headers:
    • :authority header changed to external hostname
    • :path header set to custom path (e.g., /v1/special/mcp) when specified
    • Authorization header added with Bearer token only when no existing Authorization header is present
    • Tool name prefix stripping working correctly
  • Custom MCP paths:
    • MCPServerRegistration CRD supports path field for non-standard endpoints
    • Controller includes custom paths in ConfigMap
    • Broker connects to custom path endpoints successfully
    • Router sets :path header for custom paths

Not Implemented 📝

  • Notification brokering (tools/list_changed events)
  • EnvoyFilter creation by controller
  • Resource/prompt federation (only tools currently)

MCPServerRegistration Resource

apiVersion: mcp.kuadrant.io/v1alpha1
kind: MCPServerRegistration
metadata:
  name: weather-service
  namespace: mcp-test
spec:
  prefix: weather_      # Prefix for federated tools (immutable once set)
  path: /v1/custom/mcp      # Optional custom path (default: /mcp)
  targetRef:                # HTTPRoute reference
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: weather-route
  credentialRef:            # Optional auth
    name: weather-secret
    key: token

Custom Paths

MCPServerRegistration CRD has optional path field (defaults to /mcp):

  • Controller includes full URL with custom path in ConfigMap
  • Broker successfully connects to custom endpoints and discovers tools
  • Router sets :path header when path != /mcp

HTTPRoute Requirements:

  • HTTPRoute must have a hostname that matches a Gateway listener
  • For internal services, use *.mcp.local pattern (matches wildcard listener)
  • HTTPRoute should include path match for the custom path

Example:

apiVersion: mcp.kuadrant.io/v1alpha1
kind: MCPServerRegistration
metadata:
  name: custom-path-server
  namespace: mcp-test
spec:
  path: /v1/special/mcp    # Custom endpoint
  prefix: custom_
  targetRef:
    kind: HTTPRoute
    name: custom-path-route
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: custom-path-route
  namespace: mcp-test
spec:
  hostnames:
  - custom.mcp.local       # Must match Gateway listener
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /v1/special/mcp
    backendRefs:
    - name: custom-mcp-service
      port: 8080
  • Useful for servers that expose MCP on non-standard endpoints

External Services

The controller automatically detects external services. When the HTTPRoute backend name looks like an external hostname (e.g., api.githubcopilot.com), the controller uses it directly instead of constructing internal Kubernetes DNS names. Detection criteria:

  • Contains dots (.)
  • Doesn't end with .local, .svc, or .cluster.local
  • Has at least 2 parts when split by dots

For external services, create appropriate Istio ServiceEntry and HTTPRoute resources. See docs/guides/external-mcp-server.md for detailed instructions.

Authentication

MCP servers can require authentication:

  1. MCPServerRegistration spec includes credentialRef pointing to a Kubernetes secret
    • Important: Secret must have label mcp.kuadrant.io/secret=true
    • Without this label, the MCPServerRegistration will fail validation
  2. Controller aggregates credentials into mcp-aggregated-credentials secret
  3. Broker receives via environment variables: KAGENTI_{NAME}_CRED
  4. Router adds Authorization header to Envoy routing instructions

Example credential secret:

apiVersion: v1
kind: Secret
metadata:
  name: weather-secret
  namespace: mcp-test
  labels:
    mcp.kuadrant.io/secret: "true"  # required label
type: Opaque
stringData:
  token: "Bearer your-api-token"

Credential Value Change Detection

The system handles credential updates automatically:

  1. Controller uses APIReader to bypass cache when reading credential secrets
  2. Broker detects credential value changes and re-registers servers automatically
  3. Exponential backoff retry for servers with credentials (5s → 10s → 20s → 40s → 60s)

Timing:

  • Controller → Aggregated Secret: Fast (~5 seconds)
  • Aggregated Secret → Volume Mount: 60-120 seconds (Kubernetes kubelet sync limitation)
  • Total sync time: ~60-120 seconds

This is a Kubernetes limitation - volume mounts sync every 60s by default and cannot be configured lower.

OAuth + API Key Conflict (Issue #201)

Problem: When using AuthPolicy (e.g., Kuadrant/Authorino), there's a timing issue where ext_proc runs FIRST and AuthPolicy runs SECOND. If ext_proc replaces the OAuth token with an API key, AuthPolicy fails.

Solution: The router only sets the Authorization header when no existing Authorization header is present. This allows AuthPolicy to validate the OAuth token first, then the router adds backend credentials only if the request doesn't already have auth.

Test Servers

Six test servers in config/test-servers/:

  • Server1: Go SDK (tools: greet, time, slow, headers)
  • Server2: Go SDK (tools: hello_world, time, headers, auth1234, slow)
  • Server3: Python FastMCP (tools: time, add, dozen, pi, get_weather, slow)
  • API Key Server: Validates Bearer token authentication (tool: hello_world)
  • Broken Server: Intentionally broken server for testing error handling
  • Custom Path Server: Go SDK at /v1/special/mcp (tools: echo_custom, path_info, timestamp)

Code Style

  • Minimal, terse comments (lowercase, only when necessary)
  • No emojis or AI-style formatting
  • Files must end with newline
  • Regularly run make lint to check for lint errors.

MCP Inspector

The MCP Inspector web UI supports URL parameters for configuration:

  • transport (required): Transport method for MCP connection
    • stdio: For stdio-based servers (requires serverPath)
    • sse: For SSE/HTTP servers (requires serverUrl)
    • streamable-http: For streamable HTTP servers (requires serverUrl)
  • serverUrl: URL of the MCP server (for SSE/streamable-http transports)
  • serverPath: Path to stdio.js file (for stdio transport)
  • MCP_PROXY_AUTH_TOKEN: Authentication token (auto-generated by mcp-inspector)

Environment variables:

  • MCP_AUTO_OPEN_ENABLED=false: Prevents automatic browser opening (default: true)
  • CLIENT_PORT: Custom port for the inspector UI (default: 6274)
  • SERVER_PORT: Custom port for the proxy server (default: 6277)

The make targets now handle URL parameters automatically:

  • make inspect-gateway - Opens inspector for the gateway
  • make inspect-server1 - Opens inspector for test server 1
  • make inspect-server2 - Opens inspector for test server 2
  • make inspect-server3 - Opens inspector for test server 3
  • make inspect-api-key-server - Opens inspector for API key test server (requires auth)
  • make inspect-custom-path-server - Opens inspector for custom path test server

External MCP Server Support (Issue #166)

Successfully implemented external MCP server support (e.g., GitHub Copilot MCP) following the Kuadrant pattern without sidecars.

Key Fixes That Made It Work

  1. Gateway Listener Configuration (Critical)

    • Added external hostname as a listener in the Gateway spec
    • Example: Added api.githubcopilot.com as a listener on port 8080
    • This allows the gateway to accept traffic for external hostnames
  2. HTTPRoute with Matching Hostname

    • Created HTTPRoute with hostnames: ["api.githubcopilot.com"]
    • Routes to ExternalName Service pointing to the external host
    • Must match the Gateway listener hostname exactly
  3. Correct Token Format

    • GitHub MCP requires a PAT (Personal Access Token) starting with ghp_
    • Token must be prefixed with "Bearer " in the secret
    • Format: Bearer ghp_YOUR_TOKEN_HERE
    • App tokens (ghu_ prefix) don't work with GitHub Copilot MCP
  4. Credential Environment Variable

    • Controller generates env var name: KAGENTI_{MCP_NAME}_CRED
    • Router reads this env var to add Authorization header
    • Broker uses same env var for tool discovery
  5. ServiceEntry and DestinationRule

    • ServiceEntry tells Istio about the external service
    • DestinationRule configures TLS mode: SIMPLE
    • Both should be in same namespace as the Service

Architecture Flow

Client → Gateway (w/ external listener) → Router (adds auth header) → External MCP Server
                                            ↑
                                     Config from Controller

Working Example Configuration

# 1. Gateway with external listener (in gateway-system namespace)
listeners:
- name: github-external
  hostname: api.githubcopilot.com
  port: 8080
  protocol: HTTP

# 2. HTTPRoute (in mcp-test namespace)
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: github-mcp-external
  namespace: mcp-test
spec:
  parentRefs:
  - name: mcp-gateway
    namespace: gateway-system
  hostnames:
  - api.githubcopilot.com
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /mcp
    backendRefs:
    - name: api-githubcopilot-com
      port: 443

# 3. ExternalName Service
apiVersion: v1
kind: Service
metadata:
  name: api-githubcopilot-com
  namespace: mcp-test
spec:
  type: ExternalName
  externalName: api.githubcopilot.com
  ports:
  - name: https
    port: 443
    protocol: TCP

Common Issues and Solutions

  • 404 Route Not Found: Gateway needs a listener for the external hostname
  • 401 Unauthorized: Wrong token format (use PAT with ghp_ prefix)
  • Session creation fails: Router's session cache needs auth; fixed by router adding auth header
  • No tools discovered: Token invalid or missing "Bearer " prefix

Testing

# Verify tools are discovered
kubectl logs -n mcp-system deploy/mcp-gateway | grep "Discovered.*tools"
# Should show: "Discovered tools mcpURL=https://api.githubcopilot.com:443/mcp #tools=94"

# Check auth header is added
kubectl logs -n mcp-system deploy/mcp-gateway | grep "Adding Authorization header"
# Should show: "Adding Authorization header for routing server=mcp-test/github-mcp"