|
| 1 | +# PRD: MCP OAuth Session Unification |
| 2 | + |
| 3 | +## Problem Statement |
| 4 | + |
| 5 | +**Supersedes PRD-044 MCP auth approach.** PRD-044 proposed tenant-scoped Dex redirects |
| 6 | +(Option A). This PRD bypasses Dex entirely for MCP - a cleaner solution. PRD-044's BFF SSO |
| 7 | +tenant context fix and `HandlerOptionalTenant` middleware for Dex endpoints remain needed |
| 8 | +independently and are out of scope here. |
| 9 | + |
| 10 | +The MCP (Model Context Protocol) OAuth flow bypasses the Meridian UI entirely and redirects |
| 11 | +users to the embedded Dex OIDC login page directly. This creates three problems: |
| 12 | + |
| 13 | +1. **Double login**: Users must enter credentials twice - once for the UI, once for MCP - |
| 14 | + even though both systems share the same JWT signing key and identity backend. |
| 15 | +2. **No session reuse**: Dex has no session mechanism (no cookies, no server-side sessions). |
| 16 | + There is no "already logged in" state to leverage. |
| 17 | +3. **Leaky abstraction**: Dex is an internal identity backend. Users should never see the |
| 18 | + Dex login page. The Meridian UI should be the single authentication surface. |
| 19 | + |
| 20 | +### Current Architecture |
| 21 | + |
| 22 | +Three auth flows exist: |
| 23 | + |
| 24 | +| Flow | Path | User Experience | |
| 25 | +|------|------|-----------------| |
| 26 | +| BFF Password | `POST /api/auth/login` | Meridian UI login form, no Dex involvement | |
| 27 | +| BFF SSO | `GET /api/auth/sso/{connector}` -> Dex -> `GET /api/auth/callback` | Dex is invisible - BFF controls redirects | |
| 28 | +| MCP OAuth | `GET /oauth/authorize` -> Dex login page directly | User sees raw Dex UI, must re-authenticate | |
| 29 | + |
| 30 | +The BFF SSO flow is the correct pattern - Dex stays behind the BFF and the user never |
| 31 | +interacts with it. MCP should follow the same pattern. |
| 32 | + |
| 33 | +### Shared Infrastructure (Already Exists) |
| 34 | + |
| 35 | +- **Same JWT signing key**: BFF and MCP use the same RSA key (`JWT_SIGNING_KEY`). Tokens |
| 36 | + are already interchangeable. |
| 37 | +- **Same identity backend**: Both validate against the same identity database via the |
| 38 | + embedded Dex connector. |
| 39 | +- **Same Dex client**: Both use `meridian-service` client ID. |
| 40 | +- **Same tenant resolution**: Both extract tenant from subdomain. |
| 41 | + |
| 42 | +## Proposed Solution |
| 43 | + |
| 44 | +Replace MCP's direct Dex redirect with a redirect to the Meridian UI. The SPA consent page |
| 45 | +checks the user's existing session (JWT in sessionStorage). If logged in, it shows a consent |
| 46 | +screen. If not, it shows the normal login flow first, then consent. After implementation, |
| 47 | +remove the old Dex-direct flow entirely - no feature flags, no fallback. |
| 48 | + |
| 49 | +### End-to-End Flow |
| 50 | + |
| 51 | +```text |
| 52 | +1. Claude Code POST /mcp -> 401 with auth metadata (unchanged) |
| 53 | +
|
| 54 | +2. Claude Code opens browser -> GET /oauth/authorize |
| 55 | + -> MCP validates client_id, PKCE, redirect_uri (unchanged) |
| 56 | + -> MCP stores OIDCFlowState {PKCE challenge, client_id, redirect_uri, |
| 57 | + state, tenant, scopes} |
| 58 | + -> MCP 302 -> https://{tenant}.{baseDomain}/auth/mcp-consent?mcp_state= |
| 59 | + {key}&client_id={id} |
| 60 | + [CHANGED: was Dex redirect, now UI redirect] |
| 61 | +
|
| 62 | +3. Browser loads SPA consent page |
| 63 | + -> SPA checks sessionStorage for JWT |
| 64 | + -> If no JWT: redirect to /login with return_url |
| 65 | + -> After login (or if already logged in): |
| 66 | + SPA fetches GET /mcp/consent-info?client_id=...&mcp_state=... |
| 67 | + -> MCP server validates state exists, returns trusted client metadata |
| 68 | + -> SPA renders consent card with client name, redirect URI, scopes, |
| 69 | + tenant, approve/deny |
| 70 | + [NEW: entire step] |
| 71 | +
|
| 72 | +4. User clicks "Authorize" |
| 73 | + -> SPA POST /api/auth/mcp-consent {mcp_state, client_id} |
| 74 | + with Authorization: Bearer {jwt} |
| 75 | + -> BFF auth middleware validates JWT, tenant resolver sets tenant context |
| 76 | + -> BFF extracts email + tenant from JWT claims |
| 77 | + -> BFF generates one-time consent code, stores {email, tenant, |
| 78 | + mcp_state, client_id, scopes} |
| 79 | + -> BFF returns JSON {redirect_url: "/oauth/callback?code= |
| 80 | + {consent_code}&state={mcp_state}"} |
| 81 | + [NEW: entire step] |
| 82 | +
|
| 83 | +5. SPA navigates to /oauth/callback?code={consent_code}&state={mcp_state} |
| 84 | + -> MCP HandleCallback consumes state (same as today) |
| 85 | + -> MCP consumes consent code from shared store |
| 86 | + [REPLACES: Dex code exchange] |
| 87 | + -> MCP cross-validates: consent code's mcp_state + client_id match |
| 88 | + flow state |
| 89 | + -> MCP extracts email + tenant from consent code entry |
| 90 | + -> MCP signs fresh scoped JWT {sub, email, x-tenant-id, scopes} |
| 91 | + (same pattern as today) |
| 92 | + -> MCP generates MCP auth code, stores in CodeStore (same as today) |
| 93 | + -> MCP 302 -> Claude Code's redirect_uri?code={mcp_code}&state= |
| 94 | + {mcp_state} |
| 95 | + [CHANGED: identity source is consent code, not Dex ID token] |
| 96 | +
|
| 97 | +6. Claude Code exchanges auth code for JWT at POST /oauth/token (unchanged) |
| 98 | +``` |
| 99 | + |
| 100 | +### What Gets Removed |
| 101 | + |
| 102 | +- `buildDexRedirect` - replaced by `buildConsentRedirect` |
| 103 | +- `exchangeDexCode` - replaced by consent code consumption |
| 104 | +- `BuildTenantScopedDexURL` (for MCP) - consent page is on same origin |
| 105 | +- Inner PKCE leg (MCP-to-Dex) - no longer needed |
| 106 | +- `DexCodeVerifier` field in `OIDCFlowState` - removed |
| 107 | +- `MCP_DEX_ISSUER_URL`, `MCP_DEX_CLIENT_ID`, `MCP_DEX_CALLBACK_URL` env vars from |
| 108 | + MCP server - no longer needed for MCP flow |
| 109 | +- All Dex-related imports and helpers in MCP OAuth handler |
| 110 | + |
| 111 | +### What Stays |
| 112 | + |
| 113 | +- All Dex infrastructure (BFF SSO still uses it) |
| 114 | +- Outer PKCE chain (client-to-MCP) - unchanged |
| 115 | +- MCP state store and code store - unchanged patterns |
| 116 | +- JWT signing and JWKS endpoint - unchanged |
| 117 | + |
| 118 | +## Component Changes |
| 119 | + |
| 120 | +### 1. MCP Server (`services/mcp-server/internal/auth/oidc.go`) |
| 121 | + |
| 122 | +**Modify `HandleAuthorize`**: Replace `buildDexRedirect` with `buildConsentRedirect` |
| 123 | +that redirects to the UI consent page URL with `mcp_state` and `client_id` query params. |
| 124 | +Store `requested_scopes` from the authorize request in `OIDCFlowState`. |
| 125 | + |
| 126 | +**New endpoint `GET /mcp/consent-info`**: Returns trusted client metadata (client_name, |
| 127 | +redirect_uri, scopes) after validating the `mcp_state` exists in the state store. |
| 128 | +Unauthenticated endpoint - returns display data only. Cross-checks `client_id` in URL |
| 129 | +matches client_id in state. For dynamically registered clients, include `is_dynamic: true` |
| 130 | +so the consent page can flag them as unverified. |
| 131 | + |
| 132 | +**Modify `HandleCallback`**: Accept consent codes from the BFF instead of Dex authorization |
| 133 | +codes. Consume the consent code from the shared `ConsentCodeStore`, cross-validate |
| 134 | +`mcp_state` and `client_id` against the flow state, extract identity (email, tenant), then |
| 135 | +proceed to `issueCodeAndRedirect` (unchanged). Include `scopes` claim in the signed JWT. |
| 136 | + |
| 137 | +**New `OIDCStateStore.Peek` method**: Non-consuming read that returns selected fields |
| 138 | +(client_id, redirect_uri, scopes) for the consent-info endpoint. |
| 139 | + |
| 140 | +**Remove Dex-direct code**: Delete `buildDexRedirect`, `exchangeDexCode`, |
| 141 | +`BuildTenantScopedDexURL`, inner PKCE generation, `DexCodeVerifier` from `OIDCFlowState`, |
| 142 | +and all Dex-specific env var handling (`MCP_DEX_ISSUER_URL`, `MCP_DEX_CLIENT_ID`, |
| 143 | +`MCP_DEX_CALLBACK_URL`). Remove the OIDC discovery client, Dex token exchange HTTP client, |
| 144 | +and related helpers. The MCP server no longer talks to Dex. |
| 145 | + |
| 146 | +### 2. BFF / API Gateway (`services/api-gateway/`) |
| 147 | + |
| 148 | +**New endpoint `POST /api/auth/mcp-consent`**: Behind full auth middleware chain (JWT |
| 149 | +validated, tenant resolved). Accepts `{mcp_state, client_id, action}` where `action` |
| 150 | +is an explicit enum: `"approve"` or `"deny"` (no boolean default - a missing or |
| 151 | +unrecognized action is rejected). On approve: generates one-time consent code, stores |
| 152 | +identity in `ConsentCodeStore`, returns `{redirect_url}` pointing to MCP callback. |
| 153 | +On deny: consumes MCP state, returns `{redirect_url}` pointing to client's |
| 154 | +redirect_uri with `error=access_denied` and the client's original state. |
| 155 | + |
| 156 | +**New `ConsentCodeStore`**: In-memory store, same pattern as MCP's `CodeStore`. One-time |
| 157 | +consumption, 2-minute TTL, capped at 10,000 entries, background eviction. |
| 158 | + |
| 159 | +### 3. Frontend (`frontend/src/`) |
| 160 | + |
| 161 | +**New route**: `/auth/mcp-consent` in `App.tsx`. This path avoids the Caddy |
| 162 | +`@mcp_transport` matcher which intercepts all `/oauth/*` paths and routes them to |
| 163 | +the MCP server. Using `/auth/mcp-consent` ensures the request falls through to the |
| 164 | +SPA catch-all. |
| 165 | + |
| 166 | +**New page component**: `OAuthConsentPage` - checks auth state, fetches client metadata |
| 167 | +from `/mcp/consent-info`, renders consent card, handles approve/deny. |
| 168 | + |
| 169 | +**New display component**: `ConsentCard` - shows application name, tenant context, scope |
| 170 | +description, redirect URI, approve/deny buttons. For dynamically registered clients (where |
| 171 | +`is_dynamic: true`), show "Unverified application" badge. Styled consistently with existing |
| 172 | +login page. |
| 173 | + |
| 174 | +**No changes to**: login page, callback page, auth context, auth interceptor, SSO flow. |
| 175 | + |
| 176 | +### 4. Wiring (`cmd/meridian/`) |
| 177 | + |
| 178 | +The unified binary (`cmd/meridian`) runs both the BFF (api-gateway) and MCP server in |
| 179 | +the same Go process. This enables shared in-memory stores between them. |
| 180 | + |
| 181 | +Shared `ConsentCodeStore` created once and passed to both BFF's `MCPConsentHandler` and |
| 182 | +MCP's `OIDCHandler`. Shared `OIDCStateStore` also passed to BFF handler (needed for deny |
| 183 | +flow and redirect_uri lookup). |
| 184 | + |
| 185 | +**Deployment note**: The demo docker-compose runs a separate `mcp-server` container |
| 186 | +alongside the unified `meridian` container. The consent flow requires both BFF and MCP |
| 187 | +to share in-memory stores, so the consent flow runs within the unified binary only. |
| 188 | +The separate `mcp-server` container's OAuth endpoints are not used for the consent |
| 189 | +flow - Caddy routes `/mcp/*` to the MCP handler within the unified binary. If the |
| 190 | +MCP server is ever deployed as a fully separate service, the in-memory stores must |
| 191 | +be replaced with HTTP-based inter-service calls (BFF calls MCP to exchange codes). |
| 192 | + |
| 193 | +### 5. Cleanup |
| 194 | + |
| 195 | +**Remove from docker-compose env vars**: `MCP_DEX_ISSUER_URL`, `MCP_DEX_CLIENT_ID`, |
| 196 | +`MCP_DEX_CALLBACK_URL` from both demo and develop compose files and `.env` templates. |
| 197 | +These are no longer consumed by the MCP server. |
| 198 | + |
| 199 | +**Remove from Dex client registration**: The `/oauth/callback` redirect URI in |
| 200 | +`DefaultDemoClient` is no longer needed for Dex (the MCP callback now receives consent |
| 201 | +codes from the BFF, not Dex codes). Keep it only if the BFF SSO flow still uses it. |
| 202 | +If not, remove. |
| 203 | + |
| 204 | +**Update deploy docs**: Remove any references to MCP-specific Dex configuration from |
| 205 | +`deploy/demo/README.md` and related documentation. |
| 206 | + |
| 207 | +## Security Requirements |
| 208 | + |
| 209 | +All mandatory - no "recommended" tier. Everything listed here ships or the PRD isn't done. |
| 210 | + |
| 211 | +1. **No bearer tokens in URLs**: JWTs must never appear as URL query parameters. Only |
| 212 | + opaque one-time codes travel in redirects. |
| 213 | +2. **One-time code consumption**: `Consume()` must be atomic - concurrent calls for the |
| 214 | + same code return success for exactly one caller. |
| 215 | +3. **Tenant binding chain**: The tenant must be consistent across the entire flow. |
| 216 | + MCP state stores `tenantSlug` (from subdomain). BFF JWT contains `x-tenant-id` |
| 217 | + (UUID). The consent code stores both `tenantSlug` and `tenantID`. At each |
| 218 | + boundary, resolve slug to UUID (or vice versa) and verify they refer to the |
| 219 | + same tenant. A mismatch at any link means rejection. |
| 220 | +4. **PKCE integrity**: The outer PKCE chain (client-to-MCP) must be preserved unmodified |
| 221 | + by the consent flow. |
| 222 | +5. **Explicit consent**: BFF consent endpoint only callable via POST with Bearer auth. |
| 223 | + No auto-approve, no GET-based approval. |
| 224 | +6. **Fresh scoped JWT**: MCP callback signs a new JWT with minimal claims (`sub`, `email`, |
| 225 | + `x-tenant-id`, `scopes`). BFF JWT roles and other claims must not propagate. |
| 226 | +7. **Client identity binding**: Consent code `client_id` must match flow state `client_id`. |
| 227 | +8. **Server-side client metadata**: Consent page fetches client_name from server by |
| 228 | + client_id. Never trusts URL params for display. |
| 229 | +9. **Display redirect_uri**: Consent screen shows where credentials will be sent |
| 230 | + (unforgeable client identifier). |
| 231 | +10. **Scope model**: `requested_scopes` in `OIDCFlowState`, `approved_scopes` in |
| 232 | + `ConsentCodeEntry`, `scopes` claim in MCP JWT. v1 value: `["mcp:default"]`. |
| 233 | +11. **Dynamic client flagging**: Consent screen shows "Unverified application" badge for |
| 234 | + dynamically registered clients. |
| 235 | +12. **return_url validation**: The `/login?return_url=...` redirect used when the user is |
| 236 | + not authenticated must validate that `return_url` is a relative path starting with |
| 237 | + `/` (existing BFF pattern). This prevents open-redirect attacks where a crafted |
| 238 | + consent URL chains through login to redirect to an attacker-controlled site. |
| 239 | + |
| 240 | +### Consent Code Specification |
| 241 | + |
| 242 | +- **Entropy**: 32 bytes crypto/rand, base64url-encoded (43 chars) |
| 243 | +- **TTL**: 2 minutes (shorter than MCP state's 10-min TTL) |
| 244 | +- **Store cap**: 10,000 entries with background eviction |
| 245 | +- **Binding**: email, tenant_id, tenant_slug, mcp_state, client_id, approved_scopes, |
| 246 | + created_at |
| 247 | + |
| 248 | +## UX Specification |
| 249 | + |
| 250 | +### Consent Card |
| 251 | + |
| 252 | +```text |
| 253 | ++-------------------------------------------+ |
| 254 | +| Authorize Application | |
| 255 | +| | |
| 256 | +| "Claude Code" wants to access your | |
| 257 | +| Volterra Energy account. | |
| 258 | +| | |
| 259 | +| [! Unverified application] | |
| 260 | +| | |
| 261 | +| This will allow: | |
| 262 | +| - Full access to your account | |
| 263 | +| | |
| 264 | +| Credentials will be sent to: | |
| 265 | +| http://localhost:12345/callback | |
| 266 | +| | |
| 267 | +| [Deny] [Authorize] | |
| 268 | ++-------------------------------------------+ |
| 269 | +``` |
| 270 | + |
| 271 | +The "Unverified application" badge only appears for dynamically registered clients |
| 272 | +(`is_dynamic: true`). |
| 273 | + |
| 274 | +### States |
| 275 | + |
| 276 | +1. **Loading**: SPA bundle loading, auth check, client metadata fetch - show spinner |
| 277 | +2. **Unauthenticated**: No JWT in sessionStorage - redirect to `/login?return_url=...` |
| 278 | +3. **Authenticated**: JWT found - render consent card with server-fetched metadata |
| 279 | +4. **Error - invalid state**: "This authorization request has expired. Please try again |
| 280 | + from Claude Code." |
| 281 | +5. **Error - invalid client**: "This authorization request is invalid. The application |
| 282 | + could not be found." |
| 283 | +6. **Submitting**: Approve button disabled, spinner overlay |
| 284 | +7. **Denied**: Redirect to client with `error=access_denied` |
| 285 | + |
| 286 | +### API Contracts (Frontend View) |
| 287 | + |
| 288 | +**GET `/mcp/consent-info?client_id=...&mcp_state=...`** (MCP server, unauthenticated) |
| 289 | + |
| 290 | +- 200: `{ client_id, client_name, redirect_uri, scopes, is_dynamic }` |
| 291 | +- 400: invalid/expired state or client_id mismatch |
| 292 | + |
| 293 | +**POST `/api/auth/mcp-consent`** (BFF, requires Bearer JWT) |
| 294 | + |
| 295 | +- Request: `{ mcp_state, client_id, action: "approve" | "deny" }` |
| 296 | +- 200 (approved): `{ redirect_url: "/oauth/callback?code=...&state=..." }` |
| 297 | +- 200 (denied): `{ redirect_url: "https://client/callback?error=access_denied&state=..." }` |
| 298 | +- 400: `{ error: "invalid_state" | "state_expired" | "client_mismatch" | "invalid_action" }` |
| 299 | +- 401: invalid/expired JWT |
| 300 | + |
| 301 | +## Acceptance Criteria |
| 302 | + |
| 303 | +### Backend |
| 304 | + |
| 305 | +1. MCP `/oauth/authorize` redirects to UI consent page (not Dex) |
| 306 | +2. MCP `/mcp/consent-info` returns trusted client metadata including `redirect_uri`, |
| 307 | + `scopes`, and `is_dynamic` after validating state |
| 308 | +3. BFF `POST /api/auth/mcp-consent` requires valid JWT, issues one-time consent code |
| 309 | +4. MCP `/oauth/callback` accepts consent codes and cross-validates against flow state |
| 310 | +5. MCP `/oauth/token` returns scoped JWT with `sub`, `email`, `x-tenant-id`, `scopes` |
| 311 | +6. BFF SSO flow via Dex continues to work unchanged |
| 312 | +7. Consent codes have 2-min TTL, one-time-use, capped store with eviction |
| 313 | +8. Tenant cross-check: consent code tenant must match flow state tenant |
| 314 | +9. Deny flow redirects client with `error=access_denied` and original client state |
| 315 | +10. All Dex-direct code removed from MCP OAuth handler (no `buildDexRedirect`, |
| 316 | + no `exchangeDexCode`, no inner PKCE) |
| 317 | +11. `MCP_DEX_ISSUER_URL`, `MCP_DEX_CLIENT_ID`, `MCP_DEX_CALLBACK_URL` env vars removed |
| 318 | + from MCP server and docker-compose configs |
| 319 | + |
| 320 | +### Frontend |
| 321 | + |
| 322 | +1. User with existing session sees consent card within 2 seconds |
| 323 | +2. User without session is redirected to login, then back to consent |
| 324 | +3. Client name on consent card matches server-registered name |
| 325 | +4. Redirect URI is visible on consent card |
| 326 | +5. "Authorize" button disabled until client metadata loads |
| 327 | +6. "Deny" redirects MCP client with `error=access_denied` |
| 328 | +7. Expired/invalid `mcp_state` shows clear error message |
| 329 | +8. Invalid `client_id` shows clear error message |
| 330 | +9. Consent page uses same styling as login page |
| 331 | +10. Dynamically registered clients show "Unverified application" badge |
| 332 | + |
| 333 | +### Security |
| 334 | + |
| 335 | +1. No JWTs appear in any redirect URL at any point in the flow |
| 336 | +2. Consent codes are consumed exactly once (atomic) |
| 337 | +3. PKCE chain works end-to-end (client code_verifier verified at `/oauth/token`) |
| 338 | +4. Cross-tenant state replay is rejected |
| 339 | +5. `scopes` claim present in MCP-issued JWT |
| 340 | + |
| 341 | +### End-to-End |
| 342 | + |
| 343 | +1. Full flow works: Claude Code -> authorize -> consent -> approve -> callback -> |
| 344 | + token exchange -> authenticated MCP session |
| 345 | +2. Full flow works when user is NOT logged in: Claude Code -> authorize -> consent -> |
| 346 | + login -> consent -> approve -> callback -> token exchange |
| 347 | +3. Deny flow works: Claude Code -> authorize -> consent -> deny -> Claude Code receives |
| 348 | + `error=access_denied` |
| 349 | +4. Demo environment: Volterra Energy operator can authenticate MCP via the new consent flow |
0 commit comments