Cannot run a Snowflake MCP server behind ToolHive's MCPRemoteProxy
User Story
As a ToolHive operator,
I want to put MCPRemoteProxy (or VirtualMCPServer) in front of
a Snowflake MCP server (Snowflake-Labs/mcp self-hosted, or the
Snowflake-managed <account>/api/v2/.../mcp-servers/... variant) using
OAuth against Snowflake as the upstream IdP,
so that end users can authenticate with their Snowflake account
and the proxy issues session JWTs that carry the Snowflake login_name
in audit logs.
Context
MCPExternalAuthConfig's OAuth2 upstream type marks userInfo as
required. Snowflake does not expose a userinfo endpoint that ToolHive
can call — its REST API has only /api/v2/users/{name} (which needs
the username up front) and /api/v2/users (list, not "current user").
There is no OIDC discovery for custom OAuth clients either, so the
existing OIDC upstream type doesn't fit.
kubectl apply of a sensible-looking config is rejected at admission:
spec.embeddedAuthServer.upstreamProviders[0].oauth2Config.userInfo: Required value
Forcing the issue with a fake userInfo URL produces a runtime failure
during the OAuth callback. PR #5094 added a synthesis fallback
(tk-<hash> derived from the access token) for OAuth2 upstreams without
userInfo — that unblocks the flow but loses real identity entirely:
audit logs become useless for human correlation, rate-limit buckets
keyed on subject reset on every re-auth, and there's no way back to a
Snowflake login from a tk-… hash.
What Snowflake DOES return on its token endpoint (verified against a
real trial account, documented at
https://docs.snowflake.com/en/user-guide/oauth-custom):
{
"access_token": "<opaque ~700-byte Snowflake-proprietary blob>",
"token_type": "Bearer",
"expires_in": 599,
"refresh_token": "<opaque>",
"refresh_token_expires_in": 86399,
"scope": "refresh_token session:role:<ROLE>",
"username": "JHROZEK",
"user_first_name": "Jakub",
"user_last_name": "Hrozek",
"idpInitiated": false
}
The access token is opaque (not a JWT) so we can't decode claims out of
it. But the response envelope itself carries the user identity.
Snowflake's docs note username is omitted on refresh-token grant
responses, so identity has to be captured at auth-code time.
This story closes the gap by adding an IdentityFromToken block on
OAuth2UpstreamConfig that extracts identity (subject / name / email)
from gjson dot-notation paths into the OAuth2 token-endpoint response
body, sibling to the existing userInfo (HTTP) and PR #5094 (synthesis)
paths. Slack v2 (whose oauth.v2.access response nests user identity
under authed_user.id) benefits from the same mechanism.
The fix is identity-resolution work and is independent of which
Snowflake MCP downstream is in use: the same identityFromToken
configuration drives the OAuth flow against Snowflake regardless of
whether the proxy forwards to a self-hosted Snowflake-Labs/mcp or
to the Snowflake-managed <account>/api/v2/.../mcp-servers/...
variant. Choice of downstream is just MCPRemoteProxy.spec.remoteUrl
and transport; this story does not constrain that.
Dependencies: PR #5094 (already merged) — provides the synthesis
fallback that this story integrates with as the lowest-priority path
in the identity-resolution chain.
Acceptance Criteria
Capability-level outcomes a ToolHive operator (or e2e check) can
observe once every sub-issue under this story has merged:
Deferred (not handled by this story)
- A separate
accessTokenClaims extraction mode for providers that
issue JWT access tokens with identity claims (Glean is the motivating
case). Different mechanism (JWT decode + JWKS-based signature
verification), separate trust model, and the audience-confusion
question (a JWT issued for the upstream resource server isn't, in
general, intended as identity for ToolHive) needs its own design
pass. Tracked separately.
Cannot run a Snowflake MCP server behind ToolHive's MCPRemoteProxy
User Story
As a ToolHive operator,
I want to put
MCPRemoteProxy(orVirtualMCPServer) in front ofa Snowflake MCP server (
Snowflake-Labs/mcpself-hosted, or theSnowflake-managed
<account>/api/v2/.../mcp-servers/...variant) usingOAuth against Snowflake as the upstream IdP,
so that end users can authenticate with their Snowflake account
and the proxy issues session JWTs that carry the Snowflake login_name
in audit logs.
Context
MCPExternalAuthConfig's OAuth2 upstream type marksuserInfoasrequired. Snowflake does not expose a userinfo endpoint that ToolHive
can call — its REST API has only
/api/v2/users/{name}(which needsthe username up front) and
/api/v2/users(list, not "current user").There is no OIDC discovery for custom OAuth clients either, so the
existing OIDC upstream type doesn't fit.
kubectl applyof a sensible-looking config is rejected at admission:Forcing the issue with a fake
userInfoURL produces a runtime failureduring the OAuth callback. PR #5094 added a synthesis fallback
(
tk-<hash>derived from the access token) for OAuth2 upstreams withoutuserInfo — that unblocks the flow but loses real identity entirely:
audit logs become useless for human correlation, rate-limit buckets
keyed on subject reset on every re-auth, and there's no way back to a
Snowflake login from a
tk-…hash.What Snowflake DOES return on its token endpoint (verified against a
real trial account, documented at
https://docs.snowflake.com/en/user-guide/oauth-custom):
{ "access_token": "<opaque ~700-byte Snowflake-proprietary blob>", "token_type": "Bearer", "expires_in": 599, "refresh_token": "<opaque>", "refresh_token_expires_in": 86399, "scope": "refresh_token session:role:<ROLE>", "username": "JHROZEK", "user_first_name": "Jakub", "user_last_name": "Hrozek", "idpInitiated": false }The access token is opaque (not a JWT) so we can't decode claims out of
it. But the response envelope itself carries the user identity.
Snowflake's docs note
usernameis omitted on refresh-token grantresponses, so identity has to be captured at auth-code time.
This story closes the gap by adding an
IdentityFromTokenblock onOAuth2UpstreamConfigthat extracts identity (subject / name / email)from gjson dot-notation paths into the OAuth2 token-endpoint response
body, sibling to the existing
userInfo(HTTP) and PR #5094 (synthesis)paths. Slack v2 (whose
oauth.v2.accessresponse nests user identityunder
authed_user.id) benefits from the same mechanism.The fix is identity-resolution work and is independent of which
Snowflake MCP downstream is in use: the same
identityFromTokenconfiguration drives the OAuth flow against Snowflake regardless of
whether the proxy forwards to a self-hosted
Snowflake-Labs/mcporto the Snowflake-managed
<account>/api/v2/.../mcp-servers/...variant. Choice of downstream is just
MCPRemoteProxy.spec.remoteUrland
transport; this story does not constrain that.Dependencies: PR #5094 (already merged) — provides the synthesis
fallback that this story integrates with as the lowest-priority path
in the identity-resolution chain.
Acceptance Criteria
Capability-level outcomes a ToolHive operator (or e2e check) can
observe once every sub-issue under this story has merged:
MCPExternalAuthConfigwith anidentityFromTokenblock (and nouserInfoblock) and the manifest is admitted.the token-endpoint response body using configured gjson paths and
does NOT call any userinfo endpoint.
IdentitySynthesizedadvisory condition staysFalse(withreason
AllUpstreamsHaveIdentityResolution) for configs that useidentityFromToken— the operator status surface accurately reflectsthat the upstream is NOT in synthesis mode.
subjects.user(e.g. the Snowflake login_nameJHROZEK) forconfigurations that point
namePathat the response field carryingthe human identifier.
nameclaim is unchanged across a refresh boundary even thoughSnowflake omits
usernameon refresh responses.through ngrok-exposed
MCPRemoteProxyagainst a real Snowflake trialaccount (see reproducer runbook attached as a child issue).
Deferred (not handled by this story)
accessTokenClaimsextraction mode for providers thatissue JWT access tokens with identity claims (Glean is the motivating
case). Different mechanism (JWT decode + JWKS-based signature
verification), separate trust model, and the audience-confusion
question (a JWT issued for the upstream resource server isn't, in
general, intended as identity for ToolHive) needs its own design
pass. Tracked separately.