Skip to content

Forward MCPServerEntry headerForward to vMCP outbound requests#5239

Open
ChrisJBurns wants to merge 8 commits intomainfrom
cburns/headerforward-envvar
Open

Forward MCPServerEntry headerForward to vMCP outbound requests#5239
ChrisJBurns wants to merge 8 commits intomainfrom
cburns/headerforward-envvar

Conversation

@ChrisJBurns
Copy link
Copy Markdown
Collaborator

Summary

Closes #4996.

MCPServerEntry.spec.headerForward.{addPlaintextHeaders,addHeadersFromSecret} was accepted by the CRD but never sent on outbound requests when the entry was consumed as a static backend of a VirtualMCPServer. Only MCPRemoteProxy forwarded the field — vMCP requests to remoteUrl arrived without the configured headers, breaking use cases like GitHub Copilot's X-MCP-Toolsets multi-toolset selection.

This PR fixes the bug without touching pkg/vmcp/config by mirroring MCPRemoteProxy's existing pattern: the operator emits per-(entry, header) env vars on the vMCP pod (literal values for plaintext, valueFrom.secretKeyRef for secrets), and the vMCP runtime walks the well-known prefixes at startup to reconstruct Backend.HeaderForward in static mode. TOOLHIVE_OTEL_HEADER_* already establishes plaintext-header-via-env in this codebase, so the convention isn't new.

Zero CRD/docs diff. Zero non-test changes under pkg/vmcp/config/.

Medium level
  • Operator: buildHeaderForwardEnvVarsForEntries now emits literal-value env vars for addPlaintextHeaders alongside the existing valueFrom.secretKeyRef env vars for addHeadersFromSecret. Header normalization is extracted into a private normalizeHeaderForEnvVar helper that both GenerateHeaderForwardSecretEnvVarName and the new GenerateHeaderForwardPlaintextEnvVarName share, so the secret and plaintext branches can never diverge on a header that round-trips through one and not the other.
  • Runtime: a new readHeaderForwardFromEnv (pkg/vmcp/cli/header_forward_env.go) walks os.Environ() for the TOOLHIVE_HEADER_PLAINTEXT_* and TOOLHIVE_SECRET_HEADER_FORWARD_* prefixes at startup, parses each (entry, header) pair via the inverse of the operator's normalization, and builds map[backendName]*vmcp.HeaderForwardConfig. Stray env vars whose decoded entry segment doesn't match a known static backend are dropped.
  • Discoverer: NewUnifiedBackendDiscovererWithStaticBackends gains a headerForwardByBackend parameter. discoverFromStaticConfig attaches the matching map entry to Backend.HeaderForward by backend name.
  • Round tripper (pkg/vmcp/client/header_forward.go): inserted between identityPropagatingRoundTripper (outer) and authRoundTripper (inner). Resolves plaintext + secret headers once at client-factory time via secrets.EnvironmentProvider. Rejects restricted headers via the shared pkg/transport/middleware.RestrictedHeaders set. Auth always wins over user-supplied headers because it runs after this tripper.
  • Health monitor: BackendTarget construction in pkg/vmcp/health/monitor.go::performHealthCheck now carries HeaderForward, CABundlePath, and CABundleData so health probes hit backends with the same TLS trust and header injection as list/call traffic.
  • MCPServerEntry reconciler: a new HeaderSecretRefsValidated condition (reusing MCPRemoteProxy's HeaderSecretNotFound reason) flips the entry to Failed when a referenced Secret is missing. GenerateHeaderForwardSecretEnvVarName now takes ownerName rather than proxyName so both MCPRemoteProxy and MCPServerEntry share one helper.
  • Dynamic mode (K8s API discovery): pkg/vmcp/workloads/k8s.go::mcpServerEntryToBackend is unchanged — it reads headerForward directly from the MCPServerEntry CRD at backend-construction time, so no env-var path is needed there.
Low level
File Change
pkg/vmcp/types.go Add HeaderForwardConfig (no kubebuilder markers — runtime-only); add HeaderForward field on Backend and BackendTarget.
pkg/vmcp/registry.go BackendToTarget copies HeaderForward.
pkg/vmcp/health/monitor.go Carry HeaderForward, CABundlePath, CABundleData into the health-check BackendTarget.
pkg/vmcp/client/header_forward.go New: headerForwardRoundTripper, buildHeaderForwardTripper, resolveHeaderForward.
pkg/vmcp/client/header_forward_test.go Round-tripper, resolver, restricted-header rejection, end-to-end httptest.Server test.
pkg/vmcp/client/client.go Insert tripper in chain (identity → headerForward → auth → http); add secretsProvider field.
pkg/vmcp/cli/header_forward_env.go New: readHeaderForwardFromEnv plus header-name suffix splitter.
pkg/vmcp/cli/header_forward_env_test.go Plaintext/secret/mixed/stray/multi-underscore-header table-driven tests.
pkg/vmcp/cli/serve.go Build per-backend HeaderForward map from env in static-mode bootstrap; pass to discoverer.
pkg/vmcp/aggregator/discoverer.go New headerForwardByBackend field on backendDiscoverer; constructor parameter; lookup in discoverFromStaticConfig.
pkg/vmcp/aggregator/discoverer_test.go Pass nil for new parameter at four existing call sites.
cmd/thv-operator/api/v1beta1/mcpserverentry_types.go HeaderSecretRefsValidated condition + reasons.
cmd/thv-operator/controllers/mcpserverentry_controller.go validateHeaderForwardSecretRefs wired into reconcile.
cmd/thv-operator/controllers/virtualmcpserver_deployment.go Extend buildHeaderForwardEnvVarsForEntries to emit plaintext env vars in sorted order alongside secret refs.
cmd/thv-operator/pkg/controllerutil/externalauth.go Extract normalizeHeaderForEnvVar; add GenerateHeaderForwardPlaintextEnvVarName; rename proxyNameownerName so MCPRemoteProxy and MCPServerEntry share one helper.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change
  • Refactoring
  • Documentation
  • Other

Test plan

  • task build passes
  • task test passes for all touched packages
  • New unit tests:
  • task operator-manifests, task operator-generate, task crdref-gen produce zero diff under deploy/charts/operator-crds/ and docs/operator/crd-api.md
  • Zero non-test changes under pkg/vmcp/config/

Special notes for reviewers

  • Why not the wrapper from Introduce RuntimeConfig wrapper for vMCP ConfigMap surface #5238? That PR set up runtime.Config so future operator-resolved sidecar fields could ship via the ConfigMap without leaking into the CRD. After investigation, env vars are a better fit for headerForward specifically — MCPRemoteProxy already uses env vars for the single-backend version of the same problem, and TOOLHIVE_OTEL_HEADER_* already establishes plaintext-header-via-env in this codebase. The wrapper stays empty for a future field that genuinely benefits from YAML co-location with user-authored vMCP config.
  • Plaintext-via-env exposure: literal header values land in kubectl describe pod output instead of kubectl get configmap. Both are RBAC-gated under similar verbs. Truly sensitive values still ride valueFrom.secretKeyRef and never enter the operator's view of the world. Plaintext header values are by definition non-secret — that's why the user chose plaintext.
  • Header name normalization is one-way: the runtime sees normalized header names (uppercased, hyphens to underscores). HTTP header matching is canonical-case-insensitive so this is safe for the round tripper's http.Header.Set (which canonicalizes regardless).
  • Implementation plan: an approved design doc was produced before this PR (committed as DESIGN.md during development, removed before final commit since .claude/rules/pr-creation.md keeps planning artifacts out of the PR diff).

Generated with Claude Code

MCPServerEntry.spec.headerForward.{addPlaintextHeaders,addHeadersFromSecret}
was accepted by the CRD but never sent on outbound requests when the entry
was consumed as a static backend of a VirtualMCPServer. Only MCPRemoteProxy
forwarded the field; vMCP requests to remoteUrl arrived without the
configured headers — breaking use cases like GitHub Copilot's X-MCP-Toolsets
multi-toolset selection.

Closes #4996.

Mirror the MCPRemoteProxy pattern using pod env vars as the operator-to-vMCP
delivery channel. Zero changes to pkg/vmcp/config and zero CRD/docs diff —
the per-backend HeaderForward data never enters the ConfigMap or any
CRD-reachable type:

- Operator: extend buildHeaderForwardEnvVarsForEntries to emit one env var
  per (entry, header) for both plaintext and secret-backed headers. Plaintext
  values land as literal env vars named TOOLHIVE_HEADER_PLAINTEXT_<H>_<E>;
  secret-backed values keep the existing TOOLHIVE_SECRET_HEADER_FORWARD_<H>_<E>
  via valueFrom.secretKeyRef. Header normalization is shared via the new
  normalizeHeaderForEnvVar helper so the secret and plaintext branches
  cannot diverge.

- Runtime: walk os.Environ at vMCP startup in static mode (readHeaderForwardFromEnv)
  to reconstruct map[backendName]*HeaderForwardConfig. Plaintext entries
  land in AddPlaintextHeaders; secret entries land in AddHeadersFromSecret
  carrying only the identifier (resolved later via secrets.EnvironmentProvider
  inside resolveHeaderForward at request time). Stray env vars whose decoded
  entry name doesn't match a known static backend are dropped.

- Discoverer: NewUnifiedBackendDiscovererWithStaticBackends gains a
  headerForwardByBackend parameter; discoverFromStaticConfig attaches the
  matching map entry to Backend.HeaderForward.

- Round tripper (pkg/vmcp/client/header_forward.go): inserted between
  identity (outer) and auth (inner). Resolves headers once at client-factory
  time via secrets.EnvironmentProvider; rejects restricted headers via the
  shared pkg/transport/middleware.RestrictedHeaders set; auth always wins
  over user-supplied headers because it runs after this tripper.

- Health monitor: BackendTarget construction now carries HeaderForward,
  CABundlePath, and CABundleData so health probes hit backends with the
  same TLS trust and header injection as list/call traffic.

- MCPServerEntry reconciler: HeaderSecretRefsValidated condition flips the
  entry to Failed when a referenced Secret is missing, matching
  MCPRemoteProxy's HeaderSecretNotFound reason. GenerateHeaderForwardSecretEnvVarName
  now takes ownerName rather than proxyName so both MCPRemoteProxy and
  MCPServerEntry share one source of truth.

- Dynamic mode (K8s API discovery): pkg/vmcp/workloads/k8s.go::mcpServerEntryToBackend
  is unchanged — it reads headerForward directly from the MCPServerEntry CRD
  at backend-construction time, so no env-var path is needed there.

Acceptance gates verified: zero diff under deploy/charts/operator-crds/ and
docs/operator/crd-api.md after task operator-manifests + task operator-generate
+ task crdref-gen. Zero non-test changes under pkg/vmcp/config/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@github-actions github-actions Bot added the size/XL Extra large PR: 1000+ lines changed label May 9, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

❌ Patch coverage is 70.06803% with 88 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.92%. Comparing base (9211a36) to head (0fff334).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...-operator/controllers/mcpserverentry_controller.go 43.61% 51 Missing and 2 partials ⚠️
...perator/controllers/virtualmcpserver_deployment.go 73.91% 9 Missing and 9 partials ⚠️
pkg/vmcp/client/header_forward.go 76.92% 9 Missing and 6 partials ⚠️
pkg/vmcp/client/client.go 60.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5239      +/-   ##
==========================================
+ Coverage   67.91%   67.92%   +0.01%     
==========================================
  Files         610      615       +5     
  Lines       62522    63001     +479     
==========================================
+ Hits        42464    42796     +332     
- Misses      16879    16997     +118     
- Partials     3179     3208      +29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Manual verification on a Kind cluster surfaced a real bug in the env-var
encoding: the operator was emitting one TOOLHIVE_HEADER_PLAINTEXT_<H>_<E>
env var per (entry, header) pair. The runtime parsed the env-var name
back into (header, entry) segments, but the env-var name encoding
upper-snakes the header name (X-MCP-Toolsets becomes X_MCP_TOOLSETS),
and there's no way to recover the original casing/punctuation from the
env-var name alone. The fix shipped headers — but with the WRONG names,
which would silently break GitHub Copilot's X-MCP-Toolsets selector
(the actual reproducer in #4996).

Replace the per-(entry, header) plaintext env vars with one JSON-encoded
manifest env var per backend named TOOLHIVE_HEADER_FORWARD_<entry>. The
JSON value carries every configured header with original user-authored
names preserved. Plaintext values appear inline; secret-backed entries
carry only the secret identifier — the actual Secret value still rides
the existing TOOLHIVE_SECRET_HEADER_FORWARD_<H>_<E> env var via
valueFrom.secretKeyRef and never enters the operator's view of the
world.

Operator-side:
- Drop GenerateHeaderForwardPlaintextEnvVarName.
- Add GenerateHeaderForwardManifestEnvVarName +
  HeaderForwardManifestEnvVarPrefix.
- Rewrite buildHeaderForwardEnvVarsForEntries to emit one JSON manifest
  per backend (json.Marshal sorts map keys alphabetically, giving
  deterministic Deployment-spec rendering).
- New private headerForwardManifest type mirrors vmcp.HeaderForwardConfig
  at the wire level so the runtime can json.Unmarshal directly.
- Update the test fixtures to assert the new env-var shape with JSONEq.

Runtime-side:
- Replace prefix-walk + suffix-split logic in readHeaderForwardFromEnv
  with a focused walk over TOOLHIVE_HEADER_FORWARD_* env vars +
  json.Unmarshal into vmcp.HeaderForwardConfig directly.
- Drop splitHeaderEntrySuffix (no longer relevant — the env-var name
  carries only the entry segment).
- Update test fixtures with the new JSON shape; new case for malformed
  manifest skipping; new case for secret-env-var-only-no-manifest
  defensive path.

Manual verification on Kind:
- Echo backend now records X-MCP-Toolsets, X-Trace-Id, X-Api-Key with
  ORIGINAL casing/punctuation on every outbound request.
- Acceptance gates still green: zero diff under deploy/charts/operator-crds/
  and docs/operator/crd-api.md after task operator-manifests +
  task operator-generate + task crdref-gen.

Add docs/manual-verification/headerforward-kind.md and a co-located
manifest.yaml so anyone can reproduce the integration test locally
end-to-end on Kind.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels May 9, 2026
Two parallel architectural reviews (code-reviewer + go-architect) flagged
five concrete cuts that don't change behavior. Apply them in one commit:

(1) Export ctrlutil.NormalizeHeaderForEnvVar and delete the duplicate
    normalizeForEnvSegment in pkg/vmcp/cli/header_forward_env.go. The
    "avoiding ctrlutil's regexp surface" justification was wrong: ctrlutil
    is already imported. Saves ~19 lines plus the duplication risk that
    .claude/rules/go-style.md "Avoid parallel types that drift" warns
    about.

(2) Delete headerForwardManifest in virtualmcpserver_deployment.go.
    It mirrored vmcp.HeaderForwardConfig at the wire level but was
    structurally identical — exactly the parallel-types-drift trap. The
    operator now imports vmcp directly and marshals
    *vmcp.HeaderForwardConfig, removing the manual field-copy loop too.

(3) Delete HeaderForwardSecretEnvVarPrefix and the !HasPrefix
    disambiguation in the runtime walker. Phantom check: the secret
    env-var prefix is "TOOLHIVE_SECRET_HEADER_FORWARD_" and the manifest
    prefix is "TOOLHIVE_HEADER_FORWARD_"; HasPrefix(manifest) is mutually
    exclusive with the secret prefix, so the negative check was
    reasoning about an unreachable state.

(4) Drop the staticBackendNames filter from readHeaderForwardFromEnv.
    Defensive coding without a threat model — the operator is the sole
    writer of these env vars; if a stray manifest env var exists, that's
    an operator bug and the discoverer's existing backend-resolution
    path handles it. The walker now returns a map keyed by the
    normalized entry segment from the env-var suffix; the discoverer
    normalizes Backend.Name through ctrlutil.NormalizeHeaderForEnvVar
    before indexing. Removes a parameter, simplifies the serve.go
    plumbing, drops one defensive subtest.

(5) Test cleanups:
    - TestReadHeaderForwardFromEnv "secret env var alone" subtest
      replaced with a richer "secret env var must not be parsed as a
      manifest" subtest that's now a positive assertion on the manifest
      walker (not testing an impossible code path).
    - Drop TestBuildHeaderForwardEnvVarsForEntries deterministic-across-
      reconciles subtest (~33 lines): asserts that json.Marshal sorts
      map keys, which is testing the stdlib, not our code. Existing
      plaintext-headers case already pins the JSON output exactly.
    - Drop TestHeaderForwardRoundTripper_EndToEndHTTPTestServer (~40
      lines): captureTripper IS the receiver in the existing tests, so
      a real httptest.Server adds no coverage.
    - Replace `_ = fmt.Errorf(...)` no-op in readHeaderForwardFromEnv
      with slog.Warn so malformed manifests are actually surfaced.

Net: ~185 lines removed, ~55 lines added (the directly-marshalled
HeaderForwardConfig plus the slog.Warn). Behavior identical end-to-end:
verified on Kind that all three headers (X-MCP-Toolsets, X-Trace-Id,
X-Api-Key) still arrive at the echo backend with original casing.

Acceptance gates still green: zero diff under deploy/charts/operator-crds/
and docs/operator/crd-api.md after task operator-manifests +
task operator-generate + task crdref-gen.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels May 9, 2026
The reproduction guide is local-only — it's a personal scratch artefact,
not something the project wants checked in. Kept on disk at
~/Documents/toolhive-headerforward-kind-test/ for follow-on iterations.
The PR itself stays focused on the code change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels May 9, 2026
Trailing newline left over after removing the EndToEndHTTPTestServer
test in the previous commit. No semantic change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels May 9, 2026
Single follow-up commit closing nine review findings on the
MCPServerEntry headerForward → vMCP wiring. F7 (parallel
HeaderForwardConfig types) is intentionally deferred.

- F1: Move env-var wire-format helpers from cmd/thv-operator/pkg/
  controllerutil to a neutral pkg/vmcp/headerforward/wirefmt package
  so the producer (operator) and consumer (vMCP runtime) share one
  contract without inverting the cmd/ → pkg/ layering rule.
- F2: Wire HeaderForward through dynamic-mode mcpServerEntryToBackend;
  parity test asserts the JSON marshal of the discovered config equals
  the static-mode manifest emitted by the operator.
- F3: Add a Secret get/list/watch RBAC marker on the MCPServerEntry
  reconciler so the in-cluster role permits the new validator's reads.
- F4 + F12: Sort env vars deterministically before applying to the
  Deployment to avoid the informer-cache update loop hazard, with a
  shuffled-input determinism test.
- F6: Thread context.Context through buildHeaderForwardTripper and
  resolveHeaderForward so secret-provider lookups participate in
  cancellation and tracing.
- F8 + F9 + F15: Watch referenced Secrets via a field index, validate
  that the named key exists inside the Secret (not just that the
  Secret itself exists), and aggregate all per-ref failures into one
  condition message instead of returning on the first failure.
- F10: Iterate the full middleware.RestrictedHeaders set in the
  rejection test (was a hardcoded subset) and fix a misleading subtest
  name.
- F11: Extend the test mockProvider with per-key error injection and
  verify non-NotFound provider errors propagate with %w.
- F16: Warn loudly when readHeaderForwardFromEnv encounters two
  manifest env vars sharing the same normalized owner segment, and
  document the collision domain on wirefmt.NormalizeForEnvVar.

Also adds an end-to-end test through NewHTTPBackendClient →
defaultClientFactory with httptest.Server that resolves a secret-backed
header via secrets.EnvironmentProvider and asserts the test server
received both the plaintext and resolved-secret headers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels May 9, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels May 9, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels May 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCPServerEntry.spec.headerForward is accepted by CRD but never sent with requests by VirtualMCPServer

1 participant