Skip to content

feat: Placeholder-based credential isolation — agents never see real tokens #336

@huang195

Description

@huang195

Summary

Agents should never see real bearer tokens. AuthBridge should replace the inbound Authorization header with an opaque placeholder after validation, and resolve that placeholder back to an exchanged token on the outbound path. The agent only ever sees a meaningless reference — credential handling is fully transparent.

Inspired by NVIDIA OpenShell's credential isolation model, where the agent process never has access to any real credential.

Current Flow

User -> [Authorization: Bearer <user-token>] -> AuthBridge sidecar (validates JWT)
    -> [Authorization: Bearer <user-token>] -> Agent container  <-- agent sees the token
    -> Agent makes outbound call [Authorization: Bearer <user-token>]
    -> AuthBridge sidecar (token exchange) -> [Authorization: Bearer <exchanged-token>] -> Tool

The agent receives and can read the user's bearer token. A compromised agent (prompt injection, malicious tool, dependency vulnerability) can exfiltrate or misuse it.

Proposed Flow

Inbound:
  User -> [Authorization: Bearer <user-token>]
       -> AuthBridge sidecar validates JWT, extracts claims
       -> generates opaque placeholder: kagenti:ref:<uuid>
       -> stores uuid -> {claims, act chain, expiry} in shared memory
       -> replaces header: [Authorization: Bearer kagenti:ref:<uuid>]
       -> forwards to Agent container  <-- agent only sees placeholder

Outbound:
  Agent makes call -> [Authorization: Bearer kagenti:ref:<uuid>]
       -> AuthBridge sidecar intercepts
       -> looks up uuid in shared memory -> retrieves cached claims
       -> performs token exchange using claims + agent SPIFFE identity
       -> replaces header: [Authorization: Bearer <exchanged-token>]
       -> forwards to Tool

The agent never touches a real token. It receives and propagates an opaque placeholder naturally (most HTTP frameworks propagate Authorization headers), and the sidecar resolves it on the way out.

Why Placeholders (Not Just Strip-and-Cache)

With concurrent requests from multiple users (or multiple requests from the same user), there is no reliable way to correlate an outbound request back to the inbound request that triggered it. The placeholder solves this:

  • Each inbound request gets a unique placeholder UUID
  • The agent framework naturally propagates it in the Authorization header to outbound calls
  • The outbound path uses the placeholder to look up the exact claims for that specific request
  • No ambiguity, no race conditions, no single-user assumption

This also resolves the agent framework propagation concern from #174 — frameworks already propagate Authorization headers, they just happen to contain a placeholder instead of a real token.

What Changes in authlib

Inbound (HandleInbound)

After validation succeeds:

  1. Generate a placeholder: kagenti:ref:<uuid>
  2. Store uuid -> {original claims, act chain, expiry} in a shared in-memory store
  3. Return a new action (e.g., ActionReplaceAndAllow) signaling the listener to replace the Authorization header value with the placeholder

Outbound (HandleOutbound)

When the Authorization header contains a kagenti:ref:* value:

  1. Extract the UUID, look up claims in the shared store
  2. Perform token exchange using cached claims + agent SPIFFE identity (preserving act chain)
  3. Replace the placeholder with the exchanged token via existing ActionReplaceToken
  4. If the UUID is not found or expired, deny the request (fail-closed)

When no Authorization header is present, fall back to existing noTokenPolicy behavior (unchanged).

Shared Memory Store

A sync.Map (or equivalent concurrent map) in the authlib Go process, keyed by placeholder UUID, storing validated claims and expiry. Both the inbound and outbound code paths in auth.Auth already share the same struct instance, so the store is naturally shared.

Open Design Questions

1. Where does the shared store live in envoy-sidecar mode?

In proxy-sidecar mode, a single Go process handles both directions — a sync.Map works directly. In envoy-sidecar mode, inbound and outbound are separate ext_proc filter invocations. If the same go-processor binary handles both, the in-memory store still works. If they are separate processes, a shared Unix socket or pod-local gRPC service would be needed.

2. Placeholder TTL

The placeholder must live long enough for the agent to process the request and make outbound calls. Options:

  • Fixed TTL (e.g., 60s) — simple but may be too short for complex chains
  • Tied to inbound request lifetime — evict when the inbound response completes
  • Configurable per-agent — longer for agents that do multi-step reasoning

3. Streaming and long-running agent calls

If the agent takes minutes to process (complex LLM chain with multiple tool calls), the placeholder must live at least that long. A request-lifetime-scoped eviction may be more robust than a fixed TTL.

4. Multiple outbound calls per inbound request

An agent may make several tool calls for a single inbound request, all carrying the same placeholder. The store must support multiple lookups per UUID (read-many, not pop-on-read). Eviction should happen after the inbound request completes, not after the first outbound resolution.

5. Placeholder leakage

The UUID is opaque — no claims, scopes, or original token can be derived from it. Even if exfiltrated, it is only resolvable within the sidecar's in-memory store inside that specific pod. However, if the agent logs or persists the placeholder, it could theoretically be replayed within the TTL window. Consider whether the store should also bind to source IP or other request attributes.

How This Differs from #174

Issue #174 investigates getting agent frameworks to explicitly propagate the inbound token to outbound calls. This proposal makes that unnecessary — frameworks already propagate the Authorization header naturally, so the placeholder flows through without any framework-specific integration. These are complementary: #174 is for frameworks that want explicit token awareness; this solves it transparently at the infrastructure layer.

Acceptance Criteria

  • Inbound: real Authorization header is replaced with an opaque placeholder after validation
  • Inbound: validated claims are stored in shared memory keyed by placeholder UUID
  • Outbound: placeholder is resolved to cached claims, token exchange is performed
  • Outbound: exchanged token is injected into the Authorization header
  • Unresolvable or expired placeholders are denied (fail-closed)
  • Concurrent multi-user requests are correctly isolated
  • Works with both envoy-sidecar and proxy-sidecar modes
  • Act claim chains are preserved across the placeholder boundary
  • Feature-flagged and disabled by default

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions