feat: Placeholder-based credential isolation — agents never see real tokens

## Summary

Agents should never see real bearer tokens. AuthBridge should replace the inbound `Authorization` header with an opaque placeholder after validation, and resolve that placeholder back to an exchanged token on the outbound path. The agent only ever sees a meaningless reference — credential handling is fully transparent.

Inspired by [NVIDIA OpenShell's credential isolation model](https://github.com/NVIDIA/OpenShell), where the agent process never has access to any real credential.

## Current Flow

```
User -> [Authorization: Bearer <user-token>] -> AuthBridge sidecar (validates JWT)
    -> [Authorization: Bearer <user-token>] -> Agent container  <-- agent sees the token
    -> Agent makes outbound call [Authorization: Bearer <user-token>]
    -> AuthBridge sidecar (token exchange) -> [Authorization: Bearer <exchanged-token>] -> Tool
```

The agent receives and can read the user's bearer token. A compromised agent (prompt injection, malicious tool, dependency vulnerability) can exfiltrate or misuse it.

## Proposed Flow

```
Inbound:
  User -> [Authorization: Bearer <user-token>]
       -> AuthBridge sidecar validates JWT, extracts claims
       -> generates opaque placeholder: kagenti:ref:<uuid>
       -> stores uuid -> {claims, act chain, expiry} in shared memory
       -> replaces header: [Authorization: Bearer kagenti:ref:<uuid>]
       -> forwards to Agent container  <-- agent only sees placeholder

Outbound:
  Agent makes call -> [Authorization: Bearer kagenti:ref:<uuid>]
       -> AuthBridge sidecar intercepts
       -> looks up uuid in shared memory -> retrieves cached claims
       -> performs token exchange using claims + agent SPIFFE identity
       -> replaces header: [Authorization: Bearer <exchanged-token>]
       -> forwards to Tool
```

The agent never touches a real token. It receives and propagates an opaque placeholder naturally (most HTTP frameworks propagate `Authorization` headers), and the sidecar resolves it on the way out.

## Why Placeholders (Not Just Strip-and-Cache)

With concurrent requests from multiple users (or multiple requests from the same user), there is no reliable way to correlate an outbound request back to the inbound request that triggered it. The placeholder solves this:

- Each inbound request gets a **unique placeholder UUID**
- The agent framework naturally propagates it in the `Authorization` header to outbound calls
- The outbound path uses the placeholder to look up the exact claims for that specific request
- No ambiguity, no race conditions, no single-user assumption

This also resolves the agent framework propagation concern from #174 — frameworks already propagate `Authorization` headers, they just happen to contain a placeholder instead of a real token.

## What Changes in authlib

### Inbound (`HandleInbound`)

After validation succeeds:

1. Generate a placeholder: `kagenti:ref:<uuid>`
2. Store `uuid -> {original claims, act chain, expiry}` in a shared in-memory store
3. Return a new action (e.g., `ActionReplaceAndAllow`) signaling the listener to replace the `Authorization` header value with the placeholder

### Outbound (`HandleOutbound`)

When the `Authorization` header contains a `kagenti:ref:*` value:

1. Extract the UUID, look up claims in the shared store
2. Perform token exchange using cached claims + agent SPIFFE identity (preserving `act` chain)
3. Replace the placeholder with the exchanged token via existing `ActionReplaceToken`
4. If the UUID is not found or expired, **deny the request** (fail-closed)

When no `Authorization` header is present, fall back to existing `noTokenPolicy` behavior (unchanged).

### Shared Memory Store

A `sync.Map` (or equivalent concurrent map) in the authlib Go process, keyed by placeholder UUID, storing validated claims and expiry. Both the inbound and outbound code paths in `auth.Auth` already share the same struct instance, so the store is naturally shared.

## Open Design Questions

### 1. Where does the shared store live in envoy-sidecar mode?

In proxy-sidecar mode, a single Go process handles both directions — a `sync.Map` works directly. In envoy-sidecar mode, inbound and outbound are separate ext_proc filter invocations. If the same `go-processor` binary handles both, the in-memory store still works. If they are separate processes, a shared Unix socket or pod-local gRPC service would be needed.

### 2. Placeholder TTL

The placeholder must live long enough for the agent to process the request and make outbound calls. Options:
- Fixed TTL (e.g., 60s) — simple but may be too short for complex chains
- Tied to inbound request lifetime — evict when the inbound response completes
- Configurable per-agent — longer for agents that do multi-step reasoning

### 3. Streaming and long-running agent calls

If the agent takes minutes to process (complex LLM chain with multiple tool calls), the placeholder must live at least that long. A request-lifetime-scoped eviction may be more robust than a fixed TTL.

### 4. Multiple outbound calls per inbound request

An agent may make several tool calls for a single inbound request, all carrying the same placeholder. The store must support multiple lookups per UUID (read-many, not pop-on-read). Eviction should happen after the inbound request completes, not after the first outbound resolution.

### 5. Placeholder leakage

The UUID is opaque — no claims, scopes, or original token can be derived from it. Even if exfiltrated, it is only resolvable within the sidecar's in-memory store inside that specific pod. However, if the agent logs or persists the placeholder, it could theoretically be replayed within the TTL window. Consider whether the store should also bind to source IP or other request attributes.

## How This Differs from #174

Issue #174 investigates getting agent frameworks to explicitly propagate the inbound token to outbound calls. This proposal makes that unnecessary — frameworks already propagate the `Authorization` header naturally, so the placeholder flows through without any framework-specific integration. These are complementary: #174 is for frameworks that want explicit token awareness; this solves it transparently at the infrastructure layer.

## Acceptance Criteria

- [ ] Inbound: real `Authorization` header is replaced with an opaque placeholder after validation
- [ ] Inbound: validated claims are stored in shared memory keyed by placeholder UUID
- [ ] Outbound: placeholder is resolved to cached claims, token exchange is performed
- [ ] Outbound: exchanged token is injected into the `Authorization` header
- [ ] Unresolvable or expired placeholders are denied (fail-closed)
- [ ] Concurrent multi-user requests are correctly isolated
- [ ] Works with both envoy-sidecar and proxy-sidecar modes
- [ ] Act claim chains are preserved across the placeholder boundary
- [ ] Feature-flagged and disabled by default


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Placeholder-based credential isolation — agents never see real tokens #336

Summary

Current Flow

Proposed Flow

Why Placeholders (Not Just Strip-and-Cache)

What Changes in authlib

Inbound (`HandleInbound`)

Outbound (`HandleOutbound`)

Shared Memory Store

Open Design Questions

1. Where does the shared store live in envoy-sidecar mode?

2. Placeholder TTL

3. Streaming and long-running agent calls

4. Multiple outbound calls per inbound request

5. Placeholder leakage

How This Differs from #174

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: Placeholder-based credential isolation — agents never see real tokens #336

Description

Summary

Current Flow

Proposed Flow

Why Placeholders (Not Just Strip-and-Cache)

What Changes in authlib

Inbound (HandleInbound)

Outbound (HandleOutbound)

Shared Memory Store

Open Design Questions

1. Where does the shared store live in envoy-sidecar mode?

2. Placeholder TTL

3. Streaming and long-running agent calls

4. Multiple outbound calls per inbound request

5. Placeholder leakage

How This Differs from #174

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Inbound (`HandleInbound`)

Outbound (`HandleOutbound`)