Skip to content
Open
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
293 changes: 293 additions & 0 deletions docs/rfds/custom-llm-endpoint.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,293 @@
---
title: "Custom LLM Endpoint Configuration"
---

- Author(s): [@anna239](https://github.com/anna239), [@xtmq](https://github.com/xtmq)

> **Note:** This RFD is very preliminary and intended to start a dialog about this feature. The proposed design may change significantly based on feedback and further discussion.

## Elevator pitch

> What are you proposing to change?

Add the ability for clients to pass custom LLM endpoint URLs and authentication credentials to agents via a dedicated `setLlmEndpoints` method, with support for multiple providers. This allows clients to route LLM requests through their own infrastructure (proxies, gateways, or self-hosted models) without agents needing to know about this configuration in advance.

## Status quo

> How do things work today and what problems does this cause? Why would we change things?

Currently, agents are configured with their own LLM endpoints and credentials, typically through environment variables or configuration files. This creates problems for:

- **Client proxies**: Clients want to route agent traffic through their own proxies, f.i. for setting additional headers or logging
- **Enterprise deployments**: Organizations want to route LLM traffic through their own proxies for compliance, logging, or cost management
- **Self-hosted models**: Users running local LLM servers (vLLM, Ollama, etc.) cannot easily redirect agent traffic
- **API gateways**: Organizations using LLM gateways for rate limiting, caching, or multi-provider routing

## Shiny future

> How will things play out once this feature exists?

Clients will be able to:
1. Discover whether an agent supports custom LLM endpoints via capabilities during initialization
2. Perform agent configuration, including authorization based on this knowledge
3. Pass custom LLM endpoint URLs and headers for different providers via a dedicated method
4. Have agent LLM requests automatically routed through the appropriate endpoint based on provider

## Implementation details and plan

> Tell me more about your implementation. What is your detailed implementation plan?

### Intended flow

The design uses a two-step approach: capability discovery during initialization, followed by endpoint configuration via a dedicated method. This enables the following flow:

```mermaid
sequenceDiagram
participant Client
participant Agent

Client->>Agent: initialize
Note right of Agent: Agent reports capabilities,<br/>including llmEndpoints support
Agent-->>Client: initialize response<br/>(agentCapabilities.llmEndpoints = true)

Note over Client: Client sees llmEndpoints capability.<br/>Performs configuration / authorization<br/>based on this knowledge.

Client->>Agent: setLlmEndpoints
Agent-->>Client: setLlmEndpoints response<br/>(accepted providers)

Note over Client,Agent: Ready for session setup
Client->>Agent: session/new
```

1. **Initialization**: The client calls `initialize`. The agent responds with its capabilities, including an `llmEndpoints` flag indicating support for custom endpoint configuration.
2. **Client-side decision**: The client inspects the capability. If the agent supports `llmEndpoints`, the client can perform authorization, resolve credentials, or configure endpoints accordingly. If the agent does not support it, the client falls back to a different authorization and configuration strategy.
3. **Endpoint configuration**: The client calls `setLlmEndpoints` with provider-to-endpoint mappings. The agent responds with the subset of providers it accepted.
4. **Session creation**: The client proceeds to create a session.

### Capability advertisement

The agent advertises support for custom LLM endpoints via a new capability flag in `agentCapabilities`:

```typescript
interface AgentCapabilities {
// ... existing fields ...

/** Whether the agent supports custom LLM endpoint configuration via setLlmEndpoints */
llmEndpoints?: boolean;
}
```

**Initialize Response example:**
```json
{
"jsonrpc": "2.0",
"id": 0,
"result": {
"protocolVersion": 1,
"agentInfo": {
"name": "MyAgent",
"version": "2.0.0"
},
"agentCapabilities": {
"llmEndpoints": true,
"sessionCapabilities": {}
}
}
}
```

### `setLlmEndpoints` method

A dedicated method that can be called after initialization but before session creation.

```typescript
/** Well-known LLM provider identifiers */
type LlmProvider = "anthropic" | "openai" | "google" | "amazon" | "will be added later";

interface LlmEndpointConfig {
/** Base URL for LLM API requests (e.g., "https://llm-proxy.corp.example.com/v1") */
url: string;

/**
* Additional HTTP headers to include in LLM API requests.
* Each entry is a header name mapped to its value.
* Common use cases include Authorization, custom routing, or tracing headers.
*/
headers?: Record<string, string> | null;

/** Extension metadata */
_meta?: Record<string, unknown>;
}

interface SetLlmEndpointsRequest {
/**
* Custom LLM endpoint configurations per provider.
* When provided, the agent should route LLM requests to the appropriate endpoint
* based on the provider being used.
* This configuration is per-process and should not be persisted to disk.
*/
endpoints: Record<LlmProvider, LlmEndpointConfig>;

/** Extension metadata */
_meta?: Record<string, unknown>;
}

interface SetLlmEndpointsResponse {
/**
* Echoed back with the providers the agent accepts.
* Only includes providers that the agent will actually use.
* If empty, the agent did not accept any endpoints.
*/
accepted: Record<LlmProvider, LlmEndpointConfig>;

/** Extension metadata */
_meta?: Record<string, unknown>;
}
```

#### JSON Schema Additions

```json
{
"$defs": {
"LlmEndpointConfig": {
"description": "Configuration for a custom LLM endpoint.",
"properties": {
"url": {
"type": "string",
"description": "Base URL for LLM API requests."
},
"headers": {
"type": ["object", "null"],
"description": "Additional HTTP headers to include in LLM API requests.",
"additionalProperties": {
"type": "string"
}
},
"_meta": {
"additionalProperties": true,
"type": ["object", "null"]
}
},
"required": ["url"],
"type": "object"
},
"LlmEndpoints": {
"description": "Map of provider identifiers to endpoint configurations. This configuration is per-process and should not be persisted to disk.",
"type": "object",
"additionalProperties": {
"$ref": "#/$defs/LlmEndpointConfig"
}
}
}
}
```

#### Example Exchange

**setLlmEndpoints Request:**
```json
{
"jsonrpc": "2.0",
"id": 2,
"method": "setLlmEndpoints",
"params": {
"endpoints": {
"anthropic": {
"url": "https://llm-gateway.corp.example.com/anthropic/v1",
"headers": {
"Authorization": "Bearer anthropic-token-abc123",
"X-Request-Source": "my-ide"
}
},
"openai": {
"url": "https://llm-gateway.corp.example.com/openai/v1",
"headers": {
"Authorization": "Bearer openai-token-xyz789"
}
}
}
}
}
```

**setLlmEndpoints Response:**
```json
{
"jsonrpc": "2.0",
"id": 2,
"result": {
"accepted": {
"anthropic": {
"url": "https://llm-gateway.corp.example.com/anthropic/v1",
"headers": {
"Authorization": "Bearer anthropic-token-abc123",
"X-Request-Source": "my-ide"
}
}
}
}
}
```

In this example, the agent only uses Anthropic, so it only accepts that provider's configuration.

#### Behavior

1. **Capability discovery**: The agent MUST advertise `llmEndpoints: true` in `agentCapabilities` if it supports the `setLlmEndpoints` method. Clients SHOULD check this capability before calling the method.

2. **Timing**: The `setLlmEndpoints` method MUST be called after `initialize` and before `session/new`. Calling it during an active session is undefined behavior. All subsequent sessions will use the newly configured endpoints.

3. **Confirmation via response**: The agent MUST respond with the `accepted` map containing only the providers it will actually use. If the agent accepts none, the `accepted` map SHOULD be empty.

4. **Per-process scope**: The endpoint configuration applies to the entire agent process lifetime. It should not be stored to disk or persist beyond the process.

5. **Provider-based routing**: The agent should route LLM requests to the appropriate endpoint based on the provider. If the agent uses a provider not in the provided map, it uses its default endpoint for that provider.

6. **Agent discretion**: If an agent cannot support custom endpoints (e.g., uses a proprietary API), it should not advertise the `llmEndpoints` capability.

## Open questions

### How should provider identifiers be standardized?

We need to define a standard set of provider identifiers (e.g., `"anthropic"`, `"openai"`, `"google"`, `"amazon"`). Should this be:
- A fixed enum in the protocol specification?
- An extensible set with well-known values and support for custom strings?
- Defined in a separate registry/document that can be updated independently?

### How should model availability be handled?

When a custom endpoint is provided, it may only support a subset of models. For example, a self-hosted vLLM server might only have `llama-3-70b` available, while the agent normally advertises `claude-3-opus`, `gpt-4`, etc.

## Frequently asked questions

> What questions have arisen over the course of authoring this document?

### Why not pass endpoints in the `initialize` request?

Passing endpoints directly in `initialize` would require the client to have already resolved credentials and configured endpoints before knowing whether the agent supports this feature. In practice, the client needs to inspect the agent's capabilities first to decide its authorization strategy — for example, whether to route through a corporate proxy or use direct credentials. A dedicated method after initialization solves this chicken-and-egg problem and keeps capability negotiation separate from endpoint configuration.

### Why not pass endpoint when selecting a model?

One option would be to pass the endpoint URL and credentials when the user selects a model (e.g., in `session/new` or a model selection method).

Many agents throw authentication errors before the model selection happens. This makes the flow unreliable.

### Why not use environment variables or command-line arguments?

One option would be to pass endpoint configuration via environment variables (like `OPENAI_API_BASE`) or command-line arguments when starting the agent process.

This approach has significant drawbacks:
- With multiple providers, the configuration becomes complex JSON that is awkward to pass via command-line arguments
- Environment variables may be logged or visible to other processes, creating security concerns
- Requires knowledge of agent-specific variable names or argument formats
- No standardized way to confirm the agent accepted the configuration

### What if the agent doesn't support custom endpoints?

If the agent doesn't support custom endpoints, it will not advertise `llmEndpoints: true` in `agentCapabilities` during initialization. The client can detect this and choose an alternative authorization and configuration strategy, or proceed using the agent's default endpoints.

## Revision history

- 2026-03-04: Revised to use dedicated `setLlmEndpoints` method with capability advertisement
- 2026-02-02: Initial draft - preliminary proposal to start discussion