Skip to content

GenericLlmConfig for provider-agnostic LLM configuration #104

@spichen

Description

@spichen

Problem

The current LlmConfig hierarchy needs a dedicated class per provider (OpenAiConfig, OpenAiCompatibleConfig, VllmConfig, OllamaConfig, OciGenAiConfig). This doesn't scale:

  • Every new provider requires a spec change. Anthropic, Vertex AI, Bedrock, Azure OpenAI, Mistral, Groq, Together AI all need new classes, SDK updates, adapter updates, and a new spec version.
  • "OpenAI Compatible" isn't universal. Anthropic has its own Messages API, Bedrock uses IAM auth, Azure needs deployment names and API versions, Vertex uses GCP service accounts. Forcing these into an OpenAI-compatible shape loses important details.
  • Auth is too narrow. Only api_key is supported. Real deployments need IAM roles, service accounts, managed identity, OIDC tokens, or no auth at all.
  • Generation parameters are incomplete. Only max_tokens, temperature, and top_p are defined. No guidance for top_k, stop_sequences, seed, frequency_penalty, response_format, json_schema.
  • Provider identity is tangled up with wire protocol. OCI GenAI can speak OpenAI Chat Completions. Azure OpenAI is OpenAI's API with different auth/endpoints. The spec can't separate who the provider is from what protocol they use.

Proposal

Introduce a single provider-agnostic config using free-form string discriminators instead of per-provider classes:

  • provider.type string ("openai", "anthropic", "aws_bedrock", etc.) with extra fields for provider-specific options. Optional api_protocol hint for when the wire protocol differs from what the type implies.
  • Separate auth object with its own type discriminator and credential resolution ($env:VAR_NAME for env vars).
  • Richer generation parameters with explicit fields for common options.
  • provider_extensions escape hatch for non-portable options.

Backward Compatibility

GenericLlmConfig is a new component_type that coexists with existing configs. No implicit mapping between them. Runtimes should support both, dispatching based on component_type through the existing PydanticComponentDeserializationPlugin system.

Examples

Anthropic

component_type: GenericLlmConfig
model_id: "claude-sonnet-4-20250514"
provider:
  type: "anthropic"
auth:
  type: "api_key"
  credential_ref: "$env:ANTHROPIC_API_KEY"
default_generation_parameters:
  max_tokens: 4096
  temperature: 0.7

AWS Bedrock (same shape, different provider)

component_type: GenericLlmConfig
model_id: "anthropic.claude-3-sonnet-20240229-v1:0"
provider:
  type: "aws_bedrock"
  region: "us-east-1"
auth:
  type: "iam_role"

Local Ollama (no auth needed)

component_type: GenericLlmConfig
model_id: "llama3"
provider:
  type: "ollama"
  endpoint: "http://localhost:11434"
auth: null

Open model served from vLLM

component_type: GenericLlmConfig
model_id: "meta-llama/Llama-3.1-70B-Instruct"
provider:
  type: "vllm"
  endpoint: "http://gpu-cluster:8000"
  api_protocol: "openai_chat_completions"
auth: null

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions