generated from oracle/template-repo
-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Description
Problem
The current LlmConfig hierarchy needs a dedicated class per provider (OpenAiConfig, OpenAiCompatibleConfig, VllmConfig, OllamaConfig, OciGenAiConfig). This doesn't scale:
- Every new provider requires a spec change. Anthropic, Vertex AI, Bedrock, Azure OpenAI, Mistral, Groq, Together AI all need new classes, SDK updates, adapter updates, and a new spec version.
- "OpenAI Compatible" isn't universal. Anthropic has its own Messages API, Bedrock uses IAM auth, Azure needs deployment names and API versions, Vertex uses GCP service accounts. Forcing these into an OpenAI-compatible shape loses important details.
- Auth is too narrow. Only
api_keyis supported. Real deployments need IAM roles, service accounts, managed identity, OIDC tokens, or no auth at all. - Generation parameters are incomplete. Only
max_tokens,temperature, andtop_pare defined. No guidance fortop_k,stop_sequences,seed,frequency_penalty,response_format,json_schema. - Provider identity is tangled up with wire protocol. OCI GenAI can speak OpenAI Chat Completions. Azure OpenAI is OpenAI's API with different auth/endpoints. The spec can't separate who the provider is from what protocol they use.
Proposal
Introduce a single provider-agnostic config using free-form string discriminators instead of per-provider classes:
provider.typestring ("openai","anthropic","aws_bedrock", etc.) with extra fields for provider-specific options. Optionalapi_protocolhint for when the wire protocol differs from what the type implies.- Separate
authobject with its own type discriminator and credential resolution ($env:VAR_NAMEfor env vars). - Richer generation parameters with explicit fields for common options.
provider_extensionsescape hatch for non-portable options.
Backward Compatibility
GenericLlmConfig is a new component_type that coexists with existing configs. No implicit mapping between them. Runtimes should support both, dispatching based on component_type through the existing PydanticComponentDeserializationPlugin system.
Examples
Anthropic
component_type: GenericLlmConfig
model_id: "claude-sonnet-4-20250514"
provider:
type: "anthropic"
auth:
type: "api_key"
credential_ref: "$env:ANTHROPIC_API_KEY"
default_generation_parameters:
max_tokens: 4096
temperature: 0.7AWS Bedrock (same shape, different provider)
component_type: GenericLlmConfig
model_id: "anthropic.claude-3-sonnet-20240229-v1:0"
provider:
type: "aws_bedrock"
region: "us-east-1"
auth:
type: "iam_role"Local Ollama (no auth needed)
component_type: GenericLlmConfig
model_id: "llama3"
provider:
type: "ollama"
endpoint: "http://localhost:11434"
auth: nullOpen model served from vLLM
component_type: GenericLlmConfig
model_id: "meta-llama/Llama-3.1-70B-Instruct"
provider:
type: "vllm"
endpoint: "http://gpu-cluster:8000"
api_protocol: "openai_chat_completions"
auth: nullReactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels