Skip to content

Add model.capabilities configuration for client detection #734

@mostlygeek

Description

@mostlygeek

This issue is a follows up discussion #713 on automatically adding capabilities metadata to v1/models responses. I'm opening this up to gather some community feedback before working on it.

LLM clients are using metadata from v1/models to determine the capabilities of a model. Within llama-swap a new models.capabilities configuration would contain metadata about what the local inference server supports.

models:
  model_id: 
    capabilities:
      text: 
        completions: true. # /chat/completions (openai)
        responses: false # /responses (openai)
        messages: false # /messages (anthropic)
        fim: true # (llama.cpp FIM)
      audio: 
        speech: true # text to speech
        transcriptions: true # ASR
      image:
        vision: true

Design:

  • model.capabilities is a map of key and value
    • values can be a boolean, string, int, map, etc
  • valid keys are defined in config-schema.json
  • backwards compatibility is to be maintained. Default (empty map) generates no capabilities metadata.
  • v1/models endpoint will use capabilities to generate metadata for clients
    • this is no published standard for clients so support the popular projects first (tbd)
  • The v1 of capabilities to introduce a programming pattern that makes adding new client support easier in the future.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions