This issue is a follows up discussion #713 on automatically adding capabilities metadata to v1/models responses. I'm opening this up to gather some community feedback before working on it.
LLM clients are using metadata from v1/models to determine the capabilities of a model. Within llama-swap a new models.capabilities configuration would contain metadata about what the local inference server supports.
models:
model_id:
capabilities:
text:
completions: true. # /chat/completions (openai)
responses: false # /responses (openai)
messages: false # /messages (anthropic)
fim: true # (llama.cpp FIM)
audio:
speech: true # text to speech
transcriptions: true # ASR
image:
vision: true
Design:
- model.capabilities is a map of key and value
- values can be a boolean, string, int, map, etc
- valid keys are defined in config-schema.json
- backwards compatibility is to be maintained. Default (empty map) generates no capabilities metadata.
- v1/models endpoint will use
capabilities to generate metadata for clients
- this is no published standard for clients so support the popular projects first (tbd)
- The v1 of capabilities to introduce a programming pattern that makes adding new client support easier in the future.
This issue is a follows up discussion #713 on automatically adding capabilities metadata to v1/models responses. I'm opening this up to gather some community feedback before working on it.
LLM clients are using metadata from v1/models to determine the capabilities of a model. Within llama-swap a new
models.capabilitiesconfiguration would contain metadata about what the local inference server supports.Design:
capabilitiesto generate metadata for clients