Add model.capabilities configuration for client detection

This issue is a follows up [discussion #713](https://github.com/mostlygeek/llama-swap/discussions/713#discussioncomment-16784828) on automatically adding capabilities metadata to v1/models responses. I'm opening this up to gather some community feedback before working on it.

LLM clients are using metadata from v1/models to determine the capabilities of a model. Within llama-swap a new `models.capabilities` configuration would contain metadata about what the local inference server supports. 

```yaml
models:
  model_id: 
    capabilities:
      text: 
        completions: true. # /chat/completions (openai)
        responses: false # /responses (openai)
        messages: false # /messages (anthropic)
        fim: true # (llama.cpp FIM)
      audio: 
        speech: true # text to speech
        transcriptions: true # ASR
      image:
        vision: true
```

Design: 

- model.capabilities is a map of key and value
  - values can be a boolean, string, int, map, etc
- valid keys are defined in config-schema.json
- backwards compatibility is to be maintained. Default (empty map) generates no capabilities metadata.
- v1/models endpoint will use `capabilities` to generate metadata for clients
  - this is no published standard for clients so support the popular projects first (tbd)
- The v1 of capabilities to introduce a programming pattern that makes adding new client support easier in the future. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add model.capabilities configuration for client detection #734

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add model.capabilities configuration for client detection #734

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions