Skip to content

v0.11.0 — HF & Ollama Provider Integration

Choose a tag to compare

@rafiki270 rafiki270 released this 06 Mar 17:15
· 260 commits to main since this release

What's New

Hugging Face Provider

  • HF provider adapter with routing policies (serverless, dedicated, local), SSE error parsing
  • HF Hub client, model registry with curated list ranking, token auth
  • Web UI sidebar pages: Account, Model Library with search, Ready to Use with routing policy selector
  • HF usage tracking metrics module and admin dashboard widgets with cost estimates and rate limit display
  • Model picker popup extended with HF model groups
  • 24h auto-refresh, provider comparison panel, error handling for rate limits and gated models

Ollama Provider

  • Ollama API client, connection manager (local + cloud), model registry with cache
  • Refactored provider adapter: streaming, dual-endpoint strategy, concurrency semaphore, cloud auth
  • Model picker popup extended with Ollama model groups
  • Ollama usage tracking with timing metrics, tokens/sec, VRAM status dashboard widgets
  • Pull progress UI, cold-start timeout handling, VRAM display, error UX polish

Model Usage

  • Model usage API endpoint with multi-project aggregation
  • Frontend types, API client, format utilities
  • ModelUsage page components with routing and sidebar entry

Infrastructure

  • Global DB V20 migration with hf_usage, hf_paired_models, ollama_usage, ollama_paired_models tables
  • model_capability_error added to ProviderErrorType (retryable: false)
  • Fixed applyEnvOverrides colon parsing for multi-colon model names (e.g., ollama:llama3:latest)

Full Changelog: v0.10.62...v0.11.0