v0.11.0 — HF & Ollama Provider Integration

rafiki270 released this 06 Mar 17:15

· 260 commits to main since this release

ceacf11

What's New

Hugging Face Provider

HF provider adapter with routing policies (serverless, dedicated, local), SSE error parsing
HF Hub client, model registry with curated list ranking, token auth
Web UI sidebar pages: Account, Model Library with search, Ready to Use with routing policy selector
HF usage tracking metrics module and admin dashboard widgets with cost estimates and rate limit display
Model picker popup extended with HF model groups
24h auto-refresh, provider comparison panel, error handling for rate limits and gated models

Ollama Provider

Ollama API client, connection manager (local + cloud), model registry with cache
Refactored provider adapter: streaming, dual-endpoint strategy, concurrency semaphore, cloud auth
Model picker popup extended with Ollama model groups
Ollama usage tracking with timing metrics, tokens/sec, VRAM status dashboard widgets
Pull progress UI, cold-start timeout handling, VRAM display, error UX polish

Model Usage

Model usage API endpoint with multi-project aggregation
Frontend types, API client, format utilities
ModelUsage page components with routing and sidebar entry

Infrastructure

Global DB V20 migration with hf_usage, hf_paired_models, ollama_usage, ollama_paired_models tables
model_capability_error added to ProviderErrorType (retryable: false)
Fixed applyEnvOverrides colon parsing for multi-colon model names (e.g., ollama:llama3:latest)

Full Changelog: v0.10.62...v0.11.0

Assets 2