Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[1.4.0-rc.13] - 2026-04-29

Fixed

Switch reqwest TLS crypto provider from aws-lc-rs to ring by using rustls-no-provider feature and adding an explicit rustls dependency with ring backend. This eliminates __isoc23_strtol and related glibc 2.38+ symbols emitted by aws-lc-sys 0.40.0, restoring the GLIBC_2.28 ABI floor required by downstream users (e.g. Node.js aarch64 bindings).

Unreleased

Added

CLI binary tarballs (Linux x86_64/aarch64, macOS aarch64, Windows x86_64) attached to GitHub Releases for direct download — closes #64
scripts/generate_pricing.py regenerates schemas/pricing.json from models.dev, wired into task generate:pricing, task update, and task upgrade
Usage::prompt_tokens_details ({ cached_tokens, audio_tokens }) deserialised from the OpenAI-compatible response body, plus cost::completion_cost_with_cache and matching cache_read_input_token_cost / cache_creation_input_token_cost fields on ModelPricing. ChatCompletionResponse::estimated_cost and the CostTrackingLayer now bill cached prompt tokens at the provider's discounted cache-read rate when the model has cache pricing in schemas/pricing.json — closes #65
schemas/pricing.json carries cache_read_input_token_cost / cache_creation_input_token_cost for the 1,500+ models on models.dev that publish cache pricing

Changed

schemas/pricing.json now covers 4,219 models (up from 35) sourced from models.dev — closes #48
GitHub Release CLI assets ship a single sorted SHA256SUMS-<version>.txt (sha256sum-verifiable) instead of one .sha256 per archive — closes #67
WebAssembly build verified mio-free. The liter-llm crate exposes two mutually exclusive HTTP-stack features — native-http (reqwest + tokio + memchr + base64) and wasm-http (reqwest + memchr + base64 + gloo-timers, no tokio dependency). The liter-llm-wasm crate enables only wasm-http; the workspace's reqwest is pinned with default-features = false, features = ["json", "stream", "rustls", "multipart", "form"]. As a result, cargo build --target wasm32-unknown-unknown -p liter-llm-wasm pulls neither mio nor tokio into the dependency tree — reqwest auto-routes to the browser/Node fetch API on wasm32 targets.

[1.3.0] - 2026-04-23

Changed

Alef migration: All language bindings are now auto-generated by alef instead of hand-written
BoxFuture/BoxStream type aliases no longer wrap Result<T> — all method signatures now explicitly return Result<T>
provider module is now public (was pub(crate))
ChatCompletionRequest.stream field is now public (was pub(crate))
Switched spell checker from codespell to typos
CI no longer runs code generation — only alef verify --exit-code for freshness checks
Updated alef to v0.5.9

Added

alef.toml configuration for 10 language targets, 23 API method call configs, mock server support
bindings.rs adapter module with create_client and create_client_from_json binding-friendly constructors
Default derives on all public types for binding compatibility
Clone derive on DefaultClient
E2E test fixtures converted to alef format (167+ fixtures across 23 categories)
E2E tests regenerated for 13 languages with mock HTTP server support
Test apps generated with alef e2e generate --registry
API reference documentation auto-generated with alef docs for all 10 languages
Package READMEs generated with alef readme using restored Jinja templates
alef-verify and alef-sync-versions pre-commit hooks
alef verify --exit-code step in CI validation workflow
.lychee.toml link checker configuration
_typos.toml spell checker configuration
Auto-load API keys from environment variables
FFI callback streaming support
chat_stream method across all bindings

Removed

liter-llm-bindings-core crate — replaced by alef codegen
tools/e2e-generator crate — replaced by alef e2e generate
scripts/sync_versions.py — replaced by alef sync-versions
scripts/generate_readme.py — replaced by alef readme
scripts/readme_config.yaml and scripts/readme_templates/ — replaced by templates/readme/
tests/test_apps/ — replaced by test_apps/ (alef registry mode)
Hand-written binding source in crates/liter-llm-{py,node,ffi,wasm,php}/src/
Hand-written package source in packages/{go,java,csharp,ruby,elixir}/

1.2.2 - 2026-04-18

Added

GitHub Copilot OAuth Device Flow credential provider (copilot-auth feature) — use your Copilot subscription as an LLM backend via github_copilot/ model prefix (#12)
GitHub Copilot provider with OpenAI-compatible routing, required Copilot headers, per-request UUID, and X-Initiator header
E2E test fixtures for GitHub Copilot provider (chat + auth error)

Fixed

Provider registry audit: corrected base URLs for 20 providers (aiml, assemblyai, clarifai, dashscope, deepseek, elevenlabs, firecrawl, friendliai, gradient_ai, gmi, helicone, lambda_ai, minimax, moonshot, morph, nlp_cloud, ollama, poe, stability, wandb)
Provider registry audit: corrected env var names for 5 providers (cometapi, fal_ai, gradient_ai, jina_ai, venice)
Provider registry audit: corrected endpoint lists for 6 providers (cometapi, deepinfra, elevenlabs, jina_ai, mistral, nvidia_nim)
Added missing base_url and auth config for 11 previously non-functional providers (amazon_nova, baseten, compactifai, datarobot, docker_model_runner, duckduckgo, langgraph, lemonade, v0, vercel_ai_gateway, zai)
Added 18 stub/infrastructure providers to complex_providers list to prevent incorrect config-driven routing
Added nanogpt param mapping (max_completion_tokens → max_tokens)

1.2.1 - 2026-04-17

Added

LlmClientRaw trait with _raw variants of all LlmClient methods, returning RawExchange<T> that exposes the final request body and raw provider response before normalization (#13)
RawExchange<T> and RawStreamExchange<S> types for wire-level debugging and custom parsing
MCP & IDE integration documentation with setup guides for VS Code, GitHub Copilot, Claude Desktop, Cursor (#12)

Fixed

Docker image now published to ghcr.io/kreuzberg-dev/liter-llm (#11)
Docker publish workflow timeout increased from 60 to 360 minutes (multi-arch Rust builds via QEMU were timing out)
Bedrock build_url tests no longer flake due to BEDROCK_CROSS_REGION env var race condition

1.2.0 - 2026-04-07

Added

Local LLM provider support: Ollama, LM Studio, vLLM, llama.cpp, LocalAI, llamafile -- use any local inference engine via OpenAI-compatible API
Docker Compose setup for local LLM integration testing with Ollama
Integration test suite for local LLM providers

Fixed

PHP onError hook now passes a proper \Exception object instead of a plain string (PHP strict types requires \Throwable)
README templates fixed for rumdl compliance (MD040 code fence language, MD031 blank lines, MD032 list spacing, MD020 closed headings)
Added 404 to all POST endpoint OpenAPI specs (model not found on default model names)
Homebrew badge added to all READMEs

1.1.1 - 2026-03-29

Fixed

Java Maven plugins downgraded to 3.x stable (was 4.0.0-beta, incompatible with Maven 3.9.x CI)
PHP hook isolation (per-client instead of global), budget per-model enforcement, onError hook invocation, shutdown segfault
PHP e2e tests set max_retries=0 to prevent retry delays on mock 500s
OpenAPI spec: added 400/415/422/503 status codes to all endpoints for schemathesis compliance
first_client() returns 503 Service Unavailable instead of 500 for "no models configured"
Schemathesis CI checks aligned (removed content_type_conformance, not_a_server_error)
Docker cache: per-platform TARGETARCH cache IDs prevent multi-arch build races

Added

Homebrew formula: brew tap kreuzberg-dev/tap && brew install liter-llm
Homebrew bottle builds (arm64_sequoia) in publish workflow
liter-llm-proxy and liter-llm-cli added to crates.io publish pipeline
Installation docs: CLI/Docker/Homebrew tabs
scripts/publish/upload-homebrew-bottles.sh and ensure-github-release-exists.sh

1.1.0 - 2026-03-29

OpenAI-compatible LLM proxy server with CLI, MCP tool server, and Docker support.

Proxy Server (`liter-llm-proxy`)

22 REST endpoints — full OpenAI-compatible API surface: chat completions (streaming + non-streaming), embeddings, models, images, audio (speech + transcription), moderations, rerank, search, OCR, files CRUD, batches CRUD, responses CRUD, health
Tower middleware stack — reuses core middleware: cache, rate limit, budget, cost tracking, cooldown, health check, tracing
Virtual API keys — in-memory key store with per-key model restrictions, RPM/TPM limits, budget limits
Model routing — name-based routing to provider deployments, wildcard aliases, deterministic default client
OpenDAL file storage — configurable backend (memory, S3, GCS, filesystem) for file operations
SSE streaming — chat completion chunks proxied as Server-Sent Events with [DONE] sentinel
OpenAPI 3.1 — utoipa-generated spec served at /openapi.json with bearer auth security scheme
TOML configuration — liter-llm-proxy.toml with env var interpolation (${VAR}), auto-discovery, deny_unknown_fields
CORS — configurable origins from config (default: allow all)
Graceful shutdown — SIGINT/SIGTERM handling via tokio::signal

MCP Server (`rmcp`)

22 tools — full parity with REST API: chat, embed, list_models, generate_image, speech, transcribe, moderate, rerank, search, ocr, file CRUD (5), batch CRUD (4), response CRUD (3)
Transports — stdio (default) and HTTP/SSE via StreamableHttpService
Parameter schemas — schemars::JsonSchema derives for MCP tool discovery

CLI (`liter-llm`)

liter-llm api — start proxy server with config, host/port overrides, debug logging
liter-llm mcp — start MCP server with stdio or HTTP transport
3-tier config precedence: CLI flags > env vars > config file > defaults

Docker

Multi-stage build: rust:1.91-bookworm builder, cgr.dev/chainguard/glibc-dynamic runtime (35MB)
Non-root execution, OCI labels, port 4000 exposed
ENTRYPOINT ["liter-llm"], CMD ["api", "--host", "0.0.0.0", "--port", "4000"]

Testing

74 unit tests — config parsing, error mapping, auth key store, service pool, file store, streaming
32 integration tests — auth middleware, chat/embedding/models routes, error propagation, CORS, health, OpenAPI
12 proxy e2e fixtures — chat (basic + streaming), embeddings, models, auth errors, upstream errors, health, images, moderation, reranking
Schemathesis — contract testing against OpenAPI spec via Docker (task proxy:schemathesis)

CI/CD

.github/workflows/ci-docker.yaml — build + health test + schemathesis contract tests
.github/workflows/publish-docker.yaml — multi-arch (amd64/arm64) publish to ghcr.io/kreuzberg-dev/liter-llm
Taskfile: proxy:test, proxy:schemathesis

1.0.0 - 2026-03-28

Initial stable release. Universal LLM API client with native bindings for 11 languages and 142+ providers.

Core

LlmClient trait with chat, chat_stream, embed, list_models, image_generate, speech, transcribe, moderate, rerank, search, ocr
FileClient, BatchClient, ResponseClient traits for file/batch/response operations
DefaultClient with reqwest + tokio, SSE streaming, retry with exponential backoff
ManagedClient with composable Tower middleware stack
142 LLM providers embedded at compile time from schemas/providers.json
Per-request provider routing from model name prefix (e.g. anthropic/claude-sonnet-4-20250514)
secrecy::SecretString for API keys (zeroized on drop, never logged)
TOML configuration file loading with auto-discovery (liter-llm.toml)
Custom provider registration at runtime

Middleware (Tower)

CacheLayer — in-memory LRU + pluggable backends via CacheStore trait
OpenDAL cache — 40+ storage backends (Redis, S3, GCS, filesystem, etc.) via Apache OpenDAL
BudgetLayer — global + per-model spending limits with hard/soft enforcement
HooksLayer — request/response/error lifecycle callbacks with guardrail pattern
CooldownLayer — circuit breaker after transient errors
ModelRateLimitLayer — per-model RPM/TPM rate limiting
HealthCheckLayer — background health probing
CostTrackingLayer — per-request cost calculation from embedded pricing registry
TracingLayer — OpenTelemetry GenAI semantic convention spans
FallbackLayer — automatic failover to backup provider
RouterLayer — multi-deployment load balancing (round-robin, latency, cost, weighted)

Language Bindings

All bindings expose the full API surface with language-idiomatic conventions:

Python (PyO3) — async/await, typed kwargs, full .pyi stubs
TypeScript / Node.js (NAPI-RS) — camelCase, .d.ts types, Promise-based
Rust — native, zero-cost
Go (cgo) — FFI wrapper with build tags, context.Context support
Java (Panama FFM) — JDK 25+, AutoCloseable, builder pattern
C# / .NET (P/Invoke) — async/await, IAsyncEnumerable streaming, IDisposable
Ruby (Magnus) — RBS type signatures, Enumerator streaming
Elixir (Rustler NIF) — {:ok, result} tuples, OTP-compatible
PHP (ext-php-rs) — PHP 8.2+, JSON in/out, PIE packages
WebAssembly (wasm-bindgen) — browser + Node.js, Fetch API
C / FFI (cbindgen) — extern "C" with opaque handles

Authentication

Static API keys (Bearer, x-api-key)
Azure AD OAuth2 client credentials
Vertex AI service account JWT
AWS STS Web Identity (EKS/IRSA)
AWS SigV4 signing for Bedrock

Provider Transforms

Anthropic: message format, tool use v1, thinking blocks, max_tokens default
AWS Bedrock: Converse API, EventStream binary framing, cross-region routing
Vertex AI: Gemini format, embedding :predict endpoint
Google AI: embedding/list_models response transforms
Cohere: citation handling
Mistral: API compatibility
param_mappings for config-driven field renaming (8 providers)

Documentation

MkDocs Material site at docs.liter-llm.kreuzberg.dev
170+ code snippets across 10 languages
11 API reference docs with full method coverage
Usage pages: Chat & Streaming, Embeddings & Rerank, Media, Search & OCR, Files & Batches, Configuration
TOML configuration reference
llms.txt (218 lines) with capabilities, examples, provider list
Skills directory (4,072 lines) for Claude Code integration
README generation from Jinja templates via scripts/generate_readme.py

Testing

500+ unit and integration tests
Middleware stack composition tests (cache + budget + hooks + rate limit + cooldown)
Per-request provider routing tests
File/batch/response CRUD operation tests
Concurrency tests (budget atomicity, cache contention, rate limit fairness)
Redis cache backend integration tests (Docker Compose)
Live provider tests for 7 providers (OpenAI, Anthropic, Google AI, Vertex AI, Mistral, Azure, Bedrock)
Smoke test apps for all 10 languages against real APIs
E2E test generation from JSON fixtures across all languages
Contract test fixtures for binding API parity

CI/CD

Multi-platform publish pipeline: crates.io, PyPI, npm, RubyGems, Hex.pm, Maven Central, NuGet, Packagist, Go FFI, PHP PIE
Pre-commit hooks: 43 linters across all languages
Post-generation formatting in e2e-generator
Version sync script across 27+ manifests with README regeneration

Previous RC Releases

Release candidate history (rc.1 through rc.9)

rc.1 (2026-03-27): Initial release — core crate, 11 bindings, e2e generator
rc.2 (2026-03-27): Packaging fixes for crates.io, RubyGems, Elixir NIF, Node NAPI, publish workflow
rc.3 (2026-03-27): Cache, budget, hooks middleware; custom providers; TDD e2e fixtures
rc.4 (2026-03-28): Shared bindings-core crate; camelCase conversion; real streaming across all bindings
rc.5 (2026-03-28): OpenDAL cache; search/OCR endpoints; full middleware wiring; Go/Java/C# FFI rewrites; serde deny_unknown_fields; documentation overhaul
rc.6 (2026-03-28): Full API documentation coverage; Rust crate README; version sync improvements
rc.7 (2026-03-28): Binding parity (5 middleware params + search/ocr in all 10); contract test fixtures; skills directory; PHP PIE packages
rc.8 (2026-03-28): CI fixes (PHP publish, crate order, Maven GPG, Ruby deps, Bedrock test)
rc.9 (2026-03-28): Live provider tests; Anthropic/Bedrock/Google streaming fixes; TOML config loading; per-request provider routing; integration test suite

FilesExpand file tree

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Changelog

[1.4.0-rc.13] - 2026-04-29

Fixed

Unreleased

Added

Changed

[1.3.0] - 2026-04-23

Changed

Added

Removed

1.2.2 - 2026-04-18

Added

Fixed

1.2.1 - 2026-04-17

Added

Fixed

1.2.0 - 2026-04-07

Added

Fixed

1.1.1 - 2026-03-29

Fixed

Added

1.1.0 - 2026-03-29

Proxy Server (liter-llm-proxy)

MCP Server (rmcp)

CLI (liter-llm)

Docker

Testing

CI/CD

1.0.0 - 2026-03-28

Core

Middleware (Tower)

Language Bindings

Authentication

Provider Transforms

Documentation

Testing

CI/CD

Previous RC Releases

Proxy Server (`liter-llm-proxy`)

MCP Server (`rmcp`)

CLI (`liter-llm`)