|
| 1 | +# Changelog |
| 2 | + |
| 3 | +All notable changes to FreeLLM are documented in this file. |
| 4 | + |
| 5 | +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), |
| 6 | +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). |
| 7 | + |
| 8 | +## [1.0.0] - 2026-04-08 |
| 9 | + |
| 10 | +First stable release. Production-ready OpenAI-compatible gateway aggregating |
| 11 | +6 free LLM providers with automatic failover, circuit breakers, and a |
| 12 | +real-time dashboard. |
| 13 | + |
| 14 | +### Added |
| 15 | + |
| 16 | +#### Gateway |
| 17 | +- OpenAI-compatible `/v1/chat/completions` endpoint with streaming and non-streaming support |
| 18 | +- 6 LLM providers: Groq, Gemini, Mistral, Cerebras, NVIDIA NIM, and Ollama |
| 19 | +- 25+ models across providers including Llama 3.3 70B, Gemini 2.5 Flash/Pro, Llama 4 Scout, Qwen3, Nemotron 70B, DeepSeek R1, GPT-OSS 120B |
| 20 | +- Three meta-models: `free` (round-robin), `free-fast` (latency-optimized), `free-smart` (capability-optimized) |
| 21 | +- Automatic failover across providers with configurable routing strategies (round-robin, random) |
| 22 | +- Per-provider circuit breakers with three states (closed → open → half-open) and configurable thresholds |
| 23 | +- Per-provider sliding-window rate limiting with conservative free-tier defaults |
| 24 | +- Per-client (per-IP) rate limiting via `express-rate-limit` |
| 25 | +- In-memory request log (last 500 requests) with stats and recent history |
| 26 | +- Routing deadline (`ROUTE_TIMEOUT_MS`) to prevent hung requests during cascading failures |
| 27 | + |
| 28 | +#### Security |
| 29 | +- Optional API key authentication (`FREELLM_API_KEY`) using timing-safe SHA-256 comparison |
| 30 | +- Separate admin key (`FREELLM_ADMIN_KEY`) protecting circuit breaker reset and routing strategy mutations |
| 31 | +- Configurable CORS origins (`ALLOWED_ORIGINS`) |
| 32 | +- Body size limits on JSON and URL-encoded payloads |
| 33 | +- Zod schema validation with strict mode and bounded `messages.max(256)` / `max_tokens.max(32768)` |
| 34 | +- Upstream error sanitization (only safe `message` field forwarded, never raw upstream JSON) |
| 35 | +- Production warning when running without API key auth |
| 36 | + |
| 37 | +#### Dashboard |
| 38 | +- React 18 + Vite + Tailwind SPA served by the same Express process in production |
| 39 | +- Real-time provider health cards (circuit breaker state, success/failure counts, last error) |
| 40 | +- Live request log with latency, status, model, and selected provider |
| 41 | +- Routing strategy toggle (round-robin / random) |
| 42 | +- Manual circuit breaker reset |
| 43 | +- Models page with search and grouping by provider |
| 44 | +- Mobile-responsive layout with slide-over menu |
| 45 | +- New FreeLLM logo as favicon and Open Graph image |
| 46 | + |
| 47 | +#### Deployment |
| 48 | +- Multi-stage Dockerfile (Node 22 LTS, non-root `appuser`, healthcheck baked in) |
| 49 | +- `docker-compose.yml` for one-command local deployment |
| 50 | +- `railway.json` for Railway auto-detection with healthcheck and restart policy |
| 51 | +- Graceful shutdown on SIGTERM/SIGINT (drains in-flight requests, 8s deadline) |
| 52 | +- `app.set("trust proxy", 1)` for correct client IP behind reverse proxies |
| 53 | +- Static dashboard serving with SPA fallback for client-side routing |
| 54 | +- Production-ready logging via Pino with structured JSON output |
| 55 | + |
| 56 | +#### Developer Experience |
| 57 | +- pnpm workspace monorepo with shared dependency catalog |
| 58 | +- TypeScript 5.9 across all packages with `bundler` module resolution |
| 59 | +- esbuild bundle for the API server with CJS shim for Pino compatibility |
| 60 | +- OpenAPI 3.1 spec as the single source of truth for the API client |
| 61 | +- Auto-generated React Query hooks via Orval (`@workspace/api-client-react`) |
| 62 | +- Knip configuration for unused export detection |
| 63 | +- `scripts/test-gateway.sh` end-to-end test suite with 18 checks (health, models, status, completions, streaming, NIM direct, validation) |
| 64 | + |
| 65 | +### Documentation |
| 66 | +- Comprehensive README with quickstart (Docker + local), provider table, API reference, security guide, and tech stack |
| 67 | +- Mermaid diagrams for request lifecycle, circuit breaker state machine, routing strategies, and high-level architecture |
| 68 | +- MIT license |
| 69 | +- Architecture refactor plan in `docs/superpowers/plans/` |
| 70 | + |
| 71 | +[1.0.0]: https://github.com/devansh-365/freellm/releases/tag/v1.0.0 |
0 commit comments