Release v0.3.0
Container Images (GHCR)
Always pin to a version tag in production — it is immutable.
docker pull ghcr.io/vllm-project/semantic-router/extproc:v0.3.0
docker pull ghcr.io/vllm-project/semantic-router/vllm-sr:v0.3.0
docker pull ghcr.io/vllm-project/semantic-router/anthropic-shim:v0.3.0
docker pull ghcr.io/vllm-project/semantic-router/dashboard:v0.3.0
docker pull ghcr.io/vllm-project/semantic-router/operator:v0.3.0
docker pull ghcr.io/vllm-project/semantic-router/operator-bundle:v0.3.0| Image | Tags pushed |
|---|---|
extproc |
v0.3.0 · latest |
vllm-sr |
v0.3.0 · latest |
anthropic-shim |
v0.3.0 · latest |
extproc-rocm |
v0.3.0 · latest |
vllm-sr-rocm |
v0.3.0 · latest |
llm-katan |
v0.3.0 · latest |
dashboard |
v0.3.0 · latest |
operator |
v0.3.0 · latest |
operator-bundle |
v0.3.0 · latest |
Helm Chart
# Install a specific version
helm install semantic-router \
oci://ghcr.io/vllm-project/charts/semantic-router \
--version 0.3.0 \
--namespace vllm-semantic-router-system --create-namespace
# Upgrade to this version
helm upgrade semantic-router \
oci://ghcr.io/vllm-project/charts/semantic-router \
--version 0.3.0 \
--namespace vllm-semantic-router-systemPython Package (PyPI)
pip install vllm-sr==0.3.0Rust Crate (crates.io)
[dependencies]
candle-semantic-router = "0.3.0"vllm-sr-sim ships on its own vllm-sr-sim-v* tag and is currently at
0.1.0.
See the upgrade and rollback runbook
for step-by-step upgrade and rollback instructions.
What's Changed
- security: sanitize error responses to prevent infrastructure leakage by @yossiovadia in #1451
- [Dashboard]Polish dashboard auth and setup flow by @Xunzhuo in #1494
- [Doc] Update paper: Add author, Kubernetes note by @srampal in #1493
- [Doc] tolerate README admonition list style by @Xunzhuo in #1495
- [Dashboard]: Fix OpenClaw room chat history management by @Xunzhuo in #1503
- fix(dashboard): stop setup wizard validation loop by @Xunzhuo in #1504
- project: add roadmap for v0.3 by @Xunzhuo in #1527
- Fix traditional Candle BERT classifier registration after initialization by @pugafran in #1496
- test(memory): align e2e tests with direct per-turn storage flow by @yehuditkerido in #1433
- [Misc] Fix memory E2E tests with black by @drivebyer in #1543
- [CLI][CI/Build] Support isolated local stacks and harness routing by @Xunzhuo in #1542
- website: refine homepage system-brain narrative by @Xunzhuo in #1545
- [Bugfix][Router] Wrong /v1-suffixed OpenAI base URLs for vector store search by @drivebyer in #1541
- feat(classification): add entropy-based multi-category domain matching and signal confidence scores by @noalimoy in #1497
- observability: add Prometheus metric and tracing for cache write skips by @mkoushni in #1529
- feat(domain-signal): add UseModernBERT factory branch for category classifier by @noalimoy in #1532
- fix: restore perf toolchain usability and compilation by @NJX-njx in #1560
- Revert "feat(domain-signal): add UseModernBERT factory branch for category classifier" by @Xunzhuo in #1561
- [Bugfix][Router] Enforce per-decision semantic-cache opt-out on store paths by @Djanghao in #1558
- [Doc] Fix deprecated instruction for config.yaml validation command in #1554
- fix(memory): regenerate embedding before overwriting Content in InMemoryStore.Update by @rootfs in #1566
- [Doc]: improve skill scores for semantic-router by @popey in #1575
- refactor: stabilize the router api schema by @Xunzhuo in #1553
- Feat: Avoid quadratic slowdown when listing replay records by @drivebyer in #1569
- Chore: Clean up dead code for preference classifier by @ppppqp in #1567
- [Doc] Add homepage research carousel by @Xunzhuo in #1578
- test(memory): add per-decision plugin E2E tests by @yehuditkerido in #1562
- [Bugfix][selection]: make FileEloStorage dirty flag race-safe by @drivebyer in #1580
- feet: add inference fleet simulator by @rootfs in #1582
- [Bugfix][Router] Return consistent extproc request validation errors by @mildred522 in #1573
- feat: support insight management by @Xunzhuo in #1579
- [Feat] Migrate Fleet Simulator to vllm-sr-sim by @Xunzhuo in #1583
- chore(ci): fix production-stack OOM on e2e by @rootfs in #1591
- Feat: optimize modality request body rewriting with sjson by @drivebyer in #1585
- feat: harness upgrade by @Xunzhuo in #1592
- fix: Add missing *=> prefix to Redis KNN vector search query by @yossiovadia in #1587
- chore(precommit): update precommit container by @rootfs in #1589
- feat(dsl): add conflict detection, SIGNAL_GROUP, TEST blocks, and TIER routing by @rootfs in #1588
- feat(e2e): add dashboard E2E profile by @liavweiss in #1570
- feat(domain-signal): add UseModernBERT factory branch for category classifier by @noalimoy in #1572
- fix(candle-binding): pass real softmax probabilities from ModernBERT through FFI layer by @noalimoy in #1574
- fix: use enhancement label instead of non-existent feature request label by @KJyang-0114 in #1604
- [Bugfix] Skip cache for responses with personalized context by @yossiovadia in #1502
- [Refactor][Router][CLI] Retire structural debt in classifier.go and CLI models.py by @Djanghao in #1568
- [Doc] Sync website publications for March 2026 arXiv papers by @Xunzhuo in #1612
- fix(looper): enable auto_store and Response API translation for ImmediateResponse by @rootfs in #1615
- PRISM — 153-key legitimacy layer for model selection (v0.3 Themis) by @Mossaab-s in #1563
- agent: surface loop-mode policy in harness by @Xunzhuo in #1619
- refactor: runner by @ppppqp in #1621
- [Feat][Doc] add modality dataset export and judge-based verification by @Xunzhuo in #1613
- chore: upload arkworks by @Xunzhuo in #1622
- Implement spec-backed conflict-free routing workstream by @Xunzhuo in #1620
- paper: unify and update abs layout by @Xunzhuo in #1623
- fix: graceful model download skip when hf_token is not set by @liavweiss in #1626
- [Feat][CLI][Doc] Add OpenClaw VSR install bridge and OpenClaw config import by @Xunzhuo in #1627
- feat: redis hot cache layer for frequently accessed memories by @liavweiss in #1423
- [Feat]: Add structure signal family by @Xunzhuo in #1631
- feat(dashboard): restore system/signal eval workflows and tests by @mkoushni in #1624
- docs: add projection docs by @Xunzhuo in #1633
- Feat: optimize response body extract with gjson by @drivebyer in #1614
- deploy: add privacy routing recipe by @Xunzhuo in #1635
- fix(tests): correct projection taxonomy and balance warning baseline by @asaadbalum in #1636
- docs: update balance and fix cache bug by @Xunzhuo in #1634
- [CLI][Feat] Unify vllm-sr CLI for Docker and Kubernetes deployments by @abdallahsamabd in #1576
- [Feat][Router] Harden RAG and memory usability for production workflows by @asaadbalum in #1533
- [Router][Doc] Align privacy recipe with balance mock aliases by @Xunzhuo in #1639
- feat(memory): add Chat Completions API memory support by @yehuditkerido in #1363
- paper: add vision paper by @Xunzhuo in #1641
- fix: streaming e2e failures by @Xunzhuo in #1640
- refactor(extproc): decompose response pipeline into usage, cache, and memory phases by @noalimoy in #1642
- chore(precommit): ignore binary files by @rootfs in #1645
- feat: initial implementation of valkey cache backend by @daric93 in #1540
- feat(ci): add AST-based security scanner by @rootfs in #1647
- fix(ci): ensure scanner always produces valid JSON output by @rootfs in #1648
- fix: pin ROCm builder libc baseline and tighten playground/e2e defaults by @Xunzhuo in #1653
- [Doc] Add arXiv 2603.23013 to website research by @Xunzhuo in #1649
- docs: add amd deploy blog by @Xunzhuo in #1652
- feat: expose cache similarity score via x-vsr-cache-similarity header by @yossiovadia in #1617
- feat(e2e): expand production-stack, authz-rbac, and streaming profiles with targeted testcases by @liavweiss in #1651
- [Doc] Refresh homepage architecture and research content by @Xunzhuo in #1659
- feat(config): warn on unknown YAML fields at startup to catch typos by @yossiovadia in #1658
- fix: classify server merge logics by @Xunzhuo in #1660
- Add MiniMax as a first-class LLM provider by @octo-patch in #1662
- [CI/Build][Misc] simplify harness context, skills, and PR gates by @Xunzhuo in #1663
- feat: add kb management support and tool plugin by @Xunzhuo in #1654
- [Doc]zh-Hans documentation about overview part by @FAUST-BENCHOU in #1664
- [Doc]fix zh-Hans doc navbar to vLLM-SR by @FAUST-BENCHOU in #1666
- Fix KB lint regressions and sparse config fixture by @Xunzhuo in #1665
- feat(responsestore): default Response API backend to Redis for restart-safe storage by @yehuditkerido in #1661
- feat: topology deploy support by @haowu1234 in #1655
- Fix explicit regex keyword rules to compile as real regex by @tristaZero in #1672
- fix: dashboard kb validation failures by @Xunzhuo in #1674
- fix: onboard process didnt reload config by @Xunzhuo in #1675
- feat(tools): add DSL tuning framework with analytical trace diagnosis and three scenario plugins by @rootfs in #1669
- [Doc]zh-Hans documentation about tutorials part by @FAUST-BENCHOU in #1668
- [Doc]zh-Hans documentation about fleet-sim part by @FAUST-BENCHOU in #1676
- feat: add per-decision request parameter validation and stripping plugin by @NJX-njx in #1559
- [Doc]zh-Hans documentation about training and proposal part by @FAUST-BENCHOU in #1677
- [Revert]: remove zh legacy doc by @FAUST-BENCHOU in #1682
- [Doc]zh-Hans documentation about api part by @FAUST-BENCHOU in #1679
- [Doc]zh-Hans documentation about intro and installation part by @FAUST-BENCHOU in #1678
- Bugfix: prevent data races in Elo rating updates with deep copies and locking by @drivebyer in #1673
- fix(e2e): add security fix for authz-rbac integration test by @henschwartz in #1565
- ci: add automated skill review for SKILL.md pull requests by @popey in #1684
- Bugfix: bootstrap agent tooling via .venv-agent to avoid PEP 668 on Homebrew Python by @FAUST-BENCHOU in #1680
- [Bugfix] isolate semantic cache entries by user by @NJX-njx in #1538
- feat(dsl): LLM-generation-proof DSL and NL-to-DSL pipeline by @rootfs in #1686
- feat: support reask signal by @Xunzhuo in #1670
- chore: polish log and update selection docs by @Xunzhuo in #1689
- feat(routerreplay): default store_backend to postgres for durable replay by @yehuditkerido in #1683
- [Feat] Valkey vector store implementation by @daric93 in #1671
- [Dashboard][Feat] add natural language mode to DSL builder by @Xunzhuo in #1687
- [Fix][CLI] preserve timeout output in vllm-sr tests by @Xunzhuo in #1696
- fix: onboard activation failures by @Xunzhuo in #1697
- Fix playground open_web proxy routing by @haowu1234 in #1698
- fix: pypi pre release by @Xunzhuo in #1699
- fix: add golangci-lint exclusions for pre-existing violations by @asaadbalum in #1701
- project: update OWNERS with team by @Xunzhuo in #1703
- [Docs] Align PR template with vLLM style by @Xunzhuo in #1704
- feat: add playground builtins support by @haowu1234 in #1705
- feat: support embeddings clustering by @Xunzhuo in #1702
- Bugfix: add missing RLock in ToJSON() for all three selectors by @drivebyer in #1690
- [Feat][Router] Migrate custom Chat Completions structs to official SDK types by @asaadbalum in #1550
- [Dashboard] Fix readonly tool trace access in record detail by @Xunzhuo in #1707
- fix: resolve model config by provider model ID by @fbalicchia in #1708
- feat: support model runtime by @Xunzhuo in #1709
- feat(cache,memory): replace custom ChatCompletion types with official Go SDK by @asaadbalum in #1712
- feat(serve): default semantic cache to Milvus in vllm-sr serve by @asaadbalum in #1713
- refactor(milvus): share lifecycle across stores by @henschwartz in #1692
- refactor(fleet-sim): split fleet-sim optimizer kernels from public export surfaces by @altale in #1667
- fix: update defaults for milvus address by @Xunzhuo in #1716
- docs: refresh README homepage by @Xunzhuo in #1717
- docs: update problem state and use cases by @Xunzhuo in #1719
- docs: update project description by @Xunzhuo in #1722
- fix(ci): replace naive sleep with Milvus health check in memory integration test by @asaadbalum in #1730
- docs: polish layout for website by @Xunzhuo in #1731
- feat(cli): add vllm-sr chat command for one-shot completions by @asaadbalum in #1728
- chore(doc): update vision paper by @rootfs in #1732
- classification: replace whatlanggo with lingua-go for language detection by @tristaZero in #1720
- feat(observability): per-turn session token telemetry and metrics by @rootfs in #1736
- feat(observability): stamp per-turn pricing metadata and cumulative cost onto session log by @rootfs in #1740
- [Router][Bugfix]: parseBestMatch select best match from all KNN candidates by @drivebyer in #1762
- [Router] Fix: wire Valkey config into createSemanticCache by @daric93 in #1737
- [Router] refactor(memory): dedupe MilvusStore type-filter builder by @drivebyer in #1734
- [Router] Add
SESSION_STATEdeclarations to the Routing DSL by @petern48 in #1763 - [CLI][Doc] feat/add vllm-sr eval command for router /api/v1/eval prompt checks by @AmyTao in #1735
- [Router] Add DSL recipe for Session State by @petern48 in #1765
- feat: ship versioned release channels and upgrade/rollback workflows for images, packages, and charts by @liavweiss in #1770
- feat: add supported/experimental tier classification to model-selection algorithms (#1514) by @szedan-rh in #1693
- [Router][CLI][Dashboard] Add Redis-backed startup status with API endpoint by @yehuditkerido in #1772
- [Router] feat(latency): add cache warmth estimator and session transition flow by @noCharger in #1768
- [Docs] Add token-budget-aware pool routing paper to website research by @Xunzhuo in #1777
- [CLI] Fix host parsing and path rewrite in Envoy config generator by @e1ijah1 in #1767
- [Router][CLI][E2E] Add durable MetadataRegistry for vector store and file metadata by @yehuditkerido in #1733
- docs: fix #1284 update gateway integration mode details and architecture in operator documentation by @wilsonwu in #1741
- [Router][Dashboard][CI/Build][Docs] align embedding defaults and gate classifier assets by @Xunzhuo in #1782
- feat: add conversational routing momentum (CRM) config surface by @Deepak8858 in #1771
- feat: add lookuptable support by @BruceLoveDecimal in #1773
- fix: apiserver model startup download by @Xunzhuo in #1783
- feat(replay): track Responses API function_call items by @BruceLoveDecimal in #1785
- [Feat]: add RBAC-to-router security policy integration by @abdallahsamabd in #1714
- [Router] Add Valkey memory backend with TLS support by @daric93 in #1739
- feat(dashboard): SQLite workflowstore for ML jobs and OpenClaw; healt… by @mkoushni in #1656
- [Router][Bugfix] Allow concurrent unified classifier batch execution by @NJX-njx in #1536
- fix: flaky TestValkeyStoreInteg_List ordering in CI by @daric93 in #1788
- [Router] Add bounded candidate iteration to DSL by @BruceLoveDecimal in #1786
- [Docs]: update the architecture and structure of the codebase by @AayushSaini101 in #1792
- [Router] fix HNSW insertion entry-point descent in in-memory cache by @cryo-zd in #1787
- feat(router-replay): add GET /v1/router_replay/trajectory endpoint by @ZhitongGuo in #1789
- [UI] Fix Web Search tooltip placement in composer by @AayushSaini101 in #1797
- feature: Persist SessionID and TurnIndex into replay records for multi-turn trajectory stitching by @FAUST-BENCHOU in #1800
- [Router] feat(replay): store structured tool-call fields aligned to OpenAI API spec by @nickaggarwal in #1790
- [Router] Add cache affinity routing bias by @noCharger in #1798
- feat(signal): add conversation signal family for multi-turn and tool-aware routing by @noalimoy in #1801
- feat(projections): unify projection contract across DSL, CLI, and das… by @ZhitongGuo in #1799
- [Feat]: add value_source: raw for projection score inputs (#1757) by @abdallahsamabd in #1802
- [CLI][Docs] Add Claude Code session-isolated install skill by @iamagenius00 in #1803
- docs: sync latest updates from AMD by @Xunzhuo in #1804
- docs(i18n): Fix missing Chinese translations by @csl458 in #1805
- [E2E]: e2e test for Persist SessionID and TurnIndex into replay records by @FAUST-BENCHOU in #1807
- feat: enable voice support feature in the playground by @AayushSaini101 in #1810
- docs(i18n): update zh-Hans by @csl458 in #1821
- fix(cache): replace timestamp-based fake randomness with math/rand/v2 in HNSW by @Cerdore in #1822
- [Router][Bugfix]: Pre-check search module version before FT.CREATE by @drivebyer in #1806
- chore(deps): bump lodash-es from 4.17.21 to 4.18.1 in /website in the npm_and_yarn group across 1 directory by @dependabot[bot] in #1811
- [Router]feat: support session-backed and lookup-backed signals in the routing DSL by @FAUST-BENCHOU in #1823
- chore(deps): bump the cargo group across 3 directories with 6 updates by @dependabot[bot] in #1816
- config: lazy-validate default knowledge bases (#1829) by @1fanwang in #1836
- chore(deps): bump rollup from 4.55.1 to 4.60.2 in /dashboard/frontend by @dependabot[bot] in #1820
- feat: implement hybrid history-aware retriever strategy by @FAUST-BENCHOU in #1840
- fix: bound tool-trace step count to prevent router OOM (#1835) by @SAY-5 in #1847
- feat: wire dynamic retriever into extproc tool selection flow by @KaveeshKhattar in #1841
- fix(security): add max_evaluation_chars to bound signal evaluation input size by @ramkrishs in #1850
- fix(binding): handle long prompt without oom by @rootfs in #1846
- feat: Bundle KB asset files into Helm chart by @FAUST-BENCHOU in #1854
- Feat/model switch gate by @BruceLoveDecimal in #1842
- feat: support hierarchical projection composition and dependency ordering by @asaadbalum in #1824
- feat(signals): Add EventContextSignal for event-driven request routing by @ramkrishs in #1848
- Feat/dynamic tool history signals by @BruceLoveDecimal in #1856
- feat: introduce pluggable tool retriever interface and registry by @FAUST-BENCHOU in #1858
- fix(extproc): wrap background goroutines with panic recovery (#1843) by @SAY-5 in #1844
- refactor: simplify searchLayerHybridInternal by returning sorted indices by @cryo-zd in #1859
- feat(replay): projection traces for replay + shared Milvus lifecycle (#1601, #1760) by @henschwartz in #1857
- feat(api): expose POST /api/v1/nli — Natural Language Inference as a first-class endpoint by @ramkrishs in #1865
- [Router]: Fix timestamp-based fake randomness with math/rand/v2 in selectLevel by @drivebyer in #1861
- config: add decision-level dynamic tool retrieval contract (#1832) by @1fanwang in #1870
- feat(signals): configurable confidence threshold for language signal (#1723) by @ramkrishs in #1864
- [Router] Implement TestConfigPathLoading test for Redis config file loading by @xiaotian-yu in #1872
- fix(classification): eliminate O(N²) BM25/N-gram classify amplification by @WUKUNTAI-0211 in #1871
- feat: refactor tool selection as a decision plugin with add/filter modes by @FAUST-BENCHOU in #1866
- [Router] Support image-modality queries in embedding signals by @shraderdm in #1867
- chore(cli): support nvidia gpu for vsr cli by @rootfs in #1851
- [Router] Wire image-modality embedding signals through the request path by @shraderdm in #1868
- [CLI] Add unit coverage for python config file loader follow-up to #1872 by @xiaotian-yu in #1875
- [Dashboard] Honor TARGET_ROUTER_API_URL when frontend sends placeholder eval endpoint by @SAY-5 in #1877
- chore(deps): bump the cargo group across 2 directories with 2 updates by @dependabot[bot] in #1882
- [Router] Add
EMIT retentiondirective to the Routing DSL by @e1ijah1 in #1873 - [E2E] Centralize Helm release defaults in deployer by @ErikJiang in #1886
- [Router] Add DSL support for tools dynamic retrieval by @xiaotian-yu in #1884
- [Operator] Expose queryModality on IntelligentRoute EmbeddingSignal CRD by @shraderdm in #1880
- fix(precommit): add python3-venv so agent-pr-gate works locally by @shraderdm in #1894
- [Router] Reduce HybridCache rebuild preallocation by @cryo-zd in #1898
- test: e2e tests to verify tool selection by @FAUST-BENCHOU in #1887
- [Operator] Validate embedding modality contracts on IntelligentRoute reconcile by @shraderdm in #1895
- [Docs] Add coding agent manifests for Claude Code, Open Code, and Cursor by @rpathade in #1899
- feat: Adds Qdrant as a vector search provider by @Anush008 in #1869
- [Router] Add x-vsr-skip-processing header for full extproc passthrough by @siddharth1036 in #1878
- chore(deps): bump the uv group across 1 directory with 4 updates by @dependabot[bot] in #1903
- [Dashboard] Fix Signal Level review endpoint default in Evaluation create flow (#1885) by @xiaotian-yu in #1905
- [Router] add policy version store with shadow/activate/revert lifecycle by @rpathade in #1900
- [Router] Fix hybrid HNSW layer entry-point propagation by @cryo-zd in #1912
- [CLI] add Python validator support for tool_selection plugin by @e1ijah1 in #1916
- chore(deps): bump idna from 3.11 to 3.15 in /src/vllm-sr in the uv group across 1 directory by @dependabot[bot] in #1920
- chore(deps): bump openssl from 0.10.79 to 0.10.80 in /onnx-binding in the cargo group across 1 directory by @dependabot[bot] in #1921
- [Router][Docs] Add opt-in image-modality embedding pack by @shraderdm in #1896
- feat: shrink dashboard frontend route shell, config pages, and interaction containers by @FAUST-BENCHOU in #1913
- [Router][CLI] Custom Anthropic upstream routing and tool calling by @akshayv in #1922
- feat: retire second-wave structural debt in cli, cache, and startup surfaces by @FAUST-BENCHOU in #1914
- fix: FULL_DUPLEX_STREAMED + streamed immediate response by @AayushSaini101 in #1826
- [Docs] Update router API matrix for Anthropic upstream support by @akshayv in #1924
- [Router] add Anthropic streaming with OpenAI SSE translation by @akshayv in #1926
- fix: make training working directory configurable by @AayushSaini101 in #1919
- [E2E] Re-add image-modality embedding-signal routing profile by @shraderdm in #1881
- [CLI] Update config models to Pydantic ConfigDict by @immanuwell in #1923
- refractor: retire high-priority structural debt in dashboard backend hotspots by @FAUST-BENCHOU in #1925
- [Router] Normalize embedding model handling across cache backends by @cryo-zd in #1929
- [Router][Bugfix] Replace timestamp-seeded *rand.Rand with math/rand/v2 in RLDrivenSelector by @drivebyer in #1932
- fix(metrics): stop double-counting decision evaluation metrics by @WUKUNTAI-0211 in #1945
- fix: blocker of scan_malicious_code.py by antivirus by @AayushSaini101 in #1918
- feat(extproc): detect ClientProtocol from :path header by @siloteemu in #1937
- [Router][CLI] restore hybrid_history retrieval wiring and advanced_filtering schema parity by @e1ijah1 in #1935
- [Router][Bugfix] router_replay: fix silent postgres INSERT failure that leaves dashboard Insight empty by @WUKUNTAI-0211 in #1942
- feat(anthropic): native /v1/messages ingress — IR envelope + inbound parser by @siloteemu in #1938
- [CI/Build] vllm-sr: bump huggingface_hub to fix broken CLI install by @siloteemu in #1954
- [Router][Docs] add multi_factor selector composing quality/latency/cost/load (closes #37) by @WUKUNTAI-0211 in #1953
- [Training] fix jailbreak LoRA max_grad_norm trainer crash by @xiaotian-yu in #1948
- [Router][Docs] wire AutoMix entailment verifier into confidence cascade (closes #1173 AutoMix half) by @WUKUNTAI-0211 in #1952
- feat(openvino-binding): Add OpenVINO backend for ModernBERT embedding and classifier inference by @EmonLu in #1907
- [CI/Build] add Anthropic-shape e2e test backend (llama.cpp + shim) by @siloteemu in #1950
- feat: add CLIENT SETNAME for connection identification by @atao2004 in #1955
- fix(anthropic): preserve client passthrough fields through ToAnthropicRequestBody by @siloteemu in #1944
- [Router] Populate Chat Completions previous-model to unblock ModelSwitchGate enforce (#1753) by @theohsiung in #1960
- [Docs] Correct community meeting cadence wording (#1874) by @theohsiung in #1958
- [Router] Add multi_emit projection mapping method (#1759) by @theohsiung in #1959
- refractor: retire high-priority structural debt in operator backend hotspots by @FAUST-BENCHOU in #1930
- [CI/Build] Remove generated OpenVINO benchmark binary by @cryo-zd in #1963
- chore(deps): bump the cargo group across 2 directories with 1 update by @dependabot[bot] in #1964
- chore(deps): bump tar from 0.4.45 to 0.4.46 in /onnx-binding in the cargo group across 1 directory by @dependabot[bot] in #1965
- [Router] Fix OpenAI reasoning effort request mutation by @brelance in #1933
- [Docs] Add Agentgateway installation guide by @keithmattix in #1966
- [Router] Fix path traversal in config rollback version handling by @glitch-ux in #1968
- [Router][CLI][Dashboard][Operator][Bindings] Improve codebase health by @Xunzhuo in #1970
- feat: add file attachments feature on the playground by @AayushSaini101 in #1971
- chore: refresh agent harness and maintainer ops by @Xunzhuo in #1972
- feat: add session-aware agentic routing by @Xunzhuo in #1974
- [Router] Fix KB exemplar preload failure when no backend override is set by @Peterren in #1985
- fix: preserve agentic tool-loop hard lock by @Xunzhuo in #1980
- [Dashboard] refactor: add room collaboration event bus with unified WS/SSE fan-out by @FAUST-BENCHOU in #1973
- fix: request identity-encoded upstream responses by @Xunzhuo in #1977
- fix: record session header usage telemetry by @Xunzhuo in #1982
- feat: gate remaining-turn priors by replay confidence by @Xunzhuo in #1983
- Add remaining-turn-prior ablations for agentic routing by @Xunzhuo in #1984
- [Router] Validate mmbert_model_path rejects classic BERT models at config-load time by @Peterren in #1981
- docs: include anthropic shim in release contract by @Xunzhuo in #1975
- fix: lift observability out of helm resources by @Xunzhuo in #1976
- fix: reject native Windows docker runtime by @Xunzhuo in #1978
- [Bindings] Retire pre-existing clippy debt in src/ffi/embedding.rs by @shraderdm in #1997
- feat: add session-aware agentic routing by @Xunzhuo in #1989
- [Dashboard] Fail closed when the auth service cannot initialize by @glitch-ux in #1993
- fix(classification): scan prior user turns in history-aware PII/jailbreak signals by @WUKUNTAI-0211 in #1998
- fix: reconcile each IntelligentRoute/Pool when multiple coexist by @WUKUNTAI-0211 in #1999
- Fix/Openshift deployment by @Hadar301 in #1994
- [Dashboard] wire ClawRoom clients to room collaboration event bus by @FAUST-BENCHOU in #1990
- feat(cli): add 'vllm-sr model list' to inspect configured models by @WUKUNTAI-0211 in #2000
- refactor: retire high-priority structural debt in router backend hotspots by @FAUST-BENCHOU in #1969
- [Router] Optimize BM25 index construction and tokenization by @cryo-zd in #2002
- [Dashboard] Add built-in routing modes with missing-model completion by @Peterren in #1995
- feat(anthropic): header pass-through, session-id mirroring, and lossiness response headers by @siloteemu in #1939
- [Bindings] Apply SigLIP image normalization in vision encoder by @shraderdm in #1928
- [Docs] Fix agentgateway casing by @keithmattix in #2003
- [Router][E2E] Add MiniMax-M3 to provider model list and tests by @octo-patch in #2005
- [Bindings] Chunk mmBERT embedding attention to bound memory (#1957) by @Peterren in #2007
- v0.3 release readiness preparation by @Xunzhuo in #2004
- fix: align v0.3 routing metadata and docs surfaces by @Xunzhuo in #2011
- feat(cli): support custom container DNS via VLLM_SR_DNS by @WUKUNTAI-0211 in #2017
- [Bindings] Retire pre-existing clippy debt in multimodal_embedding.rs by @shraderdm in #1996
- [Dashboard] Helm chart: correct router URL env var name and add PVC fsGroup default by @shraderdm in #2019
- [Dashboard] complete room collaboration participants for ClawOS by @FAUST-BENCHOU in #2024
- [Docs] fix #1845: replace removed Docker Compose guide with vllm-sr CLI redirect by @theohsiung in #2023
- [Dashboard] Helm chart: add optional HTTPRoute for Gateway API ingress by @lauri-amd in #2025
- docs: add contributor leaderboard page by @Xunzhuo in #2026
- feat(anthropic): outbound emitter for non-streaming responses by @siloteemu in #1940
- [Bindings] Fix SigLIP attentional probe pooling head in candle-binding by @shraderdm in #1927
- [Perf] Add intent classifier accuracy benchmark by @shira-g in #2031
- [Bindings] Preprocess images in Rust with PIL-equivalent resize to close cosine drift by @shraderdm in #1943
- [Docs] fix operator image path in deploy instructions by @lauri-amd in #2036
- [Router] fix: streaming responses never hit the Redis semantic cache by @WUKUNTAI-0211 in #2034
- [Dashboard]feat: surface OpenClaw worker tool calls in ClawRoom from session jsonl by @FAUST-BENCHOU in #2028
- docs: refine contributor leaderboard generation by @Xunzhuo in #2037
- fix: replace O(N) slice LRU with container/list to eliminate write-l… by @gagandhakrey in #2040
- document write/update-recency eviction and add RAG cache unit tests by @gagandhakrey in #2041
- fix/streaming-chunks-memory-accumulation by @gagandhakrey in #2042
- [Dashboard] Helm chart: harden against multi-writer SQLite corruption by @shraderdm in #2047
- [CI/Build] Match Helm v4 schema-rejection wording in helm-safety-validate by @shraderdm in #2046
- [Docs]add leaderboard chinese version by @FAUST-BENCHOU in #2045
- [Docs]add chinese doc about Operator,Native-backends,Rollback,Valkey by @FAUST-BENCHOU in #2043
- [Env] Enable AMD platform GPU defaults by @Xunzhuo in #2044
- [Docs] Fix the hover style for the GitHub Models link by @wilsonwu in #2055
- [Docs] Fix Chinese page switch to English problem by @wilsonwu in #2052
- [Dashboard] Add frontend Prettier formatting config by @wilsonwu in #2050
- [Dashboard] Improve setup wizard model draft validation by @wilsonwu in #2048
- [Dashboard] Helm chart: document, schema-validate, and guard dashboard.podSecurityContext by @shraderdm in #2053
- [Dashboard] Helm chart: support a stable DASHBOARD_JWT_SECRET via existingSecret by @shraderdm in #2056
New Contributors
- @pugafran made their first contribution in #1496
- @drivebyer made their first contribution in #1543
- @NJX-njx made their first contribution in #1560
- @popey made their first contribution in #1575
- @mildred522 made their first contribution in #1573
- @KJyang-0114 made their first contribution in #1604
- @Mossaab-s made their first contribution in #1563
- @daric93 made their first contribution in #1540
- @octo-patch made their first contribution in #1662
- @FAUST-BENCHOU made their first contribution in #1664
- @tristaZero made their first contribution in #1672
- @fbalicchia made their first contribution in #1708
- @altale made their first contribution in #1667
- @petern48 made their first contribution in #1763
- @AmyTao made their first contribution in #1735
- @noCharger made their first contribution in #1768
- @e1ijah1 made their first contribution in #1767
- @Deepak8858 made their first contribution in #1771
- @BruceLoveDecimal made their first contribution in #1773
- @AayushSaini101 made their first contribution in #1792
- @ZhitongGuo made their first contribution in #1789
- @nickaggarwal made their first contribution in #1790
- @iamagenius00 made their first contribution in #1803
- @csl458 made their first contribution in #1805
- @Cerdore made their first contribution in #1822
- @dependabot[bot] made their first contribution in #1811
- @1fanwang made their first contribution in #1836
- @SAY-5 made their first contribution in #1847
- @KaveeshKhattar made their first contribution in #1841
- @ramkrishs made their first contribution in #1850
- @xiaotian-yu made their first contribution in #1872
- @WUKUNTAI-0211 made their first contribution in #1871
- @rpathade made their first contribution in #1899
- @Anush008 made their first contribution in #1869
- @siddharth1036 made their first contribution in #1878
- @akshayv made their first contribution in #1922
- @immanuwell made their first contribution in #1923
- @siloteemu made their first contribution in #1937
- @EmonLu made their first contribution in #1907
- @atao2004 made their first contribution in #1955
- @brelance made their first contribution in #1933
- @keithmattix made their first contribution in #1966
- @glitch-ux made their first contribution in #1968
- @Peterren made their first contribution in #1985
- @Hadar301 made their first contribution in #1994
- @lauri-amd made their first contribution in #2025
- @shira-g made their first contribution in #2031
- @gagandhakrey made their first contribution in #2040
Full Changelog: v0.2.0...v0.3.0