Skip to content

Release v0.3.0

Latest

Choose a tag to compare

@github-actions github-actions released this 05 Jun 12:08
· 104 commits to main since this release

Release v0.3.0

Container Images (GHCR)

Always pin to a version tag in production — it is immutable.

docker pull ghcr.io/vllm-project/semantic-router/extproc:v0.3.0
docker pull ghcr.io/vllm-project/semantic-router/vllm-sr:v0.3.0
docker pull ghcr.io/vllm-project/semantic-router/anthropic-shim:v0.3.0
docker pull ghcr.io/vllm-project/semantic-router/dashboard:v0.3.0
docker pull ghcr.io/vllm-project/semantic-router/operator:v0.3.0
docker pull ghcr.io/vllm-project/semantic-router/operator-bundle:v0.3.0
Image Tags pushed
extproc v0.3.0 · latest
vllm-sr v0.3.0 · latest
anthropic-shim v0.3.0 · latest
extproc-rocm v0.3.0 · latest
vllm-sr-rocm v0.3.0 · latest
llm-katan v0.3.0 · latest
dashboard v0.3.0 · latest
operator v0.3.0 · latest
operator-bundle v0.3.0 · latest

Helm Chart

# Install a specific version
helm install semantic-router \
  oci://ghcr.io/vllm-project/charts/semantic-router \
  --version 0.3.0 \
  --namespace vllm-semantic-router-system --create-namespace

# Upgrade to this version
helm upgrade semantic-router \
  oci://ghcr.io/vllm-project/charts/semantic-router \
  --version 0.3.0 \
  --namespace vllm-semantic-router-system

Python Package (PyPI)

pip install vllm-sr==0.3.0

Rust Crate (crates.io)

[dependencies]
candle-semantic-router = "0.3.0"

vllm-sr-sim ships on its own vllm-sr-sim-v* tag and is currently at
0.1.0.

See the upgrade and rollback runbook
for step-by-step upgrade and rollback instructions.

What's Changed

  • security: sanitize error responses to prevent infrastructure leakage by @yossiovadia in #1451
  • [Dashboard]Polish dashboard auth and setup flow by @Xunzhuo in #1494
  • [Doc] Update paper: Add author, Kubernetes note by @srampal in #1493
  • [Doc] tolerate README admonition list style by @Xunzhuo in #1495
  • [Dashboard]: Fix OpenClaw room chat history management by @Xunzhuo in #1503
  • fix(dashboard): stop setup wizard validation loop by @Xunzhuo in #1504
  • project: add roadmap for v0.3 by @Xunzhuo in #1527
  • Fix traditional Candle BERT classifier registration after initialization by @pugafran in #1496
  • test(memory): align e2e tests with direct per-turn storage flow by @yehuditkerido in #1433
  • [Misc] Fix memory E2E tests with black by @drivebyer in #1543
  • [CLI][CI/Build] Support isolated local stacks and harness routing by @Xunzhuo in #1542
  • website: refine homepage system-brain narrative by @Xunzhuo in #1545
  • [Bugfix][Router] Wrong /v1-suffixed OpenAI base URLs for vector store search by @drivebyer in #1541
  • feat(classification): add entropy-based multi-category domain matching and signal confidence scores by @noalimoy in #1497
  • observability: add Prometheus metric and tracing for cache write skips by @mkoushni in #1529
  • feat(domain-signal): add UseModernBERT factory branch for category classifier by @noalimoy in #1532
  • fix: restore perf toolchain usability and compilation by @NJX-njx in #1560
  • Revert "feat(domain-signal): add UseModernBERT factory branch for category classifier" by @Xunzhuo in #1561
  • [Bugfix][Router] Enforce per-decision semantic-cache opt-out on store paths by @Djanghao in #1558
  • [Doc] Fix deprecated instruction for config.yaml validation command in #1554
  • fix(memory): regenerate embedding before overwriting Content in InMemoryStore.Update by @rootfs in #1566
  • [Doc]: improve skill scores for semantic-router by @popey in #1575
  • refactor: stabilize the router api schema by @Xunzhuo in #1553
  • Feat: Avoid quadratic slowdown when listing replay records by @drivebyer in #1569
  • Chore: Clean up dead code for preference classifier by @ppppqp in #1567
  • [Doc] Add homepage research carousel by @Xunzhuo in #1578
  • test(memory): add per-decision plugin E2E tests by @yehuditkerido in #1562
  • [Bugfix][selection]: make FileEloStorage dirty flag race-safe by @drivebyer in #1580
  • feet: add inference fleet simulator by @rootfs in #1582
  • [Bugfix][Router] Return consistent extproc request validation errors by @mildred522 in #1573
  • feat: support insight management by @Xunzhuo in #1579
  • [Feat] Migrate Fleet Simulator to vllm-sr-sim by @Xunzhuo in #1583
  • chore(ci): fix production-stack OOM on e2e by @rootfs in #1591
  • Feat: optimize modality request body rewriting with sjson by @drivebyer in #1585
  • feat: harness upgrade by @Xunzhuo in #1592
  • fix: Add missing *=> prefix to Redis KNN vector search query by @yossiovadia in #1587
  • chore(precommit): update precommit container by @rootfs in #1589
  • feat(dsl): add conflict detection, SIGNAL_GROUP, TEST blocks, and TIER routing by @rootfs in #1588
  • feat(e2e): add dashboard E2E profile by @liavweiss in #1570
  • feat(domain-signal): add UseModernBERT factory branch for category classifier by @noalimoy in #1572
  • fix(candle-binding): pass real softmax probabilities from ModernBERT through FFI layer by @noalimoy in #1574
  • fix: use enhancement label instead of non-existent feature request label by @KJyang-0114 in #1604
  • [Bugfix] Skip cache for responses with personalized context by @yossiovadia in #1502
  • [Refactor][Router][CLI] Retire structural debt in classifier.go and CLI models.py by @Djanghao in #1568
  • [Doc] Sync website publications for March 2026 arXiv papers by @Xunzhuo in #1612
  • fix(looper): enable auto_store and Response API translation for ImmediateResponse by @rootfs in #1615
  • PRISM — 153-key legitimacy layer for model selection (v0.3 Themis) by @Mossaab-s in #1563
  • agent: surface loop-mode policy in harness by @Xunzhuo in #1619
  • refactor: runner by @ppppqp in #1621
  • [Feat][Doc] add modality dataset export and judge-based verification by @Xunzhuo in #1613
  • chore: upload arkworks by @Xunzhuo in #1622
  • Implement spec-backed conflict-free routing workstream by @Xunzhuo in #1620
  • paper: unify and update abs layout by @Xunzhuo in #1623
  • fix: graceful model download skip when hf_token is not set by @liavweiss in #1626
  • [Feat][CLI][Doc] Add OpenClaw VSR install bridge and OpenClaw config import by @Xunzhuo in #1627
  • feat: redis hot cache layer for frequently accessed memories by @liavweiss in #1423
  • [Feat]: Add structure signal family by @Xunzhuo in #1631
  • feat(dashboard): restore system/signal eval workflows and tests by @mkoushni in #1624
  • docs: add projection docs by @Xunzhuo in #1633
  • Feat: optimize response body extract with gjson by @drivebyer in #1614
  • deploy: add privacy routing recipe by @Xunzhuo in #1635
  • fix(tests): correct projection taxonomy and balance warning baseline by @asaadbalum in #1636
  • docs: update balance and fix cache bug by @Xunzhuo in #1634
  • [CLI][Feat] Unify vllm-sr CLI for Docker and Kubernetes deployments by @abdallahsamabd in #1576
  • [Feat][Router] Harden RAG and memory usability for production workflows by @asaadbalum in #1533
  • [Router][Doc] Align privacy recipe with balance mock aliases by @Xunzhuo in #1639
  • feat(memory): add Chat Completions API memory support by @yehuditkerido in #1363
  • paper: add vision paper by @Xunzhuo in #1641
  • fix: streaming e2e failures by @Xunzhuo in #1640
  • refactor(extproc): decompose response pipeline into usage, cache, and memory phases by @noalimoy in #1642
  • chore(precommit): ignore binary files by @rootfs in #1645
  • feat: initial implementation of valkey cache backend by @daric93 in #1540
  • feat(ci): add AST-based security scanner by @rootfs in #1647
  • fix(ci): ensure scanner always produces valid JSON output by @rootfs in #1648
  • fix: pin ROCm builder libc baseline and tighten playground/e2e defaults by @Xunzhuo in #1653
  • [Doc] Add arXiv 2603.23013 to website research by @Xunzhuo in #1649
  • docs: add amd deploy blog by @Xunzhuo in #1652
  • feat: expose cache similarity score via x-vsr-cache-similarity header by @yossiovadia in #1617
  • feat(e2e): expand production-stack, authz-rbac, and streaming profiles with targeted testcases by @liavweiss in #1651
  • [Doc] Refresh homepage architecture and research content by @Xunzhuo in #1659
  • feat(config): warn on unknown YAML fields at startup to catch typos by @yossiovadia in #1658
  • fix: classify server merge logics by @Xunzhuo in #1660
  • Add MiniMax as a first-class LLM provider by @octo-patch in #1662
  • [CI/Build][Misc] simplify harness context, skills, and PR gates by @Xunzhuo in #1663
  • feat: add kb management support and tool plugin by @Xunzhuo in #1654
  • [Doc]zh-Hans documentation about overview part by @FAUST-BENCHOU in #1664
  • [Doc]fix zh-Hans doc navbar to vLLM-SR by @FAUST-BENCHOU in #1666
  • Fix KB lint regressions and sparse config fixture by @Xunzhuo in #1665
  • feat(responsestore): default Response API backend to Redis for restart-safe storage by @yehuditkerido in #1661
  • feat: topology deploy support by @haowu1234 in #1655
  • Fix explicit regex keyword rules to compile as real regex by @tristaZero in #1672
  • fix: dashboard kb validation failures by @Xunzhuo in #1674
  • fix: onboard process didnt reload config by @Xunzhuo in #1675
  • feat(tools): add DSL tuning framework with analytical trace diagnosis and three scenario plugins by @rootfs in #1669
  • [Doc]zh-Hans documentation about tutorials part by @FAUST-BENCHOU in #1668
  • [Doc]zh-Hans documentation about fleet-sim part by @FAUST-BENCHOU in #1676
  • feat: add per-decision request parameter validation and stripping plugin by @NJX-njx in #1559
  • [Doc]zh-Hans documentation about training and proposal part by @FAUST-BENCHOU in #1677
  • [Revert]: remove zh legacy doc by @FAUST-BENCHOU in #1682
  • [Doc]zh-Hans documentation about api part by @FAUST-BENCHOU in #1679
  • [Doc]zh-Hans documentation about intro and installation part by @FAUST-BENCHOU in #1678
  • Bugfix: prevent data races in Elo rating updates with deep copies and locking by @drivebyer in #1673
  • fix(e2e): add security fix for authz-rbac integration test by @henschwartz in #1565
  • ci: add automated skill review for SKILL.md pull requests by @popey in #1684
  • Bugfix: bootstrap agent tooling via .venv-agent to avoid PEP 668 on Homebrew Python by @FAUST-BENCHOU in #1680
  • [Bugfix] isolate semantic cache entries by user by @NJX-njx in #1538
  • feat(dsl): LLM-generation-proof DSL and NL-to-DSL pipeline by @rootfs in #1686
  • feat: support reask signal by @Xunzhuo in #1670
  • chore: polish log and update selection docs by @Xunzhuo in #1689
  • feat(routerreplay): default store_backend to postgres for durable replay by @yehuditkerido in #1683
  • [Feat] Valkey vector store implementation by @daric93 in #1671
  • [Dashboard][Feat] add natural language mode to DSL builder by @Xunzhuo in #1687
  • [Fix][CLI] preserve timeout output in vllm-sr tests by @Xunzhuo in #1696
  • fix: onboard activation failures by @Xunzhuo in #1697
  • Fix playground open_web proxy routing by @haowu1234 in #1698
  • fix: pypi pre release by @Xunzhuo in #1699
  • fix: add golangci-lint exclusions for pre-existing violations by @asaadbalum in #1701
  • project: update OWNERS with team by @Xunzhuo in #1703
  • [Docs] Align PR template with vLLM style by @Xunzhuo in #1704
  • feat: add playground builtins support by @haowu1234 in #1705
  • feat: support embeddings clustering by @Xunzhuo in #1702
  • Bugfix: add missing RLock in ToJSON() for all three selectors by @drivebyer in #1690
  • [Feat][Router] Migrate custom Chat Completions structs to official SDK types by @asaadbalum in #1550
  • [Dashboard] Fix readonly tool trace access in record detail by @Xunzhuo in #1707
  • fix: resolve model config by provider model ID by @fbalicchia in #1708
  • feat: support model runtime by @Xunzhuo in #1709
  • feat(cache,memory): replace custom ChatCompletion types with official Go SDK by @asaadbalum in #1712
  • feat(serve): default semantic cache to Milvus in vllm-sr serve by @asaadbalum in #1713
  • refactor(milvus): share lifecycle across stores by @henschwartz in #1692
  • refactor(fleet-sim): split fleet-sim optimizer kernels from public export surfaces by @altale in #1667
  • fix: update defaults for milvus address by @Xunzhuo in #1716
  • docs: refresh README homepage by @Xunzhuo in #1717
  • docs: update problem state and use cases by @Xunzhuo in #1719
  • docs: update project description by @Xunzhuo in #1722
  • fix(ci): replace naive sleep with Milvus health check in memory integration test by @asaadbalum in #1730
  • docs: polish layout for website by @Xunzhuo in #1731
  • feat(cli): add vllm-sr chat command for one-shot completions by @asaadbalum in #1728
  • chore(doc): update vision paper by @rootfs in #1732
  • classification: replace whatlanggo with lingua-go for language detection by @tristaZero in #1720
  • feat(observability): per-turn session token telemetry and metrics by @rootfs in #1736
  • feat(observability): stamp per-turn pricing metadata and cumulative cost onto session log by @rootfs in #1740
  • [Router][Bugfix]: parseBestMatch select best match from all KNN candidates by @drivebyer in #1762
  • [Router] Fix: wire Valkey config into createSemanticCache by @daric93 in #1737
  • [Router] refactor(memory): dedupe MilvusStore type-filter builder by @drivebyer in #1734
  • [Router] Add SESSION_STATE declarations to the Routing DSL by @petern48 in #1763
  • [CLI][Doc] feat/add vllm-sr eval command for router /api/v1/eval prompt checks by @AmyTao in #1735
  • [Router] Add DSL recipe for Session State by @petern48 in #1765
  • feat: ship versioned release channels and upgrade/rollback workflows for images, packages, and charts by @liavweiss in #1770
  • feat: add supported/experimental tier classification to model-selection algorithms (#1514) by @szedan-rh in #1693
  • [Router][CLI][Dashboard] Add Redis-backed startup status with API endpoint by @yehuditkerido in #1772
  • [Router] feat(latency): add cache warmth estimator and session transition flow by @noCharger in #1768
  • [Docs] Add token-budget-aware pool routing paper to website research by @Xunzhuo in #1777
  • [CLI] Fix host parsing and path rewrite in Envoy config generator by @e1ijah1 in #1767
  • [Router][CLI][E2E] Add durable MetadataRegistry for vector store and file metadata by @yehuditkerido in #1733
  • docs: fix #1284 update gateway integration mode details and architecture in operator documentation by @wilsonwu in #1741
  • [Router][Dashboard][CI/Build][Docs] align embedding defaults and gate classifier assets by @Xunzhuo in #1782
  • feat: add conversational routing momentum (CRM) config surface by @Deepak8858 in #1771
  • feat: add lookuptable support by @BruceLoveDecimal in #1773
  • fix: apiserver model startup download by @Xunzhuo in #1783
  • feat(replay): track Responses API function_call items by @BruceLoveDecimal in #1785
  • [Feat]: add RBAC-to-router security policy integration by @abdallahsamabd in #1714
  • [Router] Add Valkey memory backend with TLS support by @daric93 in #1739
  • feat(dashboard): SQLite workflowstore for ML jobs and OpenClaw; healt… by @mkoushni in #1656
  • [Router][Bugfix] Allow concurrent unified classifier batch execution by @NJX-njx in #1536
  • fix: flaky TestValkeyStoreInteg_List ordering in CI by @daric93 in #1788
  • [Router] Add bounded candidate iteration to DSL by @BruceLoveDecimal in #1786
  • [Docs]: update the architecture and structure of the codebase by @AayushSaini101 in #1792
  • [Router] fix HNSW insertion entry-point descent in in-memory cache by @cryo-zd in #1787
  • feat(router-replay): add GET /v1/router_replay/trajectory endpoint by @ZhitongGuo in #1789
  • [UI] Fix Web Search tooltip placement in composer by @AayushSaini101 in #1797
  • feature: Persist SessionID and TurnIndex into replay records for multi-turn trajectory stitching by @FAUST-BENCHOU in #1800
  • [Router] feat(replay): store structured tool-call fields aligned to OpenAI API spec by @nickaggarwal in #1790
  • [Router] Add cache affinity routing bias by @noCharger in #1798
  • feat(signal): add conversation signal family for multi-turn and tool-aware routing by @noalimoy in #1801
  • feat(projections): unify projection contract across DSL, CLI, and das… by @ZhitongGuo in #1799
  • [Feat]: add value_source: raw for projection score inputs (#1757) by @abdallahsamabd in #1802
  • [CLI][Docs] Add Claude Code session-isolated install skill by @iamagenius00 in #1803
  • docs: sync latest updates from AMD by @Xunzhuo in #1804
  • docs(i18n): Fix missing Chinese translations by @csl458 in #1805
  • [E2E]: e2e test for Persist SessionID and TurnIndex into replay records by @FAUST-BENCHOU in #1807
  • feat: enable voice support feature in the playground by @AayushSaini101 in #1810
  • docs(i18n): update zh-Hans by @csl458 in #1821
  • fix(cache): replace timestamp-based fake randomness with math/rand/v2 in HNSW by @Cerdore in #1822
  • [Router][Bugfix]: Pre-check search module version before FT.CREATE by @drivebyer in #1806
  • chore(deps): bump lodash-es from 4.17.21 to 4.18.1 in /website in the npm_and_yarn group across 1 directory by @dependabot[bot] in #1811
  • [Router]feat: support session-backed and lookup-backed signals in the routing DSL by @FAUST-BENCHOU in #1823
  • chore(deps): bump the cargo group across 3 directories with 6 updates by @dependabot[bot] in #1816
  • config: lazy-validate default knowledge bases (#1829) by @1fanwang in #1836
  • chore(deps): bump rollup from 4.55.1 to 4.60.2 in /dashboard/frontend by @dependabot[bot] in #1820
  • feat: implement hybrid history-aware retriever strategy by @FAUST-BENCHOU in #1840
  • fix: bound tool-trace step count to prevent router OOM (#1835) by @SAY-5 in #1847
  • feat: wire dynamic retriever into extproc tool selection flow by @KaveeshKhattar in #1841
  • fix(security): add max_evaluation_chars to bound signal evaluation input size by @ramkrishs in #1850
  • fix(binding): handle long prompt without oom by @rootfs in #1846
  • feat: Bundle KB asset files into Helm chart by @FAUST-BENCHOU in #1854
  • Feat/model switch gate by @BruceLoveDecimal in #1842
  • feat: support hierarchical projection composition and dependency ordering by @asaadbalum in #1824
  • feat(signals): Add EventContextSignal for event-driven request routing by @ramkrishs in #1848
  • Feat/dynamic tool history signals by @BruceLoveDecimal in #1856
  • feat: introduce pluggable tool retriever interface and registry by @FAUST-BENCHOU in #1858
  • fix(extproc): wrap background goroutines with panic recovery (#1843) by @SAY-5 in #1844
  • refactor: simplify searchLayerHybridInternal by returning sorted indices by @cryo-zd in #1859
  • feat(replay): projection traces for replay + shared Milvus lifecycle (#1601, #1760) by @henschwartz in #1857
  • feat(api): expose POST /api/v1/nli — Natural Language Inference as a first-class endpoint by @ramkrishs in #1865
  • [Router]: Fix timestamp-based fake randomness with math/rand/v2 in selectLevel by @drivebyer in #1861
  • config: add decision-level dynamic tool retrieval contract (#1832) by @1fanwang in #1870
  • feat(signals): configurable confidence threshold for language signal (#1723) by @ramkrishs in #1864
  • [Router] Implement TestConfigPathLoading test for Redis config file loading by @xiaotian-yu in #1872
  • fix(classification): eliminate O(N²) BM25/N-gram classify amplification by @WUKUNTAI-0211 in #1871
  • feat: refactor tool selection as a decision plugin with add/filter modes by @FAUST-BENCHOU in #1866
  • [Router] Support image-modality queries in embedding signals by @shraderdm in #1867
  • chore(cli): support nvidia gpu for vsr cli by @rootfs in #1851
  • [Router] Wire image-modality embedding signals through the request path by @shraderdm in #1868
  • [CLI] Add unit coverage for python config file loader follow-up to #1872 by @xiaotian-yu in #1875
  • [Dashboard] Honor TARGET_ROUTER_API_URL when frontend sends placeholder eval endpoint by @SAY-5 in #1877
  • chore(deps): bump the cargo group across 2 directories with 2 updates by @dependabot[bot] in #1882
  • [Router] Add EMIT retention directive to the Routing DSL by @e1ijah1 in #1873
  • [E2E] Centralize Helm release defaults in deployer by @ErikJiang in #1886
  • [Router] Add DSL support for tools dynamic retrieval by @xiaotian-yu in #1884
  • [Operator] Expose queryModality on IntelligentRoute EmbeddingSignal CRD by @shraderdm in #1880
  • fix(precommit): add python3-venv so agent-pr-gate works locally by @shraderdm in #1894
  • [Router] Reduce HybridCache rebuild preallocation by @cryo-zd in #1898
  • test: e2e tests to verify tool selection by @FAUST-BENCHOU in #1887
  • [Operator] Validate embedding modality contracts on IntelligentRoute reconcile by @shraderdm in #1895
  • [Docs] Add coding agent manifests for Claude Code, Open Code, and Cursor by @rpathade in #1899
  • feat: Adds Qdrant as a vector search provider by @Anush008 in #1869
  • [Router] Add x-vsr-skip-processing header for full extproc passthrough by @siddharth1036 in #1878
  • chore(deps): bump the uv group across 1 directory with 4 updates by @dependabot[bot] in #1903
  • [Dashboard] Fix Signal Level review endpoint default in Evaluation create flow (#1885) by @xiaotian-yu in #1905
  • [Router] add policy version store with shadow/activate/revert lifecycle by @rpathade in #1900
  • [Router] Fix hybrid HNSW layer entry-point propagation by @cryo-zd in #1912
  • [CLI] add Python validator support for tool_selection plugin by @e1ijah1 in #1916
  • chore(deps): bump idna from 3.11 to 3.15 in /src/vllm-sr in the uv group across 1 directory by @dependabot[bot] in #1920
  • chore(deps): bump openssl from 0.10.79 to 0.10.80 in /onnx-binding in the cargo group across 1 directory by @dependabot[bot] in #1921
  • [Router][Docs] Add opt-in image-modality embedding pack by @shraderdm in #1896
  • feat: shrink dashboard frontend route shell, config pages, and interaction containers by @FAUST-BENCHOU in #1913
  • [Router][CLI] Custom Anthropic upstream routing and tool calling by @akshayv in #1922
  • feat: retire second-wave structural debt in cli, cache, and startup surfaces by @FAUST-BENCHOU in #1914
  • fix: FULL_DUPLEX_STREAMED + streamed immediate response by @AayushSaini101 in #1826
  • [Docs] Update router API matrix for Anthropic upstream support by @akshayv in #1924
  • [Router] add Anthropic streaming with OpenAI SSE translation by @akshayv in #1926
  • fix: make training working directory configurable by @AayushSaini101 in #1919
  • [E2E] Re-add image-modality embedding-signal routing profile by @shraderdm in #1881
  • [CLI] Update config models to Pydantic ConfigDict by @immanuwell in #1923
  • refractor: retire high-priority structural debt in dashboard backend hotspots by @FAUST-BENCHOU in #1925
  • [Router] Normalize embedding model handling across cache backends by @cryo-zd in #1929
  • [Router][Bugfix] Replace timestamp-seeded *rand.Rand with math/rand/v2 in RLDrivenSelector by @drivebyer in #1932
  • fix(metrics): stop double-counting decision evaluation metrics by @WUKUNTAI-0211 in #1945
  • fix: blocker of scan_malicious_code.py by antivirus by @AayushSaini101 in #1918
  • feat(extproc): detect ClientProtocol from :path header by @siloteemu in #1937
  • [Router][CLI] restore hybrid_history retrieval wiring and advanced_filtering schema parity by @e1ijah1 in #1935
  • [Router][Bugfix] router_replay: fix silent postgres INSERT failure that leaves dashboard Insight empty by @WUKUNTAI-0211 in #1942
  • feat(anthropic): native /v1/messages ingress — IR envelope + inbound parser by @siloteemu in #1938
  • [CI/Build] vllm-sr: bump huggingface_hub to fix broken CLI install by @siloteemu in #1954
  • [Router][Docs] add multi_factor selector composing quality/latency/cost/load (closes #37) by @WUKUNTAI-0211 in #1953
  • [Training] fix jailbreak LoRA max_grad_norm trainer crash by @xiaotian-yu in #1948
  • [Router][Docs] wire AutoMix entailment verifier into confidence cascade (closes #1173 AutoMix half) by @WUKUNTAI-0211 in #1952
  • feat(openvino-binding): Add OpenVINO backend for ModernBERT embedding and classifier inference by @EmonLu in #1907
  • [CI/Build] add Anthropic-shape e2e test backend (llama.cpp + shim) by @siloteemu in #1950
  • feat: add CLIENT SETNAME for connection identification by @atao2004 in #1955
  • fix(anthropic): preserve client passthrough fields through ToAnthropicRequestBody by @siloteemu in #1944
  • [Router] Populate Chat Completions previous-model to unblock ModelSwitchGate enforce (#1753) by @theohsiung in #1960
  • [Docs] Correct community meeting cadence wording (#1874) by @theohsiung in #1958
  • [Router] Add multi_emit projection mapping method (#1759) by @theohsiung in #1959
  • refractor: retire high-priority structural debt in operator backend hotspots by @FAUST-BENCHOU in #1930
  • [CI/Build] Remove generated OpenVINO benchmark binary by @cryo-zd in #1963
  • chore(deps): bump the cargo group across 2 directories with 1 update by @dependabot[bot] in #1964
  • chore(deps): bump tar from 0.4.45 to 0.4.46 in /onnx-binding in the cargo group across 1 directory by @dependabot[bot] in #1965
  • [Router] Fix OpenAI reasoning effort request mutation by @brelance in #1933
  • [Docs] Add Agentgateway installation guide by @keithmattix in #1966
  • [Router] Fix path traversal in config rollback version handling by @glitch-ux in #1968
  • [Router][CLI][Dashboard][Operator][Bindings] Improve codebase health by @Xunzhuo in #1970
  • feat: add file attachments feature on the playground by @AayushSaini101 in #1971
  • chore: refresh agent harness and maintainer ops by @Xunzhuo in #1972
  • feat: add session-aware agentic routing by @Xunzhuo in #1974
  • [Router] Fix KB exemplar preload failure when no backend override is set by @Peterren in #1985
  • fix: preserve agentic tool-loop hard lock by @Xunzhuo in #1980
  • [Dashboard] refactor: add room collaboration event bus with unified WS/SSE fan-out by @FAUST-BENCHOU in #1973
  • fix: request identity-encoded upstream responses by @Xunzhuo in #1977
  • fix: record session header usage telemetry by @Xunzhuo in #1982
  • feat: gate remaining-turn priors by replay confidence by @Xunzhuo in #1983
  • Add remaining-turn-prior ablations for agentic routing by @Xunzhuo in #1984
  • [Router] Validate mmbert_model_path rejects classic BERT models at config-load time by @Peterren in #1981
  • docs: include anthropic shim in release contract by @Xunzhuo in #1975
  • fix: lift observability out of helm resources by @Xunzhuo in #1976
  • fix: reject native Windows docker runtime by @Xunzhuo in #1978
  • [Bindings] Retire pre-existing clippy debt in src/ffi/embedding.rs by @shraderdm in #1997
  • feat: add session-aware agentic routing by @Xunzhuo in #1989
  • [Dashboard] Fail closed when the auth service cannot initialize by @glitch-ux in #1993
  • fix(classification): scan prior user turns in history-aware PII/jailbreak signals by @WUKUNTAI-0211 in #1998
  • fix: reconcile each IntelligentRoute/Pool when multiple coexist by @WUKUNTAI-0211 in #1999
  • Fix/Openshift deployment by @Hadar301 in #1994
  • [Dashboard] wire ClawRoom clients to room collaboration event bus by @FAUST-BENCHOU in #1990
  • feat(cli): add 'vllm-sr model list' to inspect configured models by @WUKUNTAI-0211 in #2000
  • refactor: retire high-priority structural debt in router backend hotspots by @FAUST-BENCHOU in #1969
  • [Router] Optimize BM25 index construction and tokenization by @cryo-zd in #2002
  • [Dashboard] Add built-in routing modes with missing-model completion by @Peterren in #1995
  • feat(anthropic): header pass-through, session-id mirroring, and lossiness response headers by @siloteemu in #1939
  • [Bindings] Apply SigLIP image normalization in vision encoder by @shraderdm in #1928
  • [Docs] Fix agentgateway casing by @keithmattix in #2003
  • [Router][E2E] Add MiniMax-M3 to provider model list and tests by @octo-patch in #2005
  • [Bindings] Chunk mmBERT embedding attention to bound memory (#1957) by @Peterren in #2007
  • v0.3 release readiness preparation by @Xunzhuo in #2004
  • fix: align v0.3 routing metadata and docs surfaces by @Xunzhuo in #2011
  • feat(cli): support custom container DNS via VLLM_SR_DNS by @WUKUNTAI-0211 in #2017
  • [Bindings] Retire pre-existing clippy debt in multimodal_embedding.rs by @shraderdm in #1996
  • [Dashboard] Helm chart: correct router URL env var name and add PVC fsGroup default by @shraderdm in #2019
  • [Dashboard] complete room collaboration participants for ClawOS by @FAUST-BENCHOU in #2024
  • [Docs] fix #1845: replace removed Docker Compose guide with vllm-sr CLI redirect by @theohsiung in #2023
  • [Dashboard] Helm chart: add optional HTTPRoute for Gateway API ingress by @lauri-amd in #2025
  • docs: add contributor leaderboard page by @Xunzhuo in #2026
  • feat(anthropic): outbound emitter for non-streaming responses by @siloteemu in #1940
  • [Bindings] Fix SigLIP attentional probe pooling head in candle-binding by @shraderdm in #1927
  • [Perf] Add intent classifier accuracy benchmark by @shira-g in #2031
  • [Bindings] Preprocess images in Rust with PIL-equivalent resize to close cosine drift by @shraderdm in #1943
  • [Docs] fix operator image path in deploy instructions by @lauri-amd in #2036
  • [Router] fix: streaming responses never hit the Redis semantic cache by @WUKUNTAI-0211 in #2034
  • [Dashboard]feat: surface OpenClaw worker tool calls in ClawRoom from session jsonl by @FAUST-BENCHOU in #2028
  • docs: refine contributor leaderboard generation by @Xunzhuo in #2037
  • fix: replace O(N) slice LRU with container/list to eliminate write-l… by @gagandhakrey in #2040
  • document write/update-recency eviction and add RAG cache unit tests by @gagandhakrey in #2041
  • fix/streaming-chunks-memory-accumulation by @gagandhakrey in #2042
  • [Dashboard] Helm chart: harden against multi-writer SQLite corruption by @shraderdm in #2047
  • [CI/Build] Match Helm v4 schema-rejection wording in helm-safety-validate by @shraderdm in #2046
  • [Docs]add leaderboard chinese version by @FAUST-BENCHOU in #2045
  • [Docs]add chinese doc about Operator,Native-backends,Rollback,Valkey by @FAUST-BENCHOU in #2043
  • [Env] Enable AMD platform GPU defaults by @Xunzhuo in #2044
  • [Docs] Fix the hover style for the GitHub Models link by @wilsonwu in #2055
  • [Docs] Fix Chinese page switch to English problem by @wilsonwu in #2052
  • [Dashboard] Add frontend Prettier formatting config by @wilsonwu in #2050
  • [Dashboard] Improve setup wizard model draft validation by @wilsonwu in #2048
  • [Dashboard] Helm chart: document, schema-validate, and guard dashboard.podSecurityContext by @shraderdm in #2053
  • [Dashboard] Helm chart: support a stable DASHBOARD_JWT_SECRET via existingSecret by @shraderdm in #2056

New Contributors

Full Changelog: v0.2.0...v0.3.0