Skip to content

v0.4.3 — Citation Coherence, BM25 Pipeline Refinements & Admin Valves

Latest

Choose a tag to compare

@x-hannibal x-hannibal released this 24 Apr 01:10
· 1 commit to main since this release

Highlights

The v0.4 line rewires what gets read, what gets cited, and how it gets organised — same total context size, far better signal density, and citations that actually work in the UI.

  • 🎯 Pages are ranked and budgeted by relevance. BM25 scores every fetched page against your question. The most relevant pages get up to 3× the default character allowance; marginal pages shrink; off-topic noise (a Python tutorial that snuck into an autonomous-vehicles query) is dropped entirely.
  • 🔗 Citations that always resolve. Inline [N] markers in the model's reply now map 1:1 to the source pills in the OWUI sidebar — no more dangling [3], no more [REF]…[/REF] improvisations from models like Mistral, no more pool-only snippets cited with invented slugs.
  • 🧠 Better-structured answers across small and reasoning models. Tested on Mistral 24B, Gemma3 27B, and Qwen3 thinking. Replies are noticeably more comprehensive after the prompt was restructured around how transformer attention actually behaves on long contexts.
  • 🎛️ New admin valve inject_snippet_pool (default ON) — flip OFF to keep the LLM context tight to fully-fetched pages only.

In practice

For a ??:10 <question>, the two most relevant pages can each carry 8–12k characters of real body text while marginally-related ones shrink to ~200-char snippets and irrelevant ones disappear — instead of everyone getting a flat 4k slice. The answer you get back is noticeably more focused, and every cited source is one click away in the OWUI sidebar.

Other changes / fixes

Budget preservation across drops, surplus reclamation, reasoning-model reply-length accounting, prompt internals, User-Agent rotation pool expansion (20 → 40), debug-gated stats line, and more — see the full CHANGELOG for the complete list.

Upgrade

Drop the new easysearch.py into your OWUI Functions panel. No config migration needed: existing valves keep their values; the new inject_snippet_pool defaults to ON (matches v0.4.2 behaviour but with BM25 filtering).