fix: improve memory search with token AND matching and rank-based scoring#10
Conversation
…ring Replace phrase-match FTS queries with tokenized AND-joined queries so searches like "database concurrency locks" match chunks containing all three words in any order, not just as an exact consecutive phrase. Replace max-normalization scoring in hybrid search with rank-based scoring (1/(1+rank)) for both FTS and vector results, preventing pathological cases where a single weak match dominates strong results. https://claude.ai/code/session_0193Gh5J9fEu65cyKyzUp5vT
There was a problem hiding this comment.
Pull request overview
This PR updates the memory index search behavior to improve recall and make hybrid ranking more stable by switching FTS queries from phrase matching to tokenized AND matching, and by replacing score normalization with rank-based scoring.
Changes:
- Replace phrase-quoted FTS queries with tokenized queries joined by
AND(all terms required, any order). - Update hybrid search merge scoring to use rank-based scoring (
1/(1+rank)) for both FTS and vector result sets. - Replace
escape_fts_query()withbuild_fts_query()for constructing FTS5 MATCH expressions.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Add FTS results using rank-based scoring (OpenClaw-compatible) | ||
| // BM25 results are already ordered by relevance (best first) | ||
| for (rank, result) in fts_results.into_iter().enumerate() { | ||
| let key = format!("{}:{}:{}", result.file, result.line_start, result.line_end); | ||
| let normalized_score = (result.score / max_fts_score) as f32; | ||
| let weighted_score = normalized_score * text_weight; | ||
| let rank_score = 1.0 / (1.0 + rank as f32); // rank 0 → 1.0, rank 1 → 0.5, rank 9 → 0.1 | ||
| let weighted_score = rank_score * text_weight; | ||
| merged.insert(key, (weighted_score, result)); | ||
| } |
There was a problem hiding this comment.
In search_hybrid, the MemoryChunk.score returned for FTS-only results remains the raw BM25-derived value from search(), while vector-only / merged results get the combined weighted score. This makes score inconsistent and user-facing outputs (CLI/HTTP/UI) misleading. Consider setting result.score to the computed weighted_score before inserting into merged (and using the same convention for all paths) so every returned chunk’s score reflects the final combined ranking value.
| /// Build FTS5 query from raw input (OpenClaw-compatible) | ||
| /// Tokenizes input and joins with AND so all terms must appear (in any order) | ||
| fn build_fts_query(raw: &str) -> Option<String> { | ||
| let tokens: Vec<&str> = raw | ||
| .split(|c: char| !c.is_alphanumeric() && c != '_') | ||
| .map(|t| t.trim()) | ||
| .filter(|t| !t.is_empty()) | ||
| .collect(); | ||
|
|
||
| if tokens.is_empty() { | ||
| return None; | ||
| } | ||
|
|
||
| // Quote each token individually, join with AND | ||
| let quoted: Vec<String> = tokens | ||
| .iter() | ||
| .map(|t| format!("\"{}\"", t.replace('"', ""))) | ||
| .collect(); | ||
|
|
||
| Some(quoted.join(" AND ")) | ||
| } |
There was a problem hiding this comment.
build_fts_query() introduces new tokenization/AND semantics (and the empty-query early return) but there are no unit tests covering these behaviors (e.g., token order independence, punctuation handling, and the empty-input case). Since this file already has tests for indexing/search, adding targeted tests here would help prevent regressions in query construction and matching behavior.
Replace phrase-match FTS queries with tokenized AND-joined queries so
searches like "database concurrency locks" match chunks containing all
three words in any order, not just as an exact consecutive phrase.
Replace max-normalization scoring in hybrid search with rank-based
scoring (1/(1+rank)) for both FTS and vector results, preventing
pathological cases where a single weak match dominates strong results.
https://claude.ai/code/session_0193Gh5J9fEu65cyKyzUp5vT