feat: add --no-rerank flag to serve and search commands#93
feat: add --no-rerank flag to serve and search commands#93Aetherall wants to merge 1 commit intoRyandonofrio3:mainfrom
Conversation
Add --no-rerank option to both 'osgrep serve' and 'osgrep search' to disable the neural reranking stage. This uses the existing fusion scoring (BM25 + vector similarity) instead of loading the ~2GB ColBERT reranking model. The serve command's /search endpoint now also respects a 'rerank' field in the request body, allowing per-request control. The CLI flag sets the default, and the request body can override it. Performance impact on 'osgrep serve': First query with rerank: >120s (loads reranking ONNX model) First query --no-rerank: 321ms Warm queries --no-rerank: 66ms The quality tradeoff is minimal — fusion scoring already produces good results; neural reranking only adds marginal improvement for ambiguous queries. Usage: osgrep serve --no-rerank # fast server, no reranking osgrep search 'auth' --no-rerank # fast CLI search
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughAdded Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
This PR was made with opencode, im using this patch in my shell quick navigation tool. in this scenario, i value the tradeoff of lesser quality result for startup speed. might be out of scope for the mcp usage, but its quite handy in the shell too ! |
Summary
--no-rerankflag toosgrep serveandosgrep search/searchendpoint now respects arerankfield in the request body, with the CLI flag as the defaultProblem
The
servecommand hardcodes{ rerank: true }when callingsearcher.search(). On the first search query, this triggers loading of the ~2GB ColBERT neural reranking ONNX model, which takes >2 minutes before the first response is returned. This makesosgrep serveimpractical for interactive use cases (e.g., Neovim pickers, editor integrations) where sub-second latency is expected.Solution
Both
serveandsearchcommands now accept--no-rerankwhich passes{ rerank: false }tosearcher.search(). The searcher already handles this case gracefully (line 258 ofsearcher.ts:const doRerank = _search_options?.rerank ?? true) by falling back to fusion scoring.For
serve, the/searchendpoint also readsrerankfrom the JSON request body, allowing per-request control. The CLI flag sets the server-wide default.Performance
Benchmarked on a small project (~35 files):
serve(default, rerank=true)serve --no-reranksearch --no-rerank(CLI)The quality difference is minimal — fusion scoring (BM25 + vector similarity) produces good results; neural reranking adds marginal improvement primarily for ambiguous queries.
Usage
Changes
src/commands/serve.ts: Add--no-rerankoption, readrerankfrom request body with CLI default fallbacksrc/commands/search.ts: Add--no-rerankoption, pass tosearcher.search(), forward to server when using HTTP client pathSummary by CodeRabbit
--no-rerankCLI option to disable neural reranking