Improve RAG retrieval quality with multi-document retrieval and reranking

Hi! I have been exploring the Sugar-AI RAG pipeline and noticed that the current implementation retrieves only a single document before passing context to the language model.

This may reduce answer accuracy when the top result is not the most relevant chunk.

I experimented with a small modification that improves retrieval quality while keeping the architecture lightweight.

Proposed improvement:

Query
→ Vector Search (Top 5)
→ Neural Reranker
→ Top 2 Documents
→ LLM

Implementation details:

* Retrieve top-5 documents from FAISS
* Rerank them using the `bge-reranker-base` cross-encoder
* Select the top-2 documents as context
* Upgrade embedding model to `bge-small-en-v1.5`

Benefits:

* Improved retrieval accuracy
* Reduced hallucination
* No additional API costs (runs locally)
* Minimal changes to existing architecture

I would love feedback from maintainers before opening a pull request.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve RAG retrieval quality with multi-document retrieval and reranking #58

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve RAG retrieval quality with multi-document retrieval and reranking #58

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions