Skip to content

Improve RAG retrieval quality with multi-document retrieval and reranking #58

@VishalDhariwal

Description

@VishalDhariwal

Hi! I have been exploring the Sugar-AI RAG pipeline and noticed that the current implementation retrieves only a single document before passing context to the language model.

This may reduce answer accuracy when the top result is not the most relevant chunk.

I experimented with a small modification that improves retrieval quality while keeping the architecture lightweight.

Proposed improvement:

Query
→ Vector Search (Top 5)
→ Neural Reranker
→ Top 2 Documents
→ LLM

Implementation details:

  • Retrieve top-5 documents from FAISS
  • Rerank them using the bge-reranker-base cross-encoder
  • Select the top-2 documents as context
  • Upgrade embedding model to bge-small-en-v1.5

Benefits:

  • Improved retrieval accuracy
  • Reduced hallucination
  • No additional API costs (runs locally)
  • Minimal changes to existing architecture

I would love feedback from maintainers before opening a pull request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions