Skip to content

[Tests] - Add unit tests for rag/embedding/bm25_indexer.py - currently zero test coverage #290

@jaishree-verma

Description

@jaishree-verma
While exploring the codebase for application for 2026, I noticed that `rag/embedding/bm25_indexer.py` contains the `BM25Indexer` class 
which is a critical component of the sparse retrieval pipeline but currently has zero unit tests. So, it creates a problem as: 

Problem: 
`rag/embedding/bm25_indexer.py` contains the `BM25Indexer` class which is a critical component of the sparse retrieval pipeline. It currently has zero-unit tests.

 Untested Behaviors: 
- `build()` stores retrievers correctly for valid configs
- `build()` skips configs when SparseRetriever is unavailable
- `_index_config()` returns None when SparseRetriever is None
- `_index_config()` handles indexing errors gracefully
- `get()` returns cached retriever if already in memory
- `get()` loads retriever from disk if not cached
- `get()` returns None when index not found or load fails
- `get()` returns None when SparseRetriever is unavailable

Proposed Solution: 
Add `tests/unit/rag/embedding/test_bm25_indexer.py` covering the above behaviors following existing test conventions.

And hence now I would like to work on this issue. 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions