Skip to content

tests: add unit tests for rag/embedding/bm25_indexer.py#291

Open
jaishree-verma wants to merge 2 commits intojenkinsci:mainfrom
jaishree-verma:tests/bm25-indexer-unit-tests
Open

tests: add unit tests for rag/embedding/bm25_indexer.py#291
jaishree-verma wants to merge 2 commits intojenkinsci:mainfrom
jaishree-verma:tests/bm25-indexer-unit-tests

Conversation

@jaishree-verma
Copy link

Problem:
rag/embedding/bm25_indexer.py contains the BM25Indexer class which is a critical component of the sparse retrieval pipeline.
It currently has zero unit tests as noted in issue #290.

Changes:
Added tests/unit/rag/embedding/test_bm25_indexer.py covering all major behaviors:

  • __init__ — stores configs, logger, empty configs
  • build() — stores retriever for valid config, skips None results,
    handles multiple configs, handles empty config list
  • _index_config() — returns None when SparseRetriever unavailable,
    handles indexing errors gracefully, returns retriever on success
  • get() — returns None when SparseRetriever unavailable, returns
    cached retriever, loads from disk when not cached, returns None
    on load failure with warning logged, caches after loading

Testing: 13-unit tests covering all methods and edge cases.

Closes #290

- Add tests for _min_max_normalize (empty, equal values, normal range)
- Add tests for get_inverted_scores (negative scores, invalid weight, empty input)
- Add tests for validate_tool_calls (valid, invalid tool, missing param, wrong type)
- Add tests for get_default_tools_call (4 tools, correct names, query passed)
- Add tests for filter_retrieved_data (match, no match, case insensitive)
- Add tests for extract_chunks_content (combined text, fallback, placeholders)
- Add tests for make_placeholder_replacer (returns code, missing code fallback)

Closes jenkinsci#211
- Add tests for BM25Indexer.__init__ (stores configs, logger, empty configs)
- Add tests for build() (stores retriever, skips None, multiple configs, empty)
- Add tests for _index_config() (None when unavailable, error handling, success)
- Add tests for get() (None when unavailable, cached retriever, loads from disk,
  returns None on failure, caches after loading)

Closes jenkinsci#290
@jaishree-verma jaishree-verma requested a review from a team as a code owner March 16, 2026 17:20
@berviantoleo berviantoleo added the tests This PR adds/removes/updates test cases label Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tests This PR adds/removes/updates test cases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Tests] - Add unit tests for rag/embedding/bm25_indexer.py - currently zero test coverage

2 participants