Skip to content

test(metrics): add overlapping-chunk regression fixtures for ContextualPrecisionMetric (rebased on #2743)#2787

Draft
Ruthwik-Data wants to merge 9 commits into
confident-ai:mainfrom
Ruthwik-Data:test/contextual-precision-overlapping-chunks-v2
Draft

test(metrics): add overlapping-chunk regression fixtures for ContextualPrecisionMetric (rebased on #2743)#2787
Ruthwik-Data wants to merge 9 commits into
confident-ai:mainfrom
Ruthwik-Data:test/contextual-precision-overlapping-chunks-v2

Conversation

@Ruthwik-Data

Copy link
Copy Markdown
Contributor

Summary

Adds tests/test_metrics/test_contextual_precision_overlapping_chunks.py — a focused regression test suite for ContextualPrecisionMetric behaviour under overlapping-chunk retrieval scenarios.

This supersedes the test file from PR #2692 and is rebased on the merged changes from PR #2743 (RetrievedContextData._group_retrieval_contexts + WCP formula fix). The fixtures now use the new RetrievedContextData with a source field, which is the correct API post-merge.

Closes #2594 (test coverage aspect).

Motivation

PR #2743 fixed two root causes:

  1. Source grouping_group_retrieval_contexts() merges multiple RetrievedContextData chunks sharing the same source string before LLM scoring, preventing redundant-chunk penalisation.
  2. WCP formula_calculate_score() now correctly implements weighted cumulative precision (binary verdicts, rank-discounted).

Without regression tests, these fixes can silently regress. This PR adds four targeted tests that each guard one failure mode from issue #2594.

What this PR adds

Test fixtures

Fixture Scenario
overlapping_narrative_chunks Two same-source 20%-overlap 10-K narrative chunks
non_overlapping_mixed_chunks Three chunks from distinct sources (one relevant)
table_cell_overlap_chunks Adjacent table-row chunks with shared header (same source/page)

Test cases

Test Guards against
test_same_source_overlap_does_not_penalise Redundancy treated as irrelevance (regression for #2594)
test_cross_source_chunks_scored_independently Source grouping incorrectly merging different documents
test_table_boundary_overlap_not_penalised Table-header overlap lowering score for financial doc RAG
test_wcp_formula_monotone_with_relevant_rank WCP score not decreasing as relevant chunk rank drops

Design notes

Type of change

  • Test (adding missing tests or correcting existing tests)

@vercel

vercel Bot commented Jun 20, 2026

Copy link
Copy Markdown

@Ruthwik-Data is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Contextual Precision over-penalizes overlapping chunks in financial-document RAG

1 participant