Comprehensive test coverage for the RAGImagingPipeline after removing query expansion.
Total Tests: 34
Status: ✅ All Passing
Runtime: ~20 minutes (includes model loading)
# Run all tests
pytest tests/test_retrieval_pipeline.py -v
# Run specific test class
pytest tests/test_retrieval_pipeline.py::TestMedicalRequests -v
# Run with verbose logging
pytest tests/test_retrieval_pipeline.py -v -s
# Run single test
pytest tests/test_retrieval_pipeline.py::TestMedicalRequests::test_lung_segmentation_ct -vTests retrieval for medical imaging tasks with domain-specific terminology.
test_lung_segmentation_ct- Precise medical request with modalitytest_brain_mri_registration- Medical registration tasktest_medical_abbreviation- Medical abbreviation understanding (CT scan)test_dicom_format_hint- DICOM format-specific request with file hints
Key Verification: Medical terms, anatomical structures, imaging modalities (CT, MRI) are correctly matched.
Tests retrieval for general computer vision and image processing tasks.
test_ocr_text_extraction- OCR request (may not be in catalog)test_image_classification- General computer vision tasktest_deblurring_restoration- Image restoration tasktest_jpeg_format_hint- JPEG image processing with format hints
Key Verification: Domain-agnostic retrieval works, non-medical terms properly matched.
Tests queries ranging from very vague to highly specific.
test_vague_analyze_image- Very vague request ("analyze image")test_vague_segment- Vague task without context ("segment")test_precise_3d_liver_segmentation_dicom- Very precise with multiple constraintstest_moderate_precision_nifti_viewer- Moderately precise request
Key Verification: System handles both broad and narrow queries appropriately.
Tests queries for tasks likely not in the imaging tool catalog.
test_video_editing- Video editing (out of scope)test_audio_processing- Audio processing (definitely out of scope)test_3d_rendering_animation- 3D rendering/animation tasktest_document_layout_analysis- Document analysis task
Key Verification: System returns nearest matches gracefully, doesn't fail on out-of-scope queries.
Tests different retrieval configurations and modes.
test_retrieve_no_rerank- Retrieval without CrossEncoder rerankingtest_retrieve_with_rerank- Full retrieval with rerankingtest_rerank_improves_precision- Verify reranking improves result qualitytest_exclusion_filter- Exclusion filter works correctly
Key Verification: Reranking improves precision, exclusions work, both modes return valid results.
Tests image metadata hint generation and integration.
test_format_hint_dicom- DICOM format hint added to querytest_format_hint_nifti- NIfTI format hint addedtest_format_hint_tiff_stack- TIFF stack hint for microscopytest_multiple_formats- Multiple file formats in one request
Key Verification: Format tokens (format:dicom, format:nifti) correctly enhance retrieval.
Tests error conditions and boundary cases.
test_empty_query- Empty query stringtest_very_long_query- Extremely long querytest_special_characters_query- Query with special characterstest_top_k_zero- Request zero resultstest_top_k_large- Request more results than available
Key Verification: System handles edge cases gracefully without crashes.
Tests the retry mechanism for insufficient results.
test_retry_broadens_query- Very specific query triggers retrytest_obscure_term_retry- Obscure medical term needs retry
Key Verification: Retry mechanism activates when needed, broadens search appropriately.
Tests BGE-M3's semantic understanding capabilities.
test_synonym_understanding_visualize_display- Synonyms (visualize/display/show)test_related_concepts_segmentation- Related concept understanding (partition→segment)test_acronym_vs_full_form- Acronym vs full form (CT vs Computed Tomography)
Key Verification: Semantic embeddings handle vocabulary variations naturally.
These tests verify the new simplified retrieval pipeline that:
- ✅ Removed query expansion - No more hardcoded synonym dictionaries
- ✅ Relies on BGE-M3 - Semantic embeddings handle vocabulary naturally
- ✅ Uses CrossEncoder reranking - Precision layer after vector search
- ✅ Integrates image metadata - Format tokens and metadata hints enhance retrieval
- ✅ Domain-agnostic - Works for medical and non-medical tasks
Each test verifies:
- Results are returned (non-empty list)
- Top results are relevant (name matching, description content)
- Scores are properly set (similarity, rerank scores)
- Edge cases handled gracefully (no crashes)
- Semantic understanding works (synonyms, acronyms, related concepts)
- First test is slowest (~40s) - Loads BGE-M3 model and builds FAISS index
- Subsequent tests are faster - Models stay in memory (module-scoped fixture)
- Full suite takes ~20 minutes - Due to 34 tests × ~35s average per test
- Optimize: Use
-kto run subset, or--lfto run last failed
# Run with full traceback
pytest tests/test_retrieval_pipeline.py::TestName::test_name -v --tb=long
# Run with print statements visible
pytest tests/test_retrieval_pipeline.py::TestName::test_name -v -s
# Stop at first failure
pytest tests/test_retrieval_pipeline.py -x
# Run last failed tests only
pytest tests/test_retrieval_pipeline.py --lfWhen adding tests:
- Choose appropriate test class (or create new one)
- Use descriptive test names:
test_<what>_<scenario> - Log key results for debugging:
log.info(f"Result: {result}") - Assert meaningful conditions (not just "len > 0")
- Document expected behavior in docstring
Example:
def test_new_scenario(self, pipeline):
"""Test: Brief description of what this tests."""
results = pipeline.retrieve("query here", top_k=5)
assert len(results) > 0, "Should find results"
# Check specific behavior
result_names = [r["doc"].name for r in results]
log.info(f"Found: {result_names[:3]}")
assert some_condition, "Explain why this should be true"To run in CI/CD:
# Fast smoke test (3 tests, ~2 min)
pytest tests/test_retrieval_pipeline.py -k "lung_segmentation or ocr or empty_query"
# Medium coverage (10 tests, ~6 min)
pytest tests/test_retrieval_pipeline.py -k "Medical or NonMedical or edge"
# Full suite (34 tests, ~20 min)
pytest tests/test_retrieval_pipeline.py- Pipeline:
src/ai_agent/api/pipeline.py - Embedder:
src/ai_agent/retriever/text_embedder.py - Reranker:
src/ai_agent/retriever/reranker.py - Vector Index:
src/ai_agent/retriever/vector_index.py - Image Metadata:
src/ai_agent/utils/image_meta.py