Skip to content

Conversation

@rchowell
Copy link
Contributor

Changes Made

Don't check the model directly for embedding dimensions, this is the responsibility of the descriptor and fix the refactor.

Related Issues

https://dist-data.slack.com/archives/C052CA6Q9N1/p1761328915396179

Checklist

  • [N/A] Documented in API Docs (if applicable)
  • [N/A] Documented in User Guide (if applicable)
  • [N/A] If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
  • [N/A] Documentation builds and is formatted properly

@github-actions github-actions bot added the fix label Oct 24, 2025
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Simplifies test implementation by removing runtime model validation checks. The PR removes code that was instantiating embedding models in CI and verifying that AutoConfig.hidden_size matches the actual embedding dimensions from model.get_sentence_embedding_dimension().

Major changes:

  • Relocated test file from tests/ai/test_sentence_transformers.py to tests/ai/transformers/test_transformers_text_embedder.py
  • Removed IS_CI import and conditional logic for running models in CI
  • Removed runtime validation that instantiated models and checked actual embedding dimensions
  • Removed actual text embedding tests (embedder.embed_text(test_texts))
  • Simplified test_sentence_transformers_text_embedder_other to only verify descriptor metadata without model instantiation

Confidence Score: 4/5

  • This PR is safe to merge with minor concerns about test coverage reduction
  • The change removes runtime validation but trusts that AutoConfig.hidden_size accurately reflects embedding dimensions. While this simplifies tests and reduces CI time, it removes validation that the descriptor's dimensions match actual model behavior. The PR description mentions this is the "responsibility of the descriptor" which is architecturally correct, but reduces test coverage for catching mismatches between AutoConfig metadata and actual embedding dimensions
  • No files require special attention - this is a straightforward test simplification

Important Files Changed

File Analysis

Filename Score Overview
tests/ai/transformers/test_transformers_text_embedder.py 5/5 Moved test file from tests/ai/ to tests/ai/transformers/ and removed runtime validation tests that checked model dimensions against actual embeddings

Sequence Diagram

sequenceDiagram
    participant Test as Test Suite
    participant Provider as TransformersProvider
    participant Descriptor as TextEmbedderDescriptor
    participant AutoConfig as AutoConfig (HuggingFace)
    participant Model as SentenceTransformer Model

    Note over Test,Model: Before (Removed Code)
    Test->>Provider: get_text_embedder(model_name)
    Provider->>Descriptor: Create descriptor
    Test->>Descriptor: get_dimensions()
    Descriptor->>AutoConfig: from_pretrained(model).hidden_size
    AutoConfig-->>Descriptor: hidden_size value
    Descriptor-->>Test: EmbeddingDimensions(hidden_size)
    Test->>Descriptor: instantiate()
    Descriptor->>Model: Load SentenceTransformer
    Model-->>Descriptor: model instance
    Descriptor-->>Test: embedder
    Test->>Model: get_sentence_embedding_dimension()
    Model-->>Test: true_dimensions
    Note over Test: Assert descriptor dims == true dims
    Test->>Model: embed_text(["Hello", "Bye"])
    Model-->>Test: embeddings
    Note over Test: Verify embedding lengths

    Note over Test,AutoConfig: After (Current Code)
    Test->>Provider: get_text_embedder(model_name)
    Provider->>Descriptor: Create descriptor
    Test->>Descriptor: get_dimensions()
    Descriptor->>AutoConfig: from_pretrained(model).hidden_size
    AutoConfig-->>Descriptor: hidden_size value
    Descriptor-->>Test: EmbeddingDimensions(hidden_size)
    Note over Test: Assert descriptor dims == expected dims
    Note over Test,Model: No model instantiation or embedding tests
Loading

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@codecov
Copy link

codecov bot commented Oct 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.47%. Comparing base (a880db9) to head (db295c2).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5445      +/-   ##
==========================================
- Coverage   71.48%   71.47%   -0.02%     
==========================================
  Files         996      996              
  Lines      126405   126405              
==========================================
- Hits        90363    90349      -14     
- Misses      36042    36056      +14     

see 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@everettVT everettVT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants