Skip to content

test: add new tests for embedding models#1254

Merged
dbasunag merged 4 commits intoopendatahub-io:mainfrom
dbasunag:other_model
Mar 19, 2026
Merged

test: add new tests for embedding models#1254
dbasunag merged 4 commits intoopendatahub-io:mainfrom
dbasunag:other_model

Conversation

@dbasunag
Copy link
Copy Markdown
Collaborator

@dbasunag dbasunag commented Mar 18, 2026

Pull Request

Summary

Related Issues

  • Fixes:
  • JIRA:

How it has been tested

  • Locally
  • Jenkins

Additional Requirements

  • If this PR introduces a new test image, did you create a PR to mirror it in disconnected environment?
  • If this PR introduces new marker(s)/adds a new component, was relevant ticket created to update relevant Jenkins job?

Summary by CodeRabbit

  • Tests
    • Added a fixture to supply embedding-model responses and new tests validating embedding-model search: ensures results are non-empty, verifies models come from the expected source, and confirms each model is labeled with the text-embedding task.

Signed-off-by: Debarati Basu-Nag <dbasunag@redhat.com>
@github-actions
Copy link
Copy Markdown

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/wip', '/lgtm', '/cherry-pick', '/build-push-pr-image', '/hold', '/verified'}

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 328d5019-25a0-49c3-863b-50a8bd54a72e

📥 Commits

Reviewing files that changed from the base of the PR and between f0a04a1 and ee2f236.

📒 Files selected for processing (1)
  • tests/model_registry/model_catalog/search/test_model_search.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/model_registry/model_catalog/search/test_model_search.py

📝 Walkthrough

Walkthrough

Adds a class-scoped pytest fixture that fetches models filtered by tasks='text-embedding' and a new test class with three tests validating presence, source_id, and task attributes for embedding models.

Changes

Cohort / File(s) Summary
Fixture
tests/model_registry/model_catalog/search/conftest.py
Adds embedding_models_response pytest fixture that calls get_models_from_catalog_api with filterQuery=tasks='text-embedding' using model_catalog_rest_url and model_registry_rest_headers.
Tests
tests/model_registry/model_catalog/search/test_model_search.py
Adds import of OTHER_MODELS_CATALOG_ID and new TestEmbeddingModelSearch class (3 tests) asserting: non-empty result, source_id equals OTHER_MODELS_CATALOG_ID, and every model's tasks includes "text-embedding".

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Actionable Issues

  • Fixture dependency robustness: Ensure model_catalog_rest_url is non-empty before indexing; otherwise risk IndexError. Add explicit validation or raise a clear error.
  • Input validation / injection risk: The fixture constructs a filterQuery containing tasks='text-embedding'. Confirm get_models_from_catalog_api properly escapes/validates this parameter to prevent query injection (CWE-89).
  • Response trust boundaries: Tests assume response schema (size, items, source_id, tasks). Add defensive checks for missing/ malformed fields to avoid crashes and false positives.
  • Test isolation: Verify OTHER_MODELS_CATALOG_ID semantics are correct and stable across environments to avoid flaky failures due to differing catalog sources.
  • Security note: Validate headers handling in model_registry_rest_headers to avoid leaking sensitive tokens in logs; ensure they are not printed or stored insecurely (CWE-200).
🚥 Pre-merge checks | ✅ 1 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is a template with all sections present but lacks actual content. The Summary section is empty, Related Issues are unfilled placeholders, and no specific details about the changes or their rationale are provided. Fill in the Summary section with a description of what tests were added and why. Include any related issue/JIRA links in the Related Issues section.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the primary change: adding new tests for embedding models. It is concise, clear, and directly corresponds to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/model_registry/model_catalog/search/test_model_search.py (1)

458-472: Don't bind this task-filter test to a single source unless the query is source-scoped.

Line 468 requires every tasks='text-embedding' result to come from OTHER_MODELS_CATALOG_ID, but the fixture only filters on task. That couples the test to today's catalog inventory, so a legitimate embedding model added to another source will fail CI even when search behaves correctly. Either constrain the fixture/query to the intended source first, or drop the source-specific assertion from this class.

As per coding guidelines, **: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/model_registry/model_catalog/search/test_model_search.py` around lines
458 - 472, The test test_embedding_models_source_id currently asserts every item
in embedding_models_response (fixture filtered only by task='text-embedding')
has source_id == OTHER_MODELS_CATALOG_ID, coupling the test to current catalog
inventory; either narrow the query/fixture to explicitly request the
OTHER_MODELS_CATALOG_ID source (so embedding_models_response only contains
models from that source) or remove the source-specific assertion entirely —
update the test_embedding_models_source_id to either (A) modify the
fixture/query that builds embedding_models_response to include source filtering
for OTHER_MODELS_CATALOG_ID before checking items, or (B) delete the assertion
that compares model["source_id"] to OTHER_MODELS_CATALOG_ID and keep only
task-based validations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/model_registry/model_catalog/search/conftest.py`:
- Around line 14-18: The fixture in conftest calls
get_models_from_catalog_api(...) without changing the default page_size, so it
only returns page 1 and breaks whole-catalog assertions; update the fixture to
iterate/paginate through all pages returned by get_models_from_catalog_api (use
its page/token parameters or repeatedly call with incremented page/page_size
until no more results) or explicitly pass a sufficiently large page_size to
get_models_from_catalog_api to return the full result set; ensure you use the
same identifiers (model_catalog_rest_url, model_registry_rest_headers,
get_models_from_catalog_api) so downstream tests see all models matching the
"&filterQuery=tasks='text-embedding'" filter.

In `@tests/model_registry/model_catalog/search/test_model_search.py`:
- Around line 447-456: The test test_filter_query_by_text_embedding_task
currently only asserts embedding_models_response.get("size", 0) > 0; update it
to also assert that the returned payload contains a non-empty items list by
checking embedding_models_response.get("items", []) is truthy (or len(...) > 0).
Locate the test function test_filter_query_by_text_embedding_task and add an
explicit assertion on embedding_models_response.get("items", []) to ensure items
are present and not just relying on the reported size.

---

Nitpick comments:
In `@tests/model_registry/model_catalog/search/test_model_search.py`:
- Around line 458-472: The test test_embedding_models_source_id currently
asserts every item in embedding_models_response (fixture filtered only by
task='text-embedding') has source_id == OTHER_MODELS_CATALOG_ID, coupling the
test to current catalog inventory; either narrow the query/fixture to explicitly
request the OTHER_MODELS_CATALOG_ID source (so embedding_models_response only
contains models from that source) or remove the source-specific assertion
entirely — update the test_embedding_models_source_id to either (A) modify the
fixture/query that builds embedding_models_response to include source filtering
for OTHER_MODELS_CATALOG_ID before checking items, or (B) delete the assertion
that compares model["source_id"] to OTHER_MODELS_CATALOG_ID and keep only
task-based validations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 5f721509-29f7-43c6-a391-90a9a43705a3

📥 Commits

Reviewing files that changed from the base of the PR and between abd0153 and f0a04a1.

📒 Files selected for processing (2)
  • tests/model_registry/model_catalog/search/conftest.py
  • tests/model_registry/model_catalog/search/test_model_search.py

Signed-off-by: Debarati Basu-Nag <dbasunag@redhat.com>
Copy link
Copy Markdown
Contributor

@fege fege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@dbasunag dbasunag merged commit ea3d10b into opendatahub-io:main Mar 19, 2026
10 checks passed
@dbasunag dbasunag deleted the other_model branch March 19, 2026 17:25
@github-actions
Copy link
Copy Markdown

Status of building tag latest: success.
Status of pushing tag latest to image registry: success.

ssaleem-rh pushed a commit to ssaleem-rh/opendatahub-tests that referenced this pull request Mar 23, 2026
* test: add new tests for embedding models

Signed-off-by: Debarati Basu-Nag <dbasunag@redhat.com>

* fix: address review comments

Signed-off-by: Debarati Basu-Nag <dbasunag@redhat.com>

---------

Signed-off-by: Debarati Basu-Nag <dbasunag@redhat.com>
Signed-off-by: Shehan Saleem <ssaleem@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants