Skip to content

test: add validation for default catalog model correspondence#667

Merged
dbasunag merged 10 commits intoopendatahub-io:mainfrom
fege:check_all_models
Oct 7, 2025
Merged

test: add validation for default catalog model correspondence#667
dbasunag merged 10 commits intoopendatahub-io:mainfrom
fege:check_all_models

Conversation

@fege
Copy link
Copy Markdown
Contributor

@fege fege commented Oct 1, 2025

Add comprehensive test suite to validate model data consistency between default
catalog YAML configuration and API responses, including model count verification,
parameter correspondence validation, and random artifact validation.

Description

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Tests
    • Added fixtures to load the default model catalog via the API and to fetch/cache the catalog OpenAPI schema.
    • Added tests validating model count, schema-driven field correspondence, and random artifact comparisons between YAML and API responses.
    • Added a test utility to extract OpenAPI schema fields (excludes server-generated/non-comparable fields) for focused comparisons.

Validate that model parameters in the default catalog YAML file match
those returned by the model catalog API. Adds fixture to load catalog
YAML content and new test to compare model properties between YAML
and API responses, excluding artifacts and null values.
@fege fege requested review from dbasunag and lugi0 as code owners October 1, 2025 15:54
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 1, 2025

📝 Walkthrough

Walkthrough

Adds fixtures to load the default catalog YAML from the catalog pod, fetch the catalog via REST and OpenAPI, introduces tests that compare YAML-defined models and artifacts to API responses using OpenAPI-derived fields, and adds a utility to extract schema fields from the OpenAPI document.

Changes

Cohort / File(s) Summary of changes
Fixtures & constants
tests/model_registry/model_catalog/conftest.py, tests/model_registry/model_catalog/constants.py
Added requests import, re-ordered yaml import, exported DEFAULT_CATALOG_FILE and CATALOG_CONTAINER, added get_model_catalog_pod usage; new fixtures: default_model_catalog_yaml_content(...) (reads/parses default catalog YAML from pod/container), default_catalog_api_response(...) (GET to models REST endpoint, parse JSON), and catalog_openapi_schema() (fetches/caches OpenAPI YAML).
Tests — default catalog comparisons
tests/model_registry/model_catalog/test_default_model_catalog.py
Added TestModelCatalogDefaultData with three tests: test_model_default_catalog_number_of_models (compares YAML model count to API size), test_model_default_catalog_correspondence_of_model_name (field-wise compare YAML vs API using OpenAPI-derived fields), and test_model_default_catalog_random_artifact (validate a random artifact against API). Added random and extract_schema_fields usage; wiring switched from pod CLI to API fixtures.
Utilities — OpenAPI schema field extraction
tests/model_registry/model_catalog/utils.py
Added extract_schema_fields(openapi_schema, schema_name) to recursively collect all and required fields for a given OpenAPI schema, follow allOf/$ref, and exclude server-generated/non-comparable fields (e.g., id, externalId, timestamps, artifacts, source_id).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly summarizes the addition of validation tests for default catalog model correspondence and clearly reflects the primary changeset around enhancing test coverage without including unnecessary detail or vague terminology.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Oct 1, 2025

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/verified', '/build-push-pr-image', '/cherry-pick', '/hold', '/lgtm', '/wip'}

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2af0f0b and ce340ca.

📒 Files selected for processing (2)
  • tests/model_registry/model_catalog/conftest.py (3 hunks)
  • tests/model_registry/model_catalog/test_default_model_catalog.py (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-21T19:08:14.982Z
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#527
File: tests/model_registry/rbac/conftest.py:358-359
Timestamp: 2025-08-21T19:08:14.982Z
Learning: In tests/model_registry/rbac/conftest.py, the model catalog configmap only exists in one specific namespace (py_config["model_registry_namespace"]) regardless of where model registry instances are deployed, making it a shared resource that serves all model registry instances across namespaces.

Applied to files:

  • tests/model_registry/model_catalog/test_default_model_catalog.py
🧬 Code graph analysis (2)
tests/model_registry/model_catalog/conftest.py (3)
tests/model_registry/utils.py (1)
  • get_model_catalog_pod (721-724)
tests/conftest.py (1)
  • admin_client (68-69)
tests/model_registry/conftest.py (1)
  • model_registry_namespace (67-68)
tests/model_registry/model_catalog/test_default_model_catalog.py (3)
tests/model_registry/model_catalog/utils.py (1)
  • execute_get_command (39-45)
tests/model_registry/utils.py (1)
  • get_rest_headers (727-732)
tests/model_registry/model_catalog/conftest.py (2)
  • default_model_catalog_yaml_content (152-154)
  • model_catalog_rest_url (49-58)

Comment thread tests/model_registry/model_catalog/conftest.py
Comment thread tests/model_registry/model_catalog/test_default_model_catalog.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tests/model_registry/model_catalog/test_default_model_catalog.py (1)

202-208: Exclude artifacts from both sides of the comparison.

The docstring (line 181) indicates that artifacts should be ignored, but only the YAML model filters out the artifacts key (line 203). When the API returns non-null artifacts, the diff will flag a difference even though it should be ignored. Additionally, null values should also be filtered from the API model for consistency.

Apply this diff to filter both sides consistently:

             # Exclude artifacts and null-valued properties from YAML model comparison
             model_filtered = {k: v for k, v in model.items() if k != "artifacts" and v is not None}
+            api_model_filtered = {k: v for k, v in api_model.items() if k != "artifacts" and v is not None}
 
-            differences = list(diff(model_filtered, api_model))
+            differences = list(diff(model_filtered, api_model_filtered))
             if differences:
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce340ca and 7621b59.

📒 Files selected for processing (1)
  • tests/model_registry/model_catalog/test_default_model_catalog.py (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-21T19:08:14.982Z
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#527
File: tests/model_registry/rbac/conftest.py:358-359
Timestamp: 2025-08-21T19:08:14.982Z
Learning: In tests/model_registry/rbac/conftest.py, the model catalog configmap only exists in one specific namespace (py_config["model_registry_namespace"]) regardless of where model registry instances are deployed, making it a shared resource that serves all model registry instances across namespaces.

Applied to files:

  • tests/model_registry/model_catalog/test_default_model_catalog.py
🧬 Code graph analysis (1)
tests/model_registry/model_catalog/test_default_model_catalog.py (3)
tests/model_registry/model_catalog/utils.py (1)
  • execute_get_command (39-45)
tests/model_registry/utils.py (1)
  • get_rest_headers (727-732)
tests/model_registry/model_catalog/conftest.py (2)
  • default_model_catalog_yaml_content (152-154)
  • model_catalog_rest_url (49-58)
🔇 Additional comments (3)
tests/model_registry/model_catalog/test_default_model_catalog.py (3)

13-13: LGTM!

The imports are necessary and correctly used throughout the test methods.

Also applies to: 21-21


155-172: LGTM!

The refactoring to use the default_model_catalog_yaml_content fixture simplifies the test and removes direct pod access. The comparison logic is correct and maintainable.


185-187: Ensure pagination or sufficient pageSize in default model catalog test
Automatic verification of the default catalog size failed (model-catalog pod not found). Please confirm whether the default catalog contains more than 100 models; if so, increase the pageSize parameter or implement pagination in the test.

Comment thread tests/model_registry/model_catalog/test_default_model_catalog.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
tests/model_registry/model_catalog/conftest.py (1)

152-154: Guard against empty pod list before indexing.

Indexing [0] will raise IndexError if no pod is present yet. Assert non-empty list first to fail with a clear message.

Apply this diff:

-def default_model_catalog_yaml_content(admin_client: DynamicClient, model_registry_namespace: str) -> dict[Any, Any]:
-    model_catalog_pod = get_model_catalog_pod(client=admin_client, model_registry_namespace=model_registry_namespace)[0]
-    return yaml.safe_load(model_catalog_pod.execute(command=["cat", DEFAULT_CATALOG_FILE], container=CATALOG_CONTAINER))
+def default_model_catalog_yaml_content(admin_client: DynamicClient, model_registry_namespace: str) -> dict[Any, Any]:
+    model_catalog_pods = get_model_catalog_pod(client=admin_client, model_registry_namespace=model_registry_namespace)
+    assert model_catalog_pods, f"No model catalog pod found in namespace {model_registry_namespace}"
+    return yaml.safe_load(
+        model_catalog_pods[0].execute(command=["cat", DEFAULT_CATALOG_FILE], container=CATALOG_CONTAINER)
+    )
🧹 Nitpick comments (6)
tests/model_registry/model_catalog/conftest.py (1)

161-165: Avoid truncation: increase pageSize or implement pagination.

If default catalog grows beyond 100 models, results will truncate and tests will miscount.

Minimal fix:

-        url=f"{model_catalog_rest_url[0]}models?source={DEFAULT_CATALOG_ID}&pageSize=100",
+        url=f"{model_catalog_rest_url[0]}models?source={DEFAULT_CATALOG_ID}&pageSize=1000",

Alternatively, iterate pages or compare against len(items) in tests to decouple from API pagination semantics.

tests/model_registry/model_catalog/test_default_model_catalog.py (5)

170-175: Compare against len(items) rather than “size”.

Using len(items) is robust to API pagination semantics and avoids coupling to the meaning of size.

-        count = len(default_model_catalog_yaml_content.get("models", []))
-
-        assert count == default_catalog_api_response["size"], (
-            f"Expected count: {count}, Actual size: {default_catalog_api_response['size']}"
-        )
+        count = len(default_model_catalog_yaml_content.get("models", []))
+        api_count = len(default_catalog_api_response.get("items", []))
+        assert count == api_count, f"Expected count: {count}, API returned: {api_count}"

186-201: Filter None on API side and compare only YAML-defined keys.

Prevents false diffs from API nulls or extra fields. Also handle missing API model explicitly.

-        api_models = {model["name"]: model for model in default_catalog_api_response.get("items", [])}
+        api_models = {model["name"]: model for model in default_catalog_api_response.get("items", [])}
@@
-            api_model = api_models.get(model["name"])
+            api_model = api_models.get(model["name"])
+            if api_model is None:
+                models_with_differences[model["name"]] = ["missing in API"]
+                LOGGER.warning(f"Model {model['name']} not found in API response")
+                continue
@@
-            # Exclude artifacts and null-valued properties from YAML model comparison
-            model_filtered = {k: v for k, v in model.items() if k != "artifacts" and v is not None}
-
-            differences = list(diff(model_filtered, api_model))
+            # Exclude artifacts and null-valued properties from YAML model comparison
+            model_filtered = {k: v for k, v in model.items() if k != "artifacts" and v is not None}
+            # Exclude null-valued properties from API model, then project to YAML keys only
+            api_model_filtered = {k: v for k, v in api_model.items() if v is not None}
+            api_model_projected = {k: api_model_filtered.get(k) for k in model_filtered.keys()}
+
+            differences = list(diff(model_filtered, api_model_projected))

Note: No change to artifacts on the API side; the /models endpoint doesn’t include artifacts. Based on learnings.


217-219: Make selection deterministic to reduce flakiness.

Tests are easier to triage when they don’t depend on randomness. Pick a deterministic model (e.g., first by name).

-        random_model = random.choice(seq=default_model_catalog_yaml_content.get("models", []))
-        LOGGER.info(f"Random model: {random_model['name']}")
+        models = sorted(default_model_catalog_yaml_content.get("models", []), key=lambda m: m["name"])
+        assert models, "No models found in default catalog YAML"
+        random_model = models[0]
+        LOGGER.info(f"Selected model: {random_model['name']}")

231-237: Sort artifacts before diff to avoid order-sensitive failures.

API may return artifacts in different order. Sort by a stable key (uri) before comparing.

-        yaml_artifacts_filtered = [
-            {k: v for k, v in artifact.items() if k not in ["lastUpdateTimeSinceEpoch"]} for artifact in yaml_artifacts
-        ]
-        api_artifacts_filtered = [
-            {k: v for k, v in artifact.items() if k not in ["lastUpdateTimeSinceEpoch"]}
-            for artifact in api_model_artifacts
-        ]
+        yaml_artifacts_filtered = [
+            {k: v for k, v in artifact.items() if k not in ["lastUpdateTimeSinceEpoch"]} for artifact in yaml_artifacts
+        ]
+        api_artifacts_filtered = [
+            {k: v for k, v in artifact.items() if k not in ["lastUpdateTimeSinceEpoch"]} for artifact in api_model_artifacts
+        ]
+        yaml_artifacts_filtered = sorted(yaml_artifacts_filtered, key=lambda a: a.get("uri", ""))
+        api_artifacts_filtered = sorted(api_artifacts_filtered, key=lambda a: a.get("uri", ""))

239-241: Use the sorted lists for comparison.

Complete the ordering fix by diffing the sorted lists.

-        differences = list(diff(yaml_artifacts_filtered, api_artifacts_filtered))
-        assert not differences, f"Artifacts mismatch for {random_model['name']}: {differences}"
-        LOGGER.info("Artifacts match")
+        differences = list(diff(yaml_artifacts_filtered, api_artifacts_filtered))
+        assert not differences, f"Artifacts mismatch for {random_model['name']}: {differences}"
+        LOGGER.info("Artifacts match")
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7621b59 and 142257b.

📒 Files selected for processing (2)
  • tests/model_registry/model_catalog/conftest.py (3 hunks)
  • tests/model_registry/model_catalog/test_default_model_catalog.py (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-21T19:08:14.982Z
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#527
File: tests/model_registry/rbac/conftest.py:358-359
Timestamp: 2025-08-21T19:08:14.982Z
Learning: In tests/model_registry/rbac/conftest.py, the model catalog configmap only exists in one specific namespace (py_config["model_registry_namespace"]) regardless of where model registry instances are deployed, making it a shared resource that serves all model registry instances across namespaces.

Applied to files:

  • tests/model_registry/model_catalog/conftest.py
  • tests/model_registry/model_catalog/test_default_model_catalog.py
📚 Learning: 2025-10-02T08:55:57.126Z
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#667
File: tests/model_registry/model_catalog/test_default_model_catalog.py:203-207
Timestamp: 2025-10-02T08:55:57.126Z
Learning: In the model catalog API, the `/models` endpoint does not return `artifacts` in the model response. Artifacts are retrieved via a separate endpoint `/models/{name}/artifacts`. Therefore, when comparing YAML model definitions with API responses, only the YAML side needs to filter out the `artifacts` key.

Applied to files:

  • tests/model_registry/model_catalog/test_default_model_catalog.py
🧬 Code graph analysis (2)
tests/model_registry/model_catalog/conftest.py (4)
tests/model_registry/model_catalog/utils.py (3)
  • is_model_catalog_ready (56-66)
  • wait_for_model_catalog_api (35-36)
  • execute_get_command (39-45)
tests/model_registry/utils.py (1)
  • get_model_catalog_pod (721-724)
tests/conftest.py (1)
  • admin_client (68-69)
tests/model_registry/conftest.py (2)
  • model_registry_namespace (67-68)
  • model_registry_rest_headers (310-311)
tests/model_registry/model_catalog/test_default_model_catalog.py (4)
tests/model_registry/model_catalog/utils.py (5)
  • validate_model_catalog_enabled (48-53)
  • execute_get_command (39-45)
  • validate_model_catalog_resource (83-86)
  • validate_default_catalog (89-93)
  • get_validate_default_model_catalog_source (140-150)
tests/model_registry/utils.py (1)
  • get_rest_headers (727-732)
tests/model_registry/model_catalog/conftest.py (3)
  • default_catalog_api_response (158-165)
  • default_model_catalog_yaml_content (152-154)
  • model_catalog_rest_url (49-58)
tests/model_registry/conftest.py (1)
  • model_registry_rest_headers (310-311)
🪛 Ruff (0.13.2)
tests/model_registry/model_catalog/test_default_model_catalog.py

217-217: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

Comment thread tests/model_registry/model_catalog/test_default_model_catalog.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
tests/model_registry/model_catalog/test_default_model_catalog.py (3)

196-196: Consider logging filtered fields to improve debugging visibility.

While the code correctly excludes artifacts from the YAML side (as per the API design), there's no log indicating what was filtered. Adding a debug log would help troubleshoot future issues and address the concern about silently discarding fields.

             # Exclude artifacts from YAML model comparison - Artifacts are validated in a separate test
-            model_filtered = {k: v for k, v in model.items() if k != "artifacts"}
+            model_filtered = {k: v for k, v in model.items() if k != "artifacts"}
+            if "artifacts" in model:
+                LOGGER.debug(f"Filtered 'artifacts' key from {model['name']} for comparison")

217-217: Add defensive check for empty models list.

random.choice will raise IndexError if the models list is empty. While the YAML should always contain models, adding a defensive assertion makes the test more robust and provides a clearer error message.

+        models = default_model_catalog_yaml_content.get("models", [])
+        assert models, "No models found in YAML catalog"
+        
-        random_model = random.choice(seq=default_model_catalog_yaml_content.get("models", []))
+        random_model = random.choice(models)
         LOGGER.info(f"Random model: {random_model['name']}")

220-228: Optimize test execution by checking YAML first.

The test currently fetches artifacts from the API (line 220-223) before checking if the YAML model has artifacts (line 228). If the randomly selected model has no artifacts in YAML, the API call was unnecessary. Reordering these checks improves efficiency.

+        yaml_artifacts = random_model.get("artifacts", [])
+        assert yaml_artifacts, f"No artifacts found in YAML for {random_model['name']}"
+
         api_model_artifacts = execute_get_command(
             url=f"{model_catalog_rest_url[0]}sources/{DEFAULT_CATALOG_ID}/models/{random_model['name']}/artifacts",
             headers=model_registry_rest_headers,
         )["items"]
 
-        yaml_artifacts = random_model.get("artifacts", [])
-
         assert api_model_artifacts, f"No artifacts found in API for {random_model['name']}"
-        assert yaml_artifacts, f"No artifacts found in YAML for {random_model['name']}"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 142257b and aefdc09.

📒 Files selected for processing (1)
  • tests/model_registry/model_catalog/test_default_model_catalog.py (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-02T08:55:57.134Z
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#667
File: tests/model_registry/model_catalog/test_default_model_catalog.py:203-207
Timestamp: 2025-10-02T08:55:57.134Z
Learning: In the model catalog API, the `/models` endpoint does not return `artifacts` in the model response. Artifacts are retrieved via a separate endpoint `/models/{name}/artifacts`. Therefore, when comparing YAML model definitions with API responses, only the YAML side needs to filter out the `artifacts` key.

Applied to files:

  • tests/model_registry/model_catalog/test_default_model_catalog.py
📚 Learning: 2025-08-21T19:08:14.982Z
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#527
File: tests/model_registry/rbac/conftest.py:358-359
Timestamp: 2025-08-21T19:08:14.982Z
Learning: In tests/model_registry/rbac/conftest.py, the model catalog configmap only exists in one specific namespace (py_config["model_registry_namespace"]) regardless of where model registry instances are deployed, making it a shared resource that serves all model registry instances across namespaces.

Applied to files:

  • tests/model_registry/model_catalog/test_default_model_catalog.py
🧬 Code graph analysis (1)
tests/model_registry/model_catalog/test_default_model_catalog.py (4)
tests/model_registry/model_catalog/utils.py (5)
  • validate_model_catalog_enabled (48-53)
  • execute_get_command (39-45)
  • validate_model_catalog_resource (83-86)
  • validate_default_catalog (89-93)
  • get_validate_default_model_catalog_source (140-150)
tests/model_registry/utils.py (1)
  • get_rest_headers (727-732)
tests/model_registry/model_catalog/conftest.py (3)
  • default_catalog_api_response (158-165)
  • default_model_catalog_yaml_content (152-154)
  • model_catalog_rest_url (49-58)
tests/model_registry/conftest.py (1)
  • model_registry_rest_headers (310-311)
🪛 Ruff (0.13.2)
tests/model_registry/model_catalog/test_default_model_catalog.py

217-217: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

🔇 Additional comments (5)
tests/model_registry/model_catalog/test_default_model_catalog.py (5)

3-3: LGTM: random import is appropriate for test data selection.

The static analysis hint (S311) flags the use of standard pseudo-random generators, but this is a false positive. In test code, random.choice is perfectly acceptable for selecting sample test data and does not require cryptographic randomness.


157-159: LGTM: Clean test class organization.

The new test class appropriately separates data validation concerns from user-specific access tests, with clear documentation of its purpose.


161-175: LGTM: Clear and correct model count validation.

The test logic properly extracts and compares counts from both sources, with defensive coding (.get("models", [])) and helpful assertion messages.


231-241: LGTM: Artifact comparison correctly handles timestamp differences.

The test appropriately filters out lastUpdateTimeSinceEpoch from both sides before comparison, as timestamps can legitimately vary. The use of list comprehensions for filtering is clean and the assertion message is helpful.


202-205: Clarify null-valued property handling in test_default_model_catalog

  • The test references ticket RHOAIENG-35322 (no public record). Should null-valued properties be filtered out before computing differences? If so, implement that filtering.
  • Otherwise, mark this test as expected to fail (e.g., xfail) and update the comment to include a valid Jira link or ticket description.

Comment thread tests/model_registry/model_catalog/test_default_model_catalog.py
Comment thread tests/model_registry/model_catalog/test_default_model_catalog.py Outdated
Comment thread tests/model_registry/model_catalog/test_default_model_catalog.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/model_registry/model_catalog/utils.py (1)

162-205: Add error handling for invalid schema names.

The function will raise a KeyError on line 193 if schema_name doesn't exist in the OpenAPI schema. Consider adding validation or a more descriptive error message to help diagnose test failures.

Apply this diff to add validation:

+    if schema_name not in openapi_schema.get("components", {}).get("schemas", {}):
+        raise ValueError(f"Schema '{schema_name}' not found in OpenAPI specification")
+    
     target_schema = openapi_schema["components"]["schemas"][schema_name]
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aefdc09 and b9bd1b8.

📒 Files selected for processing (3)
  • tests/model_registry/model_catalog/conftest.py (3 hunks)
  • tests/model_registry/model_catalog/test_default_model_catalog.py (3 hunks)
  • tests/model_registry/model_catalog/utils.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/model_registry/model_catalog/conftest.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-02T08:55:57.134Z
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#667
File: tests/model_registry/model_catalog/test_default_model_catalog.py:203-207
Timestamp: 2025-10-02T08:55:57.134Z
Learning: In the model catalog API, the `/models` endpoint does not return `artifacts` in the model response. Artifacts are retrieved via a separate endpoint `/models/{name}/artifacts`. Therefore, when comparing YAML model definitions with API responses, only the YAML side needs to filter out the `artifacts` key.

Applied to files:

  • tests/model_registry/model_catalog/test_default_model_catalog.py
📚 Learning: 2025-08-21T19:08:14.982Z
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#527
File: tests/model_registry/rbac/conftest.py:358-359
Timestamp: 2025-08-21T19:08:14.982Z
Learning: In tests/model_registry/rbac/conftest.py, the model catalog configmap only exists in one specific namespace (py_config["model_registry_namespace"]) regardless of where model registry instances are deployed, making it a shared resource that serves all model registry instances across namespaces.

Applied to files:

  • tests/model_registry/model_catalog/test_default_model_catalog.py
🧬 Code graph analysis (1)
tests/model_registry/model_catalog/test_default_model_catalog.py (4)
tests/model_registry/model_catalog/utils.py (2)
  • execute_get_command (43-49)
  • extract_schema_fields (162-206)
tests/model_registry/utils.py (1)
  • get_rest_headers (658-663)
tests/model_registry/model_catalog/conftest.py (4)
  • default_catalog_api_response (169-176)
  • default_model_catalog_yaml_content (163-165)
  • catalog_openapi_schema (180-185)
  • model_catalog_rest_url (52-61)
tests/model_registry/conftest.py (1)
  • model_registry_rest_headers (308-309)
🪛 Ruff (0.13.3)
tests/model_registry/model_catalog/test_default_model_catalog.py

248-248: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

🔇 Additional comments (8)
tests/model_registry/model_catalog/utils.py (2)

174-191: LGTM!

The recursive logic for handling allOf schema composition and $ref resolution is well-implemented. The function correctly accumulates properties and required fields from parent schemas.


196-204: Verify the excluded fields list is complete.

The exclusion list appropriately filters server-generated and non-comparable fields. Based on learnings, the API doesn't return artifacts in the /models endpoint, so excluding it here is correct.

If you discover additional fields during testing that should be excluded (e.g., read-only server timestamps, computed fields), update this list accordingly.

tests/model_registry/model_catalog/test_default_model_catalog.py (6)

158-160: LGTM!

The new test class is well-organized and clearly scoped to validate default catalog data consistency, separate from user-specific access tests.


162-177: LGTM!

The test correctly validates that the model count from the YAML configuration matches the API response size.


178-229: Test logic is sound; known issue documented.

The test properly:

  • Extracts schema fields using the OpenAPI specification
  • Validates required fields are present in both YAML and API responses
  • Filters to schema-defined fields before comparison
  • Documents the known null-value issue (RHOAIENG-35322)

The comment on line 225 correctly notes that the test will fail until RHOAIENG-35322 is resolved. Consider using pytest.mark.xfail if you want to mark this as an expected failure until the bug is fixed:

@pytest.mark.xfail(reason="RHOAIENG-35322: null-valued properties in YAML model")
def test_model_default_catalog_correspondence_of_model_name(...)

248-248: Ignore static analysis warning (false positive).

The Ruff S311 warning about pseudo-random generators being unsuitable for cryptographic purposes is a false positive. This test code uses random.choice() solely to select a model for validation, not for any security-sensitive operation.


268-274: LGTM!

The filtering logic correctly restricts comparison to schema-defined fields, ensuring that any extra or missing fields outside the OpenAPI specification don't cause spurious test failures.


231-278: Remove the # FAILS: artifactType … comment – artifactType isn’t defined in the OpenAPI schema for CatalogModelArtifact (it has zero required fields, even via BaseResource), so the loop is a no-op; the test passes as-is and doesn’t need an xfail or tracking issue.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b9bd1b8 and 806515b.

📒 Files selected for processing (1)
  • tests/model_registry/model_catalog/test_default_model_catalog.py (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-02T08:55:57.134Z
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#667
File: tests/model_registry/model_catalog/test_default_model_catalog.py:203-207
Timestamp: 2025-10-02T08:55:57.134Z
Learning: In the model catalog API, the `/models` endpoint does not return `artifacts` in the model response. Artifacts are retrieved via a separate endpoint `/models/{name}/artifacts`. Therefore, when comparing YAML model definitions with API responses, only the YAML side needs to filter out the `artifacts` key.

Applied to files:

  • tests/model_registry/model_catalog/test_default_model_catalog.py
📚 Learning: 2025-08-21T19:08:14.982Z
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#527
File: tests/model_registry/rbac/conftest.py:358-359
Timestamp: 2025-08-21T19:08:14.982Z
Learning: In tests/model_registry/rbac/conftest.py, the model catalog configmap only exists in one specific namespace (py_config["model_registry_namespace"]) regardless of where model registry instances are deployed, making it a shared resource that serves all model registry instances across namespaces.

Applied to files:

  • tests/model_registry/model_catalog/test_default_model_catalog.py
🧬 Code graph analysis (1)
tests/model_registry/model_catalog/test_default_model_catalog.py (4)
tests/model_registry/model_catalog/utils.py (6)
  • validate_model_catalog_enabled (52-57)
  • execute_get_command (43-49)
  • validate_model_catalog_resource (87-90)
  • validate_default_catalog (93-97)
  • get_validate_default_model_catalog_source (145-155)
  • extract_schema_fields (162-206)
tests/model_registry/utils.py (1)
  • get_rest_headers (658-663)
tests/model_registry/model_catalog/conftest.py (4)
  • default_catalog_api_response (169-176)
  • default_model_catalog_yaml_content (163-165)
  • catalog_openapi_schema (180-185)
  • model_catalog_rest_url (52-61)
tests/model_registry/conftest.py (1)
  • model_registry_rest_headers (308-309)
🪛 Ruff (0.13.3)
tests/model_registry/model_catalog/test_default_model_catalog.py

248-248: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

🔇 Additional comments (6)
tests/model_registry/model_catalog/test_default_model_catalog.py (6)

3-3: LGTM!

The new imports are appropriate for the test implementations. The random module is used for test data selection (not cryptographic purposes), and the utility imports align with the fixtures and schema validation approach.

Also applies to: 14-14, 21-21, 23-23


158-160: LGTM!

The test class structure provides clear separation between user-specific access tests and data validation tests. The docstring clarifies the intent effectively.


162-177: LGTM!

The model count validation is straightforward and correctly uses the fixtures to compare YAML-defined model count against the API response size.


188-192: LGTM! OpenAPI schema-based filtering addresses past feedback.

The approach of using extract_schema_fields to determine which fields to compare is robust and aligns with the suggestion to filter by the response schema. The filtering at lines 217-218 correctly limits comparison to schema-defined fields, and the assertion at line 195 ensures the API response contains models.

Based on learnings: The current filtering correctly handles the artifacts field since extract_schema_fields excludes it from all_model_fields, and the API doesn't return artifacts in the model response.

Also applies to: 194-195, 217-218


248-248: Dismiss static analysis false positive.

The Ruff warning about random.choice being unsuitable for cryptographic purposes is a false positive. The code uses random.choice to select a test model for validation, not for cryptographic purposes.


242-246: LGTM! Artifact validation follows the same robust pattern.

The artifact validation correctly uses the OpenAPI schema-based approach to filter fields before comparison. The implementation is consistent with the model correspondence test and properly handles the schema-defined fields.

Also applies to: 269-274, 276-278

Comment thread tests/model_registry/model_catalog/test_default_model_catalog.py
Comment thread tests/model_registry/model_catalog/test_default_model_catalog.py
@dbasunag dbasunag merged commit 38e0904 into opendatahub-io:main Oct 7, 2025
10 checks passed
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Oct 7, 2025

Status of building tag latest: success.
Status of pushing tag latest to image registry: success.

@fege fege deleted the check_all_models branch October 10, 2025 07:00
mwaykole pushed a commit to mwaykole/opendatahub-tests that referenced this pull request Jan 23, 2026
…tahub-io#667)

* test: add validation for default catalog model correspondence

Validate that model parameters in the default catalog YAML file match
those returned by the model catalog API. Adds fixture to load catalog
YAML content and new test to compare model properties between YAML
and API responses, excluding artifacts and null values.

* fix: do not filter out api results

* change: add validation for random artifact default catalog API and YAML correspondence

* change: Do not filter out null values

* test: validate catalog schema compliance and field requirements

* change: add ticket for failing test

---------

Co-authored-by: Debarati Basu-Nag <dbasunag@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants