Tests for hugging face scheduled sync#1035
Conversation
|
The following are automatically added/executed:
Available user actions:
Supported labels{'/build-push-pr-image', '/wip', '/lgtm', '/hold', '/cherry-pick', '/verified'} |
📝 WalkthroughWalkthroughAdds fixtures, a polling helper, and tests to validate HuggingFace model metadata last_synced timestamps via REST: a module-scoped epoch-ms fixture, a per-test fixture capturing initial last_synced, a retrying wait_for_last_sync_update helper that checks for a 120s update, and new tests asserting presence and timing of last_synced. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@tests/model_registry/model_catalog/huggingface/conftest.py`:
- Around line 50-66: The fixture initial_last_synced_values currently returns a
string from result["customProperties"]["last_synced"]["string_value"]; change it
to return a float (update the type hint to -> float) and convert the retrieved
value with float(...) before returning so consumers like
wait_for_last_sync_update get the correct type; also add defensive access around
result (use get or check for "customProperties" and "last_synced") and raise a
clear error or pytest.skip/assert with context if those keys are missing instead
of letting a KeyError bubble up; reference execute_get_command for where the
response comes from and initial_last_synced_values and wait_for_last_sync_update
for the callers.
In
`@tests/model_registry/model_catalog/huggingface/test_huggingface_model_validation.py`:
- Around line 80-81: Two consecutive pytest.mark.usefixtures decorators are
duplicated; remove the redundant decorator so the test uses a single
`@pytest.mark.usefixtures` containing both fixtures. Locate the test decorated
with pytest.mark.usefixtures and the fixture names current_epoch_time and
updated_catalog_config_map and ensure only one `@pytest.mark.usefixtures`
decorator remains (e.g., `@pytest.mark.usefixtures`("current_epoch_time",
"updated_catalog_config_map")).
- Around line 133-134: The test currently calls pytest.fail(error) inside the
"if error:" branch which passes a dict to pytest.fail; change that to pass a
clear string message (e.g., convert the dict to a string or pretty-print it) so
pytest receives a readable message—locate the "if error:" block where
pytest.fail is called in the test_huggingface_model_validation test and replace
pytest.fail(error) with pytest.fail(str(error)) or a formatted JSON/string
representation.
- Line 92: The parameter type hint for huggingface_api in the test functions is
incorrect: change it from bool to HfApi in the test signatures (e.g., the
huggingface_api parameter in test_huggingface_model_validation,
test_huggingface_model_metadata, and test_hugging_face_models) and ensure HfApi
is imported from huggingface_hub at the top of the test module if not already
present so the type annotation is valid.
- Line 48: The type hint for the fixture parameter
updated_catalog_config_map_scope_function is wrong — change its annotation from
Generator[ConfigMap, str, str] to Generator[ConfigMap, None, None] to match the
fixture implementation in conftest (it yields a ConfigMap and does not send or
return strings); update the import/type usage wherever
updated_catalog_config_map_scope_function is declared or referenced (e.g., the
test function signature in test_huggingface_model_validation.py) to use
Generator[ConfigMap, None, None].
🧹 Nitpick comments (5)
tests/model_registry/model_catalog/huggingface/utils.py (3)
145-151: Add Google-format docstring with Args/Returns sections.Per coding guidelines, utility functions should have Google-format docstrings. The current one-liner doesn't document parameters or return value.
📝 Suggested docstring
def wait_for_last_sync_update( model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], model_name: str, initial_last_synced_values: float, ) -> bool: - """Wait for the last_synced value to be updated with exact 120-second difference""" + """Wait for the last_synced value to be updated with expected sync interval. + + Args: + model_catalog_rest_url: List of model catalog REST API URLs. + model_registry_rest_headers: Headers for REST API authentication. + model_name: Name of the model to check. + initial_last_synced_values: Initial last_synced timestamp in milliseconds. + + Returns: + True if sync interval matches expected 120 seconds, False otherwise. + """
153-153: UseHF_SOURCE_IDconstant instead of hardcoded"hf_id".Other functions in this file (e.g.,
assert_huggingface_values_matches_model_catalog_api_valuesat line 76) use theHF_SOURCE_IDconstant fromtests.model_registry.model_catalog.constants. This should be consistent.🔧 Suggested fix
- url = f"{model_catalog_rest_url[0]}sources/hf_id/models/{model_name}" + url = f"{model_catalog_rest_url[0]}sources/{HF_SOURCE_ID}/models/{model_name}"
160-178: Exact equality check (== 120) is brittle and may cause flaky tests.Comparing
difference_seconds == expected_diffrequires the sync interval to be precisely 120 seconds. Real-world timing variations (network latency, execution delays, clock drift) could cause this to intermittently fail with values like 119 or 121 seconds.Consider using a tolerance range instead:
🔧 Suggested fix with tolerance
expected_diff = 120 - if difference_seconds == expected_diff: + tolerance = 5 # Allow ±5 seconds tolerance for timing variations + if abs(difference_seconds - expected_diff) <= tolerance: LOGGER.info(f"Model {model_name} successfully synced with correct interval ({difference_seconds}s)") return True else: LOGGER.error( - f"Model {model_name}: sync interval should be {expected_diff}s, " + f"Model {model_name}: sync interval should be {expected_diff}s (±{tolerance}s), " f"but found {difference_seconds}s (difference: {abs(difference_seconds - expected_diff)}s). " f"Initial: {initial_last_synced_values}, Current: {current_last_synced}" ) return Falsetests/model_registry/model_catalog/huggingface/test_huggingface_model_validation.py (2)
21-22: Missing pytest markers per coding guidelines.Per guidelines, tests should have tier markers (
smoke,sanity,tier1,tier2) and component markers (model_registry). The docstring also doesn't follow the Given-When-Then format.🔧 Suggested markers and docstring format
+@pytest.mark.tier2 +@pytest.mark.model_registry class TestLastSyncedMetadataValidation: - """Test HuggingFace model last synced timestamp validation""" + """Test HuggingFace model last synced timestamp validation. + + Given: A HuggingFace model catalog with configured sync interval + When: The sync interval elapses + Then: The last_synced timestamp is updated correctly + """As per coding guidelines, every test should have appropriate pytest markers.
46-53: Clarify unused fixture parameter and fix docstring format.The
updated_catalog_config_map_scope_functionparameter (flagged by static analysis as unused) is actually required for test setup—it triggers the ConfigMap update. Consider adding a brief comment explaining this fixture dependency.The docstring should follow Given-When-Then format per coding guidelines.
📝 Suggested docstring
def test_huggingface_last_synced_custom( self: Self, - updated_catalog_config_map_scope_function: Generator[ConfigMap, str, str], + updated_catalog_config_map_scope_function: Generator[ConfigMap, str, str], # Required for ConfigMap setup initial_last_synced_values: str, model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], model_name: str, ): """ - Custom test for HuggingFace model last synced validation + Test that last_synced timestamp updates after sync interval. + + Given: A HuggingFace model with initial last_synced value + When: The configured sync interval (2m) elapses + Then: The last_synced timestamp is updated with ~120s difference """
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@tests/model_registry/model_catalog/huggingface/conftest.py`:
- Around line 41-47: The fixture epoch_time_before_config_map_update uses a
plain docstring; replace it with a Google-style docstring: start with a one-line
summary, an empty line, then Args (none) and Returns sections documenting that
it returns the current epoch time in milliseconds as a float and the fixture
scope/purpose; keep the function name and behavior unchanged and ensure the
docstring follows Google formatting conventions.
In
`@tests/model_registry/model_catalog/huggingface/test_huggingface_model_validation.py`:
- Around line 21-23: The test class TestLastSyncedMetadataValidation is missing
the required pytest markers; add the appropriate pytest.mark decorators for the
configured tier, component and infrastructure markers (e.g., decorate the class
with `@pytest.mark`.<tier>, `@pytest.mark`.<component>, and
`@pytest.mark`.<infrastructure>) so test selection works; apply these to the class
definition (TestLastSyncedMetadataValidation) using the exact marker names
defined in pytest.ini.
- Around line 84-95: The test function
test_huggingface_model_metadata_last_synced is missing a return type annotation
and its docstring should follow the Google-style Given/When/Then format; update
the function signature to include "-> None" and replace the existing docstring
with a concise Google-format docstring that lists Given (preconditions), When
(action performed), and Then (expected outcome) to satisfy test guidelines while
keeping the same intent and details.
- Around line 58-63: The call to wait_for_last_sync_update in the test doesn’t
assert its boolean return value, so failures are ignored; capture its return
(e.g., result = wait_for_last_sync_update(...)) and add an assertion like assert
result or assert result is True with a helpful message so the test fails when
polling did not succeed; modify the invocation of wait_for_last_sync_update in
test_huggingface_model_validation.py to store the return and assert it.
- Around line 46-56: Add a return type annotation "-> None" to the
test_huggingface_last_synced_custom function signature and replace its current
freeform docstring with a Google-style Given/When/Then docstring describing the
preconditions (Given), the action (When), and the expected outcome (Then) for
the HuggingFace model last synced validation; update the docstring directly
inside the test_huggingface_last_synced_custom function so tools and linters
pick up the new annotation and formatted docstring.
♻️ Duplicate comments (3)
tests/model_registry/model_catalog/huggingface/test_huggingface_model_validation.py (2)
48-48: Fix type hint for the fixture parameter.The fixture yields only
ConfigMapand does not send/return strings, so the type should beGenerator[ConfigMap, None, None]. This is a duplicate of a prior review comment.
91-91: Use the correct HuggingFace API type.
huggingface_apishould be typed asHfApi, notbool. This mirrors a prior review comment.🔧 Suggested change
+from huggingface_hub import HfApi ... - huggingface_api: bool, + huggingface_api: HfApi,tests/model_registry/model_catalog/huggingface/conftest.py (1)
50-66: Return type should be float and guard missing keys.This still returns a string and will raise
KeyErrorifcustomProperties/last_syncedis missing. This matches a previously raised issue.🔧 Suggested adjustment
def initial_last_synced_values( request: pytest.FixtureRequest, model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], -) -> str: - """ - Collect initial last_synced values for a given model. - """ +) -> float: + """Collect initial last_synced values for a given model. + + Args: + request: Pytest fixture request with model_name as param. + model_catalog_rest_url: List of catalog REST URLs. + model_registry_rest_headers: Headers for REST API calls. + + Returns: + float: last_synced timestamp (milliseconds since epoch). + """ model_name = request.param url = f"{model_catalog_rest_url[0]}sources/hf_id/models/{model_name}" result = execute_get_command( url=url, headers=model_registry_rest_headers, ) - return result["customProperties"]["last_synced"]["string_value"] + try: + return float(result["customProperties"]["last_synced"]["string_value"]) + except KeyError as exc: + pytest.fail(f"Missing last_synced in response for {model_name}: {result}") from exc
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
tests/model_registry/model_catalog/huggingface/test_huggingface_model_validation.py (3)
133-145: Add return type annotation and convert to Google-format Given-When-Then docstring.This test is missing the
-> Nonereturn type annotation and lacks a Google-format Given-When-Then docstring, violating the coding guidelines for test files.Proposed update
def test_huggingface_model_metadata( self: Self, updated_catalog_config_map: tuple[ConfigMap, str, str], model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], expected_catalog_values: dict[str, str], huggingface_api: bool, - ): - """ - Validate HuggingFace model metadata structure and required fields - Cross-validate with actual HuggingFace Hub API - """ + ) -> None: + """Validate HuggingFace model metadata. + + Given the catalog is configured for HuggingFace models, + When metadata is fetched from the catalog API, + Then values should match the HuggingFace Hub API results. + + Args: + updated_catalog_config_map: ConfigMap update fixture. + model_catalog_rest_url: Catalog API base URL(s). + model_registry_rest_headers: REST headers for API calls. + expected_catalog_values: Expected model metadata values. + huggingface_api: HuggingFace API client. + """
218-223: Mark unused fixture argument to silence ARG002 linter warning.
updated_catalog_config_map_scope_functionis unused in the function body but required for its side effects (resource patching and cleanup via the context manager). The fixture's presence triggers pytest setup/teardown, but the yielded value is never referenced.🔧 Suggested fix
LOGGER.info("Testing HuggingFace model exclusion functionality") + _ = updated_catalog_config_map_scope_function wait_for_hugging_face_model_import(
218-230: Add return type annotation and Google-format Given/When/Then docstring.Per coding guidelines, this test requires full type annotations and proper docstring formatting.
♻️ Proposed update
def test_hugging_face_models( self: Self, admin_client: DynamicClient, model_registry_namespace: str, updated_catalog_config_map_scope_function: Generator[ConfigMap, None, None], model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], huggingface_api: bool, num_models_from_hf_api_with_matching_criteria: int, - ): - """ - Test that excluded models do not appear in the catalog API response - """ + ) -> None: + """Validate excluded models are not returned. + + Given a catalog with include/exclude patterns, + When the catalog API is queried, + Then excluded models should not appear in the results. + + Args: + admin_client: OpenShift client fixture. + model_registry_namespace: Target namespace. + updated_catalog_config_map_scope_function: Catalog config fixture. + model_catalog_rest_url: Catalog API base URL(s). + model_registry_rest_headers: REST headers for API calls. + huggingface_api: HuggingFace API client. + num_models_from_hf_api_with_matching_criteria: Expected model count. + """
🤖 Fix all issues with AI agents
In
`@tests/model_registry/model_catalog/huggingface/test_huggingface_model_validation.py`:
- Around line 87-92: The fixture argument updated_catalog_config_map is
currently unused and triggers ARG002; mark it as used by referencing it once in
the test (for example assign it to a throwaway variable or call a no-op like _ =
updated_catalog_config_map) so the linter is satisfied; update the test function
signature that accepts updated_catalog_config_map (the pytest parameter in
test_huggingface_model_validation.py) to include a single usage statement
referencing updated_catalog_config_map.
- Around line 47-53: The test function test_huggingface_last_synced_custom
receives the fixture updated_catalog_config_map_scope_function only for its side
effects but currently triggers ARG002; inside the body of
test_huggingface_last_synced_custom, explicitly assign the fixture parameter to
"_" (e.g. `_ = updated_catalog_config_map_scope_function`) at the top of the
function to mark it as intentionally unused while preserving its setup/teardown
behavior.
♻️ Duplicate comments (3)
tests/model_registry/model_catalog/huggingface/test_huggingface_model_validation.py (3)
133-140: Fixhuggingface_apitype hints toHfApi.The fixture returns an
HfApi; typing it asboolis incorrect and breaks strict typing.🔧 Suggested fix
def test_huggingface_model_metadata( self: Self, updated_catalog_config_map: tuple[ConfigMap, str, str], model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], expected_catalog_values: dict[str, str], - huggingface_api: bool, + huggingface_api: HfApi, ):def test_hugging_face_models( self: Self, admin_client: DynamicClient, model_registry_namespace: str, updated_catalog_config_map_scope_function: Generator[ConfigMap, None, None], model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], - huggingface_api: bool, + huggingface_api: HfApi, num_models_from_hf_api_with_matching_criteria: int, ):As per coding guidelines, please update accordingly.
Also applies to: 218-226
47-56: Add return type and Google Given/When/Then docstring.Strict typing and the test docstring format are required here.
♻️ Proposed update
def test_huggingface_last_synced_custom( self: Self, updated_catalog_config_map_scope_function: Generator[ConfigMap, None, None], initial_last_synced_values: str, model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], model_name: str, - ): - """ - Custom test for HuggingFace model last synced validation - """ + ) -> None: + """Validate HuggingFace model last_synced update. + + Given a catalog configured with a custom sync interval, + When the model metadata is polled after the update, + Then last_synced should advance by the expected interval. + + Args: + updated_catalog_config_map_scope_function: Catalog config fixture. + initial_last_synced_values: Baseline last_synced value (ms). + model_catalog_rest_url: Catalog API base URL(s). + model_registry_rest_headers: REST headers for API calls. + model_name: Model under validation. + """As per coding guidelines, please update accordingly.
85-96: Add return type and Google Given/When/Then docstring.This test also needs a strict return type and a Google-format Given/When/Then docstring.
♻️ Proposed update
def test_huggingface_model_metadata_last_synced( self: Self, epoch_time_before_config_map_update: float, updated_catalog_config_map: tuple[ConfigMap, str, str], model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], expected_catalog_values: dict[str, str], huggingface_api: HfApi, - ): - """ - Validate HuggingFace model last synced timestamp is properly updated - """ + ) -> None: + """Validate last_synced timestamps. + + Given the catalog is updated, + When model metadata is fetched from the API, + Then last_synced should be present and after the test start time. + + Args: + epoch_time_before_config_map_update: Epoch ms recorded at class start. + updated_catalog_config_map: ConfigMap update fixture. + model_catalog_rest_url: Catalog API base URL(s). + model_registry_rest_headers: REST headers for API calls. + expected_catalog_values: Expected model metadata values. + huggingface_api: HuggingFace API client. + """As per coding guidelines, please update accordingly.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/model_registry/model_catalog/huggingface/utils.py`:
- Around line 175-191: The strict equality check for sync interval in the block
using current_last_synced, initial_last_synced_values, difference_seconds and
expected_diff causes flakiness; change the computation to use a rounded seconds
value (use round((current_last_synced - initial_last_synced_values) / 1000)
instead of int(...)) and introduce a small tolerance (e.g., tolerance_seconds =
1 or 2) then treat the sync as successful when abs(difference_seconds -
expected_diff) <= tolerance_seconds; update the LOGGER.info/LOGGER.error
messages to reflect the tolerance and still return True when within tolerance.
♻️ Duplicate comments (2)
tests/model_registry/model_catalog/huggingface/test_huggingface_model_validation.py (2)
47-57: Add-> Noneand Given/When/Then docstring.✏️ Suggested update
def test_huggingface_last_synced_custom( self: Self, updated_catalog_config_map_scope_function: Generator[ConfigMap, None, None], initial_last_synced_values: str, model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], model_name: str, - ): - """ - Custom test for HuggingFace model last synced validation - """ + ) -> None: + """Validate last_synced updates for a custom sync interval. + + Given a catalog configured with a 2-minute sync interval, + When the model metadata is polled after the update, + Then last_synced should advance by the expected interval. + """As per coding guidelines, tests require Given/When/Then Google-style docstrings and return type annotations.
86-96: Add-> Noneand Given/When/Then docstring.✏️ Suggested update
def test_huggingface_model_metadata_last_synced( self: Self, epoch_time_before_config_map_update: float, model_catalog_rest_url: list[str], model_registry_rest_headers: dict[str, str], expected_catalog_values: dict[str, str], huggingface_api: HfApi, - ): - """ - Validate HuggingFace model last synced timestamp is properly updated - """ + ) -> None: + """Validate last_synced timestamps are present and recent. + + Given a catalog update has occurred, + When model metadata is fetched from the API, + Then last_synced should exist and be after the test start time. + """As per coding guidelines, tests require Given/When/Then Google-style docstrings and return type annotations.
|
Status of building tag latest: success. |
* Tests for scheduled sync * minor updates * address comments
Description
How Has This Been Tested?
Merge criteria:
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.