Add RAG tests based on llama-stack quickstart and tutorial docs#445
Add RAG tests based on llama-stack quickstart and tutorial docs#445jgarciao merged 11 commits intoopendatahub-io:mainfrom
Conversation
jgarciao
commented
Jul 17, 2025
- Update distribution image used when creating LlamaStackDistribution for RAG tests
- Adds some basic tests based on the llama-stack quickstart and tutorial docs
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis update restructures and expands the RAG test suite by reorganizing fixtures, updating deployment and resource specifications, and introducing new readiness and client fixtures. It replaces a single test with a comprehensive class-based suite covering deployment, inference, agent interaction, and RAG pipeline validation. A new utility for validating RAG agent responses is also added. Changes
Possibly related PRs
Suggested labels
Suggested reviewers
📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (3)
🧰 Additional context used🧠 Learnings (3)📓 Common learningstests/rag/test_rag.py (4)tests/rag/conftest.py (7)🔇 Additional comments (13)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
|
The following are automatically added/executed:
Available user actions:
Supported labels{'/hold', '/wip', '/lgtm', '/verified', '/build-push-pr-image', '/cherry-pick'} |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (5)
utilities/rag_utils.py (3)
4-8: Consider using more specific type hints for better type safety.The type hints could be more specific to improve code clarity and type checking.
Apply this diff to use more specific type hints:
-from typing import Any, Dict, Generator, List +from typing import Any, Dict, Generator, List, TypedDict from llama_stack_client import Agent, AgentEventLogger from simple_logger.logger import get_logger LOGGER = get_logger(name=__name__) + + +class TurnExpectation(TypedDict): + question: str + expected_keywords: List[str] + description: str + + +class TurnResult(TypedDict): + question: str + description: str + expected_keywords: List[str] + found_keywords: List[str] + missing_keywords: List[str] + response_content: str + response_length: int + event_count: int + success: bool + error: str | None + + +class ValidationSummary(TypedDict): + total_turns: int + successful_turns: int + failed_turns: int + success_rate: float + total_events: int + total_response_length: int + + +class ValidationResult(TypedDict): + success: bool + results: List[TurnResult] + summary: ValidationSummaryThen update the function signature:
def validate_rag_agent_responses( rag_agent: Agent, session_id: str, - turns_with_expectations: List[Dict[str, Any]], + turns_with_expectations: List[TurnExpectation], stream: bool = True, verbose: bool = True, min_keywords_required: int = 1, print_events: bool = False, -) -> Dict[str, Any]: +) -> ValidationResult:
121-127: Simplify content extraction logic.The multiple hasattr checks for content extraction could be simplified and made more maintainable.
Consider extracting this logic into a helper function:
+def extract_event_content(event: Any) -> str: + """Extract content from various event types.""" + for attr in ['content', 'message', 'text']: + if hasattr(event, attr) and getattr(event, attr): + return str(getattr(event, attr)) + return "" + # Process events for event in AgentEventLogger().log(stream_response): if print_events: event.print() event_count += 1 # Extract content from different event types - if hasattr(event, "content") and event.content: - response_content += str(event.content) - elif hasattr(event, "message") and event.message: - response_content += str(event.message) - elif hasattr(event, "text") and event.text: - response_content += str(event.text) + response_content += extract_event_content(event)
57-216: Consider refactoring this large function for better maintainability.The function is quite long (~160 lines) and handles multiple responsibilities. Breaking it down would improve readability and testability.
Consider splitting the function into smaller, focused functions:
_process_single_turn()- Handle a single conversation turn_validate_keywords()- Validate keywords in response_create_turn_result()- Create the result dictionary for a turn_calculate_summary()- Calculate the summary statisticsThis would make the code more modular and easier to test individually.
tests/rag/test_rag.py (1)
150-165: Consider making document URLs configurable.The torchtune documentation URLs are hardcoded, which could break if the upstream repository structure changes.
Consider making these URLs configurable through environment variables or test configuration:
# At the top of the file or in a config TORCHTUNE_BASE_URL = os.getenv( "TORCHTUNE_DOCS_BASE_URL", "https://raw.githubusercontent.com/pytorch/torchtune/main/docs/source/tutorials/" ) # In the test urls = [ "llama3.rst", "chat.rst", "lora_finetune.rst", "qat_finetune.rst", "memory_optimizations.rst", ] documents = [ RAGDocument( document_id=f"num-{i}", content=f"{TORCHTUNE_BASE_URL}{url}", mime_type="text/plain", metadata={}, ) for i, url in enumerate(urls) ]tests/rag/conftest.py (1)
114-123: Improve robustness of readiness check.The readiness check could be more comprehensive and informative.
Apply this diff to improve error handling and logging:
@retry(wait_timeout=Timeout.TIMEOUT_1MIN, sleep=5) def wait_for_llama_stack_ready(client: LlamaStackClient): try: client.inspect.health() version = client.inspect.version() LOGGER.info(f"Llama Stack server (v{version.version}) is available!") return True - except APIConnectionError: + except APIConnectionError as e: + LOGGER.debug(f"Llama Stack server not ready yet: {e}") + return False + except Exception as e: + LOGGER.warning(f"Unexpected error checking Llama Stack readiness: {e}") return False
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
tests/rag/conftest.py(5 hunks)tests/rag/test_rag.py(1 hunks)utilities/rag_utils.py(2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
tests/rag/test_rag.py (2)
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
tests/rag/conftest.py (2)
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
🧬 Code Graph Analysis (1)
tests/rag/test_rag.py (2)
utilities/rag_utils.py (1)
validate_rag_agent_responses(57-216)tests/rag/conftest.py (2)
llama_stack_distribution_deployment(91-111)lls_client(126-155)
🔇 Additional comments (2)
tests/rag/test_rag.py (1)
56-58: Handle potential StopIteration exception.Same issue as above - using
next()without a default value.Apply this diff:
models = lls_client.models.list() - model_id = next(m for m in models if m.api_model_type == "llm").identifier + llm_model = next((m for m in models if m.api_model_type == "llm"), None) + assert llm_model is not None, "No LLM model found for inference test" + model_id = llm_model.identifier⛔ Skipped due to learnings
Learnt from: fege PR: opendatahub-io/opendatahub-tests#320 File: tests/model_registry/rest_api/conftest.py:257-264 Timestamp: 2025-06-18T08:27:21.114Z Learning: In tests/model_registry/rest_api/conftest.py, the user fege prefers using next() without a default value to raise StopIteration when the mysql container is not found, rather than gracefully handling it with None checks. This is a deliberate "fail fast" design choice.tests/rag/conftest.py (1)
32-35: Verify resource requests and limits are appropriate.The resource specifications show a significant gap between requests (250m CPU, 500Mi memory) and limits (2 CPU, 12Gi memory).
This large difference could lead to resource contention or pod eviction issues. Please verify that these values are appropriate for the Llama Stack server's actual resource consumption patterns.
ChristianZaccaria
left a comment
There was a problem hiding this comment.
Great work!! :)
Do we want to test for average response time?
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
for more information, see https://pre-commit.ci
* feat: refactoring of fixtures, first test with refactored code Signed-off-by: lugi0 <lgiorgi@redhat.com> * fix: change wording slightly in guide Signed-off-by: lugi0 <lgiorgi@redhat.com> * fix: move classes to utils Signed-off-by: lugi0 <lgiorgi@redhat.com> * feat: add options in fixture handling Signed-off-by: lugi0 <lgiorgi@redhat.com> * feat: add test for RBAC multitenancy scenario Signed-off-by: lugi0 <lgiorgi@redhat.com> * fix: remove code not used Signed-off-by: lugi0 <lgiorgi@redhat.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: lugi0 <lgiorgi@redhat.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/rag/conftest.py (1)
141-141: Fix type annotation in docstring to match parameter.The docstring shows
Namespacebut the parameter acceptsNamespace | Project.
🧹 Nitpick comments (2)
utilities/rag_utils.py (2)
91-97: Consider documenting expected event types.The function handles various event attributes, but it would be helpful to document what types of events are expected from the llama_stack_client.
def extract_event_content(event: Any) -> str: - """Extract content from various event types.""" + """Extract content from various event types. + + Checks for 'content', 'message', and 'text' attributes in the event object. + Expected event types from llama_stack_client may include completion events, + message events, and text streaming events. + + Args: + event: Event object from llama_stack_client agent response + + Returns: + Extracted text content or empty string if no content found + """ for attr in ["content", "message", "text"]: if hasattr(event, attr) and getattr(event, attr): return str(getattr(event, attr)) return ""
163-164: Consider extracting the success criteria for better readability.The turn success condition combines multiple criteria. Consider extracting it to a variable for clarity.
# Determine if this turn was successful - turn_successful = ( - event_count > 0 and len(response_content) > 0 and len(found_keywords) >= min_keywords_required - ) + has_events = event_count > 0 + has_content = len(response_content) > 0 + has_required_keywords = len(found_keywords) >= min_keywords_required + turn_successful = has_events and has_content and has_required_keywords
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
tests/model_registry/conftest.py(2 hunks)tests/model_registry/python_client/test_model_registry_creation.py(2 hunks)tests/model_registry/rbac/test_mr_rbac.py(2 hunks)tests/model_registry/rbac/utils.py(3 hunks)tests/rag/conftest.py(5 hunks)tests/rag/test_rag.py(1 hunks)utilities/rag_utils.py(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/model_registry/python_client/test_model_registry_creation.py
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
tests/model_registry/rbac/utils.py (9)
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#296
File: tests/model_registry/rbac/test_mr_rbac_sa.py:0-0
Timestamp: 2025-05-07T10:40:15.681Z
Learning: The `build_mr_client_args` function in the Model Registry RBAC tests uses simple string splitting (`rest_endpoint.split(":")`) for endpoint parsing by design. This implementation is intentionally non-robust to serve as a detection mechanism for any unexpected changes to the endpoint format.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:0-0
Timestamp: 2025-07-17T15:42:26.275Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 fixture (and similar duplicated Model Registry instance fixtures) do not require admin_client, db_deployment_1, or db_secret_1 parameters as explicit dependencies, even though these dependencies exist implicitly through the fixture dependency chain.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:355-371
Timestamp: 2025-06-16T11:25:51.461Z
Learning: In the opendatahub-tests repository, the login_as_test_user fixture in tests/model_registry/rbac/conftest.py intentionally logs back in as the original admin user without providing a password using login_with_user_password. This behavior is acceptable in their testing environment.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, Service resources can be created without explicitly passing the admin_client parameter when using the context manager approach with "with Service()". The client parameter is optional for Service resource creation.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#401
File: tests/model_registry/rest_api/mariadb/conftest.py:89-110
Timestamp: 2025-07-04T00:17:47.799Z
Learning: In tests/model_registry/rest_api/mariadb/conftest.py, the model_registry_with_mariadb fixture should always use OAUTH_PROXY_CONFIG_DICT for the oauth_proxy parameter regardless of the is_model_registry_oauth parameter value, based on expected product behavior for MariaDB-backed ModelRegistry instances.
tests/rag/test_rag.py (4)
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#320
File: tests/model_registry/rest_api/conftest.py:200-216
Timestamp: 2025-06-05T14:32:40.247Z
Learning: In the opendatahub-tests repository, the test fixtures should raise exceptions on cleanup failures rather than just logging warnings. The user fege prefers strict cleanup behavior where tests fail if cleanup doesn't work properly, rather than silently continuing.
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#320
File: tests/model_registry/rest_api/conftest.py:257-264
Timestamp: 2025-06-18T08:27:21.114Z
Learning: In tests/model_registry/rest_api/conftest.py, the user fege prefers using next() without a default value to raise StopIteration when the mysql container is not found, rather than gracefully handling it with None checks. This is a deliberate "fail fast" design choice.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#338
File: tests/model_registry/rbac/test_mr_rbac.py:24-53
Timestamp: 2025-06-06T12:22:57.057Z
Learning: In the opendatahub-tests repository, prefer keeping test parameterization configurations inline rather than extracting them to separate variables/constants, as it makes triaging easier by avoiding the need to jump between different parts of the file to understand the test setup.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
tests/model_registry/rbac/test_mr_rbac.py (11)
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:355-371
Timestamp: 2025-06-16T11:25:51.461Z
Learning: In the opendatahub-tests repository, the login_as_test_user fixture in tests/model_registry/rbac/conftest.py intentionally logs back in as the original admin user without providing a password using login_with_user_password. This behavior is acceptable in their testing environment.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:0-0
Timestamp: 2025-07-17T15:42:26.275Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 fixture (and similar duplicated Model Registry instance fixtures) do not require admin_client, db_deployment_1, or db_secret_1 parameters as explicit dependencies, even though these dependencies exist implicitly through the fixture dependency chain.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:733-770
Timestamp: 2025-07-17T15:42:23.880Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 and model_registry_instance_2 fixtures do not need explicit database dependency fixtures (like db_deployment_1, db_secret_1, etc.) in their function signatures. Pytest's dependency injection automatically handles the fixture dependencies when they reference db_name_1 and db_name_2 parameters. This is the correct pattern for these Model Registry instance fixtures.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#401
File: tests/model_registry/rest_api/mariadb/conftest.py:89-110
Timestamp: 2025-07-04T00:17:47.799Z
Learning: In tests/model_registry/rest_api/mariadb/conftest.py, the model_registry_with_mariadb fixture should always use OAUTH_PROXY_CONFIG_DICT for the oauth_proxy parameter regardless of the is_model_registry_oauth parameter value, based on expected product behavior for MariaDB-backed ModelRegistry instances.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, Service resources can be created without explicitly passing the admin_client parameter when using the context manager approach with "with Service()". The client parameter is optional for Service resource creation.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#296
File: tests/model_registry/rbac/test_mr_rbac_sa.py:0-0
Timestamp: 2025-05-07T10:40:15.681Z
Learning: The `build_mr_client_args` function in the Model Registry RBAC tests uses simple string splitting (`rest_endpoint.split(":")`) for endpoint parsing by design. This implementation is intentionally non-robust to serve as a detection mechanism for any unexpected changes to the endpoint format.
tests/model_registry/conftest.py (11)
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:733-770
Timestamp: 2025-07-17T15:42:23.880Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 and model_registry_instance_2 fixtures do not need explicit database dependency fixtures (like db_deployment_1, db_secret_1, etc.) in their function signatures. Pytest's dependency injection automatically handles the fixture dependencies when they reference db_name_1 and db_name_2 parameters. This is the correct pattern for these Model Registry instance fixtures.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:0-0
Timestamp: 2025-07-17T15:42:26.275Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 fixture (and similar duplicated Model Registry instance fixtures) do not require admin_client, db_deployment_1, or db_secret_1 parameters as explicit dependencies, even though these dependencies exist implicitly through the fixture dependency chain.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:666-676
Timestamp: 2025-07-17T15:41:54.284Z
Learning: In tests/model_registry/conftest.py, the db_secret_1 and db_secret_2 fixtures do not require the admin_client parameter in their signatures, unlike some other Secret fixtures in the codebase. The user lugi0 confirmed this is the correct pattern for these specific fixtures.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#401
File: tests/model_registry/rest_api/mariadb/conftest.py:89-110
Timestamp: 2025-07-04T00:17:47.799Z
Learning: In tests/model_registry/rest_api/mariadb/conftest.py, the model_registry_with_mariadb fixture should always use OAUTH_PROXY_CONFIG_DICT for the oauth_proxy parameter regardless of the is_model_registry_oauth parameter value, based on expected product behavior for MariaDB-backed ModelRegistry instances.
Learnt from: Snomaan6846
PR: opendatahub-io/opendatahub-tests#444
File: tests/model_serving/model_runtime/mlserver/basic_model_deployment/test_mlserver_basic_model_deployment.py:48-714
Timestamp: 2025-07-16T12:20:29.672Z
Learning: In tests/model_serving/model_runtime/mlserver/basic_model_deployment/test_mlserver_basic_model_deployment.py, the same get_deployment_config_dict() function is called twice in each pytest.param because different fixtures (mlserver_inference_service and mlserver_serving_runtime) need the same deployment configuration data. This duplication is intentional to provide identical configuration to multiple fixtures.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, Service resources can be created without explicitly passing the admin_client parameter when using the context manager approach with "with Service()". The client parameter is optional for Service resource creation.
tests/rag/conftest.py (7)
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:733-770
Timestamp: 2025-07-17T15:42:23.880Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 and model_registry_instance_2 fixtures do not need explicit database dependency fixtures (like db_deployment_1, db_secret_1, etc.) in their function signatures. Pytest's dependency injection automatically handles the fixture dependencies when they reference db_name_1 and db_name_2 parameters. This is the correct pattern for these Model Registry instance fixtures.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:666-676
Timestamp: 2025-07-17T15:41:54.284Z
Learning: In tests/model_registry/conftest.py, the db_secret_1 and db_secret_2 fixtures do not require the admin_client parameter in their signatures, unlike some other Secret fixtures in the codebase. The user lugi0 confirmed this is the correct pattern for these specific fixtures.
Learnt from: edwardquarm
PR: opendatahub-io/opendatahub-tests#340
File: tests/model_serving/model_runtime/utils.py:53-58
Timestamp: 2025-07-17T15:53:15.715Z
Learning: For port forwarding in test utilities, the preference is to rely on retry mechanisms (like the @retry decorator) rather than explicit error handling, allowing port forwarding failures to be handled transparently through retries until success.
🧬 Code Graph Analysis (4)
tests/model_registry/rbac/utils.py (2)
tests/conftest.py (1)
admin_client(65-66)tests/model_registry/conftest.py (1)
model_registry_namespace(54-55)
tests/rag/test_rag.py (2)
utilities/rag_utils.py (2)
TurnExpectation(57-60)validate_rag_agent_responses(99-238)tests/rag/conftest.py (2)
llama_stack_distribution_deployment(91-111)lls_client(130-163)
tests/model_registry/conftest.py (3)
tests/conftest.py (1)
teardown_resources(86-93)tests/model_registry/utils.py (2)
get_model_registry_db_label_dict(162-167)get_model_registry_deployment_template_dict(59-159)tests/model_registry/rest_api/utils.py (1)
ModelRegistryV1Alpha1(253-254)
tests/rag/conftest.py (6)
utilities/constants.py (2)
DscComponents(156-177)Timeout(214-225)utilities/data_science_cluster_utils.py (1)
update_components_in_dsc(14-54)utilities/general.py (1)
generate_random_name(308-333)utilities/infra.py (1)
create_ns(73-165)utilities/rag_utils.py (2)
LlamaStackDistribution(11-31)create_llama_stack_distribution(35-54)tests/conftest.py (3)
dsc_resource(355-356)admin_client(65-66)unprivileged_client(306-346)
🔇 Additional comments (20)
tests/model_registry/conftest.py (6)
542-546: LGTM!The unique name generation using UUID substrings is appropriate for avoiding resource name collisions.
Also applies to: 659-663
548-560: Consistent with established patterns.The DB secret fixtures correctly follow the pattern established in the learnings where admin_client is not required in the fixture signatures.
Also applies to: 666-677
563-576: LGTM!The PVC fixtures properly create storage resources with appropriate size and access mode for database usage.
Also applies to: 680-693
579-592: Follows the established pattern for Service creation.As confirmed in the learnings, Service resources can be created without the admin_client parameter when using the context manager approach.
Also applies to: 697-709
595-613: LGTM!The deployment fixtures correctly rely on Pytest's implicit dependency injection through the db_name parameters, as established in the learnings.
Also applies to: 712-730
616-654: Well-implemented multi-instance Model Registry fixtures.The fixtures correctly:
- Generate unique instance names to prevent collisions
- Configure MySQL connections using the appropriate db_name parameters
- Support both OAuth proxy and Istio configurations
- Follow the established pattern of implicit dependency injection
Also applies to: 733-770
tests/model_registry/rbac/utils.py (3)
112-150: Well-implemented RBAC grant function.The function correctly implements fine-grained access control by:
- Restricting permissions to only "get" verb on the specific service resource
- Using
resourceNamesto limit access to the specific Model Registry instance- Properly creating both Role and RoleBinding with appropriate naming conventions
- Waiting for resources to be created synchronously
152-170: LGTM!The revoke function correctly removes RBAC resources in the proper order (RoleBinding before Role) and waits for deletion to complete.
172-181: Good implementation for eventual consistency testing.The function correctly handles the polling pattern by raising AssertionError when access is still granted, allowing TimeoutSampler to retry until ForbiddenException is properly raised.
tests/model_registry/rbac/test_mr_rbac.py (1)
140-264: Comprehensive multi-tenancy RBAC test implementation.The test effectively validates exclusive access control across multiple Model Registry instances by:
- Properly utilizing all required fixtures for two independent MR instances
- Testing access grant/revoke scenarios with proper eventual consistency handling
- Verifying that access to one instance doesn't grant access to another
- Cleaning up resources after the test
Good use of TimeoutSampler for handling Kubernetes eventual consistency.
utilities/rag_utils.py (1)
73-73: No Python version compatibility issues with union syntaxVerified that
pyproject.tomlspecifiesrequires-python = "==3.12.*", and Python 3.12 fully supports PEP 604 union types. The use ofstr | Noneis valid and consistent across the codebase. No changes needed here.tests/rag/test_rag.py (5)
22-51: Well-structured deployment and model validation test.The test properly validates deployment readiness and model availability with clear error messages. The implementation addresses previous review comments about error handling.
52-78: Clean implementation of basic inference test.The test is focused and validates both response structure and content correctness.
79-113: Effective agent conversation test.The test validates agent creation and multi-turn conversations with appropriate assertions.
114-249: Comprehensive RAG pipeline test with proper cleanup.The test effectively validates the full RAG pipeline including vector database integration, document insertion, and knowledge retrieval. The cleanup in the finally block properly addresses previous review comments about resource management.
167-167: Good use of versioned documentation URLs.Using a specific release tag (v0.6.1) ensures test stability and prevents breakage from upstream changes, as suggested in previous reviews.
tests/rag/conftest.py (4)
32-35: Appropriate resource specifications added.The resource requests and limits are reasonable for a RAG deployment, providing minimum guarantees while allowing scaling up to 2 CPUs and 12Gi memory.
52-62: Correct return type annotation for enabled_llama_stack_operator.The fixture now properly yields DataScienceCluster instead of None, improving type safety.
114-127: Well-implemented readiness check with retry mechanism.The function follows best practices with specific exception handling and retry logic, aligning with the project's patterns for port forwarding and service readiness.
147-163: Proper error handling for port forwarding.The implementation correctly addresses previous review comments by adding try-except error handling with logging.
|
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/rag/conftest.py (1)
129-163: LGTM: Well-implemented client fixture with proper error handling.The fixture correctly sets up port forwarding, waits for service readiness, and provides appropriate error handling. The timeout values are reasonable for RAG/LLM operations.
Minor docstring inconsistency - line 141 should reflect the union type:
- rag_test_namespace (Namespace): Namespace or project containing RAG test resources + rag_test_namespace (Namespace | Project): Namespace or project containing RAG test resources
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/rag/conftest.py(5 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#320
File: tests/model_registry/rest_api/conftest.py:200-216
Timestamp: 2025-06-05T14:32:40.247Z
Learning: In the opendatahub-tests repository, the test fixtures should raise exceptions on cleanup failures rather than just logging warnings. The user fege prefers strict cleanup behavior where tests fail if cleanup doesn't work properly, rather than silently continuing.
tests/rag/conftest.py (7)
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:733-770
Timestamp: 2025-07-17T15:42:23.880Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 and model_registry_instance_2 fixtures do not need explicit database dependency fixtures (like db_deployment_1, db_secret_1, etc.) in their function signatures. Pytest's dependency injection automatically handles the fixture dependencies when they reference db_name_1 and db_name_2 parameters. This is the correct pattern for these Model Registry instance fixtures.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:666-676
Timestamp: 2025-07-17T15:41:54.284Z
Learning: In tests/model_registry/conftest.py, the db_secret_1 and db_secret_2 fixtures do not require the admin_client parameter in their signatures, unlike some other Secret fixtures in the codebase. The user lugi0 confirmed this is the correct pattern for these specific fixtures.
Learnt from: edwardquarm
PR: opendatahub-io/opendatahub-tests#340
File: tests/model_serving/model_runtime/utils.py:53-58
Timestamp: 2025-07-17T15:53:15.715Z
Learning: For port forwarding in test utilities, the preference is to rely on retry mechanisms (like the @retry decorator) rather than explicit error handling, allowing port forwarding failures to be handled transparently through retries until success.
🔇 Additional comments (4)
tests/rag/conftest.py (4)
2-20: LGTM: Import changes support new functionality.The new imports are appropriately added to support the enhanced RAG test infrastructure, including port forwarding, client creation, and retry mechanisms.
32-48: LGTM: Server configuration properly updated for RAG workload.The resource specifications, environment variables, and storage configuration are appropriately configured for the RAG use case with reasonable limits and proper paths.
52-92: LGTM: Fixture scope and type hint improvements.The scope changes to "class" improve test efficiency by reducing expensive setup operations, and the type hints correctly support both Namespace and Project types as needed for cross-platform compatibility.
114-126: LGTM: Well-implemented readiness check with proper error handling.The function correctly uses retry logic, handles expected connection errors during startup, and provides informative logging. The timeout and retry intervals are appropriate for service startup.
5fb7363 to
4956bf1
Compare
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
850c6f7 to
7839973
Compare
for more information, see https://pre-commit.ci
|
Status of building tag latest: success. |