Skip to content

Add RAG tests based on llama-stack quickstart and tutorial docs#445

Merged
jgarciao merged 11 commits intoopendatahub-io:mainfrom
jgarciao:add-rag-quicktart-tests
Jul 21, 2025
Merged

Add RAG tests based on llama-stack quickstart and tutorial docs#445
jgarciao merged 11 commits intoopendatahub-io:mainfrom
jgarciao:add-rag-quicktart-tests

Conversation

@jgarciao
Copy link
Copy Markdown
Contributor

  • Update distribution image used when creating LlamaStackDistribution for RAG tests
  • Adds some basic tests based on the llama-stack quickstart and tutorial docs

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
@jgarciao jgarciao requested a review from a team as a code owner July 17, 2025 13:09
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jul 17, 2025

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Introduced comprehensive end-to-end tests for LlamaStack RAG functionality, including deployment readiness, inference, agent interactions, and RAG pipeline validation.
    • Added a utility for validating RAG agent responses against expected keywords, providing detailed per-turn results and summaries.
  • Tests

    • Expanded and restructured test suite to cover server readiness, chat completion, agent conversations, and knowledge retrieval scenarios.
    • Enhanced fixtures for improved resource management and readiness checks.
  • Chores

    • Improved test reliability and maintainability with refined type annotations and fixture scopes.

Walkthrough

This update restructures and expands the RAG test suite by reorganizing fixtures, updating deployment and resource specifications, and introducing new readiness and client fixtures. It replaces a single test with a comprehensive class-based suite covering deployment, inference, agent interaction, and RAG pipeline validation. A new utility for validating RAG agent responses is also added.

Changes

File(s) Change Summary
tests/rag/conftest.py Reorganized imports; updated and expanded fixtures (scope, type hints, generator types); enhanced deployment resource specs and environment; added readiness polling and port-forwarded LlamaStack client fixture.
tests/rag/test_rag.py Replaced single test with a class containing multiple focused tests for deployment, inference, agent interaction, and end-to-end RAG validation using the new client fixture and utility.
utilities/rag_utils.py Added new utility for validating RAG agent responses, including TypedDicts for expectations/results/summary, content extraction helper, and a comprehensive validation function with verbose output.

Possibly related PRs

Suggested labels

size/l, commented-by-adolfo-ab, lgtm-by-jgarciao

Suggested reviewers

  • jgarciao
  • lugi0
  • dbasunag

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 850c6f7 and 38bf1a0.

📒 Files selected for processing (3)
  • tests/rag/conftest.py (5 hunks)
  • tests/rag/test_rag.py (1 hunks)
  • utilities/rag_utils.py (2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#320
File: tests/model_registry/rest_api/conftest.py:200-216
Timestamp: 2025-06-05T14:32:40.247Z
Learning: In the opendatahub-tests repository, the test fixtures should raise exceptions on cleanup failures rather than just logging warnings. The user fege prefers strict cleanup behavior where tests fail if cleanup doesn't work properly, rather than silently continuing.
tests/rag/test_rag.py (4)
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#320
File: tests/model_registry/rest_api/conftest.py:200-216
Timestamp: 2025-06-05T14:32:40.247Z
Learning: In the opendatahub-tests repository, the test fixtures should raise exceptions on cleanup failures rather than just logging warnings. The user fege prefers strict cleanup behavior where tests fail if cleanup doesn't work properly, rather than silently continuing.
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#320
File: tests/model_registry/rest_api/conftest.py:257-264
Timestamp: 2025-06-18T08:27:21.114Z
Learning: In tests/model_registry/rest_api/conftest.py, the user fege prefers using next() without a default value to raise StopIteration when the mysql container is not found, rather than gracefully handling it with None checks. This is a deliberate "fail fast" design choice.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#338
File: tests/model_registry/rbac/test_mr_rbac.py:24-53
Timestamp: 2025-06-06T12:22:57.057Z
Learning: In the opendatahub-tests repository, prefer keeping test parameterization configurations inline rather than extracting them to separate variables/constants, as it makes triaging easier by avoiding the need to jump between different parts of the file to understand the test setup.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
tests/rag/conftest.py (7)
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:733-770
Timestamp: 2025-07-17T15:42:23.880Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 and model_registry_instance_2 fixtures do not need explicit database dependency fixtures (like db_deployment_1, db_secret_1, etc.) in their function signatures. Pytest's dependency injection automatically handles the fixture dependencies when they reference db_name_1 and db_name_2 parameters. This is the correct pattern for these Model Registry instance fixtures.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:666-676
Timestamp: 2025-07-17T15:41:54.284Z
Learning: In tests/model_registry/conftest.py, the db_secret_1 and db_secret_2 fixtures do not require the admin_client parameter in their signatures, unlike some other Secret fixtures in the codebase. The user lugi0 confirmed this is the correct pattern for these specific fixtures.
Learnt from: edwardquarm
PR: opendatahub-io/opendatahub-tests#340
File: tests/model_serving/model_runtime/utils.py:53-58
Timestamp: 2025-07-17T15:53:15.715Z
Learning: For port forwarding in test utilities, the preference is to rely on retry mechanisms (like the @retry decorator) rather than explicit error handling, allowing port forwarding failures to be handled transparently through retries until success.
🔇 Additional comments (13)
utilities/rag_utils.py (4)

4-8: LGTM: Clean imports and logger setup

The imports are well-organized and the logger setup follows the project conventions.


57-89: LGTM: Well-structured TypedDict definitions

The TypedDict classes provide excellent type safety and clear data structure definitions for the validation framework. The use of union types (e.g., str | None) is appropriate for optional fields.


91-97: LGTM: Robust content extraction helper

The helper function uses a sensible approach to extract content from various event types by checking multiple common attributes. The fallback to empty string ensures consistent behavior.


99-238: LGTM: Comprehensive validation function with excellent error handling

The validation function is well-structured with:

  • Clear parameter documentation
  • Robust error handling with try-catch blocks
  • Detailed logging for debugging
  • Comprehensive result tracking
  • Proper use of type casting for TypedDict compliance

The function effectively validates RAG agent responses against expected keywords while providing detailed feedback for debugging test failures.

tests/rag/test_rag.py (5)

1-21: LGTM: Well-organized imports and clear class documentation

The imports are properly organized and the class docstring provides excellent context for the RAG test suite's purpose and scope.


22-51: LGTM: Comprehensive server readiness validation

The test effectively validates:

  • Deployment readiness with proper waiting
  • Model availability for both LLM and embedding types
  • Required metadata presence (embedding_dimension)
  • Clear error messages for debugging failures

The improved error handling addresses previous review comments well.


52-79: LGTM: Effective basic inference validation

The test properly validates the chat completion API with:

  • Model selection for LLM type
  • Structured message format
  • Content validation for expected answers
  • Clear assertions with descriptive error messages

The reference to tutorial documentation is helpful for context.


80-115: LGTM: Solid agent conversation validation

The test effectively validates:

  • Agent creation with proper instructions
  • Session management using UUID-based names
  • Multiple conversation turns with different question types
  • Response content validation for expected keywords

The test structure is clear and follows the tutorial documentation approach.


116-252: LGTM: Comprehensive end-to-end RAG pipeline validation

This test excellently validates the complete RAG workflow:

  • Vector database creation and registration
  • RAG agent setup with knowledge search tool
  • Document insertion from PyTorch torchtune documentation
  • Multi-turn conversation validation using the new utility
  • Proper cleanup with error handling in finally block

The test addresses previous review comments by:

  • Implementing proper vector database cleanup
  • Removing duplicate keywords
  • Using the validation utility for systematic response checking

The use of specific torchtune documentation URLs and technical keyword validation makes this a robust integration test.

tests/rag/conftest.py (4)

1-50: LGTM: Well-organized imports and enhanced server configuration

The imports are properly organized with new dependencies for port forwarding and client functionality. The server configuration improvements include:

  • Appropriate resource requests and limits
  • Updated environment variables for Milvus and orchestrator
  • Updated distribution image
  • Storage size specification

These changes support the enhanced RAG testing capabilities.


52-112: LGTM: Appropriate fixture scope and type improvements

The fixture updates are well-implemented:

  • Class-level scoping improves test performance by reusing resources
  • Type hints properly accept both Namespace and Project types
  • Generator types are correctly specified
  • Fixture dependencies are properly maintained

The scope change to class level aligns well with the comprehensive test suite structure.


114-127: LGTM: Robust server readiness checking with proper error handling

The readiness check function is well-designed with:

  • Retry decorator for reliable server polling
  • Proper exception handling for different error types
  • Appropriate logging levels for debugging
  • Clean boolean return for retry mechanism

This ensures the client fixture only provides a fully ready server connection.


129-164: LGTM: Excellent client fixture with comprehensive error handling

The rag_lls_client fixture is well-implemented with:

  • Proper port forwarding setup with error handling
  • LlamaStackClient configuration with appropriate timeout
  • Server readiness verification before yielding client
  • Clear documentation with proper parameter descriptions
  • Exception handling with logging and re-raising

This fixture provides a robust foundation for the RAG test suite by ensuring the client is fully operational before tests run.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions
Copy link
Copy Markdown

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/hold', '/wip', '/lgtm', '/verified', '/build-push-pr-image', '/cherry-pick'}

@jgarciao jgarciao requested review from jiridanek and jiripetrlik and removed request for jiridanek July 17, 2025 13:15
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (5)
utilities/rag_utils.py (3)

4-8: Consider using more specific type hints for better type safety.

The type hints could be more specific to improve code clarity and type checking.

Apply this diff to use more specific type hints:

-from typing import Any, Dict, Generator, List
+from typing import Any, Dict, Generator, List, TypedDict
 from llama_stack_client import Agent, AgentEventLogger
 from simple_logger.logger import get_logger

 LOGGER = get_logger(name=__name__)
+
+
+class TurnExpectation(TypedDict):
+    question: str
+    expected_keywords: List[str]
+    description: str
+
+
+class TurnResult(TypedDict):
+    question: str
+    description: str
+    expected_keywords: List[str]
+    found_keywords: List[str]
+    missing_keywords: List[str]
+    response_content: str
+    response_length: int
+    event_count: int
+    success: bool
+    error: str | None
+
+
+class ValidationSummary(TypedDict):
+    total_turns: int
+    successful_turns: int
+    failed_turns: int
+    success_rate: float
+    total_events: int
+    total_response_length: int
+
+
+class ValidationResult(TypedDict):
+    success: bool
+    results: List[TurnResult]
+    summary: ValidationSummary

Then update the function signature:

 def validate_rag_agent_responses(
     rag_agent: Agent,
     session_id: str,
-    turns_with_expectations: List[Dict[str, Any]],
+    turns_with_expectations: List[TurnExpectation],
     stream: bool = True,
     verbose: bool = True,
     min_keywords_required: int = 1,
     print_events: bool = False,
-) -> Dict[str, Any]:
+) -> ValidationResult:

121-127: Simplify content extraction logic.

The multiple hasattr checks for content extraction could be simplified and made more maintainable.

Consider extracting this logic into a helper function:

+def extract_event_content(event: Any) -> str:
+    """Extract content from various event types."""
+    for attr in ['content', 'message', 'text']:
+        if hasattr(event, attr) and getattr(event, attr):
+            return str(getattr(event, attr))
+    return ""
+
             # Process events
             for event in AgentEventLogger().log(stream_response):
                 if print_events:
                     event.print()
                 event_count += 1
 
                 # Extract content from different event types
-                if hasattr(event, "content") and event.content:
-                    response_content += str(event.content)
-                elif hasattr(event, "message") and event.message:
-                    response_content += str(event.message)
-                elif hasattr(event, "text") and event.text:
-                    response_content += str(event.text)
+                response_content += extract_event_content(event)

57-216: Consider refactoring this large function for better maintainability.

The function is quite long (~160 lines) and handles multiple responsibilities. Breaking it down would improve readability and testability.

Consider splitting the function into smaller, focused functions:

  1. _process_single_turn() - Handle a single conversation turn
  2. _validate_keywords() - Validate keywords in response
  3. _create_turn_result() - Create the result dictionary for a turn
  4. _calculate_summary() - Calculate the summary statistics

This would make the code more modular and easier to test individually.

tests/rag/test_rag.py (1)

150-165: Consider making document URLs configurable.

The torchtune documentation URLs are hardcoded, which could break if the upstream repository structure changes.

Consider making these URLs configurable through environment variables or test configuration:

# At the top of the file or in a config
TORCHTUNE_BASE_URL = os.getenv(
    "TORCHTUNE_DOCS_BASE_URL",
    "https://raw.githubusercontent.com/pytorch/torchtune/main/docs/source/tutorials/"
)

# In the test
urls = [
    "llama3.rst",
    "chat.rst",
    "lora_finetune.rst",
    "qat_finetune.rst",
    "memory_optimizations.rst",
]
documents = [
    RAGDocument(
        document_id=f"num-{i}",
        content=f"{TORCHTUNE_BASE_URL}{url}",
        mime_type="text/plain",
        metadata={},
    )
    for i, url in enumerate(urls)
]
tests/rag/conftest.py (1)

114-123: Improve robustness of readiness check.

The readiness check could be more comprehensive and informative.

Apply this diff to improve error handling and logging:

 @retry(wait_timeout=Timeout.TIMEOUT_1MIN, sleep=5)
 def wait_for_llama_stack_ready(client: LlamaStackClient):
     try:
         client.inspect.health()
         version = client.inspect.version()
         LOGGER.info(f"Llama Stack server (v{version.version}) is available!")
         return True
-    except APIConnectionError:
+    except APIConnectionError as e:
+        LOGGER.debug(f"Llama Stack server not ready yet: {e}")
+        return False
+    except Exception as e:
+        LOGGER.warning(f"Unexpected error checking Llama Stack readiness: {e}")
         return False
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c3a558 and eef29c2.

📒 Files selected for processing (3)
  • tests/rag/conftest.py (5 hunks)
  • tests/rag/test_rag.py (1 hunks)
  • utilities/rag_utils.py (2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
tests/rag/test_rag.py (2)
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
tests/rag/conftest.py (2)
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
🧬 Code Graph Analysis (1)
tests/rag/test_rag.py (2)
utilities/rag_utils.py (1)
  • validate_rag_agent_responses (57-216)
tests/rag/conftest.py (2)
  • llama_stack_distribution_deployment (91-111)
  • lls_client (126-155)
🔇 Additional comments (2)
tests/rag/test_rag.py (1)

56-58: Handle potential StopIteration exception.

Same issue as above - using next() without a default value.

Apply this diff:

         models = lls_client.models.list()
-        model_id = next(m for m in models if m.api_model_type == "llm").identifier
+        llm_model = next((m for m in models if m.api_model_type == "llm"), None)
+        assert llm_model is not None, "No LLM model found for inference test"
+        model_id = llm_model.identifier
⛔ Skipped due to learnings
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#320
File: tests/model_registry/rest_api/conftest.py:257-264
Timestamp: 2025-06-18T08:27:21.114Z
Learning: In tests/model_registry/rest_api/conftest.py, the user fege prefers using next() without a default value to raise StopIteration when the mysql container is not found, rather than gracefully handling it with None checks. This is a deliberate "fail fast" design choice.
tests/rag/conftest.py (1)

32-35: Verify resource requests and limits are appropriate.

The resource specifications show a significant gap between requests (250m CPU, 500Mi memory) and limits (2 CPU, 12Gi memory).

This large difference could lead to resource contention or pod eviction issues. Please verify that these values are appropriate for the Llama Stack server's actual resource consumption patterns.

Comment thread tests/rag/test_rag.py Outdated
Comment thread tests/rag/test_rag.py
Comment thread tests/rag/test_rag.py Outdated
Comment thread tests/rag/test_rag.py Outdated
Comment thread tests/rag/conftest.py Outdated
Comment thread tests/rag/conftest.py Outdated
Copy link
Copy Markdown
Contributor

@ChristianZaccaria ChristianZaccaria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!! :)

Do we want to test for average response time?

Comment thread tests/rag/conftest.py Outdated
Comment thread tests/rag/test_rag.py Outdated
Comment thread tests/rag/test_rag.py Outdated
Comment thread tests/rag/conftest.py
Comment thread tests/rag/test_rag.py Outdated
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
pre-commit-ci bot and others added 2 commits July 18, 2025 15:09
* feat: refactoring of fixtures, first test with refactored code

Signed-off-by: lugi0 <lgiorgi@redhat.com>

* fix: change wording slightly in guide

Signed-off-by: lugi0 <lgiorgi@redhat.com>

* fix: move classes to utils

Signed-off-by: lugi0 <lgiorgi@redhat.com>

* feat: add options in fixture handling

Signed-off-by: lugi0 <lgiorgi@redhat.com>

* feat: add test for RBAC multitenancy scenario

Signed-off-by: lugi0 <lgiorgi@redhat.com>

* fix: remove code not used

Signed-off-by: lugi0 <lgiorgi@redhat.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: lugi0 <lgiorgi@redhat.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tests/rag/conftest.py (1)

141-141: Fix type annotation in docstring to match parameter.

The docstring shows Namespace but the parameter accepts Namespace | Project.

🧹 Nitpick comments (2)
utilities/rag_utils.py (2)

91-97: Consider documenting expected event types.

The function handles various event attributes, but it would be helpful to document what types of events are expected from the llama_stack_client.

 def extract_event_content(event: Any) -> str:
-    """Extract content from various event types."""
+    """Extract content from various event types.
+    
+    Checks for 'content', 'message', and 'text' attributes in the event object.
+    Expected event types from llama_stack_client may include completion events,
+    message events, and text streaming events.
+    
+    Args:
+        event: Event object from llama_stack_client agent response
+        
+    Returns:
+        Extracted text content or empty string if no content found
+    """
     for attr in ["content", "message", "text"]:
         if hasattr(event, attr) and getattr(event, attr):
             return str(getattr(event, attr))
     return ""

163-164: Consider extracting the success criteria for better readability.

The turn success condition combines multiple criteria. Consider extracting it to a variable for clarity.

             # Determine if this turn was successful
-            turn_successful = (
-                event_count > 0 and len(response_content) > 0 and len(found_keywords) >= min_keywords_required
-            )
+            has_events = event_count > 0
+            has_content = len(response_content) > 0
+            has_required_keywords = len(found_keywords) >= min_keywords_required
+            turn_successful = has_events and has_content and has_required_keywords
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eef29c2 and 152fd68.

📒 Files selected for processing (7)
  • tests/model_registry/conftest.py (2 hunks)
  • tests/model_registry/python_client/test_model_registry_creation.py (2 hunks)
  • tests/model_registry/rbac/test_mr_rbac.py (2 hunks)
  • tests/model_registry/rbac/utils.py (3 hunks)
  • tests/rag/conftest.py (5 hunks)
  • tests/rag/test_rag.py (1 hunks)
  • utilities/rag_utils.py (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/model_registry/python_client/test_model_registry_creation.py
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
tests/model_registry/rbac/utils.py (9)
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#296
File: tests/model_registry/rbac/test_mr_rbac_sa.py:0-0
Timestamp: 2025-05-07T10:40:15.681Z
Learning: The `build_mr_client_args` function in the Model Registry RBAC tests uses simple string splitting (`rest_endpoint.split(":")`) for endpoint parsing by design. This implementation is intentionally non-robust to serve as a detection mechanism for any unexpected changes to the endpoint format.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:0-0
Timestamp: 2025-07-17T15:42:26.275Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 fixture (and similar duplicated Model Registry instance fixtures) do not require admin_client, db_deployment_1, or db_secret_1 parameters as explicit dependencies, even though these dependencies exist implicitly through the fixture dependency chain.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:355-371
Timestamp: 2025-06-16T11:25:51.461Z
Learning: In the opendatahub-tests repository, the login_as_test_user fixture in tests/model_registry/rbac/conftest.py intentionally logs back in as the original admin user without providing a password using login_with_user_password. This behavior is acceptable in their testing environment.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, Service resources can be created without explicitly passing the admin_client parameter when using the context manager approach with "with Service()". The client parameter is optional for Service resource creation.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#401
File: tests/model_registry/rest_api/mariadb/conftest.py:89-110
Timestamp: 2025-07-04T00:17:47.799Z
Learning: In tests/model_registry/rest_api/mariadb/conftest.py, the model_registry_with_mariadb fixture should always use OAUTH_PROXY_CONFIG_DICT for the oauth_proxy parameter regardless of the is_model_registry_oauth parameter value, based on expected product behavior for MariaDB-backed ModelRegistry instances.
tests/rag/test_rag.py (4)
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#320
File: tests/model_registry/rest_api/conftest.py:200-216
Timestamp: 2025-06-05T14:32:40.247Z
Learning: In the opendatahub-tests repository, the test fixtures should raise exceptions on cleanup failures rather than just logging warnings. The user fege prefers strict cleanup behavior where tests fail if cleanup doesn't work properly, rather than silently continuing.
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#320
File: tests/model_registry/rest_api/conftest.py:257-264
Timestamp: 2025-06-18T08:27:21.114Z
Learning: In tests/model_registry/rest_api/conftest.py, the user fege prefers using next() without a default value to raise StopIteration when the mysql container is not found, rather than gracefully handling it with None checks. This is a deliberate "fail fast" design choice.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#338
File: tests/model_registry/rbac/test_mr_rbac.py:24-53
Timestamp: 2025-06-06T12:22:57.057Z
Learning: In the opendatahub-tests repository, prefer keeping test parameterization configurations inline rather than extracting them to separate variables/constants, as it makes triaging easier by avoiding the need to jump between different parts of the file to understand the test setup.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
tests/model_registry/rbac/test_mr_rbac.py (11)
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:355-371
Timestamp: 2025-06-16T11:25:51.461Z
Learning: In the opendatahub-tests repository, the login_as_test_user fixture in tests/model_registry/rbac/conftest.py intentionally logs back in as the original admin user without providing a password using login_with_user_password. This behavior is acceptable in their testing environment.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:0-0
Timestamp: 2025-07-17T15:42:26.275Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 fixture (and similar duplicated Model Registry instance fixtures) do not require admin_client, db_deployment_1, or db_secret_1 parameters as explicit dependencies, even though these dependencies exist implicitly through the fixture dependency chain.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:733-770
Timestamp: 2025-07-17T15:42:23.880Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 and model_registry_instance_2 fixtures do not need explicit database dependency fixtures (like db_deployment_1, db_secret_1, etc.) in their function signatures. Pytest's dependency injection automatically handles the fixture dependencies when they reference db_name_1 and db_name_2 parameters. This is the correct pattern for these Model Registry instance fixtures.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#401
File: tests/model_registry/rest_api/mariadb/conftest.py:89-110
Timestamp: 2025-07-04T00:17:47.799Z
Learning: In tests/model_registry/rest_api/mariadb/conftest.py, the model_registry_with_mariadb fixture should always use OAUTH_PROXY_CONFIG_DICT for the oauth_proxy parameter regardless of the is_model_registry_oauth parameter value, based on expected product behavior for MariaDB-backed ModelRegistry instances.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, Service resources can be created without explicitly passing the admin_client parameter when using the context manager approach with "with Service()". The client parameter is optional for Service resource creation.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#296
File: tests/model_registry/rbac/test_mr_rbac_sa.py:0-0
Timestamp: 2025-05-07T10:40:15.681Z
Learning: The `build_mr_client_args` function in the Model Registry RBAC tests uses simple string splitting (`rest_endpoint.split(":")`) for endpoint parsing by design. This implementation is intentionally non-robust to serve as a detection mechanism for any unexpected changes to the endpoint format.
tests/model_registry/conftest.py (11)
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:733-770
Timestamp: 2025-07-17T15:42:23.880Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 and model_registry_instance_2 fixtures do not need explicit database dependency fixtures (like db_deployment_1, db_secret_1, etc.) in their function signatures. Pytest's dependency injection automatically handles the fixture dependencies when they reference db_name_1 and db_name_2 parameters. This is the correct pattern for these Model Registry instance fixtures.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:0-0
Timestamp: 2025-07-17T15:42:26.275Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 fixture (and similar duplicated Model Registry instance fixtures) do not require admin_client, db_deployment_1, or db_secret_1 parameters as explicit dependencies, even though these dependencies exist implicitly through the fixture dependency chain.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:666-676
Timestamp: 2025-07-17T15:41:54.284Z
Learning: In tests/model_registry/conftest.py, the db_secret_1 and db_secret_2 fixtures do not require the admin_client parameter in their signatures, unlike some other Secret fixtures in the codebase. The user lugi0 confirmed this is the correct pattern for these specific fixtures.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#401
File: tests/model_registry/rest_api/mariadb/conftest.py:89-110
Timestamp: 2025-07-04T00:17:47.799Z
Learning: In tests/model_registry/rest_api/mariadb/conftest.py, the model_registry_with_mariadb fixture should always use OAUTH_PROXY_CONFIG_DICT for the oauth_proxy parameter regardless of the is_model_registry_oauth parameter value, based on expected product behavior for MariaDB-backed ModelRegistry instances.
Learnt from: Snomaan6846
PR: opendatahub-io/opendatahub-tests#444
File: tests/model_serving/model_runtime/mlserver/basic_model_deployment/test_mlserver_basic_model_deployment.py:48-714
Timestamp: 2025-07-16T12:20:29.672Z
Learning: In tests/model_serving/model_runtime/mlserver/basic_model_deployment/test_mlserver_basic_model_deployment.py, the same get_deployment_config_dict() function is called twice in each pytest.param because different fixtures (mlserver_inference_service and mlserver_serving_runtime) need the same deployment configuration data. This duplication is intentional to provide identical configuration to multiple fixtures.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/test_mr_rbac.py:64-77
Timestamp: 2025-06-16T11:26:53.789Z
Learning: In Model Registry RBAC tests, client instantiation tests are designed to verify the ability to create and use the MR python client, with actual API functionality testing covered by separate existing tests.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, Service resources can be created without explicitly passing the admin_client parameter when using the context manager approach with "with Service()". The client parameter is optional for Service resource creation.
tests/rag/conftest.py (7)
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:733-770
Timestamp: 2025-07-17T15:42:23.880Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 and model_registry_instance_2 fixtures do not need explicit database dependency fixtures (like db_deployment_1, db_secret_1, etc.) in their function signatures. Pytest's dependency injection automatically handles the fixture dependencies when they reference db_name_1 and db_name_2 parameters. This is the correct pattern for these Model Registry instance fixtures.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:666-676
Timestamp: 2025-07-17T15:41:54.284Z
Learning: In tests/model_registry/conftest.py, the db_secret_1 and db_secret_2 fixtures do not require the admin_client parameter in their signatures, unlike some other Secret fixtures in the codebase. The user lugi0 confirmed this is the correct pattern for these specific fixtures.
Learnt from: edwardquarm
PR: opendatahub-io/opendatahub-tests#340
File: tests/model_serving/model_runtime/utils.py:53-58
Timestamp: 2025-07-17T15:53:15.715Z
Learning: For port forwarding in test utilities, the preference is to rely on retry mechanisms (like the @retry decorator) rather than explicit error handling, allowing port forwarding failures to be handled transparently through retries until success.
🧬 Code Graph Analysis (4)
tests/model_registry/rbac/utils.py (2)
tests/conftest.py (1)
  • admin_client (65-66)
tests/model_registry/conftest.py (1)
  • model_registry_namespace (54-55)
tests/rag/test_rag.py (2)
utilities/rag_utils.py (2)
  • TurnExpectation (57-60)
  • validate_rag_agent_responses (99-238)
tests/rag/conftest.py (2)
  • llama_stack_distribution_deployment (91-111)
  • lls_client (130-163)
tests/model_registry/conftest.py (3)
tests/conftest.py (1)
  • teardown_resources (86-93)
tests/model_registry/utils.py (2)
  • get_model_registry_db_label_dict (162-167)
  • get_model_registry_deployment_template_dict (59-159)
tests/model_registry/rest_api/utils.py (1)
  • ModelRegistryV1Alpha1 (253-254)
tests/rag/conftest.py (6)
utilities/constants.py (2)
  • DscComponents (156-177)
  • Timeout (214-225)
utilities/data_science_cluster_utils.py (1)
  • update_components_in_dsc (14-54)
utilities/general.py (1)
  • generate_random_name (308-333)
utilities/infra.py (1)
  • create_ns (73-165)
utilities/rag_utils.py (2)
  • LlamaStackDistribution (11-31)
  • create_llama_stack_distribution (35-54)
tests/conftest.py (3)
  • dsc_resource (355-356)
  • admin_client (65-66)
  • unprivileged_client (306-346)
🔇 Additional comments (20)
tests/model_registry/conftest.py (6)

542-546: LGTM!

The unique name generation using UUID substrings is appropriate for avoiding resource name collisions.

Also applies to: 659-663


548-560: Consistent with established patterns.

The DB secret fixtures correctly follow the pattern established in the learnings where admin_client is not required in the fixture signatures.

Also applies to: 666-677


563-576: LGTM!

The PVC fixtures properly create storage resources with appropriate size and access mode for database usage.

Also applies to: 680-693


579-592: Follows the established pattern for Service creation.

As confirmed in the learnings, Service resources can be created without the admin_client parameter when using the context manager approach.

Also applies to: 697-709


595-613: LGTM!

The deployment fixtures correctly rely on Pytest's implicit dependency injection through the db_name parameters, as established in the learnings.

Also applies to: 712-730


616-654: Well-implemented multi-instance Model Registry fixtures.

The fixtures correctly:

  • Generate unique instance names to prevent collisions
  • Configure MySQL connections using the appropriate db_name parameters
  • Support both OAuth proxy and Istio configurations
  • Follow the established pattern of implicit dependency injection

Also applies to: 733-770

tests/model_registry/rbac/utils.py (3)

112-150: Well-implemented RBAC grant function.

The function correctly implements fine-grained access control by:

  • Restricting permissions to only "get" verb on the specific service resource
  • Using resourceNames to limit access to the specific Model Registry instance
  • Properly creating both Role and RoleBinding with appropriate naming conventions
  • Waiting for resources to be created synchronously

152-170: LGTM!

The revoke function correctly removes RBAC resources in the proper order (RoleBinding before Role) and waits for deletion to complete.


172-181: Good implementation for eventual consistency testing.

The function correctly handles the polling pattern by raising AssertionError when access is still granted, allowing TimeoutSampler to retry until ForbiddenException is properly raised.

tests/model_registry/rbac/test_mr_rbac.py (1)

140-264: Comprehensive multi-tenancy RBAC test implementation.

The test effectively validates exclusive access control across multiple Model Registry instances by:

  • Properly utilizing all required fixtures for two independent MR instances
  • Testing access grant/revoke scenarios with proper eventual consistency handling
  • Verifying that access to one instance doesn't grant access to another
  • Cleaning up resources after the test

Good use of TimeoutSampler for handling Kubernetes eventual consistency.

utilities/rag_utils.py (1)

73-73: No Python version compatibility issues with union syntax

Verified that pyproject.toml specifies requires-python = "==3.12.*", and Python 3.12 fully supports PEP 604 union types. The use of str | None is valid and consistent across the codebase. No changes needed here.

tests/rag/test_rag.py (5)

22-51: Well-structured deployment and model validation test.

The test properly validates deployment readiness and model availability with clear error messages. The implementation addresses previous review comments about error handling.


52-78: Clean implementation of basic inference test.

The test is focused and validates both response structure and content correctness.


79-113: Effective agent conversation test.

The test validates agent creation and multi-turn conversations with appropriate assertions.


114-249: Comprehensive RAG pipeline test with proper cleanup.

The test effectively validates the full RAG pipeline including vector database integration, document insertion, and knowledge retrieval. The cleanup in the finally block properly addresses previous review comments about resource management.


167-167: Good use of versioned documentation URLs.

Using a specific release tag (v0.6.1) ensures test stability and prevents breakage from upstream changes, as suggested in previous reviews.

tests/rag/conftest.py (4)

32-35: Appropriate resource specifications added.

The resource requests and limits are reasonable for a RAG deployment, providing minimum guarantees while allowing scaling up to 2 CPUs and 12Gi memory.


52-62: Correct return type annotation for enabled_llama_stack_operator.

The fixture now properly yields DataScienceCluster instead of None, improving type safety.


114-127: Well-implemented readiness check with retry mechanism.

The function follows best practices with specific exception handling and retry logic, aligning with the project's patterns for port forwarding and service readiness.


147-163: Proper error handling for port forwarding.

The implementation correctly addresses previous review comments by adding try-except error handling with logging.

Comment thread tests/rag/conftest.py
jiripetrlik
jiripetrlik previously approved these changes Jul 18, 2025
@jgarciao
Copy link
Copy Markdown
Contributor Author

Do we want to test for average response time?
@ChristianZaccaria this can be done in a follow-up PR

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/rag/conftest.py (1)

129-163: LGTM: Well-implemented client fixture with proper error handling.

The fixture correctly sets up port forwarding, waits for service readiness, and provides appropriate error handling. The timeout values are reasonable for RAG/LLM operations.

Minor docstring inconsistency - line 141 should reflect the union type:

-        rag_test_namespace (Namespace): Namespace or project containing RAG test resources
+        rag_test_namespace (Namespace | Project): Namespace or project containing RAG test resources
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 152fd68 and 7e6b3b6.

📒 Files selected for processing (1)
  • tests/rag/conftest.py (5 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: fege
PR: opendatahub-io/opendatahub-tests#320
File: tests/model_registry/rest_api/conftest.py:200-216
Timestamp: 2025-06-05T14:32:40.247Z
Learning: In the opendatahub-tests repository, the test fixtures should raise exceptions on cleanup failures rather than just logging warnings. The user fege prefers strict cleanup behavior where tests fail if cleanup doesn't work properly, rather than silently continuing.
tests/rag/conftest.py (7)
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:595-612
Timestamp: 2025-07-17T15:42:04.167Z
Learning: In tests/model_registry/conftest.py, the db_deployment_1 fixture (and similar duplicated resource fixtures) do not require the admin_client parameter or explicit dependencies on related fixtures like db_secret_1, db_pvc_1, and db_service_1, even though the original model_registry_db_deployment fixture includes these parameters.
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Learnt from: dbasunag
PR: opendatahub-io/opendatahub-tests#354
File: tests/model_registry/rbac/conftest.py:166-175
Timestamp: 2025-06-16T11:25:39.599Z
Learning: In tests/model_registry/rbac/conftest.py, predictable names are intentionally used for test resources (like RoleBindings and groups) instead of random names. This design choice prioritizes exposing cleanup failures from previous test runs through name collisions rather than masking such issues with random names. The philosophy is that test failures should be observable and informative to help debug underlying infrastructure or cleanup issues.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:733-770
Timestamp: 2025-07-17T15:42:23.880Z
Learning: In tests/model_registry/conftest.py, the model_registry_instance_1 and model_registry_instance_2 fixtures do not need explicit database dependency fixtures (like db_deployment_1, db_secret_1, etc.) in their function signatures. Pytest's dependency injection automatically handles the fixture dependencies when they reference db_name_1 and db_name_2 parameters. This is the correct pattern for these Model Registry instance fixtures.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:579-591
Timestamp: 2025-07-17T15:43:04.876Z
Learning: In tests/model_registry/conftest.py, the db_service_1 and db_service_2 fixtures do not require the admin_client parameter for Service resource creation, despite the existing model_registry_db_service fixture using client=admin_client. This inconsistency was confirmed as intentional by user lugi0.
Learnt from: lugi0
PR: opendatahub-io/opendatahub-tests#446
File: tests/model_registry/conftest.py:666-676
Timestamp: 2025-07-17T15:41:54.284Z
Learning: In tests/model_registry/conftest.py, the db_secret_1 and db_secret_2 fixtures do not require the admin_client parameter in their signatures, unlike some other Secret fixtures in the codebase. The user lugi0 confirmed this is the correct pattern for these specific fixtures.
Learnt from: edwardquarm
PR: opendatahub-io/opendatahub-tests#340
File: tests/model_serving/model_runtime/utils.py:53-58
Timestamp: 2025-07-17T15:53:15.715Z
Learning: For port forwarding in test utilities, the preference is to rely on retry mechanisms (like the @retry decorator) rather than explicit error handling, allowing port forwarding failures to be handled transparently through retries until success.
🔇 Additional comments (4)
tests/rag/conftest.py (4)

2-20: LGTM: Import changes support new functionality.

The new imports are appropriately added to support the enhanced RAG test infrastructure, including port forwarding, client creation, and retry mechanisms.


32-48: LGTM: Server configuration properly updated for RAG workload.

The resource specifications, environment variables, and storage configuration are appropriately configured for the RAG use case with reasonable limits and proper paths.


52-92: LGTM: Fixture scope and type hint improvements.

The scope changes to "class" improve test efficiency by reducing expensive setup operations, and the type hints correctly support both Namespace and Project types as needed for cross-platform compatibility.


114-126: LGTM: Well-implemented readiness check with proper error handling.

The function correctly uses retry logic, handles expected connection errors during startup, and provides informative logging. The timeout and retry intervals are appropriate for service startup.

@jgarciao jgarciao force-pushed the add-rag-quicktart-tests branch from 5fb7363 to 4956bf1 Compare July 18, 2025 15:57
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
@jgarciao jgarciao force-pushed the add-rag-quicktart-tests branch from 850c6f7 to 7839973 Compare July 18, 2025 16:04
Copy link
Copy Markdown
Contributor

@ChristianZaccaria ChristianZaccaria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm !!

@jgarciao jgarciao merged commit 7354690 into opendatahub-io:main Jul 21, 2025
9 checks passed
@github-actions
Copy link
Copy Markdown

Status of building tag latest: success.
Status of pushing tag latest to image registry: success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants