fix: Fix rag tests after rebase#591
Conversation
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
When using Qwen the tests are failing, as it seems not to have tool calling enabled Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughTest updates across core and RAG suites: added explicit type annotations and return types in core tests, plus extra assertions in inference. RAG tests removed MinIO-related fixtures and parameterization, adjusted assertions, and now derive embedding dimensions dynamically. No new public APIs; only test signatures and expectations changed. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. 📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The following are automatically added/executed:
Available user actions:
Supported labels{'/verified', '/cherry-pick', '/hold', '/lgtm', '/wip', '/build-push-pr-image'} |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/llama_stack/rag/test_rag.py (1)
113-149: Fix None-check before calling.lower()and de-duplicate assertions.Current order can raise on
None. Also condense repetitive keyword checks.- content = response.output_message.content.lower() - assert content is not None, "LLM response content is None" - assert "answer" in content, "The LLM didn't provide the expected answer to the prompt" - assert "translate" in content, "The LLM didn't provide the expected answer to the prompt" - assert "summarize" in content, "The LLM didn't provide the expected answer to the prompt" - assert "chat" in content, "The LLM didn't provide the expected answer to the prompt" + content_raw = response.output_message.content + assert content_raw is not None, "LLM response content is None" + content = content_raw.lower() + for kw in ("answer", "translate", "summarize", "chat"): + assert kw in content, f"The LLM didn't mention expected capability: {kw}"
♻️ Duplicate comments (3)
tests/llama_stack/core/test_llamastack_core.py (3)
44-46: Same: drop unused fixture params here.Follow the same
@usefixturesrefactor and removeminio_pod/minio_data_connectionfrom this signature.- def test_model_register( - self, minio_pod: Pod, minio_data_connection: Secret, llama_stack_client: LlamaStackClient - ) -> None: + def test_model_register(self, llama_stack_client: LlamaStackClient) -> None:
52-54: Same: drop unused fixture params here.- def test_model_list( - self, minio_pod: Pod, minio_data_connection: Secret, llama_stack_client: LlamaStackClient - ) -> None: + def test_model_list(self, llama_stack_client: LlamaStackClient) -> None:
64-66: Same: drop unused fixture params here.- def test_inference( - self, minio_pod: Pod, minio_data_connection: Secret, llama_stack_client: LlamaStackClient - ) -> None: + def test_inference(self, llama_stack_client: LlamaStackClient) -> None:
🧹 Nitpick comments (5)
tests/llama_stack/core/test_llamastack_core.py (2)
25-27: Use @usefixtures instead of unused fixture params (silences Ruff ARG002 and clarifies intent).The fixtures
minio_podandminio_data_connectionare only needed for side effects; they aren’t referenced. Prefer@pytest.mark.usefixtures(...)and drop them from the signatures.Apply:
@@ -@pytest.mark.rawdeployment -@pytest.mark.smoke -class TestLlamaStackCore: - def test_lls_server_initial_state( - self, minio_pod: Pod, minio_data_connection: Secret, llama_stack_client: LlamaStackClient - ) -> None: +@pytest.mark.rawdeployment +@pytest.mark.smoke +@pytest.mark.usefixtures("minio_pod", "minio_data_connection") +class TestLlamaStackCore: + def test_lls_server_initial_state(self, llama_stack_client: LlamaStackClient) -> None:If you adopt this, also remove now-unused imports:
-from ocp_resources.pod import Pod -from ocp_resources.secret import SecretAlternative minimal change: keep signatures and add per-arg ignores:
- def test_lls_server_initial_state( - self, minio_pod: Pod, minio_data_connection: Secret, llama_stack_client: LlamaStackClient - ) -> None: + def test_lls_server_initial_state( # noqa: ARG002 + self, + minio_pod: Pod, # noqa: ARG002 + minio_data_connection: Secret, # noqa: ARG002 + llama_stack_client: LlamaStackClient, + ) -> None:
4-7: Type-only imports: optional clean-up if you adopt @usefixtures.If you remove
Pod/Secretfrom test signatures, drop these imports to prevent unused-import lint. If you keep them for typing only, considerfrom __future__ import annotationsorif TYPE_CHECKING:guards.tests/llama_stack/rag/test_rag.py (3)
32-47: Remove the# type: ignoreby validating/castingembedding_dimension.Make the expectation explicit and avoid masking type issues.
- embeddings_response = llama_stack_client.inference.embeddings( - model_id=embedding_model.identifier, - contents=["First chunk of text"], - output_dimension=embedding_dimension, # type: ignore - ) + ed = embedding_model.metadata.get("embedding_dimension") + assert isinstance(ed, int) and ed > 0, "Invalid embedding_dimension in model metadata" + embeddings_response = llama_stack_client.inference.embeddings( + model_id=embedding_model.identifier, + contents=["First chunk of text"], + output_dimension=ed, + )
54-54: Apply the same explicit check/cast forembedding_dimensionin this test.Prevents silent misconfigurations and removes implicit typing assumptions.
- embedding_model = next(m for m in models if m.api_model_type == "embedding") - embedding_dimension = embedding_model.metadata["embedding_dimension"] + embedding_model = next(m for m in models if m.api_model_type == "embedding") + ed = embedding_model.metadata.get("embedding_dimension") + assert isinstance(ed, int) and ed > 0, "Invalid embedding_dimension in model metadata" @@ - llama_stack_client.vector_dbs.register( + llama_stack_client.vector_dbs.register( vector_db_id=vector_db_id, embedding_model=embedding_model.identifier, - embedding_dimension=embedding_dimension, # type: ignore + embedding_dimension=ed, provider_id="milvus", ) @@ - embeddings_response = llama_stack_client.inference.embeddings( + embeddings_response = llama_stack_client.inference.embeddings( model_id=embedding_model.identifier, contents=["First chunk of text"], - output_dimension=embedding_dimension, # type: ignore + output_dimension=ed, )
152-152: Ensure the chosen LLM supports tool-calling (prefer MaaS).To avoid regressions when a non-tool-enabled model is first in the list, select a tool-capable model (e.g., MaaS) if available.
- model_id = next(m for m in models if m.api_model_type == "llm").identifier + llm_models = [m for m in models if m.api_model_type == "llm"] + # Prefer MaaS (tool-calling) models if present; fallback to the first LLM. + preferred = next((m for m in llm_models if "maas" in m.identifier.lower()), llm_models[0]) + model_id = preferred.identifierTo double-check what models are exposed and their identifiers, you can print them during a failing run or log them at debug level.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tests/llama_stack/core/test_llamastack_core.py(4 hunks)tests/llama_stack/rag/test_rag.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/llama_stack/rag/test_rag.py (1)
tests/llama_stack/conftest.py (1)
llama_stack_client(128-159)
tests/llama_stack/core/test_llamastack_core.py (4)
tests/llama_stack/conftest.py (1)
llama_stack_client(128-159)utilities/constants.py (1)
MinIo(273-333)tests/conftest.py (2)
minio_pod(480-520)minio_data_connection(546-558)tests/llama_stack/constants.py (2)
LlamaStackProviders(4-14)Inference(7-8)
🪛 Ruff (0.12.2)
tests/llama_stack/rag/test_rag.py
145-145: Use of assert detected
(S101)
146-146: Use of assert detected
(S101)
147-147: Use of assert detected
(S101)
148-148: Use of assert detected
(S101)
149-149: Use of assert detected
(S101)
tests/llama_stack/core/test_llamastack_core.py
26-26: Unused method argument: minio_pod
(ARG002)
26-26: Unused method argument: minio_data_connection
(ARG002)
45-45: Unused method argument: minio_pod
(ARG002)
45-45: Unused method argument: minio_data_connection
(ARG002)
50-50: Use of assert detected
(S101)
53-53: Unused method argument: minio_pod
(ARG002)
53-53: Unused method argument: minio_data_connection
(ARG002)
65-65: Unused method argument: minio_pod
(ARG002)
65-65: Unused method argument: minio_data_connection
(ARG002)
🔇 Additional comments (1)
tests/llama_stack/rag/test_rag.py (1)
15-16: LGTM: simplified parameterization.Switching to only
model_namespacealigns with the MaaS-based setup and removes MinIO coupling.
|
Status of building tag latest: success. |
* fix: fix typing issues in llama-stack core tests Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com> * fix: Modify llama-stack RAG tests to use MaaS When using Qwen the tests are failing, as it seems not to have tool calling enabled Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com> --------- Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Summary by CodeRabbit