Skip to content

Commit ff0fea7

Browse files
jgarciaodbasunagkpunwatkmwaykolepre-commit-ci[bot]
committed
Fix and deprecate agents test. Increase timeouts. Add workaround for RHAIENG-1819 (#840)
* fix: Fix agents test Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com> * fix: Deprecate agents test Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com> * Move to pytest 9.0.0 (#823) Co-authored-by: Karishma Punwatkar <kpunwatk@redhat.com> * Update label to ensure we pick the right pod (#841) * fix ig test failing due to timeout (#844) * fix ig test failing due to timeout Signed-off-by: Milind Waykole <mwaykole@redhat.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Milind Waykole <mwaykole@redhat.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * remove empty dir which is not needed now (#846) Signed-off-by: Milind Waykole <mwaykole@redhat.com> * some refactor of code for model-server (#845) * refactor some test Signed-off-by: Milind Waykole <mwaykole@redhat.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add kueue direct Signed-off-by: Milind Waykole <mwaykole@redhat.com> --------- Signed-off-by: Milind Waykole <mwaykole@redhat.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * migrate tier2 test to ovms (#847) Signed-off-by: Milind Waykole <mwaykole@redhat.com> * Refactor model_server tests: organize KServe tests and migrate to OVMS (#848) - Move KServe-specific tests into tests/model_serving/model_server/kserve/ directory: * authentication/ * components/ * inference_graph/ * inference_service_configuration/ * keda/ * kueue/ * metrics/ * model_car/ * multi_node/ * private_endpoint/ * raw_deployment/ * routes/ * stop_resume/ * storage/ - Keep upgrade/ at model_server level (general model server tests) - Keep llmd/ and maas_billing/ at model_server level (non-KServe components) - Migrate test_custom_resources.py from Caikit-TGIS to OVMS runtime * Update to use OVMS template and model format * Use test-dir model from ods-ci-s3 bucket * Add to sanity test suite - Fix InferenceGraph tests: * Add kserve_raw_headless_service_config fixture to set DSC rawDeploymentServiceConfig to Headed * Ensure proper fixture dependency ordering * All 6 InferenceGraph tests now passing - Update all imports to reflect new directory structure - Clean up empty directories (model_mesh, ovms, runtime_configuration) - All pre-commit checks and tests passing * Refactor model_server tests: organize KServe tests and migrate to OVMS (#850) fix signoff for previous pr Signed-off-by: Milind Waykole <mwaykole@redhat.com> * Added smoke tests markers for Runtime-tests (#852) * Hardcoded Triton Runtime Image in Triton testsuite (#853) * Hardcoded triton image * fixed ns name * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update TRITON_IMAGE version to 24.10-py3 * Changed image version * added smoke marker --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add test for issue checking isvc status RHOAIENG-38674 (#858) * Add test for issue checking isvc status RHOAIENG-38674 Signed-off-by: Milind Waykole <mwaykole@redhat.com> * fix func name Signed-off-by: Milind Waykole <mwaykole@redhat.com> --------- Signed-off-by: Milind Waykole <mwaykole@redhat.com> * Add support for byoidc envs in MR tests (#820) * feat: add support for byoidc envs Signed-off-by: lugi0 <lgiorgi@redhat.com> * fix: fix uv lock conflict Signed-off-by: lugi0 <lgiorgi@redhat.com> * fix: coderabbit review Signed-off-by: lugi0 <lgiorgi@redhat.com> * undo uv changes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * feat: address review comments Signed-off-by: lugi0 <lgiorgi@redhat.com> * fix: address coderabbit review Signed-off-by: lugi0 <lgiorgi@redhat.com> * feat: remove code that is not needed anymore Signed-off-by: lugi0 <lgiorgi@redhat.com> --------- Signed-off-by: lugi0 <lgiorgi@redhat.com> Co-authored-by: Debarati Basu-Nag <dbasunag@redhat.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Lock file maintenance (#856) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Adjust automation for FAISS vector store (#819) * Adjust automation for FAISS vector store * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jorge <jgarciao@users.noreply.github.com> * [pre-commit.ci] pre-commit autoupdate (#859) updates: - [github.com/astral-sh/ruff-pre-commit: v0.14.4 → v0.14.5](astral-sh/ruff-pre-commit@v0.14.4...v0.14.5) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * llmd tests minor improvements (#831) * Increase timeout for llmd * Update utilities/llmd_constants.py Co-authored-by: Adolfo Aguirrezabal <aaguirre@redhat.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Adolfo Aguirrezabal <aaguirre@redhat.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Milind Waykole <mwaykole@redhat.com> * Fix the test which re getting skipped due to scope (#865) Signed-off-by: Milind Waykole <mwaykole@redhat.com> * Update error pattern for negative test (#864) Co-authored-by: Luca Giorgi <lgiorgi@redhat.com> * fix: add xfail to failing catalog test (#867) Signed-off-by: lugi0 <lgiorgi@redhat.com> * Change model catakog label to pick both the catalog pods (#863) Co-authored-by: Luca Giorgi <lgiorgi@redhat.com> * feat: add wait_for_unique_llama_stack_pod as workaround for RHAIENG-1819 Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com> * fix: increase timeout when creating the llama-stack deployment Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com> * Refactor model_server tests: organize KServe tests and migrate to OVMS (#850) fix signoff for previous pr Signed-off-by: Milind Waykole <mwaykole@redhat.com> * feat: add tests for ragas remote provider (#866) * Refactor model_server tests: organize KServe tests and migrate to OVMS (#850) fix signoff for previous pr Signed-off-by: Milind Waykole <mwaykole@redhat.com> --------- Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com> Signed-off-by: Milind Waykole <mwaykole@redhat.com> Signed-off-by: lugi0 <lgiorgi@redhat.com> Co-authored-by: Debarati Basu-Nag <dbasunag@redhat.com> Co-authored-by: Karishma Punwatkar <kpunwatk@redhat.com> Co-authored-by: Milind Waykole <mwaykole@redhat.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: RAGHUL M <ragm@redhat.com> Co-authored-by: Luca Giorgi <lgiorgi@redhat.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Jiri Petrlik <jiripetrlik@gmail.com> Co-authored-by: Thomas Recchiuto <34453570+threcc@users.noreply.github.com> Co-authored-by: Adolfo Aguirrezabal <aaguirre@redhat.com>
1 parent ada7ce5 commit ff0fea7

File tree

3 files changed

+81
-18
lines changed

3 files changed

+81
-18
lines changed

tests/llama_stack/agents/test_agents.py renamed to tests/llama_stack/agents/test_agents_deprecated.py

Lines changed: 33 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,19 @@
2121
)
2222
@pytest.mark.rag
2323
@pytest.mark.skip_must_gather
24-
class TestLlamaStackAgents:
25-
"""Test class for LlamaStack Agents API
24+
class TestLlamaStackAgentsDeprecated:
25+
"""Test class for LlamaStack Agents API (Deprecated)
2626
27-
For more information about this API, see:
28-
- https://llamastack.github.io/docs/building_applications/agent
29-
- https://llamastack.github.io/docs/references/python_sdk_reference#agents
30-
- https://llamastack.github.io/docs/building_applications/responses_vs_agents
27+
Deprecation Notice: The LlamaStack Agents API was removed server-side in llama-stack 0.3.0.
28+
It is partially implemented in llama-stack-client using the Responses API
29+
(https://github.com/llamastack/llama-stack-client-python/pull/281).
30+
31+
Users are encouraged to use the Responses API directly.
32+
33+
For more information, see:
34+
- https://llamastack.github.io/docs/api-deprecated/agents
35+
- "Migrating from Agent objects to Responses in Llama Stack":
36+
https://github.com/opendatahub-io/agents/blob/5902bef12c25281eecfcd3d25654de8b02857e33/migration/legacy-agents/responses-api-agent-migration.ipynb
3137
"""
3238

3339
@pytest.mark.smoke
@@ -106,11 +112,18 @@ def test_agents_simple_agent(
106112
)
107113

108114
@pytest.mark.smoke
115+
@pytest.mark.parametrize(
116+
"enable_streaming",
117+
[
118+
pytest.param(False, id="streaming_disabled"),
119+
],
120+
)
109121
def test_agents_rag_agent(
110122
self,
111123
unprivileged_llama_stack_client: LlamaStackClient,
112124
llama_stack_models: ModelInfo,
113125
vector_store_with_example_docs: VectorStore,
126+
enable_streaming: bool,
114127
) -> None:
115128
"""
116129
Test RAG agent that can answer questions about the Torchtune project using the documents
@@ -123,7 +136,8 @@ def test_agents_rag_agent(
123136
Based on "Build a RAG Agent" example available at
124137
https://llamastack.github.io/docs/getting_started/detailed_tutorial
125138
126-
# TODO: update this example to use the vector_store API
139+
Note: streaming is not tested (enable_streaming = False), as it seems to be broken in
140+
llama-stack 0.3.0 (Agents API is only partially implemented)
127141
"""
128142

129143
# Create the RAG agent connected to the vector database
@@ -147,19 +161,26 @@ def test_agents_rag_agent(
147161
rag_agent=rag_agent,
148162
session_id=session_id,
149163
turns_with_expectations=turns_with_expectations,
150-
stream=True,
164+
stream=enable_streaming,
151165
verbose=True,
152166
min_keywords_required=1,
153167
print_events=False,
154168
)
155169

156170
# Assert that validation was successful
157-
assert validation_result["success"], f"RAG agent validation failed. Summary: {validation_result['summary']}"
171+
assert validation_result["success"], (
172+
f"RAG agent validation failed with streaming={enable_streaming}. Summary: {validation_result['summary']}"
173+
)
158174

159175
# Additional assertions for specific requirements
160176
for result in validation_result["results"]:
161-
assert result["event_count"] > 0, f"No events generated for question: {result['question']}"
162-
assert result["response_length"] > 0, f"No response content for question: {result['question']}"
177+
assert result["response_length"] > 0, (
178+
f"No response content for question: {result['question']} (streaming={enable_streaming})"
179+
)
163180
assert len(result["found_keywords"]) > 0, (
164-
f"No expected keywords found in response for: {result['question']}"
181+
f"No expected keywords found in response for: {result['question']} (streaming={enable_streaming})"
165182
)
183+
if enable_streaming:
184+
assert result["event_count"] > 0, (
185+
f"No events generated for question: {result['question']} (streaming={enable_streaming})"
186+
)

tests/llama_stack/conftest.py

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
create_llama_stack_distribution,
2020
wait_for_llama_stack_client_ready,
2121
vector_store_create_file_from_url,
22+
wait_for_unique_llama_stack_pod,
2223
)
2324
from utilities.constants import DscComponents, Annotations
2425
from utilities.data_science_cluster_utils import update_components_in_dsc
@@ -144,8 +145,8 @@ def test_with_remote_milvus(llama_stack_server_config):
144145
server_config: Dict[str, Any] = {
145146
"containerSpec": {
146147
"resources": {
147-
"requests": {"cpu": "250m", "memory": "500Mi"},
148-
"limits": {"cpu": "2", "memory": "12Gi"},
148+
"requests": {"cpu": "1", "memory": "3Gi"},
149+
"limits": {"cpu": "3", "memory": "6Gi"},
149150
},
150151
"env": env_vars,
151152
"name": "llama-stack",
@@ -208,6 +209,7 @@ def _get_llama_stack_distribution_deployment(
208209
"""
209210
Returns the Deployment resource for a given LlamaStackDistribution.
210211
Note: The deployment is created by the operator; this function retrieves it.
212+
Includes a workaround for RHAIENG-1819 to ensure exactly one pod exists.
211213
212214
Args:
213215
client (DynamicClient): Kubernetes client
@@ -222,9 +224,12 @@ def _get_llama_stack_distribution_deployment(
222224
name=llama_stack_distribution.name,
223225
min_ready_seconds=10,
224226
)
225-
227+
deployment.timeout_seconds = 120
226228
deployment.wait(timeout=120)
227229
deployment.wait_for_replicas()
230+
# Workaround for RHAIENG-1819 (Incorrect number of llama-stack pods deployed after
231+
# creating LlamaStackDistribution after setting custom ca bundle in DSCI)
232+
wait_for_unique_llama_stack_pod(client=client, namespace=llama_stack_distribution.namespace)
228233
yield deployment
229234

230235

@@ -321,6 +326,7 @@ def _create_llama_stack_test_route(
321326
}
322327
}
323328
):
329+
route.wait(timeout=60)
324330
yield route
325331

326332

@@ -355,11 +361,11 @@ def _create_llama_stack_client(
355361
) -> Generator[LlamaStackClient, Any, Any]:
356362
# LLS_CLIENT_VERIFY_SSL is false by default to be able to test with Self-Signed certificates
357363
verifySSL = os.getenv("LLS_CLIENT_VERIFY_SSL", "false").lower() == "true"
358-
http_client = httpx.Client(verify=verifySSL)
364+
http_client = httpx.Client(verify=verifySSL, timeout=240)
359365
try:
360366
client = LlamaStackClient(
361367
base_url=f"https://{route.host}",
362-
timeout=180.0,
368+
max_retries=3,
363369
http_client=http_client,
364370
)
365371
wait_for_llama_stack_client_ready(client=client)

tests/llama_stack/utils.py

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,14 @@
22
from typing import Any, Callable, Dict, Generator, List, cast
33

44
from kubernetes.dynamic import DynamicClient
5+
from kubernetes.dynamic.exceptions import ResourceNotFoundError
56
from llama_stack_client import LlamaStackClient, APIConnectionError, InternalServerError
67
from llama_stack_client.types.vector_store import VectorStore
78
from ocp_resources.llama_stack_distribution import LlamaStackDistribution
9+
from ocp_resources.pod import Pod
810
from simple_logger.logger import get_logger
911
from timeout_sampler import retry
12+
from utilities.exceptions import UnexpectedResourceCountError
1013

1114

1215
from tests.llama_stack.constants import (
@@ -15,6 +18,7 @@
1518
ModelInfo,
1619
ValidationResult,
1720
TurnResult,
21+
LLS_CORE_POD_FILTER,
1822
)
1923

2024
from llama_stack_client import Agent, AgentEventLogger
@@ -49,24 +53,55 @@ def create_llama_stack_distribution(
4953
yield llama_stack_distribution
5054

5155

56+
@retry(
57+
wait_timeout=60,
58+
sleep=5,
59+
exceptions_dict={ResourceNotFoundError: [], UnexpectedResourceCountError: []},
60+
)
61+
def wait_for_unique_llama_stack_pod(client: DynamicClient, namespace: str) -> Pod:
62+
"""Wait until exactly one LlamaStackDistribution pod is found in the
63+
namespace (multiple pods may indicate known bug RHAIENG-1819)."""
64+
pods = list(
65+
Pod.get(
66+
dyn_client=client,
67+
namespace=namespace,
68+
label_selector=LLS_CORE_POD_FILTER,
69+
)
70+
)
71+
if not pods:
72+
raise ResourceNotFoundError(f"No pods found with label selector {LLS_CORE_POD_FILTER} in namespace {namespace}")
73+
if len(pods) != 1:
74+
raise UnexpectedResourceCountError(
75+
f"Expected exactly 1 pod with label selector {LLS_CORE_POD_FILTER} "
76+
f"in namespace {namespace}, found {len(pods)}. "
77+
f"(possibly due to known bug RHAIENG-1819)"
78+
)
79+
return pods[0]
80+
81+
5282
@retry(wait_timeout=90, sleep=5)
5383
def wait_for_llama_stack_client_ready(client: LlamaStackClient) -> bool:
84+
"""Wait for LlamaStack client to be ready by checking health, version, and database access."""
5485
try:
5586
client.inspect.health()
5687
version = client.inspect.version()
57-
# Check access to llama-stack server database
88+
models = client.models.list()
5889
vector_stores = client.vector_stores.list()
5990
files = client.files.list()
6091
LOGGER.info(
6192
f"Llama Stack server is available! "
6293
f"(version:{version.version} "
94+
f"models:{len(models)} "
6395
f"vector_stores:{len(vector_stores.data)} "
6496
f"files:{len(files.data)})"
6597
)
6698
return True
99+
67100
except (APIConnectionError, InternalServerError) as error:
68101
LOGGER.debug(f"Llama Stack server not ready yet: {error}")
102+
LOGGER.debug(f"Base URL: {client.base_url}, Error type: {type(error)}, Error details: {str(error)}")
69103
return False
104+
70105
except Exception as e:
71106
LOGGER.warning(f"Unexpected error checking Llama Stack readiness: {e}")
72107
return False
@@ -108,6 +143,7 @@ def _response_fn(*, question: str) -> str:
108143
response = llama_stack_client.responses.create(
109144
input=question,
110145
model=llama_stack_models.model_id,
146+
stream=False,
111147
tools=[
112148
{
113149
"type": "file_search",

0 commit comments

Comments
 (0)