Feature: add llmd test and lib as foundation of llmd test #612
Feature: add llmd test and lib as foundation of llmd test #612mwaykole merged 15 commits intoopendatahub-io:mainfrom
Conversation
📝 WalkthroughWalkthroughAdds a new LLMD (LLM Deployment) test suite and supporting utilities: pytest fixtures, readiness verifiers, end-to-end tests for OCI and S3 storage, deployment/inference helpers, centralized constants, and an OPT125M CPU inference manifest. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The following are automatically added/executed:
Available user actions:
Supported labels{'/hold', '/verified', '/lgtm', '/cherry-pick', '/build-push-pr-image', '/wip'} |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/model_serving/model_server/llmd/conftest.py (1)
128-176: S3 model URI likely points to the wrong bucket (hard‑coded in create_llmisvc).
create_llmisvc builds s3://ods-ci-wisdom/{storage_path} when storage_key+storage_path are provided, ignoring the actual models_s3_bucket_name set in your secret. This can 404 models in non‑ods-ci buckets. Prefer passing an explicit storage_uri built from models_s3_bucket_name and storage_path, and do not provide storage_key/storage_path to avoid the hard‑coded path.@pytest.fixture(scope="class") def llmd_inference_service_s3( request: FixtureRequest, admin_client: DynamicClient, unprivileged_model_namespace: Namespace, llmd_s3_secret: Secret, - llmd_s3_service_account: ServiceAccount, + llmd_s3_service_account: ServiceAccount, + models_s3_bucket_name: str, ) -> Generator[LLMInferenceService, None, None]: @@ - if "storage_key" not in kwargs: - kwargs["storage_key"] = llmd_s3_secret.name - - if "storage_path" not in kwargs: - kwargs["storage_path"] = DEFAULT_S3_STORAGE_PATH + # Resolve storage_path and construct an explicit S3 URI using the configured bucket + storage_path = kwargs.get("storage_path", DEFAULT_S3_STORAGE_PATH) + storage_uri = f"s3://{models_s3_bucket_name}/{storage_path}" @@ with create_llmisvc( client=admin_client, name=service_name, namespace=unprivileged_model_namespace.name, - storage_key=kwargs.get("storage_key"), - storage_path=kwargs.get("storage_path"), + storage_uri=storage_uri, container_image=kwargs.get("container_image", VLLM_CPU_IMAGE), container_resources=container_resources, service_account=llmd_s3_service_account.name, wait=True, timeout=Timeout.TIMEOUT_15MIN, **{ k: v for k, v in kwargs.items() if k not in ["name", "storage_key", "storage_path", "container_image", "container_resources"] }, ) as llm_service: yield llm_serviceIf you prefer fixing this at the source, I can propose a patch to create_llmisvc to derive the bucket from the provided secret or accept an explicit bucket parameter.
🧹 Nitpick comments (13)
utilities/manifests/opt125m_cpu.py (2)
9-9: Add leading slash to endpoints for URL-join robustness.
Safer when base URLs are concatenated.- "endpoint": "v1/chat/completions", + "endpoint": "/v1/chat/completions", ... - "endpoint": "v1/completions", + "endpoint": "/v1/completions",Also applies to: 19-19
2-6: Strengthen default_query_model to avoid tautological match.
r'.*' reduces test signal; consider a light regex that checks for 100°C/212°F or “boil/boiling”.tests/model_serving/model_server/llmd/utils.py (2)
30-37: Defensive condition parsing to avoid KeyError/None access.
Protect against missing status/keys.- conditions = gateway.instance.status.get("conditions", []) - for condition in conditions: - if condition["type"] == "Programmed" and condition["status"] == "True": + status = getattr(gateway.instance, "status", {}) or {} + conditions = status.get("conditions") or [] + for condition in conditions: + if condition.get("type") == "Programmed" and condition.get("status") == "True": LOGGER.info(f"Gateway {gateway.name} is programmed and ready") return True
54-61: Same defensive parsing for LLMInferenceService conditions.- conditions = llm_service.instance.status.get("conditions", []) - for condition in conditions: - if condition["type"] == "Ready" and condition["status"] == "True": + status = getattr(llm_service.instance, "status", {}) or {} + conditions = status.get("conditions") or [] + for condition in conditions: + if condition.get("type") == "Ready" and condition.get("status") == "True": LOGGER.info(f"LLMInferenceService {llm_service.name} is ready") return Truetests/model_serving/model_server/llmd/conftest.py (2)
23-26: Silence unused admin_client argument in gateway_namespace.
Keeps fixture concise and linter‑clean.-@pytest.fixture(scope="class") -def gateway_namespace(admin_client: DynamicClient) -> str: - return DEFAULT_GATEWAY_NAMESPACE +@pytest.fixture(scope="class") +def gateway_namespace(_: DynamicClient) -> str: + return DEFAULT_GATEWAY_NAMESPACE
105-112: Default resources may be high for shared CI clusters.
Consider making memory/CPU defaults overridable via env/pytest options to reduce scheduling flakiness.Also applies to: 150-156
utilities/llmd_constants.py (2)
17-23: Avoid “latest” and personal registries; pin immutable image digests.
- DEFAULT_CONTAINER_IMAGE points to quay.io/pierdipi/vllm-cpu:latest, and VLLM_STORAGE_OCI references a personal namespace. For reproducibility and supply‑chain hygiene, pin digests or versioned tags and move off personal repos if possible.
Example (illustrative):
-VLLM_CPU_IMAGE = "quay.io/pierdipi/vllm-cpu:latest" +VLLM_CPU_IMAGE = "quay.io/org/vllm-cpu@sha256:<digest>" -VLLM_STORAGE_OCI = "oci://quay.io/mwaykole/test:opt-125m" +VLLM_STORAGE_OCI = "oci://quay.io/org/opt-125m:<version>"
9-10: Use these constants in utilities/llmd_utils.py to avoid drift.KSERVE_GATEWAY_LABEL and KSERVE_INGRESS_GATEWAY are defined here but llmd_utils hard‑codes the same strings. Replace the hard‑coded values to keep things in sync.
I’ve provided a targeted diff in the llmd_utils review.
Also applies to: 22-23
utilities/llmd_utils.py (5)
21-27: De‑duplicate KSERVE labels; use shared constants.Hard‑coded label keys/values will drift from utilities/llmd_constants.py. Import and use KSERVE_GATEWAY_LABEL and KSERVE_INGRESS_GATEWAY.
Apply:
@@ -from utilities.llmd_constants import ( +from utilities.llmd_constants import ( DEFAULT_GATEWAY_NAME, DEFAULT_GATEWAY_NAMESPACE, OPENSHIFT_DEFAULT_GATEWAY_CLASS, DEFAULT_LLM_ENDPOINT, DEFAULT_TIMEOUT, + KSERVE_GATEWAY_LABEL, + KSERVE_INGRESS_GATEWAY, ) @@ - if infrastructure is None: - infrastructure = {"labels": {"serving.kserve.io/gateway": "kserve-ingress-gateway"}} + if infrastructure is None: + infrastructure = {"labels": {KSERVE_GATEWAY_LABEL: KSERVE_INGRESS_GATEWAY}}Also applies to: 71-73
83-85: Over‑broad exception catch.Catching Exception hides real errors. Prefer specific exceptions (e.g., ApiException/ResourceNotFoundError) or re‑raise after logging.
355-357: Remove unused parameter from _validate_unauthorized_response().Drop inference arg and adjust callsite.
Apply:
- if authorized_user is False: - _validate_unauthorized_response(res, token, inference) + if authorized_user is False: + _validate_unauthorized_response(res, token)-def _validate_unauthorized_response(res: Dict[str, Any], token: Optional[str], inference: LLMUserInference) -> None: +def _validate_unauthorized_response(res: Dict[str, Any], token: Optional[str]) -> None:Also applies to: 536-536
298-300: Over‑broad exception catches degrade diagnostics.These wide catches hide TLS/CA resolution issues. Narrow them (ValueError, ImportError) or log and re‑raise.
Also applies to: 488-490
1-16: Minor: consolidate imports to what’s used.Optional tidy‑up only if your linter flags it. No behavior change.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
tests/model_serving/model_server/llmd/__init__.py(1 hunks)tests/model_serving/model_server/llmd/conftest.py(1 hunks)tests/model_serving/model_server/llmd/test_llmd_oci_cpu.py(1 hunks)tests/model_serving/model_server/llmd/test_llmd_s3.py(1 hunks)tests/model_serving/model_server/llmd/utils.py(1 hunks)utilities/llmd_constants.py(1 hunks)utilities/llmd_utils.py(1 hunks)utilities/manifests/opt125m_cpu.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-05T10:05:17.642Z
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Applied to files:
tests/model_serving/model_server/llmd/conftest.py
🧬 Code graph analysis (5)
tests/model_serving/model_server/llmd/conftest.py (4)
utilities/constants.py (1)
Timeout(213-224)utilities/infra.py (1)
s3_endpoint_secret(282-348)utilities/llmd_utils.py (2)
create_gateway(33-114)create_llmisvc(118-262)tests/conftest.py (7)
admin_client(68-69)unprivileged_model_namespace(459-466)aws_access_key_id(124-131)aws_secret_access_key(135-142)models_s3_bucket_name(243-250)models_s3_bucket_region(254-261)models_s3_bucket_endpoint(265-272)
utilities/llmd_constants.py (1)
utilities/constants.py (1)
Timeout(213-224)
tests/model_serving/model_server/llmd/test_llmd_oci_cpu.py (5)
tests/model_serving/model_server/llmd/utils.py (2)
verify_llm_service_status(40-61)verify_gateway_status(16-37)utilities/constants.py (1)
Protocols(92-99)utilities/llmd_utils.py (1)
verify_inference_response_llmd(305-366)tests/model_serving/model_server/llmd/conftest.py (2)
llmd_gateway(65-84)llmd_inference_service(88-124)tests/model_serving/conftest.py (1)
protocol(26-36)
tests/model_serving/model_server/llmd/test_llmd_s3.py (4)
tests/model_serving/model_server/llmd/utils.py (2)
verify_llm_service_status(40-61)verify_gateway_status(16-37)utilities/constants.py (1)
Protocols(92-99)utilities/llmd_utils.py (1)
verify_inference_response_llmd(305-366)tests/model_serving/model_server/llmd/conftest.py (2)
llmd_gateway(65-84)llmd_inference_service_s3(128-175)
utilities/llmd_utils.py (4)
utilities/certificates_utils.py (1)
get_ca_bundle(69-104)utilities/constants.py (2)
HTTPRequest(112-115)Timeout(213-224)utilities/exceptions.py (1)
InferenceResponseError(24-25)utilities/infra.py (1)
get_services_by_isvc_label(504-533)
🪛 Ruff (0.12.2)
tests/model_serving/model_server/llmd/conftest.py
24-24: Unused function argument: admin_client
(ARG001)
utilities/llmd_utils.py
83-83: Do not catch blind exception: Exception
(BLE001)
189-189: Avoid specifying long messages outside the exception class
(TRY003)
195-195: Avoid specifying long messages outside the exception class
(TRY003)
206-206: Avoid specifying long messages outside the exception class
(TRY003)
298-298: Do not catch blind exception: Exception
(BLE001)
397-397: Avoid specifying long messages outside the exception class
(TRY003)
399-399: Avoid specifying long messages outside the exception class
(TRY003)
419-419: Avoid specifying long messages outside the exception class
(TRY003)
424-424: Avoid specifying long messages outside the exception class
(TRY003)
429-429: Avoid specifying long messages outside the exception class
(TRY003)
488-488: Do not catch blind exception: Exception
(BLE001)
515-515: Avoid specifying long messages outside the exception class
(TRY003)
536-536: Unused function argument: inference
(ARG001)
554-554: Avoid specifying long messages outside the exception class
(TRY003)
575-575: Avoid specifying long messages outside the exception class
(TRY003)
580-580: Avoid specifying long messages outside the exception class
(TRY003)
587-587: Avoid specifying long messages outside the exception class
(TRY003)
628-631: Avoid specifying long messages outside the exception class
(TRY003)
633-633: Avoid specifying long messages outside the exception class
(TRY003)
639-639: Avoid specifying long messages outside the exception class
(TRY003)
642-642: Avoid specifying long messages outside the exception class
(TRY003)
645-645: Avoid specifying long messages outside the exception class
(TRY003)
649-649: Avoid specifying long messages outside the exception class
(TRY003)
653-653: Avoid specifying long messages outside the exception class
(TRY003)
656-656: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (7)
tests/model_serving/model_server/llmd/__init__.py (1)
1-8: LGTM – clear package docstring.
No issues.tests/model_serving/model_server/llmd/test_llmd_s3.py (2)
23-34: Test flow looks good.
Readiness checks before inference and HTTP path selection via Protocols are sound.
14-18: create_ns supports an explicit name — resolved.
utilities/infra.py defines create_ns(..., name: str | None = None), so the unprivileged_model_namespace name parameter is effective and not a no-op.tests/model_serving/model_server/llmd/test_llmd_oci_cpu.py (1)
15-34: LGTM – OCI CPU E2E structure is solid.
Good gating on readiness and reuse of common inference config.utilities/llmd_constants.py (1)
25-28: Double‑check default CPU/memory footprints.Requests 100m CPU with 8Gi memory and limit 10Gi may be unschedulable on small CI clusters. Confirm these are intentional defaults for “CPU tests.”
utilities/llmd_utils.py (2)
636-659: Good, reusable response shape validator.The checks align with OpenAI chat completions format and provide actionable errors.
180-189: Fix storage_key usage and parameterize S3 bucket
- storage_key is accepted but unused; URI is hard-coded to s3://ods-ci-wisdom/{storage_path} — makes the helper non‑portable. (utilities/llmd_utils.py:180-189)
- Either wire storage_key into the model spec (per the LLMInferenceService CRD) and make the bucket configurable, or remove storage_key from the signature/docs.
Will patch after you confirm the expected CRD fields (bucket/uri field and whether credentials come from a secret name or a service account).
Signed-off-by: Milind Waykole <mwaykole@redhat.com>
for more information, see https://pre-commit.ci
Signed-off-by: Milind Waykole <mwaykole@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (5)
utilities/llmd_utils.py (5)
469-475: Curl header quoting bug splits args; keep header as‑is.- header = HTTPRequest.CONTENT_JSON.replace("-H ", "").replace("'", "") + header = HTTPRequest.CONTENT_JSON cmd_exec = "curl -i -s" - cmd = f"{cmd_exec} -X POST -d '{body}' -H {header} -H 'Accept: application/json'" + cmd = f"{cmd_exec} -X POST -d '{body}' {header} -H 'Accept: application/json'"
482-489: Wrong deployment_mode forces --insecure; use “raw-deployment”.- ca_bundle = get_ca_bundle(client=client, deployment_mode="raw") + ca_bundle = get_ca_bundle(client=client, deployment_mode="raw-deployment")Also replace blind
except Exceptionwith a narrower exception if possible.
512-516: Do not leak Bearer tokens in errors; redact before raising.- if res: - return out - raise ValueError(f"Inference failed with error: {err}\nOutput: {out}\nCommand: {cmd}") + if res: + return out + redacted_cmd = re.sub(r'(-H\s+["\']Authorization:\s*Bearer\s+)\S+(["\'])', r'\1<redacted>\2', cmd) + raise ValueError(f"Inference failed with error: {err}\nOutput: {out}\nCommand: {redacted_cmd}")
526-534: Parse HTTP body and JSON; validators expect structured payload.- return {"output": out} + # Extract last HTTP body and parse JSON if possible + body = out.split("\r\n\r\n")[-1] + try: + payload = json.loads(body) + except json.JSONDecodeError: + payload = None + return {"output": out, "payload": payload}
566-633: Validate against parsed payload (OpenAI‑style), not raw text.-def _validate_authorized_response( +def _validate_authorized_response( res: Dict[str, Any], inference: LLMUserInference, inference_config: Dict[str, Any], inference_type: str, expected_response_text: Optional[str], use_default_query: bool, model_name: str, ) -> None: - """Validate response for authorized users.""" + """Validate response for authorized users (uses parsed JSON payload when available).""" @@ - if inference.inference_response_text_key_name: - if inference_type == inference.STREAMING: - if output := re.findall( - rf"{inference.inference_response_text_key_name}\": \"(.*)\"", - res[inference.inference_response_key_name], - re.MULTILINE, - ): - assert "".join(output) == expected_response_text, ( - f"Expected: {expected_response_text} does not match response: {output}" - ) - elif inference_type == inference.INFER or use_regex: - formatted_res = json.dumps(res[inference.inference_response_text_key_name]).replace(" ", "") - if use_regex and expected_response_text is not None: - assert re.search(expected_response_text, formatted_res), ( - f"Expected: {expected_response_text} not found in: {formatted_res}" - ) - else: - formatted_res = json.dumps(res[inference.inference_response_key_name]).replace(" ", "") - assert formatted_res == expected_response_text, ( - f"Expected: {expected_response_text} does not match output: {formatted_res}" - ) - else: - response = res[inference.inference_response_key_name] - if isinstance(response, list): - response = response[0] - - if isinstance(response, dict): - response_text = response[inference.inference_response_text_key_name] - assert response_text == expected_response_text, ( - f"Expected: {expected_response_text} does not match response: {response_text}" - ) - else: - raise InferenceResponseError( - "Inference response output does not match expected output format." - f"Expected: {expected_response_text}.\nResponse: {res}" - ) - else: - raise InferenceResponseError(f"Inference response output not found in response. Response: {res}") + payload = res.get("payload") + if not isinstance(payload, dict): + raise InferenceResponseError("Failed to parse JSON payload from response.") + validate_llm_response_format(payload) + response_text = payload["choices"][0]["message"]["content"] + if use_regex: + assert re.search(str(expected_response_text), response_text), ( + f"Expected pattern not found. Pattern: {expected_response_text}" + ) + else: + assert response_text == expected_response_text, ( + f"Expected: {expected_response_text} does not match: {response_text}" + )
🧹 Nitpick comments (6)
tests/model_serving/model_server/llmd/conftest.py (4)
24-25: Unused fixture arg: rename to underscore to silence Ruff.
admin_clientisn’t used here. Rename to_admin_client(or drop it) to avoid ARG001.-@pytest.fixture(scope="class") -def gateway_namespace(admin_client: DynamicClient) -> str: +@pytest.fixture(scope="class") +def gateway_namespace(_admin_client: DynamicClient) -> str: return DEFAULT_GATEWAY_NAMESPACE
70-76: Guard access to request.param (fixture may be unparametrized).Accessing
request.paramwithout a default raises when the fixture isn’t parametrized.- if isinstance(request.param, str): + param = getattr(request, "param", None) + if isinstance(param, str): - gateway_class_name = request.param + gateway_class_name = param kwargs = {} else: - gateway_class_name = request.param.get("gateway_class_name", "openshift-default") - kwargs = {k: v for k, v in request.param.items() if k != "gateway_class_name"} + param = param or {} + gateway_class_name = param.get("gateway_class_name", "openshift-default") + kwargs = {k: v for k, v in param.items() if k != "gateway_class_name"}
93-101: Same: don’t assume parametrization in llmd_inference_service.- if isinstance(request.param, str): + param = getattr(request, "param", None) + if isinstance(param, str): - name_suffix = request.param + name_suffix = param kwargs = {} else: - name_suffix = request.param.get("name_suffix", "basic") - kwargs = {k: v for k, v in request.param.items() if k != "name_suffix"} + param = param or {} + name_suffix = param.get("name_suffix", "basic") + kwargs = {k: v for k, v in param.items() if k != "name_suffix"}
126-140: Same: guard request.param in S3 fixture.- if isinstance(request.param, str): + param = getattr(request, "param", None) + if isinstance(param, str): - name_suffix = request.param + name_suffix = param kwargs = {"storage_path": DEFAULT_S3_STORAGE_PATH} else: - name_suffix = request.param.get("name_suffix", "s3") - kwargs = {k: v for k, v in request.param.items() if k != "name_suffix"} + param = param or {} + name_suffix = param.get("name_suffix", "s3") + kwargs = {k: v for k, v in param.items() if k != "name_suffix"}utilities/llmd_utils.py (2)
536-547: Unused arg in unauthorized validator; rename to underscore.-def _validate_unauthorized_response(res: Dict[str, Any], token: Optional[str], inference: LLMUserInference) -> None: +def _validate_unauthorized_response(res: Dict[str, Any], token: Optional[str], _inference: LLMUserInference) -> None:Also consider short‑circuiting on HTTP 401/403 status in headers if present.
73-85: Broad exception catches; narrow where feasible.Catching
Exceptionhides actionable failures. Prefer specific exceptions fromocp_resources/requestsor re‑raise after logging.Also applies to: 298-300, 488-489
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
pytest.ini(1 hunks)tests/model_serving/model_server/llmd/conftest.py(1 hunks)utilities/llmd_utils.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- pytest.ini
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-05T10:05:17.642Z
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Applied to files:
tests/model_serving/model_server/llmd/conftest.py
🧬 Code graph analysis (2)
tests/model_serving/model_server/llmd/conftest.py (4)
utilities/constants.py (1)
Timeout(213-224)utilities/infra.py (1)
s3_endpoint_secret(282-348)utilities/llmd_utils.py (2)
create_gateway(33-114)create_llmisvc(118-262)tests/conftest.py (7)
admin_client(68-69)unprivileged_model_namespace(459-466)aws_access_key_id(124-131)aws_secret_access_key(135-142)models_s3_bucket_name(243-250)models_s3_bucket_region(254-261)models_s3_bucket_endpoint(265-272)
utilities/llmd_utils.py (4)
utilities/certificates_utils.py (1)
get_ca_bundle(69-104)utilities/constants.py (2)
HTTPRequest(112-115)Timeout(213-224)utilities/exceptions.py (1)
InferenceResponseError(24-25)utilities/infra.py (1)
get_services_by_isvc_label(504-533)
🪛 Ruff (0.12.2)
tests/model_serving/model_server/llmd/conftest.py
24-24: Unused function argument: admin_client
(ARG001)
utilities/llmd_utils.py
83-83: Do not catch blind exception: Exception
(BLE001)
189-189: Avoid specifying long messages outside the exception class
(TRY003)
195-195: Avoid specifying long messages outside the exception class
(TRY003)
206-206: Avoid specifying long messages outside the exception class
(TRY003)
298-298: Do not catch blind exception: Exception
(BLE001)
397-397: Avoid specifying long messages outside the exception class
(TRY003)
399-399: Avoid specifying long messages outside the exception class
(TRY003)
419-419: Avoid specifying long messages outside the exception class
(TRY003)
424-424: Avoid specifying long messages outside the exception class
(TRY003)
429-429: Avoid specifying long messages outside the exception class
(TRY003)
488-488: Do not catch blind exception: Exception
(BLE001)
515-515: Avoid specifying long messages outside the exception class
(TRY003)
536-536: Unused function argument: inference
(ARG001)
554-554: Avoid specifying long messages outside the exception class
(TRY003)
575-575: Avoid specifying long messages outside the exception class
(TRY003)
580-580: Avoid specifying long messages outside the exception class
(TRY003)
587-587: Avoid specifying long messages outside the exception class
(TRY003)
627-630: Avoid specifying long messages outside the exception class
(TRY003)
632-632: Avoid specifying long messages outside the exception class
(TRY003)
638-638: Avoid specifying long messages outside the exception class
(TRY003)
641-641: Avoid specifying long messages outside the exception class
(TRY003)
644-644: Avoid specifying long messages outside the exception class
(TRY003)
648-648: Avoid specifying long messages outside the exception class
(TRY003)
652-652: Avoid specifying long messages outside the exception class
(TRY003)
655-655: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (2)
tests/model_serving/model_server/llmd/conftest.py (2)
30-47: Incorrect — no variable mismatch in models_s3_bucket_endpoint fixture.
The fixture in tests/conftest.py (def models_s3_bucket_endpoint, lines ~265–272) assigns models_bucket_endpoint = pytestconfig.option.models_s3_bucket_endpoint and returns models_bucket_endpoint; no undefined name is referenced and the S3 fixture should resolve correctly.Likely an incorrect or invalid review comment.
55-61: Confirm ServiceAccount "secrets" usage is intentional.Attaching the S3 Secret to the SA here (tests/model_serving/model_server/llmd/conftest.py:55–61) is uncommon—controllers normally read annotated S3 secrets. The repo already passes secrets to SAs in several fixtures (e.g. tests/model_serving/model_server/upgrade/conftest.py, tests/model_serving/model_server/storage/minio/conftest.py). Confirm LLMD pods actually require the secret mounted on the ServiceAccount; if not, remove the secrets=[...] to avoid unnecessary access.
Signed-off-by: Milind Waykole <mwaykole@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
tests/model_serving/model_server/llmd/conftest.py (3)
24-26: Fix unused fixture arg (Ruff ARG001).
admin_clientisn’t used. Rename to_admin_client(keeps dependency ordering) or drop it.@pytest.fixture(scope="class") -def gateway_namespace(admin_client: DynamicClient) -> str: +def gateway_namespace(_admin_client: DynamicClient) -> str: return DEFAULT_GATEWAY_NAMESPACE
55-61: ServiceAccount linkage: verify necessity ofsecretsfield.Mounting S3 creds is already driven by
storage_keyon the service. The SAsecretsentry may be redundant. Keep if your controller requires it; otherwise, consider dropping to reduce coupling.
149-155: Configurable resource defaults.10Gi/8Gi for CPU tests is hefty. Consider sourcing from a constant or CLI option to ease running in constrained clusters.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/model_serving/model_server/llmd/conftest.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-05T10:05:17.642Z
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
Applied to files:
tests/model_serving/model_server/llmd/conftest.py
🧬 Code graph analysis (1)
tests/model_serving/model_server/llmd/conftest.py (4)
utilities/constants.py (1)
Timeout(213-224)utilities/infra.py (1)
s3_endpoint_secret(282-348)utilities/llmd_utils.py (2)
create_gateway(33-114)create_llmisvc(118-262)tests/conftest.py (7)
admin_client(68-69)unprivileged_model_namespace(459-466)aws_access_key_id(124-131)aws_secret_access_key(135-142)models_s3_bucket_name(243-250)models_s3_bucket_region(254-261)models_s3_bucket_endpoint(265-272)
🪛 Ruff (0.12.2)
tests/model_serving/model_server/llmd/conftest.py
24-24: Unused function argument: admin_client
(ARG001)
🔇 Additional comments (1)
tests/model_serving/model_server/llmd/conftest.py (1)
102-104: Nice ordering control.Forcing
llmd_gatewayviarequest.getfixturevalueis a clean way to ensure the gateway exists before the service.
Signed-off-by: Milind Waykole <mwaykole@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (8)
utilities/llmd_utils.py (8)
526-534: Parse JSON body from curl output and return structured payload.Validators need a parsed payload; extract the last HTTP body and json.loads it.
Apply:
- out = self.run_inference( + out = self.run_inference( model_name=model_name, inference_input=inference_input, use_default_query=use_default_query, insecure=insecure, token=token, ) - return {"output": out} + # Extract last HTTP body (curl -i includes headers) + parts = re.split(r"\r?\n\r?\n", out) + body = parts[-1] if parts else out + try: + payload = json.loads(body) + except json.JSONDecodeError: + payload = None + return {"output": out, "payload": payload}
566-633: Authorized response validation should use parsed JSON payload.Avoid indexing
reswith dynamic keys; validate againstres["payload"]and support regex/exact match.Apply:
def _validate_authorized_response( res: Dict[str, Any], inference: LLMUserInference, inference_config: Dict[str, Any], inference_type: str, expected_response_text: Optional[str], use_default_query: bool, model_name: str, ) -> None: - """Validate response for authorized users.""" - - use_regex = False + """Validate response for authorized users.""" + payload = res.get("payload") + if not isinstance(payload, dict): + raise InferenceResponseError("Failed to parse JSON payload from response.") + + # Resolve expected text + use_regex = False if use_default_query: expected_response_text_config = inference_config.get("default_query_model", {}) use_regex = expected_response_text_config.get("use_regex", False) if not expected_response_text_config: raise ValueError(f"Missing default_query_model config for inference {inference_config}") if inference_config.get("support_multi_default_queries"): query_config = expected_response_text_config.get(inference_type) if not query_config: raise ValueError(f"Missing default_query_model config for inference type {inference_type}") expected_response_text = query_config.get("query_output", "") use_regex = query_config.get("use_regex", False) else: expected_response_text = expected_response_text_config.get("query_output") if not expected_response_text: raise ValueError(f"Missing response text key for inference {inference_config}") if isinstance(expected_response_text, str): - expected_response_text = Template(template=expected_response_text).safe_substitute(model_name=model_name) + expected_response_text = Template(expected_response_text).safe_substitute(model_name=model_name) elif isinstance(expected_response_text, dict): response_output = expected_response_text.get("response_output") if response_output is not None: - expected_response_text = Template(template=response_output).safe_substitute(model_name=model_name) - if inference.inference_response_text_key_name: - if inference_type == inference.STREAMING: - if output := re.findall( - rf"{inference.inference_response_text_key_name}\": \"(.*)\"", - res[inference.inference_response_key_name], - re.MULTILINE, - ): - assert "".join(output) == expected_response_text, ( - f"Expected: {expected_response_text} does not match response: {output}" - ) - elif inference_type == inference.INFER or use_regex: - formatted_res = json.dumps(res[inference.inference_response_text_key_name]).replace(" ", "") - if use_regex and expected_response_text is not None: - assert re.search(expected_response_text, formatted_res), ( - f"Expected: {expected_response_text} not found in: {formatted_res}" - ) - else: - formatted_res = json.dumps(res[inference.inference_response_key_name]).replace(" ", "") - assert formatted_res == expected_response_text, ( - f"Expected: {expected_response_text} does not match output: {formatted_res}" - ) - else: - response = res[inference.inference_response_key_name] - if isinstance(response, list): - response = response[0] - - if isinstance(response, dict): - response_text = response[inference.inference_response_text_key_name] - assert response_text == expected_response_text, ( - f"Expected: {expected_response_text} does not match response: {response_text}" - ) - else: - raise InferenceResponseError( - "Inference response output does not match expected output format." - f"Expected: {expected_response_text}.\nResponse: {res}" - ) - else: - raise InferenceResponseError(f"Inference response output not found in response. Response: {res}") + expected_response_text = Template(response_output).safe_substitute(model_name=model_name) + + # Extract text from OpenAI-style responses + response_text = None + if isinstance(payload.get("choices"), list) and payload["choices"]: + choice0 = payload["choices"][0] + if isinstance(choice0, dict): + if isinstance(choice0.get("message"), dict): + response_text = choice0["message"].get("content") + elif "text" in choice0: + response_text = choice0.get("text") + + if response_text is None: + raise InferenceResponseError(f"Inference response output not found in payload. Payload: {payload}") + + if use_regex and expected_response_text is not None: + assert re.search(str(expected_response_text), str(response_text)), ( + f"Expected pattern not found. Pattern: {expected_response_text}" + ) + elif expected_response_text is not None: + assert str(response_text) == str(expected_response_text), ( + f"Expected: {expected_response_text} does not match: {response_text}" + )
180-189: Hard‑coded S3 bucket blocks non‑default envs. Build URI from inputs.Allow fully‑qualified s3:// and avoid forcing ods-ci-wisdom.
Apply:
- if storage_key and storage_path: - model_config = { - "uri": f"s3://ods-ci-wisdom/{storage_path}", - } + if storage_key and storage_path: + # Accept full s3:// URI or construct from "bucket/prefix" + uri = storage_path if str(storage_path).startswith("s3://") else f"s3://{storage_path}" + model_config = {"uri": uri}Optional: include the secret ref if supported by the CRD (e.g., model_config["storageKey"] = storage_key).
435-437: Template() called with invalid kwarg; will raise TypeError.Use positional arg.
Apply:
- template = Template(template=template_str) + template = Template(template_str)
469-475: Curl header/body quoting bug + split args; payload injection risk.
- Do not strip quotes from the header; keep HTTPRequest.CONTENT_JSON as-is.
- Quote the JSON body via shlex.quote to survive single quotes in content.
- Add --fail-with-body to fail on non‑2xx while preserving body.
Apply:
- header = HTTPRequest.CONTENT_JSON.replace("-H ", "").replace("'", "") - cmd_exec = "curl -i -s" - cmd = f"{cmd_exec} -X POST -d '{body}' -H {header} -H 'Accept: application/json'" + header = HTTPRequest.CONTENT_JSON + cmd_exec = "curl -i -s --fail-with-body" + quoted_body = shlex.quote(body) + cmd = f"{cmd_exec} -X POST -d {quoted_body} {header} -H 'Accept: application/json'"
482-489: Wrong deployment_mode to get_ca_bundle() forces insecure.Use “raw-deployment”; also avoid blind except and log reason.
Apply:
- ca_bundle = get_ca_bundle(client=client, deployment_mode="raw") + ca_bundle = get_ca_bundle(client=client, deployment_mode="raw-deployment") if ca_bundle: cmd += f" --cacert {ca_bundle}" else: cmd += " --insecure" - except Exception: - cmd += " --insecure" + except Exception as e: + LOGGER.warning(f"Falling back to --insecure (CA bundle unavailable): {e}") + cmd += " --insecure"
512-515: Bearer token leakage in error path; redact before raising.Sanitize Authorization header in the command string.
Apply:
- if res: - return out - raise ValueError(f"Inference failed with error: {err}\nOutput: {out}\nCommand: {cmd}") + if res: + return out + redacted_cmd = re.sub(r"(Authorization:\s*Bearer\s+)\S+", r"\1<redacted>", cmd) + raise ValueError(f"Inference failed with error: {err}\nOutput: {out}\nCommand: {redacted_cmd}")
589-595: More Template() constructor bugs.Both calls pass
template=kwarg; switch to positional.Apply:
- expected_response_text = Template(template=expected_response_text).safe_substitute(model_name=model_name) + expected_response_text = Template(expected_response_text).safe_substitute(model_name=model_name) ... - expected_response_text = Template(template=response_output).safe_substitute(model_name=model_name) + expected_response_text = Template(response_output).safe_substitute(model_name=model_name)
🧹 Nitpick comments (3)
utilities/llmd_utils.py (3)
71-85: Don't swallow broad exceptions when cleaning existing Gateway.Catching Exception hides real failures (permissions, API errors). Catch the specific not-found exception (or check
.existsvia API) and log others.Apply:
- try: - existing_gateway = Gateway( + try: + existing_gateway = Gateway( client=client, name=name, namespace=namespace, api_group="gateway.networking.k8s.io", ) if existing_gateway.exists: LOGGER.info(f"Cleaning up existing Gateway {name} in namespace {namespace}") existing_gateway.delete(wait=True, timeout=Timeout.TIMEOUT_2MIN) - except Exception as e: + except Exception as e: LOGGER.debug(f"No existing Gateway to clean up: {e}")Consider narrowing the except to the specific “not found”/lookup error raised by ocp-resources.
295-302: Catching broad Exception hides discovery failures.Scope to the expected not-found error from get_services_by_isvc_label; log others at warning/error.
Apply:
- except Exception as e: + except Exception as e: LOGGER.warning(f"Could not get service for LLMInferenceService {llm_service.name}: {e}")Consider catching the specific ResourceNotFoundError raised by get_services_by_isvc_label.
536-555: Unused parameter and brittle auth checks.
inferencearg unused; prefix with_or remove.- Consider case-insensitive header match and structured checks once headers parsing is added.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
utilities/llmd_utils.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
utilities/llmd_utils.py (4)
utilities/certificates_utils.py (1)
get_ca_bundle(69-104)utilities/constants.py (2)
HTTPRequest(112-115)Timeout(213-224)utilities/exceptions.py (1)
InferenceResponseError(24-25)utilities/infra.py (1)
get_services_by_isvc_label(504-533)
🪛 Ruff (0.12.2)
utilities/llmd_utils.py
83-83: Do not catch blind exception: Exception
(BLE001)
189-189: Avoid specifying long messages outside the exception class
(TRY003)
195-195: Avoid specifying long messages outside the exception class
(TRY003)
206-206: Avoid specifying long messages outside the exception class
(TRY003)
298-298: Do not catch blind exception: Exception
(BLE001)
397-397: Avoid specifying long messages outside the exception class
(TRY003)
399-399: Avoid specifying long messages outside the exception class
(TRY003)
419-419: Avoid specifying long messages outside the exception class
(TRY003)
424-424: Avoid specifying long messages outside the exception class
(TRY003)
429-429: Avoid specifying long messages outside the exception class
(TRY003)
488-488: Do not catch blind exception: Exception
(BLE001)
515-515: Avoid specifying long messages outside the exception class
(TRY003)
536-536: Unused function argument: inference
(ARG001)
554-554: Avoid specifying long messages outside the exception class
(TRY003)
575-575: Avoid specifying long messages outside the exception class
(TRY003)
580-580: Avoid specifying long messages outside the exception class
(TRY003)
587-587: Avoid specifying long messages outside the exception class
(TRY003)
627-630: Avoid specifying long messages outside the exception class
(TRY003)
632-632: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (1)
utilities/llmd_utils.py (1)
239-250: Confirm LLMInferenceService ctor acceptslabelvslabels.LLMInferenceService is imported from ocp_resources (not defined in this repo); the call currently passes
label=labels(utilities/llmd_utils.py:239–250). Verify ocp_resources.LLMInferenceService init acceptslabel— if it expectslabels, change tolabels=labelsto avoid a silent no-op.
pierDipi
left a comment
There was a problem hiding this comment.
I don't get why we're adding CPU tests here while we could augment the existing tests that run on CI and will catch issues earlier in https://github.com/opendatahub-io/kserve/blob/release-v0.15/test/e2e/llmisvc/test_llm_inference_service.py
Signed-off-by: Milind Waykole <mwaykole@redhat.com>
for more information, see https://pre-commit.ci
Signed-off-by: Milind Waykole <mwaykole@redhat.com>
Signed-off-by: Milind Waykole <mwaykole@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
utilities/llmd_utils.py (1)
579-655: Authorized response validation should use parsed payload (OpenAI format).Current logic indexes res with dynamic keys and will KeyError; use payload.
def _validate_authorized_response( res: Dict[str, Any], inference: LLMUserInference, inference_config: Dict[str, Any], inference_type: str, expected_response_text: Optional[str], use_default_query: bool, model_name: str, ) -> None: - """Validate response for authorized users.""" - - use_regex = False - - if use_default_query: - expected_response_text_config = inference_config.get("default_query_model", {}) - use_regex = expected_response_text_config.get("use_regex", False) - - if not expected_response_text_config: - raise ValueError(f"Missing default_query_model config for inference {inference_config}") - - if inference_config.get("support_multi_default_queries"): - query_config = expected_response_text_config.get(inference_type) - if not query_config: - raise ValueError(f"Missing default_query_model config for inference type {inference_type}") - expected_response_text = query_config.get("query_output", "") - use_regex = query_config.get("use_regex", False) - else: - expected_response_text = expected_response_text_config.get("query_output") - - if not expected_response_text: - raise ValueError(f"Missing response text key for inference {inference_config}") - - if isinstance(expected_response_text, str): - expected_response_text = Template(template=expected_response_text).safe_substitute(model_name=model_name) - elif isinstance(expected_response_text, dict): - response_output = expected_response_text.get("response_output") - if response_output is not None: - expected_response_text = Template(template=response_output).safe_substitute(model_name=model_name) - if inference.inference_response_text_key_name: - if inference_type == inference.STREAMING: - if output := re.findall( - rf"{inference.inference_response_text_key_name}\": \"(.*)\"", - res[inference.inference_response_key_name], - re.MULTILINE, - ): - assert "".join(output) == expected_response_text, ( - f"Expected: {expected_response_text} does not match response: {output}" - ) - elif inference_type == inference.INFER or use_regex: - formatted_res = json.dumps(res[inference.inference_response_text_key_name]).replace(" ", "") - if use_regex and expected_response_text is not None: - assert re.search(expected_response_text, formatted_res), ( - f"Expected: {expected_response_text} not found in: {formatted_res}" - ) - else: - formatted_res = json.dumps(res[inference.inference_response_key_name]).replace(" ", "") - assert formatted_res == expected_response_text, ( - f"Expected: {expected_response_text} does not match output: {formatted_res}" - ) - else: - response = res[inference.inference_response_key_name] - if isinstance(response, list): - response = response[0] - - if isinstance(response, dict): - response_text = response[inference.inference_response_text_key_name] - assert response_text == expected_response_text, ( - f"Expected: {expected_response_text} does not match response: {response_text}" - ) - else: - raise InferenceResponseError( - "Inference response output does not match expected output format." - f"Expected: {expected_response_text}.\nResponse: {res}" - ) - else: - raise InferenceResponseError(f"Inference response output not found in response. Response: {res}") + """Validate response for authorized users (OpenAI Chat Completions).""" + payload = res.get("payload") + if not isinstance(payload, dict): + raise InferenceResponseError("Failed to parse JSON payload from response.") + + # Resolve expected text + use_regex = False + if use_default_query: + expected_response_text_config = inference_config.get("default_query_model", {}) + use_regex = expected_response_text_config.get("use_regex", False) + if not expected_response_text_config: + raise ValueError("Missing default_query_model config") + if inference_config.get("support_multi_default_queries"): + query_config = expected_response_text_config.get(inference_type) or {} + expected_response_text = query_config.get("query_output", "") + use_regex = query_config.get("use_regex", use_regex) + else: + expected_response_text = expected_response_text_config.get("query_output") + if not expected_response_text: + raise ValueError("Missing response text key") + if isinstance(expected_response_text, str): + expected_response_text = Template(expected_response_text).safe_substitute(model_name=model_name) + elif isinstance(expected_response_text, dict): + response_output = expected_response_text.get("response_output", "") + expected_response_text = Template(response_output).safe_substitute(model_name=model_name) + + # Validate payload shape and extract content + validate_llm_response_format(payload) + response_text = payload["choices"][0]["message"]["content"] + + if use_regex and expected_response_text is not None: + assert re.search(str(expected_response_text), response_text), ( + f"Expected pattern not found. Pattern: {expected_response_text}" + ) + else: + assert response_text == expected_response_text, ( + f"Expected: {expected_response_text} does not match: {response_text}" + )
🧹 Nitpick comments (1)
tests/model_serving/model_server/llmd/conftest.py (1)
23-26: Remove unused parameter to fix Ruff ARG001.-@pytest.fixture(scope="class") -def gateway_namespace(admin_client: DynamicClient) -> str: +@pytest.fixture(scope="class") +def gateway_namespace() -> str: return DEFAULT_GATEWAY_NAMESPACE
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
tests/model_serving/model_server/llmd/__init__.py(1 hunks)tests/model_serving/model_server/llmd/conftest.py(1 hunks)tests/model_serving/model_server/llmd/test_llmd_oci_cpu.py(1 hunks)tests/model_serving/model_server/llmd/test_llmd_s3.py(1 hunks)tests/model_serving/model_server/llmd/utils.py(1 hunks)utilities/llmd_utils.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/model_serving/model_server/llmd/init.py
- tests/model_serving/model_server/llmd/test_llmd_oci_cpu.py
🧰 Additional context used
🧬 Code graph analysis (4)
tests/model_serving/model_server/llmd/test_llmd_s3.py (5)
tests/model_serving/model_server/llmd/utils.py (3)
verify_llm_service_status(44-65)verify_gateway_status(20-41)verify_llmd_pods_not_restarted(68-106)utilities/constants.py (1)
Protocols(92-99)utilities/llmd_utils.py (1)
verify_inference_response_llmd(327-388)tests/conftest.py (1)
admin_client(68-69)tests/model_serving/model_server/llmd/conftest.py (2)
llmd_gateway(65-94)llmd_inference_service_s3(137-184)
tests/model_serving/model_server/llmd/utils.py (1)
utilities/exceptions.py (1)
PodContainersRestartError(109-110)
tests/model_serving/model_server/llmd/conftest.py (4)
utilities/constants.py (1)
Timeout(213-224)utilities/infra.py (1)
s3_endpoint_secret(282-348)utilities/llmd_utils.py (2)
create_gateway(33-136)create_llmisvc(140-284)tests/conftest.py (7)
admin_client(68-69)unprivileged_model_namespace(459-466)aws_access_key_id(124-131)aws_secret_access_key(135-142)models_s3_bucket_name(243-250)models_s3_bucket_region(254-261)models_s3_bucket_endpoint(265-272)
utilities/llmd_utils.py (4)
utilities/certificates_utils.py (1)
get_ca_bundle(69-104)utilities/constants.py (2)
HTTPRequest(112-115)Timeout(213-224)utilities/exceptions.py (1)
InferenceResponseError(24-25)utilities/infra.py (1)
get_services_by_isvc_label(504-533)
🪛 Ruff (0.13.1)
tests/model_serving/model_server/llmd/conftest.py
24-24: Unused function argument: admin_client
(ARG001)
utilities/llmd_utils.py
102-102: Do not catch blind exception: Exception
(BLE001)
211-211: Avoid specifying long messages outside the exception class
(TRY003)
217-217: Avoid specifying long messages outside the exception class
(TRY003)
228-228: Avoid specifying long messages outside the exception class
(TRY003)
320-320: Do not catch blind exception: Exception
(BLE001)
419-419: Avoid specifying long messages outside the exception class
(TRY003)
421-421: Avoid specifying long messages outside the exception class
(TRY003)
441-441: Avoid specifying long messages outside the exception class
(TRY003)
446-446: Avoid specifying long messages outside the exception class
(TRY003)
451-451: Avoid specifying long messages outside the exception class
(TRY003)
510-510: Do not catch blind exception: Exception
(BLE001)
537-537: Avoid specifying long messages outside the exception class
(TRY003)
558-558: Unused function argument: inference
(ARG001)
576-576: Avoid specifying long messages outside the exception class
(TRY003)
597-597: Avoid specifying long messages outside the exception class
(TRY003)
602-602: Avoid specifying long messages outside the exception class
(TRY003)
609-609: Avoid specifying long messages outside the exception class
(TRY003)
649-652: Avoid specifying long messages outside the exception class
(TRY003)
654-654: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (10)
tests/model_serving/model_server/llmd/conftest.py (3)
78-93: Guard request.param and avoid duplicate kwargs when creating gateway.- if isinstance(request.param, str): - gateway_class_name = request.param - kwargs = {} - else: - gateway_class_name = request.param.get("gateway_class_name", "openshift-default") - kwargs = {k: v for k, v in request.param.items() if k != "gateway_class_name"} + param = getattr(request, "param", None) + if isinstance(param, str): + gateway_class_name = param + kwargs = {} + elif isinstance(param, dict): + gateway_class_name = param.get("gateway_class_name", "openshift-default") + kwargs = {k: v for k, v in param.items() if k != "gateway_class_name"} + else: + gateway_class_name = "openshift-default" + kwargs = {} + wait_for_condition = kwargs.pop("wait_for_condition", True) + timeout = kwargs.pop("timeout", Timeout.TIMEOUT_5MIN) @@ - wait_for_condition=True, - timeout=Timeout.TIMEOUT_5MIN, + wait_for_condition=wait_for_condition, + timeout=timeout, teardown=False, # Don't delete gateway in teardown **kwargs,
103-133: Same robustness for llmd_inference_service: guard param and pop duplicates.- if isinstance(request.param, str): - name_suffix = request.param - kwargs = {} - else: - name_suffix = request.param.get("name_suffix", "basic") - kwargs = {k: v for k, v in request.param.items() if k != "name_suffix"} + param = getattr(request, "param", None) + if isinstance(param, str): + name_suffix = param + kwargs = {} + elif isinstance(param, dict): + name_suffix = param.get("name_suffix", "basic") + kwargs = {k: v for k, v in param.items() if k != "name_suffix"} + else: + name_suffix = "basic" + kwargs = {} - service_name = kwargs.get("name", f"llm-{name_suffix}") + service_name = kwargs.pop("name", f"llm-{name_suffix}") @@ - container_resources = kwargs.get( + container_resources = kwargs.pop( "container_resources", { "limits": {"cpu": "1", "memory": "10Gi"}, "requests": {"cpu": "100m", "memory": "8Gi"}, }, ) + storage_uri = kwargs.pop("storage_uri", VLLM_STORAGE_OCI) + container_image = kwargs.pop("container_image", VLLM_CPU_IMAGE) + wait = kwargs.pop("wait", True) + timeout = kwargs.pop("timeout", Timeout.TIMEOUT_15MIN) @@ - storage_uri=kwargs.get("storage_uri", VLLM_STORAGE_OCI), - container_image=kwargs.get("container_image", VLLM_CPU_IMAGE), + storage_uri=storage_uri, + container_image=container_image, container_resources=container_resources, - wait=True, - timeout=Timeout.TIMEOUT_15MIN, - **{k: v for k, v in kwargs.items() if k != "name"}, + wait=wait, + timeout=timeout, + **kwargs,
136-184: S3 fixture: same guard/pop pattern and build full S3 path with bucket.Prevents crashes and ensures a valid S3 URI is formed.
- if isinstance(request.param, str): - name_suffix = request.param - kwargs = {"storage_path": DEFAULT_S3_STORAGE_PATH} - else: - name_suffix = request.param.get("name_suffix", "s3") - kwargs = {k: v for k, v in request.param.items() if k != "name_suffix"} + param = getattr(request, "param", None) + if isinstance(param, str): + name_suffix = param + kwargs = {"storage_path": DEFAULT_S3_STORAGE_PATH} + elif isinstance(param, dict): + name_suffix = param.get("name_suffix", "s3") + kwargs = {k: v for k, v in param.items() if k != "name_suffix"} + else: + name_suffix = "s3" + kwargs = {"storage_path": DEFAULT_S3_STORAGE_PATH} - service_name = kwargs.get("name", f"llm-{name_suffix}") + service_name = kwargs.pop("name", f"llm-{name_suffix}") @@ - if "storage_key" not in kwargs: - kwargs["storage_key"] = llmd_s3_secret.name - - if "storage_path" not in kwargs: - kwargs["storage_path"] = DEFAULT_S3_STORAGE_PATH + storage_key = kwargs.pop("storage_key", llmd_s3_secret.name) + storage_path = kwargs.pop("storage_path", DEFAULT_S3_STORAGE_PATH) + # Build "bucket/prefix" if only a prefix was provided + from tests.conftest import models_s3_bucket_name as _bucket_name # type: ignore + try: + bucket = _bucket_name # fixture callable imported above + except Exception: + bucket = None + if not str(storage_path).startswith("s3://"): + if bucket: + storage_path = f"{bucket.rstrip('/')}/{str(storage_path).lstrip('/')}" @@ - container_resources = kwargs.get( + container_resources = kwargs.pop( "container_resources", { "limits": {"cpu": "1", "memory": "10Gi"}, "requests": {"cpu": "100m", "memory": "8Gi"}, }, ) - with create_llmisvc( + container_image = kwargs.pop("container_image", VLLM_CPU_IMAGE) + wait = kwargs.pop("wait", True) + timeout = kwargs.pop("timeout", Timeout.TIMEOUT_15MIN) + with create_llmisvc( client=admin_client, name=service_name, namespace=unprivileged_model_namespace.name, - storage_key=kwargs.get("storage_key"), - storage_path=kwargs.get("storage_path"), - container_image=kwargs.get("container_image", VLLM_CPU_IMAGE), + storage_key=storage_key, + storage_path=storage_path, + container_image=container_image, container_resources=container_resources, service_account=llmd_s3_service_account.name, - wait=True, - timeout=Timeout.TIMEOUT_15MIN, - **{ - k: v - for k, v in kwargs.items() - if k not in ["name", "storage_key", "storage_path", "container_image", "container_resources"] - }, + wait=wait, + timeout=timeout, + **kwargs, ) as llm_service: yield llm_serviceutilities/llmd_utils.py (6)
491-494: Curl header formatting bug splits arguments. Keep quoted -H header.- header = HTTPRequest.CONTENT_JSON.replace("-H ", "").replace("'", "") + header = HTTPRequest.CONTENT_JSON cmd_exec = "curl -i -s" - cmd = f"{cmd_exec} -X POST -d '{body}' -H {header} -H 'Accept: application/json'" + cmd = f"{cmd_exec} -X POST -d '{body}' {header} -H 'Accept: application/json'"
505-510: Wrong deployment_mode forces insecure.Use “raw-deployment” to let get_ca_bundle work.
- ca_bundle = get_ca_bundle(client=client, deployment_mode="raw") + ca_bundle = get_ca_bundle(client=client, deployment_mode="raw-deployment")
457-459: Fix Template constructor usage (prevents TypeError).- template = Template(template=template_str) + template = Template(template_str)
202-209: Stop hard‑coding the S3 bucket; accept full URI or bucket/prefix.Current value ignores the environment and will point to the wrong bucket.
- if storage_key and storage_path: - model_config = { - "uri": f"s3://ods-ci-wisdom/{storage_path}", - } + if storage_key and storage_path: + # Treat storage_path as "bucket/prefix" or a full "s3://bucket/prefix" + uri = storage_path if str(storage_path).startswith("s3://") else f"s3://{storage_path}" + model_config = {"uri": uri}
548-555: Parse JSON body from curl output for validators.Validators expect a parsed payload, not raw headers.
- out = self.run_inference( + out = self.run_inference( model_name=model_name, inference_input=inference_input, use_default_query=use_default_query, insecure=insecure, token=token, ) - return {"output": out} + body = out.split("\r\n\r\n")[-1] + try: + payload = json.loads(body) + except json.JSONDecodeError: + payload = None + return {"output": out, "payload": payload}
534-538: Redact Bearer token from error to avoid leakage.- if res: - return out - raise ValueError(f"Inference failed with error: {err}\nOutput: {out}\nCommand: {cmd}") + if res: + return out + redacted_cmd = re.sub(r'(Authorization:\s*Bearer\s+)\S+', r'\1<redacted>', cmd) + raise ValueError(f"Inference failed with error: {err}\nOutput: {out}\nCommand: {redacted_cmd}")tests/model_serving/model_server/llmd/test_llmd_s3.py (1)
31-38: Confirmed: OPT125M_CPU_INFERENCE_CONFIG has chat_completions.http
OPT125M_CPU_INFERENCE_CONFIG includes a chat_completions -> http section, so get_runtime_config() will not raise.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
tests/model_serving/model_server/llmd/conftest.py (1)
24-25: Silence Ruff ARG001: unused fixture arg.admin_client isn’t used. Rename to underscore to keep fixture ordering without linter noise.
- def gateway_namespace(admin_client: DynamicClient) -> str: + def gateway_namespace(_admin_client: DynamicClient) -> str:utilities/llmd_utils.py (2)
103-105: Avoid blind Exception catch.Prefer a narrower exception (e.g., specific ocp-resources errors) or drop the try; Gateway.exists already guards existence.
559-566: Silence Ruff ARG001: unused param.inference isn’t used in _validate_unauthorized_response; rename to underscore.
-def _validate_unauthorized_response(res: Dict[str, Any], token: Optional[str], inference: LLMUserInference) -> None: +def _validate_unauthorized_response(res: Dict[str, Any], token: Optional[str], _inference: LLMUserInference) -> None:
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
tests/model_serving/model_server/llmd/conftest.py(1 hunks)tests/model_serving/model_server/llmd/utils.py(1 hunks)utilities/llmd_utils.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/model_serving/model_server/llmd/utils.py
🧰 Additional context used
🧬 Code graph analysis (2)
tests/model_serving/model_server/llmd/conftest.py (4)
utilities/constants.py (1)
Timeout(213-224)utilities/infra.py (1)
s3_endpoint_secret(282-348)utilities/llmd_utils.py (2)
create_llmd_gateway(33-137)create_llmisvc(141-285)tests/conftest.py (7)
admin_client(68-69)unprivileged_model_namespace(459-466)aws_access_key_id(124-131)aws_secret_access_key(135-142)models_s3_bucket_name(243-250)models_s3_bucket_region(254-261)models_s3_bucket_endpoint(265-272)
utilities/llmd_utils.py (4)
utilities/certificates_utils.py (1)
get_ca_bundle(69-104)utilities/constants.py (2)
HTTPRequest(112-115)Timeout(213-224)utilities/exceptions.py (1)
InferenceResponseError(24-25)utilities/infra.py (1)
get_services_by_isvc_label(504-533)
🪛 Ruff (0.13.1)
tests/model_serving/model_server/llmd/conftest.py
24-24: Unused function argument: admin_client
(ARG001)
utilities/llmd_utils.py
103-103: Do not catch blind exception: Exception
(BLE001)
212-212: Avoid specifying long messages outside the exception class
(TRY003)
218-218: Avoid specifying long messages outside the exception class
(TRY003)
229-229: Avoid specifying long messages outside the exception class
(TRY003)
321-321: Do not catch blind exception: Exception
(BLE001)
420-420: Avoid specifying long messages outside the exception class
(TRY003)
422-422: Avoid specifying long messages outside the exception class
(TRY003)
442-442: Avoid specifying long messages outside the exception class
(TRY003)
447-447: Avoid specifying long messages outside the exception class
(TRY003)
452-452: Avoid specifying long messages outside the exception class
(TRY003)
511-511: Do not catch blind exception: Exception
(BLE001)
538-538: Avoid specifying long messages outside the exception class
(TRY003)
559-559: Unused function argument: inference
(ARG001)
577-577: Avoid specifying long messages outside the exception class
(TRY003)
598-598: Avoid specifying long messages outside the exception class
(TRY003)
603-603: Avoid specifying long messages outside the exception class
(TRY003)
610-610: Avoid specifying long messages outside the exception class
(TRY003)
650-653: Avoid specifying long messages outside the exception class
(TRY003)
655-655: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (11)
utilities/llmd_utils.py (8)
580-656: Validate against parsed payload instead of raw “output” blob.Current logic indexes res as if it contained structured data; after parsing, use res["payload"] and handle streaming vs non‑streaming.
def _validate_authorized_response( res: Dict[str, Any], inference: LLMUserInference, inference_config: Dict[str, Any], inference_type: str, expected_response_text: Optional[str], use_default_query: bool, model_name: str, ) -> None: - """Validate response for authorized users.""" - - use_regex = False + """Validate response for authorized users.""" + use_regex = False @@ - if inference.inference_response_text_key_name: - if inference_type == inference.STREAMING: - if output := re.findall( - rf"{inference.inference_response_text_key_name}\": \"(.*)\"", - res[inference.inference_response_key_name], - re.MULTILINE, - ): - assert "".join(output) == expected_response_text, ( - f"Expected: {expected_response_text} does not match response: {output}" - ) - elif inference_type == inference.INFER or use_regex: - formatted_res = json.dumps(res[inference.inference_response_text_key_name]).replace(" ", "") - if use_regex and expected_response_text is not None: - assert re.search(expected_response_text, formatted_res), ( - f"Expected: {expected_response_text} not found in: {formatted_res}" - ) - else: - formatted_res = json.dumps(res[inference.inference_response_key_name]).replace(" ", "") - assert formatted_res == expected_response_text, ( - f"Expected: {expected_response_text} does not match output: {formatted_res}" - ) - else: - response = res[inference.inference_response_key_name] - if isinstance(response, list): - response = response[0] - - if isinstance(response, dict): - response_text = response[inference.inference_response_text_key_name] - assert response_text == expected_response_text, ( - f"Expected: {expected_response_text} does not match response: {response_text}" - ) - else: - raise InferenceResponseError( - "Inference response output does not match expected output format." - f"Expected: {expected_response_text}.\nResponse: {res}" - ) - else: - raise InferenceResponseError(f"Inference response output not found in response. Response: {res}") + # Streaming: validate against raw output + if inference_type == inference.STREAMING: + key = inference.inference_response_text_key_name or "content" + if output := re.findall(rf'{re.escape(key)}\": \"(.*)\"', res.get("output", ""), re.MULTILINE): + assert "".join(output) == expected_response_text, ( + f"Expected: {expected_response_text} does not match response: {output}" + ) + else: + raise InferenceResponseError("Streaming response did not contain expected content.") + return + + # Non-streaming: use parsed payload + payload = res.get("payload") + if not isinstance(payload, dict): + raise InferenceResponseError("Failed to parse JSON payload from response.") + + # Default to OpenAI Chat Completions shape + try: + response_text = payload["choices"][0]["message"]["content"] + except Exception as exc: + raise InferenceResponseError(f"Unexpected payload shape: {payload}") from exc + + if use_regex and expected_response_text is not None: + assert re.search(str(expected_response_text), response_text), ( + f"Expected pattern not found. Pattern: {expected_response_text}" + ) + else: + assert response_text == expected_response_text, ( + f"Expected: {expected_response_text} does not match: {response_text}" + )
492-495: Curl header formatting bug; causes split args.Don’t strip -H/quotes from CONTENT_JSON; pass it verbatim.
- header = HTTPRequest.CONTENT_JSON.replace("-H ", "").replace("'", "") - cmd_exec = "curl -i -s" - cmd = f"{cmd_exec} -X POST -d '{body}' -H {header} -H 'Accept: application/json'" + header = HTTPRequest.CONTENT_JSON + cmd_exec = "curl -i -s" + cmd = f"{cmd_exec} -X POST -d '{body}' {header} -H 'Accept: application/json'"
505-509: Wrong deployment_mode for get_ca_bundle(); forces insecure.Use “raw-deployment” as per certificates_utils.
- ca_bundle = get_ca_bundle(client=client, deployment_mode="raw") + ca_bundle = get_ca_bundle(client=client, deployment_mode="raw-deployment")
535-539: Avoid Bearer token leakage in error message.Redact Authorization header before raising.
- res, out, err = run_command(command=shlex.split(cmd), verify_stderr=False, check=False) - if res: - return out - raise ValueError(f"Inference failed with error: {err}\nOutput: {out}\nCommand: {cmd}") + res, out, err = run_command(command=shlex.split(cmd), verify_stderr=False, check=False) + if res: + return out + redacted_cmd = re.sub(r'(-H\s+[\'"]Authorization:\s*Bearer\s+)\S+([\'"])', r'\1<redacted>\2', cmd) + raise ValueError(f"Inference failed with error: {err}\nOutput: {out}\nCommand: {redacted_cmd}")
540-556: Parse JSON body from curl output and return payload.Validators need structured payload; “-i” includes headers.
- out = self.run_inference( + out = self.run_inference( model_name=model_name, inference_input=inference_input, use_default_query=use_default_query, insecure=insecure, token=token, ) - return {"output": out} + # Extract last HTTP body (supports multiple responses/redirects) + parts = out.split("\r\n\r\n") + body = parts[-1] if parts else out + try: + payload = json.loads(body) + except json.JSONDecodeError: + payload = None + return {"output": out, "payload": payload}
457-460: Template() called with wrong kwarg; raises TypeError.Use positional constructor.
- template = Template(template=template_str) - body = template.safe_substitute(model_name=model_name) + template = Template(template_str) + body = template.safe_substitute(model_name=model_name)
612-617: Fix Template() misuse in expected-text rendering.Same constructor issue here.
- if isinstance(expected_response_text, str): - expected_response_text = Template(template=expected_response_text).safe_substitute(model_name=model_name) - elif isinstance(expected_response_text, dict): - response_output = expected_response_text.get("response_output") - if response_output is not None: - expected_response_text = Template(template=response_output).safe_substitute(model_name=model_name) + if isinstance(expected_response_text, str): + expected_response_text = Template(expected_response_text).safe_substitute(model_name=model_name) + elif isinstance(expected_response_text, dict): + response_output = expected_response_text.get("response_output") + if response_output is not None: + expected_response_text = Template(response_output).safe_substitute(model_name=model_name)
203-206: Hard-coded S3 bucket breaks non-default envs; build URI from input.Currently forces s3://ods-ci-wisdom, ignoring provided bucket/path.
- if storage_key and storage_path: - model_config = { - "uri": f"s3://ods-ci-wisdom/{storage_path}", - } + if storage_key and storage_path: + # Accept full s3://... or bucket/prefix form + uri = storage_path if str(storage_path).startswith("s3://") else f"s3://{storage_path}" + model_config = {"uri": uri}tests/model_serving/model_server/llmd/conftest.py (3)
79-95: Guard against missing request.param and avoid duplicate kwargs in create_llmd_gateway().Unconditional access to request.param can raise AttributeError when not parametrized. Also, kwargs can duplicate explicit args (wait_for_condition/timeout), causing TypeError.
- if isinstance(request.param, str): - gateway_class_name = request.param - kwargs = {} - else: - gateway_class_name = request.param.get("gateway_class_name", "openshift-default") - kwargs = {k: v for k, v in request.param.items() if k != "gateway_class_name"} + param = getattr(request, "param", None) + if isinstance(param, str): + gateway_class_name = param + kwargs = {} + elif isinstance(param, dict): + gateway_class_name = param.get("gateway_class_name", "openshift-default") + kwargs = {k: v for k, v in param.items() if k != "gateway_class_name"} + else: + gateway_class_name = "openshift-default" + kwargs = {} + wait_for_condition = kwargs.pop("wait_for_condition", True) + timeout = kwargs.pop("timeout", Timeout.TIMEOUT_5MIN) @@ - with create_llmd_gateway( + with create_llmd_gateway( client=admin_client, namespace=gateway_namespace, gateway_class_name=gateway_class_name, - wait_for_condition=True, - timeout=Timeout.TIMEOUT_5MIN, + wait_for_condition=wait_for_condition, + timeout=timeout, teardown=False, # Don't delete gateway in teardown - **kwargs, + **kwargs, ) as gateway:
104-133: Same issues in llmd_inference_service: guard param and pop duplicates.Potential AttributeError on request.param and duplicate named args (name, storage_uri, container_image, container_resources, wait, timeout).
- if isinstance(request.param, str): - name_suffix = request.param - kwargs = {} - else: - name_suffix = request.param.get("name_suffix", "basic") - kwargs = {k: v for k, v in request.param.items() if k != "name_suffix"} - - service_name = kwargs.get("name", f"llm-{name_suffix}") + param = getattr(request, "param", None) + if isinstance(param, str): + name_suffix = param + kwargs = {} + elif isinstance(param, dict): + name_suffix = param.get("name_suffix", "basic") + kwargs = {k: v for k, v in param.items() if k != "name_suffix"} + else: + name_suffix = "basic" + kwargs = {} + + service_name = kwargs.pop("name", f"llm-{name_suffix}") @@ - container_resources = kwargs.get( + container_resources = kwargs.pop( "container_resources", { "limits": {"cpu": "1", "memory": "10Gi"}, "requests": {"cpu": "100m", "memory": "8Gi"}, }, ) + storage_uri = kwargs.pop("storage_uri", VLLM_STORAGE_OCI) + container_image = kwargs.pop("container_image", VLLM_CPU_IMAGE) + wait = kwargs.pop("wait", True) + timeout = kwargs.pop("timeout", Timeout.TIMEOUT_15MIN) @@ - with create_llmisvc( + with create_llmisvc( client=admin_client, name=service_name, namespace=unprivileged_model_namespace.name, - storage_uri=kwargs.get("storage_uri", VLLM_STORAGE_OCI), - container_image=kwargs.get("container_image", VLLM_CPU_IMAGE), + storage_uri=storage_uri, + container_image=container_image, container_resources=container_resources, - wait=True, - timeout=Timeout.TIMEOUT_15MIN, - **{k: v for k, v in kwargs.items() if k != "name"}, + wait=wait, + timeout=timeout, + **kwargs, ) as llm_service:
145-185: Apply same robustness to S3 fixture and allow wait/timeout override.Guard request.param and pop duplicate args (name, storage_key, storage_path, container_image, container_resources, wait, timeout).
- if isinstance(request.param, str): - name_suffix = request.param - kwargs = {"storage_path": DEFAULT_S3_STORAGE_PATH} - else: - name_suffix = request.param.get("name_suffix", "s3") - kwargs = {k: v for k, v in request.param.items() if k != "name_suffix"} - - service_name = kwargs.get("name", f"llm-{name_suffix}") + param = getattr(request, "param", None) + if isinstance(param, str): + name_suffix = param + kwargs = {"storage_path": DEFAULT_S3_STORAGE_PATH} + elif isinstance(param, dict): + name_suffix = param.get("name_suffix", "s3") + kwargs = {k: v for k, v in param.items() if k != "name_suffix"} + else: + name_suffix = "s3" + kwargs = {"storage_path": DEFAULT_S3_STORAGE_PATH} + + service_name = kwargs.pop("name", f"llm-{name_suffix}") @@ - if "storage_key" not in kwargs: - kwargs["storage_key"] = llmd_s3_secret.name - - if "storage_path" not in kwargs: - kwargs["storage_path"] = DEFAULT_S3_STORAGE_PATH + storage_key = kwargs.pop("storage_key", llmd_s3_secret.name) + storage_path = kwargs.pop("storage_path", DEFAULT_S3_STORAGE_PATH) @@ - container_resources = kwargs.get( + container_resources = kwargs.pop( "container_resources", { "limits": {"cpu": "1", "memory": "10Gi"}, "requests": {"cpu": "100m", "memory": "8Gi"}, }, ) + container_image = kwargs.pop("container_image", VLLM_CPU_IMAGE) + wait = kwargs.pop("wait", True) + timeout = kwargs.pop("timeout", Timeout.TIMEOUT_15MIN) @@ - with create_llmisvc( + with create_llmisvc( client=admin_client, name=service_name, namespace=unprivileged_model_namespace.name, - storage_key=kwargs.get("storage_key"), - storage_path=kwargs.get("storage_path"), - container_image=kwargs.get("container_image", VLLM_CPU_IMAGE), + storage_key=storage_key, + storage_path=storage_path, + container_image=container_image, container_resources=container_resources, service_account=llmd_s3_service_account.name, - wait=True, - timeout=Timeout.TIMEOUT_15MIN, - **{ - k: v - for k, v in kwargs.items() - if k not in ["name", "storage_key", "storage_path", "container_image", "container_resources"] - }, + wait=wait, + timeout=timeout, + **kwargs, ) as llm_service:
|
Status of building tag latest: success. |
* Add LLMD model server tests and utilities * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix precommit Signed-off-by: Milind Waykole <mwaykole@redhat.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix precommit Signed-off-by: Milind Waykole <mwaykole@redhat.com> * fix precommit Signed-off-by: Milind Waykole <mwaykole@redhat.com> * fix precommit Signed-off-by: Milind Waykole <mwaykole@redhat.com> * fix gateway thing Signed-off-by: Milind Waykole <mwaykole@redhat.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix gateway thing Signed-off-by: Milind Waykole <mwaykole@redhat.com> * fix: break long line in utils.py to meet flake8 line length requirement * fix gateway thing Signed-off-by: Milind Waykole <mwaykole@redhat.com> --------- Signed-off-by: Milind Waykole <mwaykole@redhat.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add LLMD model server tests and utilities * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix precommit Signed-off-by: Milind Waykole <mwaykole@redhat.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix precommit Signed-off-by: Milind Waykole <mwaykole@redhat.com> * fix precommit Signed-off-by: Milind Waykole <mwaykole@redhat.com> * fix precommit Signed-off-by: Milind Waykole <mwaykole@redhat.com> * fix gateway thing Signed-off-by: Milind Waykole <mwaykole@redhat.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix gateway thing Signed-off-by: Milind Waykole <mwaykole@redhat.com> * fix: break long line in utils.py to meet flake8 line length requirement * fix gateway thing Signed-off-by: Milind Waykole <mwaykole@redhat.com> --------- Signed-off-by: Milind Waykole <mwaykole@redhat.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Summary by CodeRabbit
New Features
Tests
Documentation
Chores