fix: update vllm cpu image#507
Conversation
📝 WalkthroughSummary by CodeRabbit
WalkthroughUpdates test fixtures: one replaces a VLLM runtime image SHA256 digest; another adds a 600-second timeout to Pod.wait_for_status calls in lmeval MinIO copy pod fixtures. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
|
/verified |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/model_explainability/lm_eval/conftest.py (1)
385-386: Good call adding an explicit timeout; suggest centralizing the value for consistency600s is reasonable, but elsewhere in this file you use Timeout constants. Prefer a shared constant (either an existing Timeout.* or a local module constant) to avoid magic numbers and ease tuning across environments.
Apply this minimal change:
- pod.wait_for_status(status=Pod.Status.SUCCEEDED, timeout=600) + pod.wait_for_status(status=Pod.Status.SUCCEEDED, timeout=LMEVAL_COPY_POD_TIMEOUT)And add near the other module-level constants:
LMEVAL_COPY_POD_TIMEOUT: int = 600
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tests/model_explainability/conftest.py(1 hunks)tests/model_explainability/lm_eval/conftest.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.
🔇 Additional comments (1)
tests/model_explainability/conftest.py (1)
164-165: Unable to verify vLLM CPU image digest in this environment — please manually confirm pullabilityThe updated digest (
sha256:ada6b3ba98829eb81ae4f89364d9b431c0222671eafb9a04aa16f31628536af2) looks correct, but registry tools aren’t available here. To avoid CI or runtime failures, please run on a machine with appropriate tooling:# Using skopeo (recommended) skopeo inspect docker://quay.io/rh-aiservices-bu/vllm-cpu-openai-ubi9@sha256:ada6b3ba98829eb81ae4f89364d9b431c0222671eafb9a04aa16f31628536af2 | jq '.Digest, .Name' # Or with podman podman manifest inspect quay.io/rh-aiservices-bu/vllm-cpu-openai-ubi9@sha256:ada6b3ba98829eb81ae4f89364d9b431c0222671eafb9a04aa16f31628536af2 | jq '.schemaVersion, .mediaType'Ensure each command returns the expected digest/schemaVersion to confirm the image is available in your target cluster.
|
The following are automatically added/executed:
Available user actions:
Supported labels{'/wip', '/hold', '/verified', '/cherry-pick', '/build-push-pr-image', '/lgtm'} |
|
/cherry-pick 2.23 |
|
Status of building tag latest: success. |
|
Cherry pick action created PR #509 successfully 🎉! |
update vllm cpu image and increase LMEval copy pod timeout
Description
Previous image was 1 year old, and with the upcoming GuardrailsOrchestrator updates we need the latest one for our tests to work.
Also use this PR to increase timeout in one of LMEval fixtures, that timed out in some test envs.
How Has This Been Tested?
Running the tests against a working cluster
Merge criteria: