Skip to content

llmd health check #1284

Merged
mwaykole merged 9 commits intoopendatahub-io:mainfrom
mwaykole:llmd-healthcheck
Apr 7, 2026
Merged

llmd health check #1284
mwaykole merged 9 commits intoopendatahub-io:mainfrom
mwaykole:llmd-healthcheck

Conversation

@mwaykole
Copy link
Copy Markdown
Member

@mwaykole mwaykole commented Mar 24, 2026

health check for llmd/kserve to make sure each test runs on a correct configuration

@github-actions
Copy link
Copy Markdown

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/lgtm', '/hold', '/cherry-pick', '/wip', '/verified', '/build-push-pr-image'}

Adds an automatic pre-test health gate for all KServe tests that
verifies odh-model-controller and kserve-controller-manager
deployments are healthy. Tests are skipped with a descriptive
reason if checks fail. Includes --skip-kserve-health-check CLI option.

Made-with: Cursor
Signed-off-by: Milind waykole <mwaykole@redhat.com>
Adds an automatic pre-test health gate for all LLMD tests that
verifies cert-manager, authorino, RHCL operators, required
deployments, LeaderWorkerSetOperator, GatewayClass, and Kuadrant
CRs are healthy. Tests are skipped with a descriptive reason if
checks fail. Includes --skip-llmd-health-check CLI option and
wrapper resource classes for LeaderWorkerSetOperator and Kuadrant.

Made-with: Cursor
Signed-off-by: Milind waykole <mwaykole@redhat.com>
Signed-off-by: Milind waykole <mwaykole@redhat.com>
Signed-off-by: Milind waykole <mwaykole@redhat.com>
@mwaykole mwaykole marked this pull request as ready for review April 7, 2026 10:13
@mwaykole mwaykole requested a review from a team as a code owner April 7, 2026 10:13
@mwaykole mwaykole enabled auto-merge (squash) April 7, 2026 10:15
@mwaykole mwaykole merged commit 64e4e96 into opendatahub-io:main Apr 7, 2026
9 of 10 checks passed
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

Status of building tag latest: success.
Status of pushing tag latest to image registry: success.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 7, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

The pull request adds pytest infrastructure for conditional health checks of KServe and LLMD services. Two new CLI flags (--skip-kserve-health-check and --skip-llmd-health-check) control whether health verification fixtures execute. New conftest files implement health gate logic validating component readiness via DSC status, operator conditions, and deployment availability. Two utility resource classes are introduced for Kuadrant and LeaderWorkerSetOperator.

Changes

Cohort / File(s) Summary
Pytest Configuration
conftest.py
Added two boolean CLI options to control KServe and LLMD health check execution via pytest_addoption.
KServe Health Infrastructure
tests/model_serving/model_server/kserve/conftest.py
Introduced health verification function and session-scoped autouse fixture that validates KServe component state, DSC readiness condition, and controller deployment availability. Bypasses checks if --skip-kserve-health-check flag is set or component_health marker is present.
LLMD Health Infrastructure
tests/model_serving/model_server/llmd/conftest.py
Implemented comprehensive health gate with operator CSV verification, DSC condition validation (KserveLLMInferenceServiceDependencies), deployment availability checks, and optional LeaderWorkerSetOperator and Kuadrant CR validation. Marks tests as xfail if checks fail unless --skip-llmd-health-check or component_health marker is used.
Resource Utility Classes
utilities/resources/kuadrant.py, utilities/resources/leader_worker_set_operator.py
Added Kuadrant (using ApiGroups.KUADRANT_IO) and LeaderWorkerSetOperator (using operator.openshift.io) classes extending base resource types. Both override to_dict() to ensure default empty spec dict when no definition source is provided.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

🚥 Pre-merge checks | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning Pull request lacks any description despite the template requiring summary, related issues, testing methodology, and additional requirements sections. Add a comprehensive description following the template: include a brief summary of changes, link related issues/JIRA tickets, document testing approach (locally/Jenkins), and address additional requirements for test images and markers.
Title check ❓ Inconclusive The title 'llmd health check' is vague and generic, failing to convey meaningful information about the comprehensive changeset that adds health checks for both LLMD and KServe components. Revise title to be more specific and descriptive, e.g., 'Add health check fixtures for LLMD and KServe test suites' to accurately reflect the broader scope of changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants