Skip to content

feat: Create ODH distro image smoke test for Vertex AI#139

Closed
Artemon-line wants to merge 47 commits intomainfrom
RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI
Closed

feat: Create ODH distro image smoke test for Vertex AI#139
Artemon-line wants to merge 47 commits intomainfrom
RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI

Conversation

@Artemon-line
Copy link
Copy Markdown
Collaborator

@Artemon-line Artemon-line commented Nov 25, 2025

What does this PR do?

Adds Vertex AI support to integration tests alongside vLLM. Tests run sequentially using a single Llama Stack instance configured with both providers.

Key Changes

  • Added Vertex AI provider support with conditional execution based on GCP secrets
  • Single Llama Stack deployment - one instance supports both vLLM and Vertex AI simultaneously
  • Sequential test execution - vLLM tests first, then Vertex AI tests
  • Optimized vLLM startup - starts early (non-blocking) and is validated before tests
  • GCP credentials handling - mounts credentials to container at /run/secrets/gcp-credentials (matching local Podman setup)
  • Early cleanup - vLLM container stopped after vLLM tests complete

Architecture

  • Deployment: Single Llama Stack instance with both vLLM and Vertex AI providers configured
  • Testing: Sequential execution - vLLM tests (validation, smoke, integration), then Vertex AI tests (integration)
  • Optimization: vLLM starts early in parallel with image build, validated before tests

Test Plan

Local Testing

  1. vLLM: Start vllm container, run ./tests/run_integration_tests.sh with INFERENCE_MODEL="vllm-inference/Qwen/Qwen3-0.6B"
  2. Vertex AI: Set VERTEX_AI_PROJECT, run gcloud auth application-default login, then ./tests/run_integration_tests.sh with INFERENCE_MODEL="vertexai/google/gemini-2.0-flash"

CI Testing

  1. Create PR, verify vLLM tests run first, then Vertex AI tests
  2. Verify Vertex AI tests are skipped with warning if secrets not configured

Required Secrets

For Vertex AI tests to run in CI:

  • VERTEX_AI_PROJECT – Target GCP project
  • GCP_WORKLOAD_IDENTITY_PROVIDER – Used for OIDC authentication via Workload Identity Federation

If Vertex AI secrets are not configured, Vertex AI tests are skipped with a warning.

Files Changed

  • .github/workflows/run-integration-tests.yml - Added Vertex AI test section, sequential execution
  • .github/actions/setup-llama-stack/action.yml - Updated to mount GCP credentials from GOOGLE_APPLICATION_CREDENTIALS
  • .github/actions/setup-vllm/action.yml - Made non-blocking (removed wait loop)
  • docs/live-tests-guide.md - Updated to reflect sequential execution and adding new providers

Summary by CodeRabbit

  • Chores

    • Added reusable CI actions to start services and wait for readiness; split start and readiness steps; streamlined container build-and-publish workflow with dispatch-aware tagging and gated execution paths.
  • Tests

    • Added a Run integration tests workflow covering multiple providers with conditional paths, improved readiness polling, artifact collection, and updated smoke/integration scripts to use direct model identifiers and assume external readiness.
  • Documentation

    • Added a live-tests guide and made the CI build badge clickable.

✏️ Tip: You can customize this high-level summary in your review settings.

- Added a step to check for Vertex AI recordings in the integration tests workflow, allowing conditional execution of Vertex AI tests based on the presence of recordings.
- Updated the integration tests script to dynamically select the text model based on the environment, supporting both Vertex AI and vllm-inference providers.

This improves the testing process by enabling the use of recorded data for Vertex AI, ensuring more reliable and efficient test runs.

Signed-off-by: Artemy <ahladenk@redhat.com>
- Removed unnecessary blank lines in the live tests workflow YAML file for better readability.
- Updated the condition for the `create-pr` job to use `needs.live-tests.result` instead of `needs.live-tests.outcome`.
- Added instructions for setting up Podman for pre-commit hooks in the GitHub Actions testing guide.
- Included a section on fixing SELinux context for Fedora/RHEL users in the documentation.
- Enhanced the live tests guide to clarify the process of including recordings in pull requests.

These changes improve the clarity and functionality of the testing workflows and documentation.

Signed-off-by: Artemy <ahladenk@redhat.com>
- Removed redundant environment variable settings in the live tests workflow, replacing them with direct secret references for improved security.
- Simplified the Docker run command by using a fixed container name instead of an environment variable.
- Updated the local testing script to verify GCP authentication before proceeding, ensuring smoother execution.
- Enhanced the live tests guide by clarifying prerequisites and streamlining instructions for running tests with VLLM and Vertex AI.

These changes improve the clarity, security, and functionality of the testing workflows and documentation.

Signed-off-by: Artemy <ahladenk@redhat.com>
- Removed the optional environment variable setting for VERTEX_AI_LOCATION in the live tests guide to streamline instructions.
- Added a log statement in the local testing script to confirm the use of the Vertex AI provider and its model.

These changes enhance the clarity and usability of the documentation and scripts for running live tests.

Signed-off-by: [Your Name] <your.email@example.com>
Signed-off-by: Artemy <ahladenk@redhat.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Nov 25, 2025

Walkthrough

Adds a composite GitHub Action to start and wait for a Llama Stack container, separates vLLM startup from readiness polling, introduces a "Run integration tests" workflow, refactors the Red Hat container publish workflow to use workflow_run/workflow_dispatch branching, updates tests and docs, and converts the README build badge to a workflow link.

Changes

Cohort / File(s) Summary
Composite action: setup Llama Stack
.github/actions/setup-llama-stack/action.yml
New composite action that builds conditional docker run args (inference/embedding models, vLLM/Vertex AI options), optionally mounts Vertex AI credentials, starts the container with host networking and port mapping, and polls http://127.0.0.1:8321/v1/health up to 60 attempts with diagnostics and logs on timeout.
Composite action: vLLM start
.github/actions/setup-vllm/action.yml
Modified to start vLLM detached (non-blocking) and moved health-check into a separate step that polls the vLLM health endpoint (60 attempts, 2s interval), with explicit start/wait messages and docker logs on failure.
Workflows: integration & publish
.github/workflows/run-integration-tests.yml, .github/workflows/redhat-distro-container.yml
Added a new "Run integration tests" workflow (triggers: PR, branch pushes, dispatch, schedule) handling vLLM/Vertex AI paths, credential gating, image build for dispatch, smoke/integration tests, artifact collection, and cleanup. Refactored Red Hat workflow to trigger on workflow_run or workflow_dispatch, split dispatch vs run logic (Containerfile generation, tag composition), and consolidated publishing.
Tests
tests/smoke.sh, tests/run_integration_tests.sh
tests/smoke.sh: removed internal startup orchestration, added explicit model presence checks, updated OpenAI inference payload to use INFERENCE_MODEL directly. tests/run_integration_tests.sh: pytest invocation updated to use raw INFERENCE_MODEL (removed vllm-inference/ prefix).
Docs & README
docs/live-tests-guide.md, README.md
Added docs/live-tests-guide.md with instructions for running live inference tests (local and CI, providers, secrets). Converted README build badge to a clickable link to the workflow.

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub
    participant Runner as Actions Runner
    participant Docker as Docker Engine
    participant VLLM as vLLM
    participant Llama as Llama Stack
    participant Vertex as Vertex AI
    participant Tests as Test Runner

    Dev->>GH: push / PR / workflow_dispatch
    GH->>Runner: start workflow
    Runner->>Runner: prepare tools (buildx/qemu/etc.)
    alt vLLM path
        Runner->>Docker: docker run vLLM (detached)
        Runner->>VLLM: poll /v1/health (60 tries)
    end
    alt Vertex creds present
        Runner->>Vertex: authenticate / mount creds
    end
    alt dispatch builds image
        Runner->>Docker: build Llama Stack image
        Docker-->>Runner: image ready
    end
    Runner->>Docker: docker run Llama Stack (env, mounts, --net=host, ports)
    Runner->>Llama: poll /v1/health (up to 60 attempts)
    Llama-->>Runner: health OK
    Runner->>Tests: run smoke & integration tests (vLLM)
    alt Vertex creds present
        Runner->>Tests: run Vertex AI integration tests
    end
    Runner->>Runner: collect logs, upload artifacts, cleanup
    Runner-->>GH: workflow result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • Review hotspots:
    • .github/actions/setup-llama-stack/action.yml — conditional docker argument construction, credential mount detection, health-poll loop and diagnostic logging.
    • .github/actions/setup-vllm/action.yml — detached startup, separated readiness step, timeout and docker logs handling.
    • .github/workflows/run-integration-tests.yml & .github/workflows/redhat-distro-container.yml — event branching, Containerfile generation, tag composition logic, publish gating.
    • tests/smoke.sh & tests/run_integration_tests.sh — ensure test expectations match external startup and updated model naming.

Poem

🐇 I snuck a tag, I nudged a run,
I polled the health until the sun.
vLLM hummed, Vertex waved hello,
Logs in paw, the pipelines flow. ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title focuses on Vertex AI smoke testing, but the changeset includes broader integration test infrastructure: vLLM optimization, sequential test orchestration, GCP credentials handling, and CI workflow restructuring.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 36dafa2 and 27f427f.

📒 Files selected for processing (1)
  • tests/smoke.sh (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/smoke.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test (linux/amd64)
  • GitHub Check: Summary

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Artemon-line Artemon-line changed the title RHAIENG-1793: Create ODH distro image smoke test for Vertex AI feat: Create ODH distro image smoke test for Vertex AI Nov 26, 2025
Signed-off-by: [Your Name] <your.email@example.com>
Signed-off-by: Artemy <ahladenk@redhat.com>
Signed-off-by: Artemy <ahladenk@redhat.com>
- Added `shell: bash` to the `live-tests.yml` and `redhat-distro-container.yml` workflows to ensure consistent execution environment for scripts.

This change enhances the reliability of the workflows by explicitly defining the shell used for running commands.

Signed-off-by: [Your Name] <your.email@example.com>
Signed-off-by: Artemy <ahladenk@redhat.com>
Signed-off-by: Artemy <ahladenk@redhat.com>
…tion into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI

Signed-off-by: Artemy <ahladenk@redhat.com>
This change ensures that Vertex AI tests are only executed when recordings are present, streamlining the testing process.

Signed-off-by: Artemy <ahladenk@redhat.com>
Signed-off-by: Artemy <ahladenk@redhat.com>
Copy link
Copy Markdown
Contributor

@ktdreyer ktdreyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase this on main so that we can try it out!

I've posted some initial comments in the meantime.

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Dec 1, 2025

This pull request has merge conflicts that must be resolved before it can be merged. @Artemon-line please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 1, 2025
@Artemon-line Artemon-line self-assigned this Dec 2, 2025
- Enhanced the README to include a clickable build badge for better visibility.
- Removed the obsolete `record-integration-tests.yml` workflow to streamline CI processes.
- Refactored the `redhat-distro-container.yml` workflow to consolidate testing steps and improve clarity.
- Updated the live tests guide to reflect changes in running tests with vLLM and Vertex AI, including clearer instructions and prerequisites.

These changes aim to simplify the CI/CD pipeline and improve documentation for running integration tests.

Signed-off-by: Artemy <ahladenk@redhat.com>
…tion into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI
@mergify mergify bot removed the needs-rebase label Dec 2, 2025
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Dec 2, 2025

This pull request has merge conflicts that must be resolved before it can be merged. @Artemon-line please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 2, 2025
…update integration tests workflow to streamline model inference testing
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/smoke.sh (1)

46-59: The hardcoded "green" expectation in test_model_openai_inference will fail for Vertex AI models with different response formats.

The function sends "What color is grass?" to $INFERENCE_MODEL (which includes vertexai/google/gemini-2.0-flash per the CI/CD workflow) and greps for "green". While both Qwen and Gemini will likely mention grass as green, response phrasing differs across models—Gemini might return "green color" or "the colour of grass is green" instead of containing the word "green" directly. For robust multi-model testing, either parameterize the expected response based on the model, or use a more generic success criterion like checking for non-empty completion or absence of error status.

♻️ Duplicate comments (1)
.github/workflows/run-integration-tests.yml (1)

146-158: Smoke test loop runs both vLLM and Vertex AI models, but inference test may fail for Vertex AI.

Lines 146–158 loop over both VLLM_INFERENCE_MODEL and VERTEX_AI_INFERENCE_MODEL, running smoke.sh for each. However, smoke.sh's test_model_openai_inference function (line 48 in smoke.sh) hardcodes a "What color is grass?" query and expects "green" in the response. This is specific to vLLM/Qwen and will likely fail for Vertex AI/Gemini. See related issue in smoke.sh review.

This issue originates from the smoke.sh file. Both files must be fixed together: smoke.sh must be updated to support multiple model types, and/or this workflow must conditionally run smoke tests only for compatible models.

🧹 Nitpick comments (1)
.github/workflows/run-integration-tests.yml (1)

134-140: Remove redundant health check; setup-llama-stack already validates readiness.

The "Verify deployment" step (lines 134–140) repeats the health check already performed by the setup-llama-stack action (line 122–132). Removing this duplicate step reduces noise and makes the workflow intent clearer.

-      - name: Verify deployment
-        if: github.event_name != 'workflow_dispatch'
-        shell: bash
-        run: |
-          echo "Verifying deployed Llama Stack instance..."
-          curl -fsS http://127.0.0.1:8321/v1/health || exit 1
-          echo "Deployment verified successfully!"
-
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f88ffc and 59db223.

📒 Files selected for processing (3)
  • .github/actions/setup-vllm/action.yml (2 hunks)
  • .github/workflows/run-integration-tests.yml (1 hunks)
  • tests/smoke.sh (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test
  • GitHub Check: Summary
🔇 Additional comments (4)
.github/actions/setup-vllm/action.yml (2)

6-22: Verify vLLM health endpoint exists and is reachable.

The "Start VLLM" step launches the container non-blocking with --net=host, enabling access to http://localhost:8000/health. Confirm that this health endpoint exists in the vLLM image (quay.io/higginsd/vllm-cpu:65393ee064-qwen3) and responds as expected before merging.


24-38: Good readiness validation pattern.

The "Wait for VLLM to be ready" step cleanly separates startup from readiness polling, with appropriate timeout (120 sec), diagnostic output on failure, and exit-code semantics. This is a solid approach for async startup validation.

tests/smoke.sh (1)

7-24: Verify exact health response format.

Line 16 performs a strict equality check on the health endpoint response: [ "$resp" != '{"status":"OK"}' ]. Confirm that the Llama Stack health endpoint consistently returns exactly this JSON format (no whitespace variations, no additional fields) to avoid false failures.

.github/workflows/run-integration-tests.yml (1)

85-99: Clarify workflow_dispatch build and test execution intent.

For workflow_dispatch events, the Containerfile is generated (lines 85–99) but the image is not built (line 102 condition is github.event_name != 'workflow_dispatch'). As a result, for workflow_dispatch:

  • Checkout, setup tools (uv, QEMU, Buildx), and Containerfile generation run.
  • No image build, no container startup, no tests run.

This differs from the schedule event, which generates the Containerfile and builds/tests the image. Clarify whether workflow_dispatch is intended to only generate/inspect the Containerfile, or whether the build and test steps should also run.

This was noted in a prior review comment: "If this step is triggered for workflow_dispatch when is the image built, given L100 excludes that trigger for the following build step?"

…p, removing unnecessary conditions, and cleaning up container resources post-tests
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59db223 and c408bc3.

📒 Files selected for processing (1)
  • .github/workflows/run-integration-tests.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test
  • GitHub Check: Summary
🔇 Additional comments (3)
.github/workflows/run-integration-tests.yml (3)

81-83: Verify setup-llama-stack handles vLLM readiness when startup is non-blocking.

The vLLM container is started non-blocking (line 81-83), then the Llama Stack container is started immediately after Google Cloud authentication (line 121-131). The setup-llama-stack action must wait for vLLM to be ready before deploying, since it requires the vLLM endpoint at http://localhost:8000/v1 (line 129).

Please confirm that the setup-llama-stack action (referenced at line 123) includes proper readiness checks for vLLM before deploying the Llama Stack container. Ensure there are no race conditions between vLLM startup completing and Llama Stack attempting to configure it.

Additionally, verify this behavior against the past review concern: "Why is vLLM no longer blocking? Are there scenarios we would want to test even if the vLLM setup fails?"

Also applies to: 121-131


121-131: Verify model parameter format matches setup-llama-stack action expectations.

The inference_model parameter includes a provider prefix (e.g., vllm-inference/Qwen/Qwen3-0.6B). Confirm that the setup-llama-stack action expects this format and that it's correctly used in the Llama Stack deployment configuration.

Additionally, verify that this parameter naming and format is consistent across:

  • Line 127: passed to setup-llama-stack as inference_model
  • Line 155: used in integration tests as INFERENCE_MODEL environment variable
  • Line 169: used in Vertex AI integration tests as INFERENCE_MODEL environment variable

If the formats differ, ensure proper transformation is applied in the action or test scripts.


113-119: Verify credentials file path consistency between auth and Llama Stack action.

The google-github-actions/auth action creates a credentials file when create_credentials_file: true is set, typically at ${GITHUB_WORKSPACE}/gha-creds-*.json. The setup-llama-stack action should use the GOOGLE_APPLICATION_CREDENTIALS environment variable that the auth action sets, or explicitly handle the credentials file path.

Confirm that the setup-llama-stack action correctly receives and uses the GCP credentials file. Per the PR objectives, credentials should be mounted at /run/secrets/gcp-credentials in the container. Verify the action handles this mount correctly and that the path is consistent with local Podman setup mentioned in test plan.

…nditionally and conditionally execute Vertex AI tests based on credential availability
…ining image tags based on event type, improving tagging logic for better version management
…tex-AI' of github.com:opendatahub-io/llama-stack-distribution into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dff2183 and f936fae.

📒 Files selected for processing (1)
  • .github/workflows/redhat-distro-container.yml (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test
  • GitHub Check: Summary
🔇 Additional comments (2)
.github/workflows/redhat-distro-container.yml (2)

93-105: Excellent implementation of the "Determine image tags" step—directly addresses the past review suggestion.

The tag determination logic is now cleanly extracted into its own step, making the tagging strategy explicit and maintainable. The regex pattern on line 101 correctly matches rhoai-v* branches using bash's [[ operator with extended pattern matching.

For workflow_dispatch events, the format source-{commit}-{sha} aligns with the IMAGE_TAG computed at line 47 for output purposes. For other events (workflow_run), conditional tagging correctly adds :latest for main and :{branch}-latest for release branches.


51-55: Verify checkout ref correctness for workflow_run events.

The checkout ref at line 55 uses github.event.workflow_run.head_sha for workflow_run events. Confirm that this ref correctly corresponds to the exact commit that triggered the upstream "Run integration tests" workflow, rather than the current branch HEAD. This ensures the built container matches the version that passed tests.

Comment on lines +37 to +39
if: |
(github.event_name == 'workflow_run' && github.event.workflow_run.conclusion == 'success') ||
github.event_name == 'workflow_dispatch'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -name "redhat-distro-container.yml" -type f

Repository: opendatahub-io/llama-stack-distribution

Length of output: 130


🌐 Web query:

GitHub workflow_run event head_sha availability when conclusion success

💡 Result:

Short answer: github.event.workflow_run.head_sha is usually present for workflow_run events that originate from a commit (push/PR), but it is not guaranteed (can be empty/null) and the event's conclusion field can also sometimes be missing. For reliable values fetch the workflow run via the Actions API (GET /repos/:owner/:repo/actions/runs/:run_id) and use that data (or the commit->pulls API) instead.

Key refs:

  • Official docs: workflow_run event payload / behavior. [1]
  • Community reports that github.event.workflow_run.conclusion can be empty and the recommended workaround is to call the API to read the run. [2]
  • Community examples showing use of github.event.workflow_run.head_sha but noting it’s not always sufficient for finding PR base/head; users query the API or search PRs by commit SHA. [3]

If you want, I can show a minimal workflow step that fetches the canonical head_sha and conclusion from the API (with octokit/request-action or gh).


🏁 Script executed:

cat -n .github/workflows/redhat-distro-container.yml | head -60

Repository: opendatahub-io/llama-stack-distribution

Length of output: 3227


Confirm that head_sha fallback behavior is intentional; GitHub does not guarantee head_sha in workflow_run events even when conclusion == 'success'.

The web search confirms that github.event.workflow_run.head_sha is not guaranteed to be present for workflow_run events, and the GitHub Actions API should be used for reliable data. This means the fallback to 'main' at line 44 could be triggered unexpectedly when head_sha is missing, rather than only in edge cases. Additionally, head_sha is used in multiple locations (lines 23, 44, 47, 55), so the fallback behavior should be clearly documented. Consider either:

  • Adding an explicit check to verify head_sha presence before proceeding
  • Using the GitHub Actions API to fetch the canonical run data
  • Documenting why the 'main' fallback is acceptable for your use case
🤖 Prompt for AI Agents
.github/workflows/redhat-distro-container.yml around lines 37 to 39: the
workflow currently assumes github.event.workflow_run.head_sha may be missing and
falls back to 'main' later, which can be triggered unexpectedly; update the job
to explicitly verify head_sha before using it (and fail/skip the job or require
a fallback policy), or call the GitHub Actions API to fetch the canonical
workflow_run.head_sha when github.event_name == 'workflow_run', and ensure every
place that uses head_sha (lines 23, 44, 47, 55) reads from the verified value;
alternatively, if the 'main' fallback is acceptable, add an explicit comment
documenting why that fallback is safe and add a guard that logs a warning when
head_sha is absent.

@ktdreyer
Copy link
Copy Markdown
Contributor

ktdreyer commented Dec 5, 2025

This PR is getting big. It implements multiple things besides vertex testing, and we're 52 comments in. If this is working, I propose we merge it and iterate, so that we make further reviews easier. Smaller changes are easier to review.

My suggestions for future improvements:

  1. .github/actions/setup-llama-stack/action.yml has a huge run: block. When we cram a lot of code into YAML, it's harder to read, debug, use shellcheck, etc. We could abstract that out into a main.sh file, following the pattern that upstream LLS uses (example).

  2. Both run-integration-tests.yml and redhat-distro-container.yml are building the LLS container. The integration test workflow builds the container image but doesn't push it to the registry. Then the publish workflow rebuilds the same image from scratch to push it. This means we're building the same image twice instead of testing the exact artifact we publish. Ideally we would build -> test -> tag and push that same tested artifact.

  3. It's unclear to me why the functional tests operate sequentially instead of running in parallel. I would like to explore alternatives to make this simpler and faster if possible.

Again, these are future suggestions for improvement. I'm approving this PR today as-is.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.github/workflows/redhat-distro-container.yml (1)

23-23: Inconsistent fallback behavior for head_sha across env and checkout steps; concurrency group may mis-assign workflow_run events if head_sha is null.

The workflow uses github.event.workflow_run.head_sha in multiple places with different fallbacks:

  • Line 23 (concurrency group): Falls back to github.ref if head_sha is missing; if it evaluates to empty, the concurrency group becomes {workflow}-workflow_run-, potentially grouping unrelated workflow_run events.
  • Line 44 (LLAMA_STACK_COMMIT_SHA): Falls back to 'main'
  • Line 47 (IMAGE_TAG): Falls back to github.sha
  • Line 55 (checkout ref): Falls back to github.ref

This inconsistency means that if head_sha is missing, the checked-out code, built commit SHA, and concurrency group may not align. The past review flagged that head_sha is not guaranteed by GitHub. Consider:

  1. Documenting the fallback policy: Add a comment explaining why these different fallbacks are acceptable and what behavior each represents.
  2. Adding explicit validation: Log a warning or fail-fast if head_sha is missing in workflow_run events, rather than silently falling back.
  3. Unifying fallbacks: If feasible, use consistent fallback values across all uses (e.g., always default to github.ref or always call the GitHub Actions API to fetch canonical run data).

Also applies to: 44-44, 47-47, 55-55

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f936fae and 8b9e361.

📒 Files selected for processing (1)
  • .github/workflows/redhat-distro-container.yml (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test
  • GitHub Check: Summary
🔇 Additional comments (3)
.github/workflows/redhat-distro-container.yml (3)

37-39: Job conditional logic is clear and correct.

The conditional correctly gates the build-and-publish job to only run when tests pass (workflow_run + success) or on manual dispatch.


93-105: Tag determination logic is correct and addresses past review suggestion.

The step properly extracts tag logic into a separate, readable block as suggested in the past review. The bash logic correctly branches on event type and handles the main/rhoai-v* tagging rules.


107-116: Build configuration is appropriate.

The use of docker/build-push-action, platforms constraint, and GitHub Actions Cache are standard and correct. The tags input references IMAGE_TAGS which is populated by the "Determine image tags" step.

Copy link
Copy Markdown
Collaborator

@nathan-weinberg nathan-weinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any evidence these changes are functional?

Frankly, I think this PR is trying to do way too much - I'm not confident this will work on merge, and despite being labeled as a "test" addition you are doing some massive refactors here - while I'm not against it, this needs to be split up.

LLAMA_STACK_COMMIT_SHA: ${{ github.event.inputs.llama_stack_commit_sha || 'main' }}
strategy:
matrix:
platform: [linux/amd64] # TODO: enable other arch once all pip packages are available.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this removed?

with:
python-version: 3.12
version: 0.7.6
enable-cache: false
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

context: .
file: distribution/Containerfile
platforms: ${{ matrix.platform }}
platforms: linux/amd64
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have you removed the matrix?

name: Build, test, and publish Red Hat Distribution Containers
name: Build and publish Red Hat Distribution Containers

on:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several mentions of workflow_dispatch later in this file, but it's not exposed as an option here? Unless I am missing something?


- name: Output custom build information
if: contains(fromJSON('["workflow_dispatch", "schedule"]'), github.event_name)
if: github.event_name == 'workflow_dispatch'
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only for this event?

with:
python-version: 3.12
version: 0.7.6
enable-cache: false
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no cache?

Comment on lines +149 to +150
echo "Running smoke test for vLLM model: $VLLM_INFERENCE_MODEL"
INFERENCE_MODEL="$VLLM_INFERENCE_MODEL" ./tests/smoke.sh
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just use the same env var names? This seems unnecessarily complex

# ============================================

- name: Gather logs and debugging information
if: always() && github.event_name != 'workflow_dispatch'
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're using always() but also a condition?

fi

- name: Upload logs as artifacts
if: always() && github.event_name != 'workflow_dispatch'
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always and a condition?

retention-days: 7

- name: Cleanup containers
if: always() && github.event_name != 'workflow_dispatch'
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always and a condition?

@nathan-weinberg
Copy link
Copy Markdown
Collaborator

Also, 44 commits is crazy and needs to be squashed

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Dec 8, 2025

This pull request has merge conflicts that must be resolved before it can be merged. @Artemon-line please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 8, 2025
…tex-AI' of github.com:opendatahub-io/llama-stack-distribution into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
docs/live-tests-guide.md (1)

44-44: INFERENCE_MODEL format contradicts prior review feedback: clarify whether provider prefix should be included.

The documentation provides examples with provider prefixes (e.g., vllm-inference/Qwen/Qwen3-0.6B at line 44 and vertexai/google/gemini-2.0-flash at line 62), and line 69 explicitly states the script expects this format. However, a prior review comment flagged these exact examples as incorrect and recommended removing the provider prefix. This discrepancy needs resolution.

If the prefix should be removed, update all examples at lines 44, 62, 214, and 279 to match the intended format. If the prefix is correct, ensure this aligns with how distribution/run.yaml structures provider and model identifiers.

Also applies to: 62-62, 69-69, 214-214, 279-279

.github/workflows/redhat-distro-container.yml (1)

37-47: Confirm that head_sha fallback behavior is intentional and acceptable for your workflow reliability.

The workflow uses github.event.workflow_run.head_sha in multiple locations (lines 23, 44, 47, 55) with fallbacks to 'main' or github.sha. A prior review flagged that workflow_run event payloads do not guarantee head_sha is present, even when conclusion == 'success'.

If head_sha is missing and the fallback to 'main' is triggered, the workflow will build and publish an image from the main branch instead of the commit that triggered the test workflow. This could result in publishing an untested image to production.

Recommended actions:

  1. Explicit verification: Add a check to verify head_sha is present before proceeding; fail or warn if missing
  2. Use GitHub API: Call the Actions API to fetch the canonical workflow run data including a verified head_sha
  3. Document fallback policy: If the 'main' fallback is acceptable for your use case, add explicit comments explaining why and document when this fallback may be triggered
🧹 Nitpick comments (2)
docs/live-tests-guide.md (2)

227-227: Hyphenate compound adjective.

Line 227 should read log-gathering instead of log gathering to form a compound adjective modifying "steps".


169-169: Avoid emphasis in place of headings (markdown style).

Lines 169 and 194 use emphasis (**Standard Pattern**, **Exception**) to mark section-like content. Consider converting these to proper markdown headings (#### Standard Pattern, #### Exception) for better document structure and accessibility.

Also applies to: 194-194

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b9e361 and 36dafa2.

📒 Files selected for processing (3)
  • .github/workflows/redhat-distro-container.yml (4 hunks)
  • .github/workflows/run-integration-tests.yml (1 hunks)
  • docs/live-tests-guide.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/live-tests-guide.md

[uncategorized] ~10-~10: The official name of this software platform is spelled with a capital “H”.
Context: ...th providers are tested sequentially in .github/workflows/run-integration-tests.yml. A...

(GITHUB)


[uncategorized] ~73-~73: The official name of this software platform is spelled with a capital “H”.
Context: ...ni-2.0-flash). ## CI/CD Workflow The .github/workflows/run-integration-tests.yml` wo...

(GITHUB)


[uncategorized] ~73-~73: The official name of this software platform is spelled with a capital “H”.
Context: ...ts for both providers sequentially. The .github/workflows/redhat-distro-container.yml ...

(GITHUB)


[uncategorized] ~126-~126: The official name of this software platform is spelled with a capital “H”.
Context: ...dd Environment Variables Location: .github/workflows/run-integration-tests.yml, i...

(GITHUB)


[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ... Configuration Variables Location: .github/workflows/run-integration-tests.yml, i...

(GITHUB)


[uncategorized] ~157-~157: The official name of this software platform is spelled with a capital “H”.
Context: ... Setup Steps (if needed) Location: .github/workflows/run-integration-tests.yml, i...

(GITHUB)


[uncategorized] ~188-~188: The official name of this software platform is spelled with a capital “H”.
Context: ...ication or configuration - Always check github.event_name != 'workflow_dispatch' to s...

(GITHUB)


[uncategorized] ~201-~201: The official name of this software platform is spelled with a capital “H”.
Context: ...anup step See the vLLM implementation (.github/workflows/run-integration-tests.yml li...

(GITHUB)


[uncategorized] ~205-~205: The official name of this software platform is spelled with a capital “H”.
Context: ...d Integration Tests Step Location: .github/workflows/run-integration-tests.yml, a...

(GITHUB)


[grammar] ~227-~227: Use a hyphen to join words.
Context: ...d changes Note: The cleanup and log gathering steps are already configured f...

(QB_NEW_EN_HYPHEN)


[uncategorized] ~294-~294: The official name of this software platform is spelled with a capital “H”.
Context: ...dd test section - Example: Vertex AI (.github/workflows/run-integration-tests.yml li...

(GITHUB)


[uncategorized] ~302-~302: The official name of this software platform is spelled with a capital “H”.
Context: ...running as a sidecar - Example: vLLM (.github/workflows/run-integration-tests.yml li...

(GITHUB)


[uncategorized] ~323-~323: The official name of this software platform is spelled with a capital “H”.
Context: ... - Remote Service):** - Authentication: .github/workflows/run-integration-tests.yml li...

(GITHUB)


[uncategorized] ~324-~324: The official name of this software platform is spelled with a capital “H”.
Context: ...ent section) - Credential verification: .github/workflows/run-integration-tests.yml li...

(GITHUB)


[uncategorized] ~325-~325: The official name of this software platform is spelled with a capital “H”.
Context: ...ymllines 195-203 - Integration tests:.github/workflows/run-integration-tests.yml` li...

(GITHUB)


[uncategorized] ~330-~330: The official name of this software platform is spelled with a capital “H”.
Context: ...on - Local Sidecar):** - Service setup: .github/workflows/run-integration-tests.yml li...

(GITHUB)


[uncategorized] ~331-~331: The official name of this software platform is spelled with a capital “H”.
Context: ...2 (in deployment section) - Validation: .github/workflows/run-integration-tests.yml li...

(GITHUB)


[uncategorized] ~332-~332: The official name of this software platform is spelled with a capital “H”.
Context: ...tests.ymllines 155-167 - Smoke tests:.github/workflows/run-integration-tests.yml` li...

(GITHUB)


[uncategorized] ~333-~333: The official name of this software platform is spelled with a capital “H”.
Context: ...ymllines 169-174 - Integration tests:.github/workflows/run-integration-tests.yml` li...

(GITHUB)


[uncategorized] ~334-~334: The official name of this software platform is spelled with a capital “H”.
Context: ...ion-tests.ymllines 176-182 - Cleanup:.github/workflows/run-integration-tests.yml` li...

(GITHUB)

🪛 markdownlint-cli2 (0.18.1)
docs/live-tests-guide.md

169-169: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


194-194: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Summary
🔇 Additional comments (3)
.github/workflows/run-integration-tests.yml (2)

144-193: Sequential testing with conditional guards is well-structured.

The workflow properly implements:

  • Conditional vLLM startup (line 89) only for PR/push events
  • Smoke tests with separate guards for vLLM (always) and Vertex AI (only when credentials present)
  • Sequential integration tests with appropriate skip logic
  • Clear separation between provider test phases

The prior critical issue about Vertex AI smoke test failures when credentials are missing has been properly resolved via conditional checks at lines 158–163.


199-244: Cleanup and logging strategy is sound.

The use of always() combined with the github.event_name != 'workflow_dispatch' condition correctly ensures logs are gathered and containers cleaned up even on test failure, while avoiding unnecessary steps when containers were never started.

.github/workflows/redhat-distro-container.yml (1)

92-104: Image tag determination logic is clear and well-scoped.

The dynamic tag determination correctly differentiates between workflow_dispatch (source format) and workflow_run (commit-based tags with branch variants). Logic is readable and easy to maintain.

@mergify mergify bot removed the needs-rebase label Dec 9, 2025
@Artemon-line
Copy link
Copy Markdown
Collaborator Author

f this is working, I propose we merge it and iterate, so that we make further reviews easier. Smaller changes are easier to review.

@Artemon-line
Copy link
Copy Markdown
Collaborator Author

Agreed, closing this one and splitting everything into smaller PRs. Thanks for the review.

@leseb leseb deleted the RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI branch January 6, 2026 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RHAIENG-1793: add live CI tests for GCP Vertex

4 participants