feat: Create ODH distro image smoke test for Vertex AI by Artemon-line · Pull Request #139 · opendatahub-io/llama-stack-distribution

Artemon-line · 2025-11-25T17:25:52Z

What does this PR do?

Adds Vertex AI support to integration tests alongside vLLM. Tests run sequentially using a single Llama Stack instance configured with both providers.

Key Changes

Added Vertex AI provider support with conditional execution based on GCP secrets
Single Llama Stack deployment - one instance supports both vLLM and Vertex AI simultaneously
Sequential test execution - vLLM tests first, then Vertex AI tests
Optimized vLLM startup - starts early (non-blocking) and is validated before tests
GCP credentials handling - mounts credentials to container at /run/secrets/gcp-credentials (matching local Podman setup)
Early cleanup - vLLM container stopped after vLLM tests complete

Architecture

Deployment: Single Llama Stack instance with both vLLM and Vertex AI providers configured
Testing: Sequential execution - vLLM tests (validation, smoke, integration), then Vertex AI tests (integration)
Optimization: vLLM starts early in parallel with image build, validated before tests

Test Plan

Local Testing

vLLM: Start vllm container, run ./tests/run_integration_tests.sh with INFERENCE_MODEL="vllm-inference/Qwen/Qwen3-0.6B"
Vertex AI: Set VERTEX_AI_PROJECT, run gcloud auth application-default login, then ./tests/run_integration_tests.sh with INFERENCE_MODEL="vertexai/google/gemini-2.0-flash"

CI Testing

Create PR, verify vLLM tests run first, then Vertex AI tests
Verify Vertex AI tests are skipped with warning if secrets not configured

Required Secrets

For Vertex AI tests to run in CI:

VERTEX_AI_PROJECT – Target GCP project
GCP_WORKLOAD_IDENTITY_PROVIDER – Used for OIDC authentication via Workload Identity Federation

If Vertex AI secrets are not configured, Vertex AI tests are skipped with a warning.

Files Changed

.github/workflows/run-integration-tests.yml - Added Vertex AI test section, sequential execution
.github/actions/setup-llama-stack/action.yml - Updated to mount GCP credentials from GOOGLE_APPLICATION_CREDENTIALS
.github/actions/setup-vllm/action.yml - Made non-blocking (removed wait loop)
docs/live-tests-guide.md - Updated to reflect sequential execution and adding new providers

Summary by CodeRabbit

Chores
- Added reusable CI actions to start services and wait for readiness; split start and readiness steps; streamlined container build-and-publish workflow with dispatch-aware tagging and gated execution paths.
Tests
- Added a Run integration tests workflow covering multiple providers with conditional paths, improved readiness polling, artifact collection, and updated smoke/integration scripts to use direct model identifiers and assume external readiness.
Documentation
- Added a live-tests guide and made the CI build badge clickable.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Added a step to check for Vertex AI recordings in the integration tests workflow, allowing conditional execution of Vertex AI tests based on the presence of recordings. - Updated the integration tests script to dynamically select the text model based on the environment, supporting both Vertex AI and vllm-inference providers. This improves the testing process by enabling the use of recorded data for Vertex AI, ensuring more reliable and efficient test runs. Signed-off-by: Artemy <ahladenk@redhat.com>

- Removed unnecessary blank lines in the live tests workflow YAML file for better readability. - Updated the condition for the `create-pr` job to use `needs.live-tests.result` instead of `needs.live-tests.outcome`. - Added instructions for setting up Podman for pre-commit hooks in the GitHub Actions testing guide. - Included a section on fixing SELinux context for Fedora/RHEL users in the documentation. - Enhanced the live tests guide to clarify the process of including recordings in pull requests. These changes improve the clarity and functionality of the testing workflows and documentation. Signed-off-by: Artemy <ahladenk@redhat.com>

- Removed redundant environment variable settings in the live tests workflow, replacing them with direct secret references for improved security. - Simplified the Docker run command by using a fixed container name instead of an environment variable. - Updated the local testing script to verify GCP authentication before proceeding, ensuring smoother execution. - Enhanced the live tests guide by clarifying prerequisites and streamlining instructions for running tests with VLLM and Vertex AI. These changes improve the clarity, security, and functionality of the testing workflows and documentation. Signed-off-by: Artemy <ahladenk@redhat.com>

- Removed the optional environment variable setting for VERTEX_AI_LOCATION in the live tests guide to streamline instructions. - Added a log statement in the local testing script to confirm the use of the Vertex AI provider and its model. These changes enhance the clarity and usability of the documentation and scripts for running live tests. Signed-off-by: [Your Name] <your.email@example.com> Signed-off-by: Artemy <ahladenk@redhat.com>

coderabbitai · 2025-11-25T17:25:58Z

Walkthrough

Adds a composite GitHub Action to start and wait for a Llama Stack container, separates vLLM startup from readiness polling, introduces a "Run integration tests" workflow, refactors the Red Hat container publish workflow to use workflow_run/workflow_dispatch branching, updates tests and docs, and converts the README build badge to a workflow link.

Changes

Cohort / File(s)	Summary
Composite action: setup Llama Stack `.github/actions/setup-llama-stack/action.yml`	New composite action that builds conditional `docker run` args (inference/embedding models, vLLM/Vertex AI options), optionally mounts Vertex AI credentials, starts the container with host networking and port mapping, and polls `http://127.0.0.1:8321/v1/health` up to 60 attempts with diagnostics and logs on timeout.
Composite action: vLLM start `.github/actions/setup-vllm/action.yml`	Modified to start vLLM detached (non-blocking) and moved health-check into a separate step that polls the vLLM health endpoint (60 attempts, 2s interval), with explicit start/wait messages and docker logs on failure.
Workflows: integration & publish `.github/workflows/run-integration-tests.yml`, `.github/workflows/redhat-distro-container.yml`	Added a new "Run integration tests" workflow (triggers: PR, branch pushes, dispatch, schedule) handling vLLM/Vertex AI paths, credential gating, image build for dispatch, smoke/integration tests, artifact collection, and cleanup. Refactored Red Hat workflow to trigger on `workflow_run` or `workflow_dispatch`, split dispatch vs run logic (Containerfile generation, tag composition), and consolidated publishing.
Tests `tests/smoke.sh`, `tests/run_integration_tests.sh`	`tests/smoke.sh`: removed internal startup orchestration, added explicit model presence checks, updated OpenAI inference payload to use `INFERENCE_MODEL` directly. `tests/run_integration_tests.sh`: pytest invocation updated to use raw `INFERENCE_MODEL` (removed `vllm-inference/` prefix).
Docs & README `docs/live-tests-guide.md`, `README.md`	Added `docs/live-tests-guide.md` with instructions for running live inference tests (local and CI, providers, secrets). Converted README build badge to a clickable link to the workflow.

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub
    participant Runner as Actions Runner
    participant Docker as Docker Engine
    participant VLLM as vLLM
    participant Llama as Llama Stack
    participant Vertex as Vertex AI
    participant Tests as Test Runner

    Dev->>GH: push / PR / workflow_dispatch
    GH->>Runner: start workflow
    Runner->>Runner: prepare tools (buildx/qemu/etc.)
    alt vLLM path
        Runner->>Docker: docker run vLLM (detached)
        Runner->>VLLM: poll /v1/health (60 tries)
    end
    alt Vertex creds present
        Runner->>Vertex: authenticate / mount creds
    end
    alt dispatch builds image
        Runner->>Docker: build Llama Stack image
        Docker-->>Runner: image ready
    end
    Runner->>Docker: docker run Llama Stack (env, mounts, --net=host, ports)
    Runner->>Llama: poll /v1/health (up to 60 attempts)
    Llama-->>Runner: health OK
    Runner->>Tests: run smoke & integration tests (vLLM)
    alt Vertex creds present
        Runner->>Tests: run Vertex AI integration tests
    end
    Runner->>Runner: collect logs, upload artifacts, cleanup
    Runner-->>GH: workflow result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Review hotspots:
- .github/actions/setup-llama-stack/action.yml — conditional docker argument construction, credential mount detection, health-poll loop and diagnostic logging.
- .github/actions/setup-vllm/action.yml — detached startup, separated readiness step, timeout and docker logs handling.
- .github/workflows/run-integration-tests.yml & .github/workflows/redhat-distro-container.yml — event branching, Containerfile generation, tag composition logic, publish gating.
- tests/smoke.sh & tests/run_integration_tests.sh — ensure test expectations match external startup and updated model naming.

Poem

🐇 I snuck a tag, I nudged a run,
I polled the health until the sun.
vLLM hummed, Vertex waved hello,
Logs in paw, the pipelines flow. ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title focuses on Vertex AI smoke testing, but the changeset includes broader integration test infrastructure: vLLM optimization, sequential test orchestration, GCP credentials handling, and CI workflow restructuring.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 36dafa2 and 27f427f.

📒 Files selected for processing (1)

tests/smoke.sh (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/smoke.sh

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: test (linux/amd64)
GitHub Check: Summary

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: [Your Name] <your.email@example.com> Signed-off-by: Artemy <ahladenk@redhat.com>

Signed-off-by: Artemy <ahladenk@redhat.com>

- Added `shell: bash` to the `live-tests.yml` and `redhat-distro-container.yml` workflows to ensure consistent execution environment for scripts. This change enhances the reliability of the workflows by explicitly defining the shell used for running commands. Signed-off-by: [Your Name] <your.email@example.com> Signed-off-by: Artemy <ahladenk@redhat.com>

Signed-off-by: Artemy <ahladenk@redhat.com>

…tion into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI Signed-off-by: Artemy <ahladenk@redhat.com>

This change ensures that Vertex AI tests are only executed when recordings are present, streamlining the testing process. Signed-off-by: Artemy <ahladenk@redhat.com>

Signed-off-by: Artemy <ahladenk@redhat.com>

ktdreyer

Please rebase this on main so that we can try it out!

I've posted some initial comments in the meantime.

tests/run_integration_tests.sh

scripts/run-live-tests-local.sh

mergify · 2025-12-01T22:45:55Z

This pull request has merge conflicts that must be resolved before it can be merged. @Artemon-line please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

- Enhanced the README to include a clickable build badge for better visibility. - Removed the obsolete `record-integration-tests.yml` workflow to streamline CI processes. - Refactored the `redhat-distro-container.yml` workflow to consolidate testing steps and improve clarity. - Updated the live tests guide to reflect changes in running tests with vLLM and Vertex AI, including clearer instructions and prerequisites. These changes aim to simplify the CI/CD pipeline and improve documentation for running integration tests. Signed-off-by: Artemy <ahladenk@redhat.com>

…tion into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI

…ntation

…er builds and add integration tests

…cript

mergify · 2025-12-02T19:08:32Z

This pull request has merge conflicts that must be resolved before it can be merged. @Artemon-line please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…update integration tests workflow to streamline model inference testing

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/smoke.sh (1)

46-59: The hardcoded "green" expectation in test_model_openai_inference will fail for Vertex AI models with different response formats.

The function sends "What color is grass?" to $INFERENCE_MODEL (which includes vertexai/google/gemini-2.0-flash per the CI/CD workflow) and greps for "green". While both Qwen and Gemini will likely mention grass as green, response phrasing differs across models—Gemini might return "green color" or "the colour of grass is green" instead of containing the word "green" directly. For robust multi-model testing, either parameterize the expected response based on the model, or use a more generic success criterion like checking for non-empty completion or absence of error status.

♻️ Duplicate comments (1)

.github/workflows/run-integration-tests.yml (1)

146-158: Smoke test loop runs both vLLM and Vertex AI models, but inference test may fail for Vertex AI.

Lines 146–158 loop over both VLLM_INFERENCE_MODEL and VERTEX_AI_INFERENCE_MODEL, running smoke.sh for each. However, smoke.sh's test_model_openai_inference function (line 48 in smoke.sh) hardcodes a "What color is grass?" query and expects "green" in the response. This is specific to vLLM/Qwen and will likely fail for Vertex AI/Gemini. See related issue in smoke.sh review.

This issue originates from the smoke.sh file. Both files must be fixed together: smoke.sh must be updated to support multiple model types, and/or this workflow must conditionally run smoke tests only for compatible models.

🧹 Nitpick comments (1)

.github/workflows/run-integration-tests.yml (1)
134-140: Remove redundant health check; setup-llama-stack already validates readiness.

The "Verify deployment" step (lines 134–140) repeats the health check already performed by the setup-llama-stack action (line 122–132). Removing this duplicate step reduces noise and makes the workflow intent clearer.
-      - name: Verify deployment
-        if: github.event_name != 'workflow_dispatch'
-        shell: bash
-        run: |
-          echo "Verifying deployed Llama Stack instance..."
-          curl -fsS http://127.0.0.1:8321/v1/health || exit 1
-          echo "Deployment verified successfully!"
-

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f88ffc and 59db223.

📒 Files selected for processing (3)

.github/actions/setup-vllm/action.yml (2 hunks)
.github/workflows/run-integration-tests.yml (1 hunks)
tests/smoke.sh (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: test
GitHub Check: Summary

🔇 Additional comments (4)

.github/actions/setup-vllm/action.yml (2)

6-22: Verify vLLM health endpoint exists and is reachable.

The "Start VLLM" step launches the container non-blocking with --net=host, enabling access to http://localhost:8000/health. Confirm that this health endpoint exists in the vLLM image (quay.io/higginsd/vllm-cpu:65393ee064-qwen3) and responds as expected before merging.

24-38: Good readiness validation pattern.

The "Wait for VLLM to be ready" step cleanly separates startup from readiness polling, with appropriate timeout (120 sec), diagnostic output on failure, and exit-code semantics. This is a solid approach for async startup validation.

tests/smoke.sh (1)

7-24: Verify exact health response format.

Line 16 performs a strict equality check on the health endpoint response: [ "$resp" != '{"status":"OK"}' ]. Confirm that the Llama Stack health endpoint consistently returns exactly this JSON format (no whitespace variations, no additional fields) to avoid false failures.

.github/workflows/run-integration-tests.yml (1)

85-99: Clarify workflow_dispatch build and test execution intent.

For workflow_dispatch events, the Containerfile is generated (lines 85–99) but the image is not built (line 102 condition is github.event_name != 'workflow_dispatch'). As a result, for workflow_dispatch:

Checkout, setup tools (uv, QEMU, Buildx), and Containerfile generation run.

No image build, no container startup, no tests run.

This differs from the schedule event, which generates the Containerfile and builds/tests the image. Clarify whether workflow_dispatch is intended to only generate/inspect the Containerfile, or whether the build and test steps should also run.

This was noted in a prior review comment: "If this step is triggered for workflow_dispatch when is the image built, given L100 excludes that trigger for the following build step?"

…p, removing unnecessary conditions, and cleaning up container resources post-tests

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59db223 and c408bc3.

📒 Files selected for processing (1)

.github/workflows/run-integration-tests.yml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: test
GitHub Check: Summary

🔇 Additional comments (3)

.github/workflows/run-integration-tests.yml (3)

81-83: Verify setup-llama-stack handles vLLM readiness when startup is non-blocking.

The vLLM container is started non-blocking (line 81-83), then the Llama Stack container is started immediately after Google Cloud authentication (line 121-131). The setup-llama-stack action must wait for vLLM to be ready before deploying, since it requires the vLLM endpoint at http://localhost:8000/v1 (line 129).

Please confirm that the setup-llama-stack action (referenced at line 123) includes proper readiness checks for vLLM before deploying the Llama Stack container. Ensure there are no race conditions between vLLM startup completing and Llama Stack attempting to configure it.

Additionally, verify this behavior against the past review concern: "Why is vLLM no longer blocking? Are there scenarios we would want to test even if the vLLM setup fails?"

Also applies to: 121-131

121-131: Verify model parameter format matches setup-llama-stack action expectations.

The inference_model parameter includes a provider prefix (e.g., vllm-inference/Qwen/Qwen3-0.6B). Confirm that the setup-llama-stack action expects this format and that it's correctly used in the Llama Stack deployment configuration.

Additionally, verify that this parameter naming and format is consistent across:

Line 127: passed to setup-llama-stack as inference_model

Line 155: used in integration tests as INFERENCE_MODEL environment variable

Line 169: used in Vertex AI integration tests as INFERENCE_MODEL environment variable

If the formats differ, ensure proper transformation is applied in the action or test scripts.

113-119: Verify credentials file path consistency between auth and Llama Stack action.

The google-github-actions/auth action creates a credentials file when create_credentials_file: true is set, typically at ${GITHUB_WORKSPACE}/gha-creds-*.json. The setup-llama-stack action should use the GOOGLE_APPLICATION_CREDENTIALS environment variable that the auth action sets, or explicitly handle the credentials file path.

Confirm that the setup-llama-stack action correctly receives and uses the GCP credentials file. Per the PR objectives, credentials should be mounted at /run/secrets/gcp-credentials in the container. Verify the action handles this mount correctly and that the path is consistent with local Podman setup mentioned in test plan.

.github/workflows/run-integration-tests.yml

…pt to streamline testing process

…nditionally and conditionally execute Vertex AI tests based on credential availability

…est-for-Vertex-AI

.github/workflows/redhat-distro-container.yml

…ining image tags based on event type, improving tagging logic for better version management

…tex-AI' of github.com:opendatahub-io/llama-stack-distribution into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dff2183 and f936fae.

📒 Files selected for processing (1)

.github/workflows/redhat-distro-container.yml (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: test
GitHub Check: Summary

🔇 Additional comments (2)

.github/workflows/redhat-distro-container.yml (2)

93-105: Excellent implementation of the "Determine image tags" step—directly addresses the past review suggestion.

The tag determination logic is now cleanly extracted into its own step, making the tagging strategy explicit and maintainable. The regex pattern on line 101 correctly matches rhoai-v* branches using bash's [[ operator with extended pattern matching.

For workflow_dispatch events, the format source-{commit}-{sha} aligns with the IMAGE_TAG computed at line 47 for output purposes. For other events (workflow_run), conditional tagging correctly adds :latest for main and :{branch}-latest for release branches.

51-55: Verify checkout ref correctness for workflow_run events.

The checkout ref at line 55 uses github.event.workflow_run.head_sha for workflow_run events. Confirm that this ref correctly corresponds to the exact commit that triggered the upstream "Run integration tests" workflow, rather than the current branch HEAD. This ensures the built container matches the version that passed tests.

coderabbitai · 2025-12-05T14:50:28Z

.github/workflows/redhat-distro-container.yml

+    if: |
+      (github.event_name == 'workflow_run' && github.event.workflow_run.conclusion == 'success') ||
+      github.event_name == 'workflow_dispatch'


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -name "redhat-distro-container.yml" -type f

Repository: opendatahub-io/llama-stack-distribution

Length of output: 130

🌐 Web query:

GitHub workflow_run event head_sha availability when conclusion success

💡 Result:

Short answer: github.event.workflow_run.head_sha is usually present for workflow_run events that originate from a commit (push/PR), but it is not guaranteed (can be empty/null) and the event's conclusion field can also sometimes be missing. For reliable values fetch the workflow run via the Actions API (GET /repos/:owner/:repo/actions/runs/:run_id) and use that data (or the commit->pulls API) instead.

Key refs:

Official docs: workflow_run event payload / behavior. [1]

Community reports that github.event.workflow_run.conclusion can be empty and the recommended workaround is to call the API to read the run. [2]

Community examples showing use of github.event.workflow_run.head_sha but noting it’s not always sufficient for finding PR base/head; users query the API or search PRs by commit SHA. [3]

If you want, I can show a minimal workflow step that fetches the canonical head_sha and conclusion from the API (with octokit/request-action or gh).

🏁 Script executed:

cat -n .github/workflows/redhat-distro-container.yml | head -60

Repository: opendatahub-io/llama-stack-distribution

Length of output: 3227

Confirm that head_sha fallback behavior is intentional; GitHub does not guarantee head_sha in workflow_run events even when conclusion == 'success'.

The web search confirms that github.event.workflow_run.head_sha is not guaranteed to be present for workflow_run events, and the GitHub Actions API should be used for reliable data. This means the fallback to 'main' at line 44 could be triggered unexpectedly when head_sha is missing, rather than only in edge cases. Additionally, head_sha is used in multiple locations (lines 23, 44, 47, 55), so the fallback behavior should be clearly documented. Consider either:

Adding an explicit check to verify head_sha presence before proceeding

Using the GitHub Actions API to fetch the canonical run data

Documenting why the 'main' fallback is acceptable for your use case

🤖 Prompt for AI Agents

.github/workflows/redhat-distro-container.yml around lines 37 to 39: the workflow currently assumes github.event.workflow_run.head_sha may be missing and falls back to 'main' later, which can be triggered unexpectedly; update the job to explicitly verify head_sha before using it (and fail/skip the job or require a fallback policy), or call the GitHub Actions API to fetch the canonical workflow_run.head_sha when github.event_name == 'workflow_run', and ensure every place that uses head_sha (lines 23, 44, 47, 55) reads from the verified value; alternatively, if the 'main' fallback is acceptable, add an explicit comment documenting why that fallback is safe and add a guard that logs a warning when head_sha is absent.

ktdreyer · 2025-12-05T19:39:09Z

This PR is getting big. It implements multiple things besides vertex testing, and we're 52 comments in. If this is working, I propose we merge it and iterate, so that we make further reviews easier. Smaller changes are easier to review.

My suggestions for future improvements:

.github/actions/setup-llama-stack/action.yml has a huge run: block. When we cram a lot of code into YAML, it's harder to read, debug, use shellcheck, etc. We could abstract that out into a main.sh file, following the pattern that upstream LLS uses (example).
Both run-integration-tests.yml and redhat-distro-container.yml are building the LLS container. The integration test workflow builds the container image but doesn't push it to the registry. Then the publish workflow rebuilds the same image from scratch to push it. This means we're building the same image twice instead of testing the exact artifact we publish. Ideally we would build -> test -> tag and push that same tested artifact.
It's unclear to me why the functional tests operate sequentially instead of running in parallel. I would like to explore alternatives to make this simpler and faster if possible.

Again, these are future suggestions for improvement. I'm approving this PR today as-is.

…est-for-Vertex-AI

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

.github/workflows/redhat-distro-container.yml (1)

23-23: Inconsistent fallback behavior for head_sha across env and checkout steps; concurrency group may mis-assign workflow_run events if head_sha is null.

The workflow uses github.event.workflow_run.head_sha in multiple places with different fallbacks:

Line 23 (concurrency group): Falls back to github.ref if head_sha is missing; if it evaluates to empty, the concurrency group becomes {workflow}-workflow_run-, potentially grouping unrelated workflow_run events.

Line 44 (LLAMA_STACK_COMMIT_SHA): Falls back to 'main'

Line 47 (IMAGE_TAG): Falls back to github.sha

Line 55 (checkout ref): Falls back to github.ref

This inconsistency means that if head_sha is missing, the checked-out code, built commit SHA, and concurrency group may not align. The past review flagged that head_sha is not guaranteed by GitHub. Consider:

Documenting the fallback policy: Add a comment explaining why these different fallbacks are acceptable and what behavior each represents.

Adding explicit validation: Log a warning or fail-fast if head_sha is missing in workflow_run events, rather than silently falling back.

Unifying fallbacks: If feasible, use consistent fallback values across all uses (e.g., always default to github.ref or always call the GitHub Actions API to fetch canonical run data).

Also applies to: 44-44, 47-47, 55-55

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f936fae and 8b9e361.

📒 Files selected for processing (1)

.github/workflows/redhat-distro-container.yml (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: test
GitHub Check: Summary

🔇 Additional comments (3)

.github/workflows/redhat-distro-container.yml (3)

37-39: Job conditional logic is clear and correct.

The conditional correctly gates the build-and-publish job to only run when tests pass (workflow_run + success) or on manual dispatch.

93-105: Tag determination logic is correct and addresses past review suggestion.

The step properly extracts tag logic into a separate, readable block as suggested in the past review. The bash logic correctly branches on event type and handles the main/rhoai-v* tagging rules.

107-116: Build configuration is appropriate.

The use of docker/build-push-action, platforms constraint, and GitHub Actions Cache are standard and correct. The tags input references IMAGE_TAGS which is populated by the "Determine image tags" step.

nathan-weinberg

Do you have any evidence these changes are functional?

Frankly, I think this PR is trying to do way too much - I'm not confident this will work on merge, and despite being labeled as a "test" addition you are doing some massive refactors here - while I'm not against it, this needs to be split up.

nathan-weinberg · 2025-12-08T18:23:40Z

.github/workflows/redhat-distro-container.yml

-      LLAMA_STACK_COMMIT_SHA: ${{ github.event.inputs.llama_stack_commit_sha || 'main' }}
-    strategy:
-      matrix:
-        platform: [linux/amd64] # TODO: enable other arch once all pip packages are available.


Why is this removed?

nathan-weinberg · 2025-12-08T18:24:17Z

.github/workflows/redhat-distro-container.yml

        with:
          python-version: 3.12
          version: 0.7.6
+          enable-cache: false


nathan-weinberg · 2025-12-08T18:24:54Z

.github/workflows/redhat-distro-container.yml

          context: .
          file: distribution/Containerfile
-          platforms: ${{ matrix.platform }}
+          platforms: linux/amd64


Why have you removed the matrix?

nathan-weinberg · 2025-12-08T18:27:01Z

.github/workflows/redhat-distro-container.yml

-name: Build, test, and publish Red Hat Distribution Containers
+name: Build and publish Red Hat Distribution Containers

 on:


There are several mentions of workflow_dispatch later in this file, but it's not exposed as an option here? Unless I am missing something?

nathan-weinberg · 2025-12-08T18:27:51Z

.github/workflows/redhat-distro-container.yml


      - name: Output custom build information
-        if: contains(fromJSON('["workflow_dispatch", "schedule"]'), github.event_name)
+        if: github.event_name == 'workflow_dispatch'


Why only for this event?

nathan-weinberg · 2025-12-08T18:30:05Z

.github/workflows/run-integration-tests.yml

+        with:
+          python-version: 3.12
+          version: 0.7.6
+          enable-cache: false


Why no cache?

nathan-weinberg · 2025-12-08T18:32:02Z

.github/workflows/run-integration-tests.yml

+          echo "Running smoke test for vLLM model: $VLLM_INFERENCE_MODEL"
+          INFERENCE_MODEL="$VLLM_INFERENCE_MODEL" ./tests/smoke.sh


Can't we just use the same env var names? This seems unnecessarily complex

nathan-weinberg · 2025-12-08T18:33:14Z

.github/workflows/run-integration-tests.yml

+      # ============================================
+
+      - name: Gather logs and debugging information
+        if: always() && github.event_name != 'workflow_dispatch'


You're using always() but also a condition?

nathan-weinberg · 2025-12-08T18:33:25Z

.github/workflows/run-integration-tests.yml

+          fi
+
+      - name: Upload logs as artifacts
+        if: always() && github.event_name != 'workflow_dispatch'


always and a condition?

nathan-weinberg · 2025-12-08T18:33:28Z

.github/workflows/run-integration-tests.yml

+          retention-days: 7
+
+      - name: Cleanup containers
+        if: always() && github.event_name != 'workflow_dispatch'


always and a condition?

nathan-weinberg · 2025-12-08T18:34:03Z

Also, 44 commits is crazy and needs to be squashed

mergify · 2025-12-08T18:35:50Z

This pull request has merge conflicts that must be resolved before it can be merged. @Artemon-line please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…e test execution

…tex-AI' of github.com:opendatahub-io/llama-stack-distribution into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

docs/live-tests-guide.md (1)

44-44: INFERENCE_MODEL format contradicts prior review feedback: clarify whether provider prefix should be included.

The documentation provides examples with provider prefixes (e.g., vllm-inference/Qwen/Qwen3-0.6B at line 44 and vertexai/google/gemini-2.0-flash at line 62), and line 69 explicitly states the script expects this format. However, a prior review comment flagged these exact examples as incorrect and recommended removing the provider prefix. This discrepancy needs resolution.

If the prefix should be removed, update all examples at lines 44, 62, 214, and 279 to match the intended format. If the prefix is correct, ensure this aligns with how distribution/run.yaml structures provider and model identifiers.

Also applies to: 62-62, 69-69, 214-214, 279-279

.github/workflows/redhat-distro-container.yml (1)

37-47: Confirm that head_sha fallback behavior is intentional and acceptable for your workflow reliability.

The workflow uses github.event.workflow_run.head_sha in multiple locations (lines 23, 44, 47, 55) with fallbacks to 'main' or github.sha. A prior review flagged that workflow_run event payloads do not guarantee head_sha is present, even when conclusion == 'success'.

If head_sha is missing and the fallback to 'main' is triggered, the workflow will build and publish an image from the main branch instead of the commit that triggered the test workflow. This could result in publishing an untested image to production.

Recommended actions:

Explicit verification: Add a check to verify head_sha is present before proceeding; fail or warn if missing

Use GitHub API: Call the Actions API to fetch the canonical workflow run data including a verified head_sha

Document fallback policy: If the 'main' fallback is acceptable for your use case, add explicit comments explaining why and document when this fallback may be triggered

🧹 Nitpick comments (2)

docs/live-tests-guide.md (2)

227-227: Hyphenate compound adjective.

Line 227 should read log-gathering instead of log gathering to form a compound adjective modifying "steps".

169-169: Avoid emphasis in place of headings (markdown style).

Lines 169 and 194 use emphasis (**Standard Pattern**, **Exception**) to mark section-like content. Consider converting these to proper markdown headings (#### Standard Pattern, #### Exception) for better document structure and accessibility.

Also applies to: 194-194

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b9e361 and 36dafa2.

📒 Files selected for processing (3)

.github/workflows/redhat-distro-container.yml (4 hunks)
.github/workflows/run-integration-tests.yml (1 hunks)
docs/live-tests-guide.md (1 hunks)

🧰 Additional context used

🪛 LanguageTool

docs/live-tests-guide.md

[uncategorized] ~10-~10: The official name of this software platform is spelled with a capital “H”.
Context: ...th providers are tested sequentially in .github/workflows/run-integration-tests.yml. A...

(GITHUB)

[uncategorized] ~73-~73: The official name of this software platform is spelled with a capital “H”.
Context: ...ni-2.0-flash). ## CI/CD Workflow The .github/workflows/run-integration-tests.yml` wo...

(GITHUB)

[uncategorized] ~73-~73: The official name of this software platform is spelled with a capital “H”.
Context: ...ts for both providers sequentially. The .github/workflows/redhat-distro-container.yml ...

(GITHUB)

[uncategorized] ~126-~126: The official name of this software platform is spelled with a capital “H”.
Context: ...dd Environment Variables Location: .github/workflows/run-integration-tests.yml, i...

(GITHUB)

[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ... Configuration Variables Location: .github/workflows/run-integration-tests.yml, i...

(GITHUB)

[uncategorized] ~157-~157: The official name of this software platform is spelled with a capital “H”.
Context: ... Setup Steps (if needed) Location: .github/workflows/run-integration-tests.yml, i...

(GITHUB)

[uncategorized] ~188-~188: The official name of this software platform is spelled with a capital “H”.
Context: ...ication or configuration - Always check github.event_name != 'workflow_dispatch' to s...

(GITHUB)

[uncategorized] ~201-~201: The official name of this software platform is spelled with a capital “H”.
Context: ...anup step See the vLLM implementation (.github/workflows/run-integration-tests.yml li...

(GITHUB)

[uncategorized] ~205-~205: The official name of this software platform is spelled with a capital “H”.
Context: ...d Integration Tests Step Location: .github/workflows/run-integration-tests.yml, a...

(GITHUB)

[grammar] ~227-~227: Use a hyphen to join words.
Context: ...d changes Note: The cleanup and log gathering steps are already configured f...

(QB_NEW_EN_HYPHEN)

[uncategorized] ~294-~294: The official name of this software platform is spelled with a capital “H”.
Context: ...dd test section - Example: Vertex AI (.github/workflows/run-integration-tests.yml li...

(GITHUB)

[uncategorized] ~302-~302: The official name of this software platform is spelled with a capital “H”.
Context: ...running as a sidecar - Example: vLLM (.github/workflows/run-integration-tests.yml li...

(GITHUB)

[uncategorized] ~323-~323: The official name of this software platform is spelled with a capital “H”.
Context: ... - Remote Service):** - Authentication: .github/workflows/run-integration-tests.yml li...

(GITHUB)

[uncategorized] ~324-~324: The official name of this software platform is spelled with a capital “H”.
Context: ...ent section) - Credential verification: .github/workflows/run-integration-tests.yml li...

(GITHUB)

[uncategorized] ~325-~325: The official name of this software platform is spelled with a capital “H”.
Context: ...ymllines 195-203 - Integration tests:.github/workflows/run-integration-tests.yml` li...

(GITHUB)

[uncategorized] ~330-~330: The official name of this software platform is spelled with a capital “H”.
Context: ...on - Local Sidecar):** - Service setup: .github/workflows/run-integration-tests.yml li...

(GITHUB)

[uncategorized] ~331-~331: The official name of this software platform is spelled with a capital “H”.
Context: ...2 (in deployment section) - Validation: .github/workflows/run-integration-tests.yml li...

(GITHUB)

[uncategorized] ~332-~332: The official name of this software platform is spelled with a capital “H”.
Context: ...tests.ymllines 155-167 - Smoke tests:.github/workflows/run-integration-tests.yml` li...

(GITHUB)

[uncategorized] ~333-~333: The official name of this software platform is spelled with a capital “H”.
Context: ...ymllines 169-174 - Integration tests:.github/workflows/run-integration-tests.yml` li...

(GITHUB)

[uncategorized] ~334-~334: The official name of this software platform is spelled with a capital “H”.
Context: ...ion-tests.ymllines 176-182 - Cleanup:.github/workflows/run-integration-tests.yml` li...

(GITHUB)

🪛 markdownlint-cli2 (0.18.1)

docs/live-tests-guide.md

169-169: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

194-194: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Summary

🔇 Additional comments (3)

.github/workflows/run-integration-tests.yml (2)

144-193: Sequential testing with conditional guards is well-structured.

The workflow properly implements:

Conditional vLLM startup (line 89) only for PR/push events

Smoke tests with separate guards for vLLM (always) and Vertex AI (only when credentials present)

Sequential integration tests with appropriate skip logic

Clear separation between provider test phases

The prior critical issue about Vertex AI smoke test failures when credentials are missing has been properly resolved via conditional checks at lines 158–163.

199-244: Cleanup and logging strategy is sound.

The use of always() combined with the github.event_name != 'workflow_dispatch' condition correctly ensures logs are gathered and containers cleaned up even on test failure, while avoiding unnecessary steps when containers were never started.

.github/workflows/redhat-distro-container.yml (1)

92-104: Image tag determination logic is clear and well-scoped.

The dynamic tag determination correctly differentiates between workflow_dispatch (source format) and workflow_run (commit-based tags with branch variants). Logic is readable and easy to maintain.

…est-for-Vertex-AI

Artemon-line · 2025-12-09T15:58:51Z

f this is working, I propose we merge it and iterate, so that we make further reviews easier. Smaller changes are easier to review.

Artemon-line · 2025-12-09T15:59:02Z

Agreed, closing this one and splitting everything into smaller PRs. Thanks for the review.

Artemon-line added 4 commits November 25, 2025 17:22

Artemon-line changed the title ~~RHAIENG-1793: Create ODH distro image smoke test for Vertex AI~~ feat: Create ODH distro image smoke test for Vertex AI Nov 26, 2025

Artemon-line added 7 commits November 26, 2025 13:08

chore: remove example secrets and refactor workflows

c4bb419

Signed-off-by: [Your Name] <your.email@example.com> Signed-off-by: Artemy <ahladenk@redhat.com>

chore: remove unnecessary blank line in extract-llama-stack-info.sh

dbc2170

Signed-off-by: Artemy <ahladenk@redhat.com>

chore: streamline live tests workflow by removing unnecessary steps

366250d

Signed-off-by: Artemy <ahladenk@redhat.com>

Merge branch 'main' of github.com:opendatahub-io/llama-stack-distribu…

1aea53d

…tion into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI Signed-off-by: Artemy <ahladenk@redhat.com>

fix: update Vertex AI recordings check in workflow

e6c9304

This change ensures that Vertex AI tests are only executed when recordings are present, streamlining the testing process. Signed-off-by: Artemy <ahladenk@redhat.com>

feat: add live CI tests workflow for Llama Stack

33f3e86

Signed-off-by: Artemy <ahladenk@redhat.com>

ktdreyer requested changes Dec 1, 2025

View reviewed changes

tests/run_integration_tests.sh Outdated Show resolved Hide resolved

tests/run_integration_tests.sh Outdated Show resolved Hide resolved

scripts/run-live-tests-local.sh Outdated Show resolved Hide resolved

mergify bot added the needs-rebase label Dec 1, 2025

Artemon-line self-assigned this Dec 2, 2025

Artemon-line added 2 commits December 2, 2025 10:23

Merge branch 'main' of github.com:opendatahub-io/llama-stack-distribu…

dc86754

…tion into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI

mergify bot removed the needs-rebase label Dec 2, 2025

Artemon-line added 8 commits December 2, 2025 12:36

chore: update Llama Stack version in README

9479da7

chore: remove obsolete Vertex secrets check from CI workflow

bd4bddb

chore: enhance CI workflow for Vertex AI and update live tests docume…

e09518c

…ntation

chore: update CI workflows to streamline Red Hat distribution contain…

52f998d

…er builds and add integration tests

chore: update CI workflow for integration tests and remove obsolete s…

5ee8a6e

…cript

chore: update live tests guide and remove obsolete local test script

c050543

chore: update Llama Stack version in README to 7c0bb39

661ad57

feat: add setup-llama-stack action for container management

8f1e9be

mergify bot added the needs-rebase label Dec 2, 2025

chore: enhance setup-vllm action to wait for container readiness and …

59db223

…update integration tests workflow to streamline model inference testing

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

chore: refine integration tests workflow by renaming VLLM startup ste…

c408bc3

…p, removing unnecessary conditions, and cleaning up container resources post-tests

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

.github/workflows/run-integration-tests.yml Outdated Show resolved Hide resolved

Artemon-line added 2 commits December 4, 2025 14:09

chore: remove Llama Stack container verification from smoke test scri…

7d587be

…pt to streamline testing process

chore: update integration tests workflow to run vLLM smoke tests unco…

dff2183

…nditionally and conditionally execute Vertex AI tests based on credential availability

Artemon-line requested a review from nathan-weinberg December 4, 2025 14:58

Merge branch 'main' into RHAIENG-1793-Create-ODH-distro-image-smoke-t…

0fd2176

…est-for-Vertex-AI

kami619 reviewed Dec 4, 2025

View reviewed changes

.github/workflows/redhat-distro-container.yml Outdated Show resolved Hide resolved

Artemon-line added 2 commits December 5, 2025 14:44

chore: enhance redhat-distro-container workflow by dynamically determ…

c942051

…ining image tags based on event type, improving tagging logic for better version management

Merge branch 'RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Ver…

f936fae

…tex-AI' of github.com:opendatahub-io/llama-stack-distribution into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI

coderabbitai bot reviewed Dec 5, 2025

View reviewed changes

ktdreyer approved these changes Dec 5, 2025

View reviewed changes

Artemon-line added 2 commits December 8, 2025 16:09

Merge branch 'main' into RHAIENG-1793-Create-ODH-distro-image-smoke-t…

6d3ab30

…est-for-Vertex-AI

Merge branch 'main' into RHAIENG-1793-Create-ODH-distro-image-smoke-t…

8b9e361

…est-for-Vertex-AI

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

nathan-weinberg requested changes Dec 8, 2025

View reviewed changes

mergify bot added the needs-rebase label Dec 8, 2025

Artemon-line added 2 commits December 9, 2025 14:48

chore: update workflows to improve architecture support and streamlin…

1d12a3f

…e test execution

Merge branch 'RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Ver…

36dafa2

…tex-AI' of github.com:opendatahub-io/llama-stack-distribution into RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI

coderabbitai bot reviewed Dec 9, 2025

View reviewed changes

Merge branch 'main' into RHAIENG-1793-Create-ODH-distro-image-smoke-t…

27f427f

…est-for-Vertex-AI

mergify bot removed the needs-rebase label Dec 9, 2025

Artemon-line closed this Dec 9, 2025

leseb deleted the RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Vertex-AI branch January 6, 2026 10:19

		echo "Running smoke test for vLLM model: $VLLM_INFERENCE_MODEL"
		INFERENCE_MODEL="$VLLM_INFERENCE_MODEL" ./tests/smoke.sh

Conversation

Artemon-line commented Nov 25, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Key Changes

Architecture

Test Plan

Local Testing

CI Testing

Required Secrets

Files Changed

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

ktdreyer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Dec 1, 2025

Uh oh!

mergify bot commented Dec 2, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

ktdreyer commented Dec 5, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

nathan-weinberg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nathan-weinberg commented Dec 8, 2025

Uh oh!

mergify bot commented Dec 8, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Artemon-line commented Nov 25, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 25, 2025 •

edited

Loading