Skip to content
Closed
Show file tree
Hide file tree
Changes from 43 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
898c61a
feat: enhance integration testing workflow for Vertex AI
Artemon-line Nov 25, 2025
071e64f
chore: clean up live tests workflow and update documentation
Artemon-line Nov 25, 2025
5cd3c11
refactor: update live tests workflow and enhance documentation
Artemon-line Nov 25, 2025
b89d2f0
docs: update live tests guide and local script
Artemon-line Nov 25, 2025
c4bb419
chore: remove example secrets and refactor workflows
Artemon-line Nov 26, 2025
dbc2170
chore: remove unnecessary blank line in extract-llama-stack-info.sh
Artemon-line Nov 26, 2025
56d5e8f
chore: specify shell type in workflow files
Artemon-line Nov 26, 2025
366250d
chore: streamline live tests workflow by removing unnecessary steps
Artemon-line Nov 26, 2025
1aea53d
Merge branch 'main' of github.com:opendatahub-io/llama-stack-distribu…
Artemon-line Nov 26, 2025
e6c9304
fix: update Vertex AI recordings check in workflow
Artemon-line Nov 26, 2025
33f3e86
feat: add live CI tests workflow for Llama Stack
Artemon-line Nov 28, 2025
0f3303a
refactor: update CI workflows and improve live tests documentation
Artemon-line Dec 2, 2025
dc86754
Merge branch 'main' of github.com:opendatahub-io/llama-stack-distribu…
Artemon-line Dec 2, 2025
9479da7
chore: update Llama Stack version in README
Artemon-line Dec 2, 2025
bd4bddb
chore: remove obsolete Vertex secrets check from CI workflow
Artemon-line Dec 2, 2025
e09518c
chore: enhance CI workflow for Vertex AI and update live tests docume…
Artemon-line Dec 2, 2025
52f998d
chore: update CI workflows to streamline Red Hat distribution contain…
Artemon-line Dec 2, 2025
5ee8a6e
chore: update CI workflow for integration tests and remove obsolete s…
Artemon-line Dec 2, 2025
c050543
chore: update live tests guide and remove obsolete local test script
Artemon-line Dec 2, 2025
661ad57
chore: update Llama Stack version in README to 7c0bb39
Artemon-line Dec 2, 2025
8f1e9be
feat: add setup-llama-stack action for container management
Artemon-line Dec 2, 2025
470027e
chore: update llama_stack_provider_trustyai_fms version to 0.3.1 in b…
Artemon-line Dec 2, 2025
319c58e
chore: improve Llama Stack health check logic and update integration …
Artemon-line Dec 2, 2025
1a9391d
chore: refactor integration test workflow to use sequential execution…
Artemon-line Dec 3, 2025
a8872db
Merge branch 'main' of github.com:opendatahub-io/llama-stack-distribu…
Artemon-line Dec 3, 2025
13fc1ea
chore: update Python version in pre-commit configuration to 3.14; add…
Artemon-line Dec 3, 2025
04c3079
Merge branch 'main' into RHAIENG-1793-Create-ODH-distro-image-smoke-t…
Artemon-line Dec 3, 2025
00e6034
chore: enhance setup-llama-stack action with improved GCP credentials…
Artemon-line Dec 3, 2025
6bbc381
fix: remove unnecessary whitespace in setup-llama-stack action YAML file
Artemon-line Dec 3, 2025
1e51609
chore: simplify Cloud SDK setup in integration tests workflow by remo…
Artemon-line Dec 3, 2025
41c7913
chore: improve GCP credentials handling in setup-llama-stack action a…
Artemon-line Dec 3, 2025
4418ae4
chore: refactor integration tests workflow to utilize a matrix strate…
Artemon-line Dec 3, 2025
8ba39a8
chore: update integration tests workflow to run vLLM and Vertex AI te…
Artemon-line Dec 3, 2025
7274d14
chore: disable caching in Python setup for redhat-distro-container an…
Artemon-line Dec 3, 2025
6f88ffc
chore: update caching configuration in Python setup for redhat-distro…
Artemon-line Dec 3, 2025
59db223
chore: enhance setup-vllm action to wait for container readiness and …
Artemon-line Dec 4, 2025
c408bc3
chore: refine integration tests workflow by renaming VLLM startup ste…
Artemon-line Dec 4, 2025
7d587be
chore: remove Llama Stack container verification from smoke test scri…
Artemon-line Dec 4, 2025
dff2183
chore: update integration tests workflow to run vLLM smoke tests unco…
Artemon-line Dec 4, 2025
0fd2176
Merge branch 'main' into RHAIENG-1793-Create-ODH-distro-image-smoke-t…
Artemon-line Dec 4, 2025
c942051
chore: enhance redhat-distro-container workflow by dynamically determ…
Artemon-line Dec 5, 2025
f936fae
Merge branch 'RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Ver…
Artemon-line Dec 5, 2025
6d3ab30
Merge branch 'main' into RHAIENG-1793-Create-ODH-distro-image-smoke-t…
Artemon-line Dec 8, 2025
8b9e361
Merge branch 'main' into RHAIENG-1793-Create-ODH-distro-image-smoke-t…
Artemon-line Dec 8, 2025
1d12a3f
chore: update workflows to improve architecture support and streamlin…
Artemon-line Dec 9, 2025
36dafa2
Merge branch 'RHAIENG-1793-Create-ODH-distro-image-smoke-test-for-Ver…
Artemon-line Dec 9, 2025
27f427f
Merge branch 'main' into RHAIENG-1793-Create-ODH-distro-image-smoke-t…
Artemon-line Dec 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 126 additions & 0 deletions .github/actions/setup-llama-stack/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
name: Setup Llama Stack
description: Start Llama Stack container and wait for it to be ready
inputs:
image_name:
description: 'Container image name'
required: true
image_tag:
description: 'Container image tag'
required: true
inference_model:
description: 'Inference model name'
required: true
embedding_model:
description: 'Embedding model name'
required: true
vllm_url:
description: 'VLLM URL (for vLLM provider)'
required: false
default: ''
vertex_ai_project:
description: 'Vertex AI project ID (for Vertex AI provider)'
required: false
default: ''
vertex_ai_location:
description: 'Vertex AI location (for Vertex AI provider)'
required: false
default: 'us-central1'
runs:
using: "composite"
steps:
- name: Start Llama Stack container
shell: bash
env:
IMAGE_NAME: ${{ inputs.image_name }}
IMAGE_TAG: ${{ inputs.image_tag }}
INFERENCE_MODEL: ${{ inputs.inference_model }}
EMBEDDING_MODEL: ${{ inputs.embedding_model }}
VLLM_URL: ${{ inputs.vllm_url }}
VERTEX_AI_PROJECT: ${{ inputs.vertex_ai_project }}
VERTEX_AI_LOCATION: ${{ inputs.vertex_ai_location }}
run: |
# Start llama stack container
# Build docker run command with conditional environment variables
DOCKER_ENV_ARGS=(
--env INFERENCE_MODEL="$INFERENCE_MODEL"
--env EMBEDDING_MODEL="$EMBEDDING_MODEL"
--env TRUSTYAI_LMEVAL_USE_K8S=False
)

# Add VLLM_URL only if defined and non-empty
if [ -n "$VLLM_URL" ]; then
DOCKER_ENV_ARGS+=(--env VLLM_URL="$VLLM_URL")
fi

# Add VERTEX_AI_PROJECT only if defined and non-empty
if [ -n "$VERTEX_AI_PROJECT" ]; then
DOCKER_ENV_ARGS+=(--env VERTEX_AI_PROJECT="$VERTEX_AI_PROJECT")
fi

# Add VERTEX_AI_LOCATION only if defined and non-empty
if [ -n "$VERTEX_AI_LOCATION" ]; then
DOCKER_ENV_ARGS+=(--env VERTEX_AI_LOCATION="$VERTEX_AI_LOCATION")
fi

# Mount GCP credentials if they exist (for Vertex AI)
# Mount to /run/secrets/gcp-credentials to match local Podman setup
# Use GOOGLE_APPLICATION_CREDENTIALS env var if set (from google-github-actions/auth with create_credentials_file: true)
# Otherwise fall back to standard location
DOCKER_VOLUME_ARGS=()
if [ -n "$GOOGLE_APPLICATION_CREDENTIALS" ] && [ -f "$GOOGLE_APPLICATION_CREDENTIALS" ]; then
# Use the credentials file path from GOOGLE_APPLICATION_CREDENTIALS env var
GCP_CREDENTIALS_FILE="$GOOGLE_APPLICATION_CREDENTIALS"
elif [ -f "$HOME/.config/gcloud/application_default_credentials.json" ]; then
# Fall back to standard location
GCP_CREDENTIALS_FILE="$HOME/.config/gcloud/application_default_credentials.json"
else
GCP_CREDENTIALS_FILE=""
fi

if [ -n "$GCP_CREDENTIALS_FILE" ] && [ -f "$GCP_CREDENTIALS_FILE" ]; then
# Mount credentials file to /run/secrets/gcp-credentials (matching local Podman setup)
DOCKER_VOLUME_ARGS=(
-v "$GCP_CREDENTIALS_FILE:/run/secrets/gcp-credentials:ro"
)
DOCKER_ENV_ARGS+=(--env GOOGLE_APPLICATION_CREDENTIALS="/run/secrets/gcp-credentials")
echo "Mounting GCP credentials to /run/secrets/gcp-credentials for Vertex AI support"
echo "Credentials file: $GCP_CREDENTIALS_FILE"
else
echo "Warning: GCP credentials file not found"
echo "Checked GOOGLE_APPLICATION_CREDENTIALS: ${GOOGLE_APPLICATION_CREDENTIALS:-not set}"
echo "Checked standard location: $HOME/.config/gcloud/application_default_credentials.json"
echo "Vertex AI authentication may fail"
fi

docker run \
-d \
--pull=never \
--net=host \
-p 8321:8321 \
"${DOCKER_VOLUME_ARGS[@]}" \
"${DOCKER_ENV_ARGS[@]}" \
--name llama-stack \
"$IMAGE_NAME:$IMAGE_TAG"
echo "Started Llama Stack container..."

- name: Wait for Llama Stack to be ready
shell: bash
run: |
# Wait for llama stack to be ready by doing a health check
echo "Waiting for Llama Stack server..."
for i in {1..60}; do
echo "Attempt $i/60 to connect to Llama Stack..."
if resp=$(curl -fsS http://127.0.0.1:8321/v1/health 2>/dev/null); then
if [ "$resp" == '{"status":"OK"}' ]; then
echo "Llama Stack server is up!"
exit 0
fi
else
echo "Connection failed, retrying in 1 second..."
fi
sleep 1
done
echo "Llama Stack server failed to start after 60 attempts :("
echo "Container logs:"
docker logs llama-stack || true
exit 1
26 changes: 18 additions & 8 deletions .github/actions/setup-vllm/action.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
name: Setup VLLM
description: Start VLLM
description: Start VLLM container and wait for it to be ready
runs:
using: "composite"
steps:
- name: Start VLLM
shell: bash
run: |
# Start vllm container
# Start vllm container in background (non-blocking)
docker run -d \
--name vllm \
--privileged=true \
Expand All @@ -19,10 +19,20 @@ runs:
--model /root/.cache/Qwen3-0.6B \
--served-model-name Qwen/Qwen3-0.6B \
--max-model-len 8192
echo "vLLM container started (waiting for it to be ready)..."

# Wait for vllm to be ready
echo "Waiting for vllm to be ready..."
timeout 900 bash -c 'until curl -fsS http://localhost:8000/health >/dev/null; do
echo "Waiting for vllm..."
sleep 5
done'
- name: Wait for VLLM to be ready
shell: bash
run: |
echo "Validating vLLM is ready..."
for i in {1..60}; do
if curl -fsS http://localhost:8000/health >/dev/null 2>&1; then
echo "vLLM is ready!"
exit 0
fi
echo "Waiting for vLLM... ($i/60)"
sleep 2
done
echo "vLLM failed to start after 120 seconds"
docker logs vllm || true
exit 1
Loading