Skip to content

ci: add multi-arch matrix with MaaS vLLM for ARM64 testing#287

Open
Artemon-line wants to merge 3 commits intomainfrom
feat/maas-vllm-arm64-ci
Open

ci: add multi-arch matrix with MaaS vLLM for ARM64 testing#287
Artemon-line wants to merge 3 commits intomainfrom
feat/maas-vllm-arm64-ci

Conversation

@Artemon-line
Copy link
Collaborator

@Artemon-line Artemon-line commented Mar 20, 2026

Summary

  • Add matrix strategy for [amd64, arm64] with native GitHub runners (ubuntu-latest / ubuntu-24.04-arm)
  • Use MaaS vLLM endpoints for inference/embedding tests on origin PRs (both arches)
  • Keep local vllm-cpu fallback for fork PRs on amd64
  • Fork arm64 PRs are build-only (no runtime tests, MaaS secrets unavailable)
  • Add per-arch cache scoping, artifact names, and graceful cleanup
  • Add ARM64 support section to README documenting the fork PR coverage gap
  • Skip upstream-broken tests (test_openai_completion_logprobs, test_openai_embeddings_with_encoding_format_base64)

Supersedes #252
Implements RHAIENG-3509

How publish works

The publish step is gated on matrix.arch == 'amd64' so it runs only once (not duplicated across matrix legs). It does a fresh multi-arch build with platforms: linux/amd64,linux/arm64, leveraging GHA cache from both the amd64 and arm64 build steps, and pushes a single multi-arch manifest. Both architectures are published in one operation.

Test plan

  • Verify CI runs on both amd64 and arm64 with MaaS endpoints (origin PR)
  • Verify smoke test passes with MaaS inference (llama-3-2-3b) and embedding (nomic-embed-text-v1-5)
  • Verify integration tests pass on both arches
  • Verify integration tests pass locally (vLLM: 21 passed, OpenAI: 40 passed, 0 failed)
  • Verify publish step runs only on amd64 (arm64 job shows it as "skipped") -- post-merge: triggers on push to main or workflow_dispatch
  • Verify fork PR behavior: amd64 uses local vllm-cpu, arm64 is build-only -- post-merge: requires a fork PR targeting main after the new workflow is on main (GitHub Actions runs workflow from the merge commit, so fork detection logic must be on the base branch first)

Skipped upstream tests

Test Reason
test_openai_embeddings_with_encoding_format_base64 nomic-embed-text-v1-5 doesn't support dimensions param (always returns 768)
test_openai_completion_logprobs Llama Stack /v1/completions validates logprobs as bool instead of int per OpenAI spec

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 20, 2026

📝 Walkthrough

Walkthrough

The PR adds multi-architecture CI for amd64 and arm64 using a matrix-driven GitHub Actions workflow, with per-arch runners, Buildx platform selection, and per-arch cache scopes. vLLM/Containerfile generation and publishing steps are gated to amd64; MaaS-based vLLM/embedding endpoints and tokens can override local vLLM settings via environment writes. Smoke and integration test scripts were updated to conditionally inject vLLM/embedding envs and to skip specific integration tests. README documents ARM64 support and differences in fork vs origin testing. Docker cleanup and artifact naming were made arch-aware.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Security Review

CWE-532 (Insertion of Sensitive Information into Log Files)

  • Observed: workflow writes MaaS tokens into GITHUB_ENV and may emit them during step execution. Mitigation: call ::add-mask:: for each token before writing to GITHUB_ENV and avoid echoing token values. Example fix:
    • run: |
      echo "::add-mask::$MAAS_VLLM_API_TOKEN"
      echo "::add-mask::$MAAS_EMBEDDING_API_TOKEN"
      echo "VLLM_API_TOKEN=$MAAS_VLLM_API_TOKEN" >> $GITHUB_ENV
      echo "VLLM_EMBEDDING_API_TOKEN=$MAAS_EMBEDDING_API_TOKEN" >> $GITHUB_ENV

CWE-798 (Use of Hard-coded Credentials) / Incomplete validation of sensitive inputs

  • Observed: workflow can write empty or partial token variables (e.g., URL set without API token), and the smoke script may inject tokens even when embedding URL is unset. Mitigation: validate required pairs before exporting to GITHUB_ENV; fail-fast if a URL is set without its corresponding token when token is required. Add explicit guards:
    • if MAAS_VLLM_URL is set then require non-empty MAAS_VLLM_API_TOKEN or explicitly document safe anonymous usage.
    • only export VLLM_API_TOKEN when non-empty.

CWE-200 (Exposure of Sensitive Information to an Unauthorized Actor)

  • Observed: fork PRs may run parts of the workflow where secrets could be injected. Mitigation: ensure workflow prevents secret use in forks by gating writes/uses of secrets with checks such as github.event.pull_request.head.repo.full_name == github.repository and only run secret-dependent steps on origin or on push to protected branches. Where secrets must not be used in forks, explicitly skip those steps.

Action items (must be implemented or documented):

  1. Add ::add-mask:: for every token before any use or write to GITHUB_ENV.
  2. Add validation steps to fail early if required token/URL pairs are inconsistent.
  3. Add explicit conditional guards preventing secrets from being written/used in forked PR contexts.
  4. Avoid echoing secret values anywhere in workflow logs; prefer writing to GITHUB_ENV only after masking.

No other immediate CWE/CVE items were identified in the diff; focus review on ensuring secrets handling and conditional gating are implemented exactly as described above.

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly summarizes the main architectural change: adding a multi-arch CI matrix strategy with MaaS vLLM support for ARM64 testing, which aligns with the primary modifications across the workflow and supporting files.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/smoke.sh (1)

3-3: ⚠️ Potential issue | 🟡 Minor

Missing -e flag in shell options (CWE-755: Improper Handling of Exceptional Conditions).

Per coding guidelines, shell scripts should use set -euo pipefail. Without -e, commands can fail silently and the script continues, potentially masking errors.

Proposed fix
-set -uo pipefail
+set -euo pipefail
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/smoke.sh` at line 3, The script currently uses the shell options token
"set -uo pipefail" which omits the -e (exit-on-error) flag; update the options
token to include -e so it reads "set -euo pipefail" (i.e., modify the "set -uo
pipefail" line) to ensure the script exits immediately on any command failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/redhat-distro-container.yml:
- Around line 153-166: The run block "Configure MaaS vLLM endpoints" currently
interpolates secrets directly via ${{ env.MAAS_* }}, which can allow shell
metacharacter injection; instead expose those secrets via the job/step env and
write them into GITHUB_ENV using the shell environment variables (e.g.
MAAS_VLLM_URL, MAAS_VLLM_API_TOKEN, MAAS_EMBEDDING_URL,
MAAS_EMBEDDING_API_TOKEN) so the secret values are not expanded in the workflow
YAML; in the step, reference the runtime env vars and append each key/value to
$GITHUB_ENV using a safe, quoted write (e.g. printf or echo with quoting) for
VLLM_URL, VLLM_API_TOKEN, VLLM_EMBEDDING_URL, VLLM_EMBEDDING_API_TOKEN and the
fixed EMBEDDING/INFERENCE model entries to avoid any direct workflow
interpolation.

---

Outside diff comments:
In `@tests/smoke.sh`:
- Line 3: The script currently uses the shell options token "set -uo pipefail"
which omits the -e (exit-on-error) flag; update the options token to include -e
so it reads "set -euo pipefail" (i.e., modify the "set -uo pipefail" line) to
ensure the script exits immediately on any command failure.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 47b922af-9c83-45dd-bd6d-7d35500138ef

📥 Commits

Reviewing files that changed from the base of the PR and between 24c67ac and 4473929.

📒 Files selected for processing (3)
  • .github/workflows/redhat-distro-container.yml
  • README.md
  • tests/smoke.sh

@Artemon-line Artemon-line marked this pull request as draft March 20, 2026 16:41
Add matrix strategy for [amd64, arm64] with native GitHub runners.
Use MaaS vLLM endpoints for inference/embedding tests on origin PRs,
keeping local vllm-cpu fallback for fork PRs on amd64. Fork arm64
PRs are build-only (no runtime tests, since MaaS secrets unavailable).

Changes:
- Add matrix strategy with ubuntu-latest (amd64) and ubuntu-24.04-arm (arm64)
- Add MaaS vLLM/embedding endpoint secrets for remote inference
- Make smoke.sh env vars conditional (MaaS vs local vllm vs sentence-transformers)
- Gate publish, Slack, and custom build steps on amd64 only
- Add per-arch cache scoping and artifact names
- Add ARM64 support section to README with fork PR coverage gap note
- Map secrets to intermediate env vars in step-level env blocks (security)
- Fix Vertex AI model path (publishers/google/models/gemini-2.0-flash)
- Fix Vertex AI location (global)

Supersedes PR #252.
Implements RHAIENG-3509.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Artemy <ahladenk@redhat.com>
@Artemon-line Artemon-line force-pushed the feat/maas-vllm-arm64-ci branch from 4d0ecc5 to e883da9 Compare March 23, 2026 11:19
- test_openai_embeddings_with_encoding_format_base64: nomic-embed-text-v1-5
  doesn't support dimensions param (always returns 768)
- test_openai_completion_logprobs: upstream /v1/completions validates
  logprobs as bool instead of int per OpenAI spec

Verified locally: vLLM 21 passed, OpenAI 40 passed, 0 failed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Artemy <ahladenk@redhat.com>
@Artemon-line Artemon-line force-pushed the feat/maas-vllm-arm64-ci branch from e883da9 to e705ffd Compare March 23, 2026 12:23
@Artemon-line Artemon-line marked this pull request as ready for review March 23, 2026 13:25
@Artemon-line Artemon-line requested review from a team and rhdedgar March 23, 2026 13:25
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
.github/workflows/redhat-distro-container.yml (2)

108-123: ⚠️ Potential issue | 🟠 Major

Scheduled arm64 runs are validating the wrong source revision.

On schedule, Lines 186-200 still run smoke/integration on arm64 when MaaS is set, but Line 109 regenerates distribution/Containerfile only in the amd64 leg. The arm64 nightly lane therefore tests the checked-in Containerfile instead of main. Run this step on arm64 for schedule, or skip the scheduled arm64 leg entirely.

Suggested fix
-        if: matrix.arch == 'amd64' && contains(fromJSON('["workflow_dispatch", "schedule"]'), github.event_name)
+        if: github.event_name == 'schedule' || (matrix.arch == 'amd64' && github.event_name == 'workflow_dispatch')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/redhat-distro-container.yml around lines 108 - 123, The
step "Generate Containerfile to build an image from an arbitrary llama-stack
commit (workflow_dispatch/schedule)" is gated by if: matrix.arch == 'amd64' &&
contains(...), so scheduled arm64 runs use the checked-in Containerfile; remove
the arch restriction so the Containerfile regeneration (the python3
distribution/build.py and sed -i removal of the pip install line) runs for
schedule on arm64 too — change the step's if to only check the event (e.g., if:
contains(fromJSON('["workflow_dispatch","schedule"]'), github.event_name)) or
otherwise add logic to run the step for arm64 when github.event_name ==
'schedule'.

290-298: ⚠️ Potential issue | 🟠 Major

Major: Validate llama_stack_commit_sha input before interpolating in shell commands (CWE-94).

LLAMA_STACK_COMMIT_SHA originates from github.event.inputs.llama_stack_commit_sha (a user-supplied workflow_dispatch input) and is used at lines 294–298 within a run: block. Although the values are quoted, they lack explicit validation. An attacker can inject shell commands via input like main"; echo pwned". The job has access to secrets (OPENAI_API_KEY, MAAS_VLLM_API_TOKEN, etc.), making this exploitable. Validate the input format (reject anything except main or a valid 7–40 character hex commit SHA) before use.

Suggested fix
       - name: Output custom build information
         if: matrix.arch == 'amd64' && contains(fromJSON('["workflow_dispatch", "schedule"]'), github.event_name)
+        env:
+          _LLAMA_STACK_COMMIT_SHA: ${{ env.LLAMA_STACK_COMMIT_SHA }}
         run: |
+          if [[ "$_LLAMA_STACK_COMMIT_SHA" != "main" && ! "$_LLAMA_STACK_COMMIT_SHA" =~ ^[0-9a-fA-F]{7,40}$ ]]; then
+            echo "Invalid llama_stack_commit_sha: must be 'main' or a 7-40 character hex SHA" >&2
+            exit 1
+          fi
           echo "✅ Custom container image built successfully!"
-          echo "📦 Image: ${{ env.IMAGE_NAME }}:source-${{ env.LLAMA_STACK_COMMIT_SHA }}"
-          echo "🔗 Llama Stack commit: ${{ env.LLAMA_STACK_COMMIT_SHA }}"
+          echo "📦 Image: $IMAGE_NAME:source-$_LLAMA_STACK_COMMIT_SHA"
+          echo "🔗 Llama Stack commit: $_LLAMA_STACK_COMMIT_SHA"
           echo ""
           echo "You can pull this image using:"
-          echo "docker pull ${{ env.IMAGE_NAME }}:source-${{ env.LLAMA_STACK_COMMIT_SHA }}"
+          echo "docker pull $IMAGE_NAME:source-$_LLAMA_STACK_COMMIT_SHA"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/redhat-distro-container.yml around lines 290 - 298,
Validate and sanitize the user-supplied LLAMA_STACK_COMMIT_SHA
(github.event.inputs.llama_stack_commit_sha) before interpolating it into the
run block that echoes/pulls the image (where env.LLAMA_STACK_COMMIT_SHA and
env.IMAGE_NAME are used); add an earlier step that checks the input against a
whitelist (allow only the literal "main" or a 7–40 character hex SHA) and fail
the job if it doesn't match, or set a safe default, then export the validated
value into env.LLAMA_STACK_COMMIT_SHA for subsequent steps so the echo/docker
pull lines use a vetted value and cannot be used for shell injection.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/redhat-distro-container.yml:
- Around line 153-171: The workflow step unconditionally sets
EMBEDDING_PROVIDER/EMBEDDING_MODEL to vllm-embedding whenever MAAS_VLLM_URL is
present, causing mixed config if MAAS_EMBEDDING_URL is empty; update the step so
it validates MAAS_EMBEDDING_URL before writing VLLM_EMBEDDING_URL,
VLLM_EMBEDDING_API_TOKEN, EMBEDDING_PROVIDER, EMBEDDING_MODEL and
EMBEDDING_PROVIDER_MODEL_ID into GITHUB_ENV — i.e., only append those
embedding-related env entries when _MAAS_EMBEDDING_URL is non-empty (or else
emit a clear failure), while still setting VLLM_URL, VLLM_API_TOKEN and
VLLM_INFERENCE_MODEL when _MAAS_VLLM_URL exists.
- Around line 60-64: The MAAS secrets (MAAS_VLLM_URL, MAAS_VLLM_API_TOKEN,
MAAS_EMBEDDING_URL, MAAS_EMBEDDING_API_TOKEN) are currently defined at job-level
env and therefore exposed to every step; remove these four variables from the
job-level env block and instead add them only to the "Configure MaaS vLLM
endpoints" step by setting its step-level env to the corresponding secrets
(e.g., MAAS_VLLM_URL: ${{ secrets.MAAS_VLLM_URL }}, etc.); ensure no other steps
reference or inherit these vars so the secrets are only available during that
single step.
- Around line 275-281: The Slack notification is currently evaluated inside the
amd64 matrix leg (the "Set Slack notify status" / "Notify Slack" steps) so
arm64-only failures are missed; instead add a new single fan-in job (e.g.,
"notify-slack") that uses needs to depend on the full matrix job (the job that
produces steps.distribution-changed.outputs.changed) and remove the matrix.arch
gating from the notification logic; in that new job inspect needs.*.result (and
the distribution-changed output via
needs.<matrixJob>.outputs.distribution-changed) to detect any failed
architecture or a successful publish and then run the Slack notify step once
based on those aggregated results rather than using failure() inside a single
matrix leg.

In `@tests/run_integration_tests.sh`:
- Around line 66-68: Remove test_openai_embeddings_with_encoding_format_base64
from the unconditional SKIP_TESTS string and instead append that single test
name to SKIP_TESTS only when the run is actually using the MaaS embedding path;
update the logic around the SKIP_TESTS variable in
tests/run_integration_tests.sh to detect the MaaS run (e.g., check the existing
env/config flags used elsewhere in the script such as EMBEDDING_PROVIDER,
EMBEDDING_ENDPOINT, EMBEDDING_MODEL or any MAAS-specific boolean) and only add
"test_openai_embeddings_with_encoding_format_base64" to SKIP_TESTS in that
MaaS-specific branch so other lanes keep coverage.

---

Outside diff comments:
In @.github/workflows/redhat-distro-container.yml:
- Around line 108-123: The step "Generate Containerfile to build an image from
an arbitrary llama-stack commit (workflow_dispatch/schedule)" is gated by if:
matrix.arch == 'amd64' && contains(...), so scheduled arm64 runs use the
checked-in Containerfile; remove the arch restriction so the Containerfile
regeneration (the python3 distribution/build.py and sed -i removal of the pip
install line) runs for schedule on arm64 too — change the step's if to only
check the event (e.g., if:
contains(fromJSON('["workflow_dispatch","schedule"]'), github.event_name)) or
otherwise add logic to run the step for arm64 when github.event_name ==
'schedule'.
- Around line 290-298: Validate and sanitize the user-supplied
LLAMA_STACK_COMMIT_SHA (github.event.inputs.llama_stack_commit_sha) before
interpolating it into the run block that echoes/pulls the image (where
env.LLAMA_STACK_COMMIT_SHA and env.IMAGE_NAME are used); add an earlier step
that checks the input against a whitelist (allow only the literal "main" or a
7–40 character hex SHA) and fail the job if it doesn't match, or set a safe
default, then export the validated value into env.LLAMA_STACK_COMMIT_SHA for
subsequent steps so the echo/docker pull lines use a vetted value and cannot be
used for shell injection.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: f0604127-4448-46d2-86b9-134827a253ef

📥 Commits

Reviewing files that changed from the base of the PR and between 4473929 and e705ffd.

📒 Files selected for processing (4)
  • .github/workflows/redhat-distro-container.yml
  • README.md
  • tests/run_integration_tests.sh
  • tests/smoke.sh
✅ Files skipped from review due to trivial changes (1)
  • README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/smoke.sh

Comment on lines +60 to +64
# MaaS vLLM endpoints (used for both arches on origin PRs)
MAAS_VLLM_URL: ${{ secrets.MAAS_VLLM_URL }}
MAAS_VLLM_API_TOKEN: ${{ secrets.MAAS_VLLM_API_TOKEN }}
MAAS_EMBEDDING_URL: ${{ secrets.MAAS_EMBEDDING_URL }}
MAAS_EMBEDDING_API_TOKEN: ${{ secrets.MAAS_EMBEDDING_API_TOKEN }}
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the workflow file
fd "redhat-distro-container.yml" -t f

Repository: opendatahub-io/llama-stack-distribution

Length of output: 65


🏁 Script executed:

# Get the file and examine the lines in question
if [ -f ".github/workflows/redhat-distro-container.yml" ]; then
  echo "=== Lines 50-75 (around the secrets segment) ===" && \
  sed -n '50,75p' .github/workflows/redhat-distro-container.yml && \
  echo -e "\n=== Lines 105-125 (around the custom-source segment) ===" && \
  sed -n '105,125p' .github/workflows/redhat-distro-container.yml && \
  echo -e "\n=== Full file structure overview ===" && \
  cat -n .github/workflows/redhat-distro-container.yml | head -200
fi

Repository: opendatahub-io/llama-stack-distribution

Length of output: 12304


CWE-668: Move MaaS secrets from job scope to step scope.

Lines 60-64 expose MAAS_VLLM_URL, MAAS_VLLM_API_TOKEN, MAAS_EMBEDDING_URL, and MAAS_EMBEDDING_API_TOKEN to every step in the job, including the custom-source build at lines 108-123 that runs pip install -e . from an external repository. An attacker controlling the cloned llama-stack repository can inject install hooks that exfiltrate these tokens before test execution. Remove the MaaS secrets from job-level env and inject them only in the "Configure MaaS vLLM endpoints" step (line 153).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/redhat-distro-container.yml around lines 60 - 64, The MAAS
secrets (MAAS_VLLM_URL, MAAS_VLLM_API_TOKEN, MAAS_EMBEDDING_URL,
MAAS_EMBEDDING_API_TOKEN) are currently defined at job-level env and therefore
exposed to every step; remove these four variables from the job-level env block
and instead add them only to the "Configure MaaS vLLM endpoints" step by setting
its step-level env to the corresponding secrets (e.g., MAAS_VLLM_URL: ${{
secrets.MAAS_VLLM_URL }}, etc.); ensure no other steps reference or inherit
these vars so the secrets are only available during that single step.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these aren't raw secrets in env — they're ${{ secrets.* }} mapped to env vars, which is the standard GitHub Actions pattern (same as VERTEX_AI_PROJECT,
OPENAI_API_KEY on lines 56, 59

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Artemon-line, you're correct that ${{ secrets.* }} in job-level env is the standard GitHub Actions pattern — my concern wasn't about the syntax, it was about scope (CWE-668): all job-level env vars, regardless of how they're populated, are visible to every step in the job, including the step that clones github.com/opendatahub-io/llama-stack and runs pip install -e .. A compromised or hijacked install hook in that external repo could read the process environment and exfiltrate any token present at that point.

That said, your consistency argument is fair — VERTEX_AI_PROJECT, GCP_WORKLOAD_IDENTITY_PROVIDER, and OPENAI_API_KEY are already at job-level with the same exposure, and the cloned repo is within your own org (opendatahub-io), which significantly reduces the supply-chain risk compared to a third-party dependency. Since the existing secrets already follow this pattern and the external repo is org-controlled, I'll defer to your judgment here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Artemon-line are you sure that this comment is a false positive? I mean consistent insecure is also consistent. Just saying, these API tokens might be very valuable and interesting for people. I know the risk properly not so large for this particular repo, but in general it's important to keep an eye on such honeypots.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @rhuss , that’s a very valuable point.
I’d like to keep this PR as a PoC (Proof of Concept) for now, so I’ve added the do-not-merge and test labels.
My intention is to demonstrate how this vLLM-on-MaaS solution could work; however, some notable downsides include secret management and MaaS availability.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Artemon-line , by no means I want to block anything that is useful (and this PR is definitely helpful), just want to remind on a general issue with API keys in CI. :old-man-yelling-at-the-cloud:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeRabbit is pointing at a best-practice here, and best-practices are certainly easier for humans (and agents) to reason about. It's worth considering.

env:
VERTEX_AI_PROJECT: ${{ secrets.VERTEX_AI_PROJECT }}
VERTEX_AI_LOCATION: us-central1
VERTEX_AI_LOCATION: global
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has a lot of changes in it. This is one example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ktdreyer for pointing this out, I will address this in other PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes addressed here: #289

@Artemon-line Artemon-line added do-not-merge Apply to PRs that should not be merged (yet) testing labels Mar 24, 2026
@rhdedgar
Copy link
Collaborator

The feature PoC looks good, and will unblock #252. The secret / token topic has already been discussed, but once that's addressed, I think this will really help. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge Apply to PRs that should not be merged (yet) testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants