feat: add vertexai inference provider by nathan-weinberg · Pull Request #33 · opendatahub-io/llama-stack-distribution

nathan-weinberg · 2025-09-15T14:09:03Z

What does this PR do?

Adds VertexAI provider to the midstream distro image

Test Plan

No testing at this time

Summary by CodeRabbit

New Features
- Added support for Google Cloud Vertex AI as a remote inference provider; configurable via environment variables for project and location (default: us-central1).
Chores
- Added the Google Cloud AI client library to the runtime image dependencies; may modestly increase build time and image size.
Documentation
- Updated provider listings and README to include the new Vertex AI option.

coderabbitai · 2025-09-15T14:09:11Z

Walkthrough

Adds Vertex AI as a remote inference provider in distribution configs and docs, and adds google-cloud-aiplatform to the distribution container image's pip install block; no runtime entrypoint or exported declarations were changed.

Changes

Cohort / File(s)	Summary of Changes
Container image deps `distribution/Containerfile`	Added `google-cloud-aiplatform` to the existing `pip install` RUN block; `litellm` remains present and unchanged.
Inference provider config `distribution/build.yaml`, `distribution/run.yaml`	Added Vertex AI provider: inserted `- provider_type: remote::vertexai` in `distribution/build.yaml`; added conditional provider entry in `distribution/run.yaml` with `provider_id: ${env.VERTEX_AI_PROJECT:+vertexai}`, `provider_type: remote::vertexai`, and `config.project` / `config.location` using `${env.VERTEX_AI_PROJECT:=}` and `${env.VERTEX_AI_LOCATION:=us-central1}`.
Docs `distribution/README.md`	Added `inference

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor User
    participant App
    participant Router
    participant VertexProvider as Vertex AI Provider
    participant VertexAPI as Vertex AI API

    User->>App: Submit inference request
    App->>Router: Select inference provider
    alt VERTEX_AI_PROJECT set
        Router->>VertexProvider: Route request to Vertex AI
        Note right of VertexProvider #E8F8F5: Container image includes `google-cloud-aiplatform`
        VertexProvider->>VertexAPI: Call Vertex AI (project/location)
        VertexAPI-->>VertexProvider: Return result
        VertexProvider-->>Router: Forward result
    else Not set
        Router->>App: Use other configured provider
    end
    Router-->>App: Return result
    App-->>User: Render output

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I hop through builds with tiny feet,
A new cloud friend makes my route complete,
Vertex whispers in the sky so high,
Packages snug, we give tests a try,
🥕 — from a rabbit who helped deploy.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "feat: add vertexai inference provider" concisely and accurately describes the primary change in the PR — adding a Vertex AI (vertexai) inference provider across the distribution (configs, README, and container). It is specific, short, and follows conventional commit style, so a reviewer can quickly understand the main intent.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c8522b and d2b6a0b.

📒 Files selected for processing (4)

distribution/Containerfile (1 hunks)
distribution/README.md (1 hunks)
distribution/build.yaml (1 hunks)
distribution/run.yaml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

distribution/run.yaml
distribution/build.yaml
distribution/Containerfile
distribution/README.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build-test-push (linux/amd64)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

distribution/build.yaml (1)
39-43: Source of truth: include Vertex AI deps here, not in Containerfile.

To persist across regenerations, add the new packages to additional_pip_packages (and then revert the Containerfile changes).

Suggested addition (outside the changed hunk, in additional_pip_packages):
additional_pip_packages:
  - aiosqlite
  - sqlalchemy[asyncio]
  - asyncpg
  - psycopg2-binary
  - google-cloud-aiplatform
  - litellm

🧹 Nitpick comments (3)

distribution/Containerfile (1)

17-17: Pin or bound google-cloud-aiplatform to avoid upstream breakage.

Unpinned aiplatform can pull breaking transitive updates (grpc, google-auth). After moving to the generator, add a conservative upper-bound (<2) or a known-good pin that matches the provider in llama-stack 0.2.x.
distribution/run.yaml (2)
45-49: Add usage docs/env guidance for ADC.

Make it explicit that GOOGLE_APPLICATION_CREDENTIALS (or GCP Workload Identity) must be provided at runtime; otherwise calls will 401/403. A brief comment in README or deployment notes is sufficient.

162-173: No model mapped to the new provider.

As-is, the vertexai provider won’t be used unless a model entry points to it. Consider adding a sample model bound to provider_id vertexai and gate it via env vars (similar to vllm/milvus patterns).

Example (outside the changed hunk):
models:
  - metadata: {}
    model_id: ${env.VERTEX_MODEL:=}
    provider_id: ${env.VERTEX_AI_PROJECT:+vertexai}
    provider_model_id: ${env.VERTEX_PROVIDER_MODEL_ID:=}  # e.g., publishers/google/models/gemini-1.5-pro
    model_type: llm

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 756fad8 and 8838451.

📒 Files selected for processing (3)

distribution/Containerfile (1 hunks)
distribution/build.yaml (1 hunks)
distribution/run.yaml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build-test-push (linux/amd64)

🔇 Additional comments (4)

distribution/Containerfile (1)

20-20: Verify litellm/openai interplay.

litellm brings its own OpenAI dependency; you also install openai separately. Please confirm version compatibility to avoid resolver conflicts or shadowed imports at runtime.

distribution/build.yaml (2)

10-10: LGTM: provider type added.

Adding remote::vertexai to the distribution spec looks good.

38-38: Python base mismatch: 3.11 here vs 3.12 in Containerfile.

The spec references python-311, while the generated Containerfile uses python-312. Please align to avoid subtle ABI/package issues.

distribution/run.yaml (1)

45-49: Confirm llama-stack 0.2.21 includes remote::vertexai.

If the provider landed after 0.2.21, initialization will fail. Please verify against the installed llama-stack version or bump accordingly.

distribution/Containerfile

distribution/run.yaml

leseb

holding this one for downstream first

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

distribution/run.yaml (1)
45-49: Gate Vertex AI provider behind env to avoid startup failures without GCP creds.

Mirror the existing Milvus pattern so the provider is only registered when VERTEX_AI_PROJECT is set. This prevents init errors when ADC/project are absent.

Apply:
-  - provider_id: vertexai
+  - provider_id: ${env.VERTEX_AI_PROJECT:+vertexai}
     provider_type: remote::vertexai
     config:
       project: ${env.VERTEX_AI_PROJECT:=}
       location: ${env.VERTEX_AI_LOCATION:=us-central1}
Please verify boot succeeds both with and without VERTEX_AI_PROJECT set.

🧹 Nitpick comments (1)

distribution/run.yaml (1)
45-49: Optional: align provider_id naming with others (e.g., “-inference”).

Most inference providers use a “-inference” suffix (e.g., vllm-inference, bedrock-inference). Consider renaming for consistency.

If you adopt the env‑gated form above, change just the suffix:
-  - provider_id: ${env.VERTEX_AI_PROJECT:+vertexai}
+  - provider_id: ${env.VERTEX_AI_PROJECT:+vertexai-inference}

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae61a99 and b2dac1f.

📒 Files selected for processing (4)

distribution/Containerfile (1 hunks)
distribution/README.md (1 hunks)
distribution/build.yaml (1 hunks)
distribution/run.yaml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

distribution/README.md
distribution/Containerfile
distribution/build.yaml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build-test-push (linux/amd64)

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>

chore: bump wheel release to get 0.3.5.1

nathan-weinberg requested review from Elbehery, VaishnaviHire, cdoern, derekhiggins, leseb, rhdedgar, rhuss and skamenan7 as code owners September 15, 2025 14:09

coderabbitai bot reviewed Sep 15, 2025

View reviewed changes

distribution/Containerfile Show resolved Hide resolved

distribution/run.yaml Outdated Show resolved Hide resolved

leseb requested changes Sep 15, 2025

View reviewed changes

nathan-weinberg added the do-not-merge Apply to PRs that should not be merged (yet) label Sep 15, 2025

nathan-weinberg force-pushed the vertexai branch 4 times, most recently from ae61a99 to b2dac1f Compare September 16, 2025 20:07

coderabbitai bot reviewed Sep 16, 2025

View reviewed changes

nathan-weinberg force-pushed the vertexai branch from b2dac1f to efc7ed8 Compare September 17, 2025 13:48

nathan-weinberg removed the do-not-merge Apply to PRs that should not be merged (yet) label Sep 18, 2025

nathan-weinberg force-pushed the vertexai branch 2 times, most recently from d931c79 to 6869e23 Compare September 18, 2025 15:05

leseb approved these changes Sep 18, 2025

View reviewed changes

nathan-weinberg force-pushed the vertexai branch from 6869e23 to 8c8522b Compare September 18, 2025 17:38

cdoern approved these changes Sep 18, 2025

View reviewed changes

feat: add vertexai inference provider

d2b6a0b

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>

nathan-weinberg force-pushed the vertexai branch from 8c8522b to d2b6a0b Compare September 18, 2025 17:57

nathan-weinberg merged commit 0512b5d into opendatahub-io:main Sep 18, 2025
5 checks passed

nathan-weinberg deleted the vertexai branch September 18, 2025 18:17

coderabbitai bot mentioned this pull request Sep 19, 2025

chore: clarify README.md #45

Merged

leseb added a commit to leseb/llama-stack-distribution that referenced this pull request Jan 9, 2026

Merge pull request opendatahub-io#33 from leseb/RHAIENG-2516

eb427fc

chore: bump wheel release to get 0.3.5.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add vertexai inference provider#33

feat: add vertexai inference provider#33
nathan-weinberg merged 1 commit intoopendatahub-io:mainfrom
nathan-weinberg:vertexai

nathan-weinberg commented Sep 15, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 15, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

leseb left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nathan-weinberg commented Sep 15, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

leseb left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nathan-weinberg commented Sep 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 15, 2025 •

edited

Loading