fix: use env to configure vLLM by Elbehery · Pull Request #49 · opendatahub-io/llama-stack-distribution

Elbehery · 2025-09-23T12:31:06Z

What does this PR do?

This PR allows running LLS without vLLM provider. It allows configuring vLLM url through env vars.

Currently, the default config using run.yaml requires vLLM by default. Upon configuring other providers, vLLM is not needed. This behavior is not always correct, as using a different providers does not requires a running vLLM instance.

cc @leseb @derekhiggins

Summary by CodeRabbit

New Features
- Conditional activation of the VLLM inference provider and related models based on environment variables for opt-in usage.
Bug Fixes
- Avoids unintended connections to a localhost inference endpoint by removing hardcoded default URLs.
Chores
- Simplified configuration defaults for inference and evaluation endpoints (empty unless set), and a fallback model ID to ensure predictable startup.

coderabbitai · 2025-09-23T12:31:14Z

Walkthrough

The run configuration makes VLLM provider activation conditional on VLLM_URL, replaces hardcoded default URLs with empty defaults, and sets model_id to dummy when INFERENCE_MODEL is unset. Provider_id, provider config url, and TrustyAI LMEval base_url now depend on ${env.VLLM_URL}.

Changes

Cohort / File(s)	Summary of Changes
Runtime configuration `distribution/run.yaml`	- `providers.inference.provider_id`: `vllm-inference` → `${env.VLLM_URL:+vllm-inference}` - `providers.inference.config.url`: `${env.VLLM_URL:=http://localhost:8000/v1}` → `${env.VLLM_URL:=}` - `inference_store.eval.trustyai_lmeval.base_url`: `${env.VLLM_URL:=http://localhost:8000/v1}` → `${env.VLLM_URL:=}` - `models[].provider_id` (VLLM-backed): `vllm-inference` → `${env.VLLM_URL:+vllm-inference}` - `models[].model_id`: `${env.INFERENCE_MODEL}` → `${env.INFERENCE_MODEL:=dummy}`

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant App
  participant Config as Config Loader
  participant Env as Environment
  participant Prov as Provider Registry
  participant Eval as TrustyAI LMEval
  participant Models

  App->>Config: Load distribution/run.yaml
  Config->>Env: Read VLLM_URL, INFERENCE_MODEL
  alt VLLM_URL is set
    Note right of Prov #DFF2E1: provider_id resolves\nand URLs set
    Config-->>Prov: provider_id = vllm-inference
    Config-->>Prov: url = VLLM_URL
    Config-->>Eval: base_url = VLLM_URL
  else VLLM_URL is empty/undefined
    Note right of Prov #FFF3DE: provider_id omitted\nURLs default to empty
    Config-->>Prov: provider_id unset
    Config-->>Prov: url = ""
    Config-->>Eval: base_url = ""
  end
  alt INFERENCE_MODEL is set
    Config-->>Models: model_id = INFERENCE_MODEL
  else INFERENCE_MODEL unset
    Config-->>Models: model_id = dummy
  end
  App->>Prov: Initialize providers (if any)
  App->>Eval: Initialize eval client (with base_url if provided)
  App->>Models: Register models (using provider_id if present)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I twitch my ears at envs that hide,
A URL set? Then off we ride!
If not, I stash my vllm gear,
And name my model "dummy" here.
One var flips — configs cheer. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "fix: use env to configure vLLM" is concise and accurately summarizes the primary change—switching vLLM configuration to environment variables—which matches the modifications shown in distribution/run.yaml.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

skamenan7 · 2025-09-23T12:36:11Z

LGTM, thanks @Elbehery !

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ecc0eb and 70bcac6.

📒 Files selected for processing (2)

distribution/run.yaml (3 hunks)
tests/run_integration_tests.sh (1 hunks)

🧰 Additional context used

🪛 Shellcheck (0.11.0)

tests/run_integration_tests.sh

[warning] 55-55: env is referenced but not assigned (for output from commands, use "$(env ...)" ).

(SC2154)

🪛 GitHub Actions: Pre-commit

tests/run_integration_tests.sh

[error] 55-55: ShellCheck failed. Exit code 1 due to issues in tests/run_integration_tests.sh. Warnings detected: SC2154 (env referenced but not assigned) and SC2086 (potential globbing; consider quoting). Example: --text-model=${env.VLLM_URL:+vllm-inference}/"$INFERENCE_MODEL" \ -> should be "${env.VLLM_URL:+vllm-inference}"/"$INFERENCE_MODEL" \

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-test-push (linux/amd64)
GitHub Check: Summary

🔇 Additional comments (1)

distribution/run.yaml (1)

16-16: Env‑gated VLLM config looks consistent.

Conditional provider/model wiring and empty default URLs align with existing patterns (e.g., Bedrock). LGTM.

Also applies to: 19-19, 110-110, 179-179

tests/run_integration_tests.sh

derekhiggins · 2025-09-23T13:27:01Z

distribution/run.yaml

@@ -176,7 +176,7 @@ inference_store:
 models:
 - metadata: {}
  model_id: ${env.INFERENCE_MODEL}


llama_stack.core.stack.EnvVarError: Environment variable 'INFERENCE_MODEL' not set or empty at models[0].model_id. Use ${env.INFERENCE_MODEL:=default_value} to provide a default value, ${env.INFERENCE_MODEL:+value_if_set} to make the field conditional, or ensure the environment variable is set.

This still requires INFERENCE_MODEL is defined, maybe we should default it to dummy or something?

fixed 👍🏽

ptal

nathan-weinberg

What is the purpose of this change? I'm assuming it's so you can run the distribution without vLLM, but you haven't actually stated that anywhere.

tests/run_integration_tests.sh

Signed-off-by: Mustafa Elbehery <[email protected]> relates to https://issues.redhat.com/browse/RHAIENG-1220

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

distribution/run.yaml (1)
109-111: TrustyAI LMEval base_url now empty by default — consider gating or keeping a safe fallback.

If eval is used without VLLM_URL, an empty base_url might cause runtime errors. Two options:

Gate the provider on VLLM_URL.

Keep the previous local fallback.

Option A (keep safe fallback; minimal change):
-        base_url: ${env.VLLM_URL:=}
+        base_url: ${env.VLLM_URL:=http://localhost:8000/v1}
Option B (gate provider; requires changing provider_id above, shown as a full snippet since it’s outside this hunk):
# at lines 105-107
provider_id: ${env.VLLM_URL:+trustyai_lmeval}
provider_type: remote::trustyai_lmeval
module: llama_stack_provider_lmeval==0.2.4
If you prefer the current empty default, please confirm that invoking eval flows with this provider present but base_url empty is handled gracefully.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd1ecbd and 3bf862c.

📒 Files selected for processing (1)

distribution/run.yaml (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-test-push (linux/amd64)
GitHub Check: Summary

🔇 Additional comments (3)

distribution/run.yaml (3)

16-16: Good: provider activation gated by VLLM_URL.

This aligns with the existing pattern used for other remotes (e.g., bedrock/openai) and prevents accidental activation.

Please confirm the templating engine treats empty values as “unset” for the :+ operator (i.e., does not create a provider with an empty provider_id). This matches the earlier EnvVarError guidance but worth double‑checking with a quick run without VLLM_URL set.

19-19: Empty default for vLLM url: verify behavior when unset.

With ${env.VLLM_URL:=} the URL will be empty when not provided. Assuming the provider is truly disabled by the conditional provider_id, this is fine. If not, this would lead to a misconfigured remote.

Run the stack with and without VLLM_URL to ensure no startup errors occur and that the vLLM provider only appears when VLLM_URL is set.

178-180: Models: sensible defaults; verify loader tolerance when VLLM is disabled.

model_id defaulting to dummy resolves the prior EnvVarError.

provider_id conditional on VLLM_URL mirrors the provider gating.

Please verify that when VLLM_URL is unset, the loader accepts the model entry with a missing/omitted provider_id (or skips the model) to avoid schema errors. If not, we can gate the entire model entry on VLLM_URL.

nathan-weinberg · 2025-09-23T20:03:06Z

distribution/run.yaml

 - metadata: {}
-  model_id: ${env.INFERENCE_MODEL}
-  provider_id: vllm-inference
+  model_id: ${env.INFERENCE_MODEL:=dummy}


Seems strange that we should need this? Is this a bug upstream?

+1 curious about the rationale for this change.

this was requested above due to this

llama_stack.core.stack.EnvVarError: Environment variable 'INFERENCE_MODEL' not set or empty at models[0].model_id. Use ${env.INFERENCE_MODEL:=default_value} to provide a default value, ${env.INFERENCE_MODEL:+value_if_set} to make the field conditional, or ensure the environment variable is set.

That error came from me testing this run.yaml without VLLM_URL and INFERENCE_MODEL
i.e. We are making VLLM_URL optional above, so we should also make INFERENCE_MODEL optional

iirc I think I also tried with a blank INFERENCE_MODEL but there was a different error (don't have it at the moment)

I guess this is a function of the fact that the Models resource still needs the env var, even if the vLLM provider does not?

I guess this is a function of the fact that the Models resource still needs the env var, even if the vLLM provider does not?

yup

I proposed options above, would appreciate recommendation how to proceed

I think once the env variable has a default we're fine,

thanks for your follow up 🙏🏽

yeah maybe keep it but throw a TODO on there to change later

Elbehery · 2025-09-24T12:05:26Z

What is the purpose of this change? I'm assuming it's so you can run the distribution without vLLM, but you haven't actually stated that anywhere.

updated the PR description 👍🏽

Elbehery requested review from VaishnaviHire, cdoern, derekhiggins, leseb, nathan-weinberg, rhdedgar, rhuss and skamenan7 as code owners September 23, 2025 12:31

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

tests/run_integration_tests.sh Outdated Show resolved Hide resolved

Elbehery force-pushed the fix_vllm_provider_url branch 2 times, most recently from 3f97eea to 58ae505 Compare September 23, 2025 12:39

derekhiggins reviewed Sep 23, 2025

View reviewed changes

nathan-weinberg requested changes Sep 23, 2025

View reviewed changes

tests/run_integration_tests.sh Outdated Show resolved Hide resolved

Elbehery force-pushed the fix_vllm_provider_url branch 2 times, most recently from 396d941 to cd1ecbd Compare September 23, 2025 14:34

fix: use env to configure vLLM

3bf862c

Signed-off-by: Mustafa Elbehery <[email protected]> relates to https://issues.redhat.com/browse/RHAIENG-1220

Elbehery force-pushed the fix_vllm_provider_url branch from cd1ecbd to 3bf862c Compare September 23, 2025 19:44

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

nathan-weinberg reviewed Sep 23, 2025

View reviewed changes

nathan-weinberg approved these changes Sep 25, 2025

View reviewed changes

derekhiggins approved these changes Sep 25, 2025

View reviewed changes

nathan-weinberg removed request for VaishnaviHire, cdoern, leseb, rhdedgar, rhuss and skamenan7 September 25, 2025 16:05

mergify bot merged commit 4b2d68f into opendatahub-io:main Sep 25, 2025
7 checks passed

Elbehery deleted the fix_vllm_provider_url branch September 25, 2025 21:12

Conversation

Elbehery commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

skamenan7 commented Sep 23, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nathan-weinberg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

derekhiggins Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Elbehery commented Sep 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Elbehery commented Sep 23, 2025 •

edited

Loading

coderabbitai bot commented Sep 23, 2025 •

edited

Loading

derekhiggins Sep 24, 2025 •

edited

Loading