fix: use env to configure vLLM#49
Conversation
WalkthroughThe run configuration makes VLLM provider activation conditional on VLLM_URL, replaces hardcoded default URLs with empty defaults, and sets model_id to Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant App
participant Config as Config Loader
participant Env as Environment
participant Prov as Provider Registry
participant Eval as TrustyAI LMEval
participant Models
App->>Config: Load distribution/run.yaml
Config->>Env: Read VLLM_URL, INFERENCE_MODEL
alt VLLM_URL is set
Note right of Prov #DFF2E1: provider_id resolves\nand URLs set
Config-->>Prov: provider_id = vllm-inference
Config-->>Prov: url = VLLM_URL
Config-->>Eval: base_url = VLLM_URL
else VLLM_URL is empty/undefined
Note right of Prov #FFF3DE: provider_id omitted\nURLs default to empty
Config-->>Prov: provider_id unset
Config-->>Prov: url = ""
Config-->>Eval: base_url = ""
end
alt INFERENCE_MODEL is set
Config-->>Models: model_id = INFERENCE_MODEL
else INFERENCE_MODEL unset
Config-->>Models: model_id = dummy
end
App->>Prov: Initialize providers (if any)
App->>Eval: Initialize eval client (with base_url if provided)
App->>Models: Register models (using provider_id if present)
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
LGTM, thanks @Elbehery ! |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
distribution/run.yaml(3 hunks)tests/run_integration_tests.sh(1 hunks)
🧰 Additional context used
🪛 Shellcheck (0.11.0)
tests/run_integration_tests.sh
[warning] 55-55: env is referenced but not assigned (for output from commands, use "$(env ...)" ).
(SC2154)
🪛 GitHub Actions: Pre-commit
tests/run_integration_tests.sh
[error] 55-55: ShellCheck failed. Exit code 1 due to issues in tests/run_integration_tests.sh. Warnings detected: SC2154 (env referenced but not assigned) and SC2086 (potential globbing; consider quoting). Example: --text-model=${env.VLLM_URL:+vllm-inference}/"$INFERENCE_MODEL" \ -> should be "${env.VLLM_URL:+vllm-inference}"/"$INFERENCE_MODEL" \
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
🔇 Additional comments (1)
distribution/run.yaml (1)
16-16: Env‑gated VLLM config looks consistent.Conditional provider/model wiring and empty default URLs align with existing patterns (e.g., Bedrock). LGTM.
Also applies to: 19-19, 110-110, 179-179
3f97eea to
58ae505
Compare
distribution/run.yaml
Outdated
| @@ -176,7 +176,7 @@ inference_store: | |||
| models: | |||
| - metadata: {} | |||
| model_id: ${env.INFERENCE_MODEL} | |||
There was a problem hiding this comment.
llama_stack.core.stack.EnvVarError: Environment variable 'INFERENCE_MODEL' not set or empty at models[0].model_id. Use ${env.INFERENCE_MODEL:=default_value} to provide a default value, ${env.INFERENCE_MODEL:+value_if_set} to make the field conditional, or ensure the environment variable is set.
This still requires INFERENCE_MODEL is defined, maybe we should default it to dummy or something?
nathan-weinberg
left a comment
There was a problem hiding this comment.
What is the purpose of this change? I'm assuming it's so you can run the distribution without vLLM, but you haven't actually stated that anywhere.
396d941 to
cd1ecbd
Compare
Signed-off-by: Mustafa Elbehery <[email protected]> relates to https://issues.redhat.com/browse/RHAIENG-1220
cd1ecbd to
3bf862c
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
distribution/run.yaml (1)
109-111: TrustyAI LMEval base_url now empty by default — consider gating or keeping a safe fallback.If eval is used without
VLLM_URL, an emptybase_urlmight cause runtime errors. Two options:
- Gate the provider on
VLLM_URL.- Keep the previous local fallback.
Option A (keep safe fallback; minimal change):
- base_url: ${env.VLLM_URL:=} + base_url: ${env.VLLM_URL:=http://localhost:8000/v1}Option B (gate provider; requires changing provider_id above, shown as a full snippet since it’s outside this hunk):
# at lines 105-107 provider_id: ${env.VLLM_URL:+trustyai_lmeval} provider_type: remote::trustyai_lmeval module: llama_stack_provider_lmeval==0.2.4If you prefer the current empty default, please confirm that invoking eval flows with this provider present but
base_urlempty is handled gracefully.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
distribution/run.yaml(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
🔇 Additional comments (3)
distribution/run.yaml (3)
16-16: Good: provider activation gated by VLLM_URL.This aligns with the existing pattern used for other remotes (e.g., bedrock/openai) and prevents accidental activation.
Please confirm the templating engine treats empty values as “unset” for the
:+operator (i.e., does not create a provider with an emptyprovider_id). This matches the earlier EnvVarError guidance but worth double‑checking with a quick run withoutVLLM_URLset.
19-19: Empty default for vLLM url: verify behavior when unset.With
${env.VLLM_URL:=}the URL will be empty when not provided. Assuming the provider is truly disabled by the conditionalprovider_id, this is fine. If not, this would lead to a misconfigured remote.Run the stack with and without
VLLM_URLto ensure no startup errors occur and that the vLLM provider only appears whenVLLM_URLis set.
178-180: Models: sensible defaults; verify loader tolerance when VLLM is disabled.
model_iddefaulting todummyresolves the prior EnvVarError.provider_idconditional onVLLM_URLmirrors the provider gating.Please verify that when
VLLM_URLis unset, the loader accepts the model entry with a missing/omittedprovider_id(or skips the model) to avoid schema errors. If not, we can gate the entire model entry onVLLM_URL.
| - metadata: {} | ||
| model_id: ${env.INFERENCE_MODEL} | ||
| provider_id: vllm-inference | ||
| model_id: ${env.INFERENCE_MODEL:=dummy} |
There was a problem hiding this comment.
Seems strange that we should need this? Is this a bug upstream?
There was a problem hiding this comment.
+1 curious about the rationale for this change.
There was a problem hiding this comment.
this was requested above due to this
llama_stack.core.stack.EnvVarError: Environment variable 'INFERENCE_MODEL' not set or empty at models[0].model_id. Use ${env.INFERENCE_MODEL:=default_value} to provide a default value, ${env.INFERENCE_MODEL:+value_if_set} to make the field conditional, or ensure the environment variable is set.
There was a problem hiding this comment.
That error came from me testing this run.yaml without VLLM_URL and INFERENCE_MODEL
i.e. We are making VLLM_URL optional above, so we should also make INFERENCE_MODEL optional
iirc I think I also tried with a blank INFERENCE_MODEL but there was a different error (don't have it at the moment)
There was a problem hiding this comment.
I guess this is a function of the fact that the Models resource still needs the env var, even if the vLLM provider does not?
There was a problem hiding this comment.
I guess this is a function of the fact that the Models resource still needs the env var, even if the vLLM provider does not?
yup
There was a problem hiding this comment.
I proposed options above, would appreciate recommendation how to proceed
There was a problem hiding this comment.
I think once the env variable has a default we're fine,
There was a problem hiding this comment.
thanks for your follow up 🙏🏽
There was a problem hiding this comment.
yeah maybe keep it but throw a TODO on there to change later
updated the PR description 👍🏽 |
What does this PR do?
This PR allows running LLS without
vLLMprovider. It allows configuringvLLMurl through env vars.Currently, the default config using
run.yamlrequiresvLLMby default. Upon configuring other providers, vLLM is not needed. This behavior is not always correct, as using a different providers does not requires a running vLLM instance.cc @leseb @derekhiggins
Summary by CodeRabbit
New Features
Bug Fixes
Chores