Skip to content

fix: use env to configure vLLM#49

Merged
mergify[bot] merged 1 commit intoopendatahub-io:mainfrom
Elbehery:fix_vllm_provider_url
Sep 25, 2025
Merged

fix: use env to configure vLLM#49
mergify[bot] merged 1 commit intoopendatahub-io:mainfrom
Elbehery:fix_vllm_provider_url

Conversation

@Elbehery
Copy link
Copy Markdown
Collaborator

@Elbehery Elbehery commented Sep 23, 2025

What does this PR do?

This PR allows running LLS without vLLM provider. It allows configuring vLLM url through env vars.

Currently, the default config using run.yaml requires vLLM by default. Upon configuring other providers, vLLM is not needed. This behavior is not always correct, as using a different providers does not requires a running vLLM instance.

cc @leseb @derekhiggins

Summary by CodeRabbit

  • New Features

    • Conditional activation of the VLLM inference provider and related models based on environment variables for opt-in usage.
  • Bug Fixes

    • Avoids unintended connections to a localhost inference endpoint by removing hardcoded default URLs.
  • Chores

    • Simplified configuration defaults for inference and evaluation endpoints (empty unless set), and a fallback model ID to ensure predictable startup.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Sep 23, 2025

Walkthrough

The run configuration makes VLLM provider activation conditional on VLLM_URL, replaces hardcoded default URLs with empty defaults, and sets model_id to dummy when INFERENCE_MODEL is unset. Provider_id, provider config url, and TrustyAI LMEval base_url now depend on ${env.VLLM_URL}.

Changes

Cohort / File(s) Summary of Changes
Runtime configuration
distribution/run.yaml
- providers.inference.provider_id: vllm-inference${env.VLLM_URL:+vllm-inference}
- providers.inference.config.url: ${env.VLLM_URL:=http://localhost:8000/v1}${env.VLLM_URL:=}
- inference_store.eval.trustyai_lmeval.base_url: ${env.VLLM_URL:=http://localhost:8000/v1}${env.VLLM_URL:=}
- models[*].provider_id (VLLM-backed): vllm-inference${env.VLLM_URL:+vllm-inference}
- models[*].model_id: ${env.INFERENCE_MODEL}${env.INFERENCE_MODEL:=dummy}

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant App
  participant Config as Config Loader
  participant Env as Environment
  participant Prov as Provider Registry
  participant Eval as TrustyAI LMEval
  participant Models

  App->>Config: Load distribution/run.yaml
  Config->>Env: Read VLLM_URL, INFERENCE_MODEL
  alt VLLM_URL is set
    Note right of Prov #DFF2E1: provider_id resolves\nand URLs set
    Config-->>Prov: provider_id = vllm-inference
    Config-->>Prov: url = VLLM_URL
    Config-->>Eval: base_url = VLLM_URL
  else VLLM_URL is empty/undefined
    Note right of Prov #FFF3DE: provider_id omitted\nURLs default to empty
    Config-->>Prov: provider_id unset
    Config-->>Prov: url = ""
    Config-->>Eval: base_url = ""
  end
  alt INFERENCE_MODEL is set
    Config-->>Models: model_id = INFERENCE_MODEL
  else INFERENCE_MODEL unset
    Config-->>Models: model_id = dummy
  end
  App->>Prov: Initialize providers (if any)
  App->>Eval: Initialize eval client (with base_url if provided)
  App->>Models: Register models (using provider_id if present)
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I twitch my ears at envs that hide,
A URL set? Then off we ride!
If not, I stash my vllm gear,
And name my model "dummy" here.
One var flips — configs cheer. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "fix: use env to configure vLLM" is concise and accurately summarizes the primary change—switching vLLM configuration to environment variables—which matches the modifications shown in distribution/run.yaml.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@skamenan7
Copy link
Copy Markdown
Collaborator

LGTM, thanks @Elbehery !

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ecc0eb and 70bcac6.

📒 Files selected for processing (2)
  • distribution/run.yaml (3 hunks)
  • tests/run_integration_tests.sh (1 hunks)
🧰 Additional context used
🪛 Shellcheck (0.11.0)
tests/run_integration_tests.sh

[warning] 55-55: env is referenced but not assigned (for output from commands, use "$(env ...)" ).

(SC2154)

🪛 GitHub Actions: Pre-commit
tests/run_integration_tests.sh

[error] 55-55: ShellCheck failed. Exit code 1 due to issues in tests/run_integration_tests.sh. Warnings detected: SC2154 (env referenced but not assigned) and SC2086 (potential globbing; consider quoting). Example: --text-model=${env.VLLM_URL:+vllm-inference}/"$INFERENCE_MODEL" \ -> should be "${env.VLLM_URL:+vllm-inference}"/"$INFERENCE_MODEL" \

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-test-push (linux/amd64)
  • GitHub Check: Summary
🔇 Additional comments (1)
distribution/run.yaml (1)

16-16: Env‑gated VLLM config looks consistent.

Conditional provider/model wiring and empty default URLs align with existing patterns (e.g., Bedrock). LGTM.

Also applies to: 19-19, 110-110, 179-179

@Elbehery Elbehery force-pushed the fix_vllm_provider_url branch 2 times, most recently from 3f97eea to 58ae505 Compare September 23, 2025 12:39
@@ -176,7 +176,7 @@ inference_store:
models:
- metadata: {}
model_id: ${env.INFERENCE_MODEL}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llama_stack.core.stack.EnvVarError: Environment variable 'INFERENCE_MODEL' not set or empty at models[0].model_id. Use ${env.INFERENCE_MODEL:=default_value} to provide a default value, ${env.INFERENCE_MODEL:+value_if_set} to make the field conditional, or ensure the environment variable is set.

This still requires INFERENCE_MODEL is defined, maybe we should default it to dummy or something?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed 👍🏽

ptal

Copy link
Copy Markdown
Collaborator

@nathan-weinberg nathan-weinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this change? I'm assuming it's so you can run the distribution without vLLM, but you haven't actually stated that anywhere.

@Elbehery Elbehery force-pushed the fix_vllm_provider_url branch 2 times, most recently from 396d941 to cd1ecbd Compare September 23, 2025 14:34
@Elbehery Elbehery force-pushed the fix_vllm_provider_url branch from cd1ecbd to 3bf862c Compare September 23, 2025 19:44
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
distribution/run.yaml (1)

109-111: TrustyAI LMEval base_url now empty by default — consider gating or keeping a safe fallback.

If eval is used without VLLM_URL, an empty base_url might cause runtime errors. Two options:

  • Gate the provider on VLLM_URL.
  • Keep the previous local fallback.

Option A (keep safe fallback; minimal change):

-        base_url: ${env.VLLM_URL:=}
+        base_url: ${env.VLLM_URL:=http://localhost:8000/v1}

Option B (gate provider; requires changing provider_id above, shown as a full snippet since it’s outside this hunk):

# at lines 105-107
provider_id: ${env.VLLM_URL:+trustyai_lmeval}
provider_type: remote::trustyai_lmeval
module: llama_stack_provider_lmeval==0.2.4

If you prefer the current empty default, please confirm that invoking eval flows with this provider present but base_url empty is handled gracefully.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd1ecbd and 3bf862c.

📒 Files selected for processing (1)
  • distribution/run.yaml (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-test-push (linux/amd64)
  • GitHub Check: Summary
🔇 Additional comments (3)
distribution/run.yaml (3)

16-16: Good: provider activation gated by VLLM_URL.

This aligns with the existing pattern used for other remotes (e.g., bedrock/openai) and prevents accidental activation.

Please confirm the templating engine treats empty values as “unset” for the :+ operator (i.e., does not create a provider with an empty provider_id). This matches the earlier EnvVarError guidance but worth double‑checking with a quick run without VLLM_URL set.


19-19: Empty default for vLLM url: verify behavior when unset.

With ${env.VLLM_URL:=} the URL will be empty when not provided. Assuming the provider is truly disabled by the conditional provider_id, this is fine. If not, this would lead to a misconfigured remote.

Run the stack with and without VLLM_URL to ensure no startup errors occur and that the vLLM provider only appears when VLLM_URL is set.


178-180: Models: sensible defaults; verify loader tolerance when VLLM is disabled.

  • model_id defaulting to dummy resolves the prior EnvVarError.
  • provider_id conditional on VLLM_URL mirrors the provider gating.

Please verify that when VLLM_URL is unset, the loader accepts the model entry with a missing/omitted provider_id (or skips the model) to avoid schema errors. If not, we can gate the entire model entry on VLLM_URL.

- metadata: {}
model_id: ${env.INFERENCE_MODEL}
provider_id: vllm-inference
model_id: ${env.INFERENCE_MODEL:=dummy}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems strange that we should need this? Is this a bug upstream?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 curious about the rationale for this change.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was requested above due to this

llama_stack.core.stack.EnvVarError: Environment variable 'INFERENCE_MODEL' not set or empty at models[0].model_id. Use ${env.INFERENCE_MODEL:=default_value} to provide a default value, ${env.INFERENCE_MODEL:+value_if_set} to make the field conditional, or ensure the environment variable is set.

Copy link
Copy Markdown
Collaborator

@derekhiggins derekhiggins Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That error came from me testing this run.yaml without VLLM_URL and INFERENCE_MODEL
i.e. We are making VLLM_URL optional above, so we should also make INFERENCE_MODEL optional

iirc I think I also tried with a blank INFERENCE_MODEL but there was a different error (don't have it at the moment)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a function of the fact that the Models resource still needs the env var, even if the vLLM provider does not?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a function of the fact that the Models resource still needs the env var, even if the vLLM provider does not?

yup

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed options above, would appreciate recommendation how to proceed

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think once the env variable has a default we're fine,

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your follow up 🙏🏽

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah maybe keep it but throw a TODO on there to change later

@Elbehery
Copy link
Copy Markdown
Collaborator Author

What is the purpose of this change? I'm assuming it's so you can run the distribution without vLLM, but you haven't actually stated that anywhere.

updated the PR description 👍🏽

@mergify mergify bot merged commit 4b2d68f into opendatahub-io:main Sep 25, 2025
7 checks passed
@Elbehery Elbehery deleted the fix_vllm_provider_url branch September 25, 2025 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants