Skip to content

Commit 4b2d68f

Browse files
authored
fix: use env to configure vLLM (#49)
# What does this PR do? This PR allows running LLS without `vLLM` provider. It allows configuring `vLLM` url through env vars. Currently, the default config using `run.yaml` requires `vLLM` by default. Upon configuring other providers, vLLM is not needed. This behavior is not always correct, as using a different providers does _not_ requires a running vLLM instance. cc @leseb @derekhiggins ## Summary by CodeRabbit - New Features - Conditional activation of the VLLM inference provider and related models based on environment variables for opt-in usage. - Bug Fixes - Avoids unintended connections to a localhost inference endpoint by removing hardcoded default URLs. - Chores - Simplified configuration defaults for inference and evaluation endpoints (empty unless set), and a fallback model ID to ensure predictable startup. Approved-by: nathan-weinberg Approved-by: derekhiggins
2 parents 468e850 + 3bf862c commit 4b2d68f

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

distribution/run.yaml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@ apis:
1313
- files
1414
providers:
1515
inference:
16-
- provider_id: vllm-inference
16+
- provider_id: ${env.VLLM_URL:+vllm-inference}
1717
provider_type: remote::vllm
1818
config:
19-
url: ${env.VLLM_URL:=http://localhost:8000/v1}
19+
url: ${env.VLLM_URL:=}
2020
max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
2121
api_token: ${env.VLLM_API_TOKEN:=fake}
2222
tls_verify: ${env.VLLM_TLS_VERIFY:=true}
@@ -107,7 +107,7 @@ providers:
107107
module: llama_stack_provider_lmeval==0.2.4
108108
config:
109109
use_k8s: ${env.TRUSTYAI_LMEVAL_USE_K8S:=true}
110-
base_url: ${env.VLLM_URL:=http://localhost:8000/v1}
110+
base_url: ${env.VLLM_URL:=}
111111
datasetio:
112112
- provider_id: huggingface
113113
provider_type: remote::huggingface
@@ -175,8 +175,8 @@ inference_store:
175175
db_path: /opt/app-root/src/.llama/distributions/rh/inference_store.db
176176
models:
177177
- metadata: {}
178-
model_id: ${env.INFERENCE_MODEL}
179-
provider_id: vllm-inference
178+
model_id: ${env.INFERENCE_MODEL:=dummy}
179+
provider_id: ${env.VLLM_URL:+vllm-inference}
180180
model_type: llm
181181
- metadata:
182182
embedding_dimension: 768

0 commit comments

Comments
 (0)