Skip to content

Commit 9d4af4f

Browse files
committed
Fix cache folder in SLM deployment
Signed-off-by: Ricardo Noriega De Soto <rnoriega@redhat.com>
1 parent f8f8ccc commit 9d4af4f

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

manifests/vllm-slm/base/deployment.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,11 +27,16 @@ spec:
2727
- "Qwen/Qwen2.5-1.5B-Instruct"
2828
- "--max-model-len"
2929
- "4096"
30+
- "--gpu-memory-utilization"
31+
- "0.5"
32+
- "--enforce-eager"
3033
ports:
3134
- name: api
3235
containerPort: 8000
3336
protocol: TCP
3437
env:
38+
- name: HOME
39+
value: "/tmp"
3540
- name: HF_HOME
3641
value: "/tmp/hf-cache"
3742
- name: HUGGINGFACE_HUB_CACHE

0 commit comments

Comments
 (0)