Skip to content

Commit e58b17d

Browse files
committed
Allow configurable modelArtifacts readOnly for PVC mounts
Add a modelArtifacts.readOnly value and wire it into PVC volume and volumeMount rendering so pvc+hf deployments can opt into writable mounts for Hugging Face cache metadata while keeping read-only defaults. Signed-off-by: Kay Yan <kay.yan@daocloud.io>
1 parent 1198601 commit e58b17d

7 files changed

Lines changed: 31 additions & 9 deletions

File tree

charts/llm-d-modelservice/templates/_helpers.tpl

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -373,7 +373,7 @@ Context is .Values.modelArtifacts
373373
- name: model-storage
374374
persistentVolumeClaim:
375375
claimName: {{ $claim }}
376-
readOnly: true
376+
readOnly: {{ .readOnly }}
377377
{{- else if eq $protocol "oci" }}
378378
- name: model-storage
379379
image:
@@ -398,12 +398,13 @@ volumeMounts:
398398
{{- if .container.mountModelVolume }}
399399
- name: model-storage
400400
mountPath: {{ .Values.modelArtifacts.mountPath }}
401-
{{- /* enforce readOnly volumeMounts for OCI and PVCs */}}
401+
{{- /* enforce readOnly volumeMounts for OCI and PVC variants */}}
402402
{{- $parsedArtifacts := regexSplit "://" .Values.modelArtifacts.uri -1 -}}
403403
{{- $protocol := first $parsedArtifacts -}}
404-
{{- $path := last $parsedArtifacts -}}
405-
{{- if or (eq $protocol "oci") (eq $protocol "pvc") }}
404+
{{- if eq $protocol "oci" }}
406405
readOnly: true
406+
{{- else if hasPrefix "pvc" $protocol }}
407+
readOnly: {{ .Values.modelArtifacts.readOnly }}
407408
{{- end -}}
408409
{{- end }}
409410
{{- end }}

charts/llm-d-modelservice/values.schema.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1955,6 +1955,12 @@
19551955
"title": "mountPath",
19561956
"type": "string"
19571957
},
1958+
"readOnly": {
1959+
"default": true,
1960+
"description": "Whether model volume mounts should be read-only. Set to false for pvc+hf:// when Hugging Face cache writes are needed.",
1961+
"title": "readOnly",
1962+
"type": "boolean"
1963+
},
19581964
"name": {
19591965
"default": "random/model",
19601966
"description": " Required",

charts/llm-d-modelservice/values.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,9 @@ modelArtifacts:
7676
authSecretName: ""
7777
# location where model volume will be mounted (used when mountModelVolume: true)
7878
mountPath: /model-cache
79+
# Whether model volume mounts should be read-only.
80+
# Set to false for pvc+hf:// when Hugging Face cache writes are needed.
81+
readOnly: true
7982

8083
# When true, a LeaderWorkerSet is used instead of a Deployment
8184
multinode: false

examples/output-dra.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,7 @@ spec:
108108
volumeMounts:
109109
- name: model-storage
110110
mountPath: /model-cache
111+
readOnly: true
111112
---
112113
# Source: llm-d-modelservice/templates/resource-claim-template.yaml
113114
apiVersion: resource.k8s.io/v1

examples/output-gaudi.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,3 +101,4 @@ spec:
101101
volumeMounts:
102102
- name: model-storage
103103
mountPath: /model-cache
104+
readOnly: true

examples/output-pvc-hf.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,7 @@ spec:
121121
volumeMounts:
122122
- name: model-storage
123123
mountPath: /model-cache
124+
readOnly: true
124125
---
125126
# Source: llm-d-modelservice/templates/prefill-deployment.yaml
126127
apiVersion: apps/v1
@@ -214,3 +215,4 @@ spec:
214215
volumeMounts:
215216
- name: model-storage
216217
mountPath: /model-cache
218+
readOnly: true

examples/pvc/README.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -86,8 +86,8 @@ Note that the path after the `<pvc-name>` is the path on the PVC which the downl
8686
Make sure that for the container of your interst in `prefill.containers` or `decode.containers`, there's a field called `mountModelVolume: true` ([see example](../values-pd.yaml#L87)) for the volume mounts to be created correctly.
8787

8888
### Behavior
89-
- A read-only PVC volume with the name `model-storage` is created for the deployment
90-
- A read-only volumeMount with the mountPath: `model-cache` is created for each container where `mountModelVolume: true`
89+
- A PVC volume with the name `model-storage` is created for the deployment (read-only by default)
90+
- A volumeMount with the mountPath: `model-cache` is created for each container where `mountModelVolume: true` (read-only by default)
9191
- `--model` arg for that container is set to `model-cache/<path/to/model>` where `mountModelVolume: true`
9292

9393
⚠️ You do **not** need to configure volumeMounts for containers where `mountModelVolume: true`. ModelService will automatically populate the pod specification and mount the model files.
@@ -96,7 +96,7 @@ However, if you want to add your own volume specifications, you may do so under
9696

9797
💡 You may optionally set the `--served-model-name` in your container to be used for the OpenAI request, otherwise the request name must be a long string like `"model": "model-cache/<path/to/model>"`. Note that this argument is added automatically using the option `modelCommand: vllmServe` or `imageDefault`, using `routing.modelName` as the value to the `--served-model-name` argument.
9898

99-
> For security purposes, a read-only volume is mounted to the pods to prevent a pod from deleting the model files in case another model service installation uses the same PVC. If you would like to write to the PVC, you should not do so through ModelService, but rather through your own pod like the download-model/pvc-debugger without the read-only restriction.
99+
> For security, this chart mounts model volumes as read-only by default to prevent accidental deletion or mutation when a PVC is shared. If your workflow needs writes (for example, Hugging Face cache lock files), set `modelArtifacts.readOnly: false`.
100100
101101

102102
## Use HF-downloaded models with PVCs
@@ -121,7 +121,15 @@ helm install pvc-hf-example llm-d-modelservice/llm-d-modelservice \
121121
Make sure that for the container of your interst in `prefill.containers` or `decode.containers`, there's a field called `mountModelVolume: true` ([see example](../values-pd.yaml#L87)) for the volume mounts to be created correctly.
122122

123123
### Behavior
124-
- A read-only PVC volume with the name `model-storage` is created for the deployment
125-
- A read-only volumeMount with the mountPath: `model-cache` is created for each container where `mountModelVolume: true`
124+
- A PVC volume with the name `model-storage` is created for the deployment (read-only by default)
125+
- A volumeMount with the mountPath: `model-cache` is created for each container where `mountModelVolume: true` (read-only by default)
126126
- `HF_HUB_CACHE` environment variable for that container is set to `model-cache/path/to/hf_hub_cache` where `mountModelVolume: true`
127127
- `--model` arugment is set to `facebook/opt-125m`
128+
129+
If your runtime needs to write Hugging Face cache metadata or lock files, set:
130+
131+
```yaml
132+
modelArtifacts:
133+
uri: pvc+hf://pvc-name/path/to/hf_hub_cache/namespace/modelID
134+
readOnly: false
135+
```

0 commit comments

Comments
 (0)