You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Allow configurable modelArtifacts readOnly for PVC mounts
Add a modelArtifacts.readOnly value and wire it into PVC volume and volumeMount rendering so pvc+hf deployments can opt into writable mounts for Hugging Face cache metadata while keeping read-only defaults.
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
Copy file name to clipboardExpand all lines: examples/pvc/README.md
+13-5Lines changed: 13 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -86,8 +86,8 @@ Note that the path after the `<pvc-name>` is the path on the PVC which the downl
86
86
Make sure that for the container of your interst in `prefill.containers` or `decode.containers`, there's a field called `mountModelVolume: true` ([see example](../values-pd.yaml#L87)) for the volume mounts to be created correctly.
87
87
88
88
### Behavior
89
-
- A read-only PVC volume with the name `model-storage` is created for the deployment
90
-
- A read-only volumeMount with the mountPath: `model-cache` is created for each container where `mountModelVolume: true`
89
+
- A PVC volume with the name `model-storage` is created for the deployment (read-only by default)
90
+
- A volumeMount with the mountPath: `model-cache` is created for each container where `mountModelVolume: true` (read-only by default)
91
91
-`--model` arg for that container is set to `model-cache/<path/to/model>` where `mountModelVolume: true`
92
92
93
93
⚠️ You do **not** need to configure volumeMounts for containers where `mountModelVolume: true`. ModelService will automatically populate the pod specification and mount the model files.
@@ -96,7 +96,7 @@ However, if you want to add your own volume specifications, you may do so under
96
96
97
97
💡 You may optionally set the `--served-model-name` in your container to be used for the OpenAI request, otherwise the request name must be a long string like `"model": "model-cache/<path/to/model>"`. Note that this argument is added automatically using the option `modelCommand: vllmServe` or `imageDefault`, using `routing.modelName` as the value to the `--served-model-name` argument.
98
98
99
-
> For security purposes, a read-only volume is mounted to the pods to prevent a pod from deleting the model files in case another model service installation uses the same PVC. If you would like to write to the PVC, you should not do so through ModelService, but rather through your own pod like the download-model/pvc-debugger without the read-only restriction.
99
+
> For security, this chart mounts model volumes as read-only by default to prevent accidental deletion or mutation when a PVC is shared. If your workflow needs writes (for example, Hugging Face cache lock files), set `modelArtifacts.readOnly: false`.
Make sure that for the container of your interst in `prefill.containers` or `decode.containers`, there's a field called `mountModelVolume: true` ([see example](../values-pd.yaml#L87)) for the volume mounts to be created correctly.
122
122
123
123
### Behavior
124
-
- A read-only PVC volume with the name `model-storage` is created for the deployment
125
-
- A read-only volumeMount with the mountPath: `model-cache` is created for each container where `mountModelVolume: true`
124
+
- A PVC volume with the name `model-storage` is created for the deployment (read-only by default)
125
+
- A volumeMount with the mountPath: `model-cache` is created for each container where `mountModelVolume: true` (read-only by default)
126
126
-`HF_HUB_CACHE` environment variable for that container is set to `model-cache/path/to/hf_hub_cache` where `mountModelVolume: true`
127
127
-`--model` arugment is set to `facebook/opt-125m`
128
+
129
+
If your runtime needs to write Hugging Face cache metadata or lock files, set:
0 commit comments