Skip to content

Commit 870bbff

Browse files
docs: minor cleanups deployment README (#131)
- removes k8s section - Switches to use kustomize instead of kubectl -k. The latter might be behind required version Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>
1 parent 13d6da9 commit 870bbff

File tree

1 file changed

+6
-28
lines changed

1 file changed

+6
-28
lines changed

deployment/README.md

Lines changed: 6 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -86,12 +86,14 @@ kustomize build deployment/overlays/kubernetes | envsubst | kubectl apply -f -
8686
8787
#### Simulator Model (CPU)
8888
```bash
89-
kubectl apply -k deployment/samples/models/simulator/
89+
PROJECT_DIR=$(git rev-parse --show-toplevel)
90+
kustomize build ${PROJECT_DIR}/docs/samples/models/simulator/ | kubectl apply -f -
9091
```
9192

9293
#### Facebook OPT-125M Model (CPU)
9394
```bash
94-
kubectl apply -k deployment/samples/models/facebook-opt-125m-cpu/
95+
PROJECT_DIR=$(git rev-parse --show-toplevel)
96+
kustomize build ${PROJECT_DIR}/docs/samples/models/facebook-opt-125m-cpu/ | kubectl apply -f -
9597
```
9698

9799
#### Qwen3 Model (GPU Required)
@@ -100,7 +102,8 @@ kubectl apply -k deployment/samples/models/facebook-opt-125m-cpu/
100102
> This model requires GPU nodes with `nvidia.com/gpu` resources available in your cluster.
101103
102104
```bash
103-
kubectl apply -k deployment/samples/models/qwen3/
105+
PROJECT_DIR=$(git rev-parse --show-toplevel)
106+
kustomize build ${PROJECT_DIR}/docs/samples/models/qwen3/ | kubectl apply -f -
104107
```
105108

106109
#### Verify Model Deployment
@@ -202,31 +205,6 @@ kubectl patch --local -f ${PROJECT_DIR}/deployment/base/policies/maas-auth-polic
202205
-o yaml | kubectl apply -f -
203206

204207
```
205-
206-
### Kubernetes Configuration
207-
208-
For standard Kubernetes clusters, ensure you have an Ingress controller installed (e.g., NGINX).
209-
210-
Install NGINX Ingress Controller if not present:
211-
212-
```bash
213-
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.1/deploy/static/provider/cloud/deploy.yaml
214-
```
215-
216-
## Storage Initializer Configuration
217-
218-
The KServe storage initializer requires sufficient resources for downloading large models. Default settings in `deployment/base/kserve/kserve-config-openshift.yaml`:
219-
220-
- Memory Request: 4Gi
221-
- Memory Limit: 8Gi
222-
- CPU Request: 2
223-
- CPU Limit: 4
224-
225-
To adjust for larger models:
226-
```bash
227-
kubectl edit configmap inferenceservice-config -n kserve
228-
```
229-
230208
## Testing the Deployment
231209

232210
### 1. Get Gateway Endpoint

0 commit comments

Comments
 (0)