Skip to content

Commit 480b2d6

Browse files
committed
fix: fix deployments comments
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
1 parent 3ccba1f commit 480b2d6

File tree

4 files changed

+67
-47
lines changed

4 files changed

+67
-47
lines changed

DEVELOPMENT.md

Lines changed: 59 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,8 @@ Documentation for developing the inference scheduler.
1919

2020
## Kind Development Environment
2121

22-
> **WARNING**: This currently requires you to have manually built the vllm
22+
> [!Warning]
23+
> This currently requires you to have manually built the vllm
2324
> simulator separately on your local system. In a future iteration this will
2425
> be handled automatically and will not be required. The tag for the simulator
2526
> currently needs to be `v0.1.0`.
@@ -116,46 +117,50 @@ kubectl rollout restart deployment food-review-endpoint-picker
116117

117118
## Kubernetes Development Environment
118119

119-
A Kubernetes (or OpenShift) cluster can be used for development and testing.
120-
There is a cluster-level infrastructure deployment that needs to be managed,
121-
and then development environments can be created on a per-namespace basis to
122-
enable sharing the cluster with multiple developers (or feel free to just use
123-
the `default` namespace if the cluster is private/personal).
120+
A Kubernetes cluster can be used for development and testing.
121+
The setup can be split in two:
122+
123+
- cluster-level infrastructure deployment (e.g., CRDs), and
124+
- deployment of development environments on a per-namespace basis
125+
126+
This enables cluster sharing by multiple developers. In case of private/personal
127+
clusters, the the `default` namespace can be used directly.
124128

125129
### Setup - Infrastructure
126130

127-
> **WARNING**: In shared cluster situations you should probably not be
128-
> running this unless you're the cluster admin and you're _certain_ it's you
131+
> [!CAUTION]
132+
> In shared cluster situations you should probably not be
133+
> running this unless you're the cluster admin and you're _certain_ you
129134
> that should be running this, as this can be disruptive to other developers
130135
> in the cluster.
131136
132137
The following will deploy all the infrastructure-level requirements (e.g. CRDs,
133-
Operators, etc) to support the namespace-level development environments:
138+
Operators, etc.) to support the namespace-level development environments:
134139

135140
Install GIE CRDs:
136141

137142
```bash
138-
VERSION=v0.3.0
139-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/$VERSION/manifests.yaml
143+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/latest/download/manifests.yaml
140144
```
141145

142-
Install Kgateway:
146+
Install kgateway:
143147
```bash
144148
KGTW_VERSION=v2.0.2
145149
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
146150
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
147151
```
148152

149-
For more details you can find in Gateway API inference [getting started guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/)
153+
For more details, see the Gateway API inference Extension [getting started guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/)
150154

151155
### Setup - Developer Environment
152156

153-
> **WARNING**: This setup is currently very manual in regards to container
157+
> [!NOTE]
158+
> This setup is currently very manual in regards to container
154159
> images for the VLLM simulator and the EPP. It is expected that you build and
155160
> push images for both to your own private registry. In future iterations, we
156161
> will be providing automation around this to make it simpler.
157162
158-
To deploy a development environment to the cluster you'll need to explicitly
163+
To deploy a development environment to the cluster, you'll need to explicitly
159164
provide a namespace. This can be `default` if this is your personal cluster,
160165
but on a shared cluster you should pick something unique. For example:
161166

@@ -175,7 +180,8 @@ Set the default namespace for kubectl commands
175180
kubectl config set-context --current --namespace="${NAMESPACE}"
176181
```
177182

178-
> NOTE: If you are using OpenShift (oc CLI), use the following instead: `oc project "${NAMESPACE}"`
183+
> [!NOTE]
184+
> If you are using OpenShift (oc CLI), you can use the following instead: `oc project "${NAMESPACE}"`
179185
180186
- Set Hugging Face token variable:
181187

@@ -186,26 +192,26 @@ export HF_TOKEN="<HF_TOKEN>"
186192
Download the `llm-d-kv-cache-manager` repository (the instllation script and Helm chart to install the vLLM environment):
187193

188194
```bash
189-
cd .. & git clone git@github.com:llm-d/llm-d-kv-cache-manager.git
195+
cd .. && git clone git@github.com:llm-d/llm-d-kv-cache-manager.git
190196
```
197+
191198
If you prefer to clone it into the `/tmp` directory, make sure to update the `VLLM_CHART_DIR` environment variable:
192199
`export VLLM_CHART_DIR=<tmp_dir>/llm-d-kv-cache-manager/vllm-setup-helm`
193200

194-
195-
196201
Once all this is set up, you can deploy the environment:
197202

198203
```bash
199204
make env-dev-kubernetes
200205
```
201206

202207
This will deploy the entire stack to whatever namespace you chose.
203-
**Note:** The model and images of each componet can be replaced. See [Environment Configuration](#environment-configuration) for model settings.
208+
> [!NOTE]
209+
> The model and images of each componet can be replaced. See [Environment Configuration](#environment-configuration) for model settings.
204210
205-
You can test by exposing the inference `Gateway` via port-forward:
211+
You can test by exposing the `inference gateway` via port-forward:
206212

207213
```bash
208-
kubectl port-forward service/inference-gateway 8080:80
214+
kubectl port-forward service/inference-gateway 8080:80 -n "${NAMESPACE}"
209215
```
210216

211217
And making requests with `curl`:
@@ -215,6 +221,9 @@ curl -s -w '\n' http://localhost:8080/v1/completions -H 'Content-Type: applicati
215221
-d '{"model":"meta-llama/Llama-3.1-8B-Instruct","prompt":"hi","max_tokens":10,"temperature":0}' | jq
216222
```
217223

224+
> [!NOTE]
225+
> If the response is empty or contains an error, jq may output a cryptic error. You can run the command without jq to debug raw responses.
226+
218227
#### Environment Configurateion
219228

220229
**1. Setting the EPP image and tag:**
@@ -234,7 +243,7 @@ You can optionally set the vllm replicas:
234243
export VLLM_REPLICA_COUNT=2
235244
```
236245

237-
**3. Setting the model name and label:**
246+
**3. Setting the model name:**
238247

239248
You can replace the model name that will be used in the system.
240249

@@ -244,41 +253,36 @@ export MODEL_NAME="${MODEL_NAME:-mistralai/Mistral-7B-Instruct-v0.2}"
244253

245254
**4. Additional environment settings:**
246255

247-
More Setting of environment variables can be found in the `scripts/kubernetes-dev-env.sh`.
248-
249-
256+
More environment variable settings can be found in the `scripts/kubernetes-dev-env.sh`.
250257

251258
#### Development Cycle
252259

253-
> **WARNING**: This is a very manual process at the moment. We expect to make
260+
> [!Warning]
261+
> This is a very manual process at the moment. We expect to make
254262
> this more automated in future iterations.
255263
256264
Make your changes locally and commit them. Then select an image tag based on
257-
the `git` SHA:
265+
the `git` SHA and set your private registry:
258266

259267
```bash
260268
export EPP_TAG=$(git rev-parse HEAD)
269+
export IMAGE_REGISTRY="quay.io/my-id"
261270
```
262271

263-
Build the image:
272+
Build the image and tag the image for your private registry:
264273

265274
```bash
266-
DEV_VERSION=$EPP_TAG make image-build
275+
make image-build
267276
```
268277

269-
Tag the image for your private registry and push it:
278+
and push it:
270279

271280
```bash
272-
$CONTAINER_RUNTIME tag quay.io/llm-d/llm-d-gateway-api-inference-extension/epp:$TAG \
273-
<MY_REGISTRY>/<MY_IMAGE>:$EPP_TAG
274-
$CONTAINER_RUNTIME push <MY_REGISTRY>/<MY_IMAGE>:$EPP_TAG
281+
make image-push
275282
```
276283

277-
> **NOTE**: `$CONTAINER_RUNTIME` can be configured or replaced with whatever your
278-
> environment's standard container runtime is (e.g. `podman`, `docker`).
279-
280-
Then you can re-deploy the environment with the new changes (don't forget all
281-
the required env vars):
284+
You can now re-deploy the environment with your changes (don't forget all
285+
the required environment variables):
282286

283287
```bash
284288
make env-dev-kubernetes
@@ -299,3 +303,19 @@ If you also want to remove the namespace entirely, run:
299303
```sh
300304
kubectl delete namespace ${NAMESPACE}
301305
```
306+
307+
To uninstall the infra-stracture development:
308+
Uninstal GIE CRDs:
309+
310+
```sh
311+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/latest/download/manifests.yaml --ignore-not-found
312+
```
313+
314+
Uninstall kgateway:
315+
316+
```sh
317+
helm uninstall kgateway -n kgateway-system
318+
helm uninstall kgateway-crds -n kgateway-system
319+
```
320+
321+
For more details, see the Gateway API inference Extension [getting started guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/)

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -301,7 +301,7 @@ clean-env-dev-kind: ## Cleanup kind setup (delete cluster $(KIND_CLUSTER_NA
301301

302302

303303
# Kubernetes Development Environment - Deploy
304-
# This target deploys the GIE stack in a specific namespace for development and testing.
304+
# This target deploys the inference scheduler stack in a specific namespace for development and testing.
305305
.PHONY: env-dev-kubernetes
306306
env-dev-kubernetes: check-kubectl check-kustomize check-envsubst
307307
IMAGE_REGISTRY=$(IMAGE_REGISTRY) ./scripts/kubernetes-dev-env.sh 2>&1

deploy/environments/dev/kubernetes-kgateway/kustomization.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ resources:
99
- gateway-parameters.yaml
1010

1111
images:
12-
- name: quay.io/llm-d/gateway-api-inference-extension
12+
- name: ghcr.io/llm-d/gateway-api-inference-extension
1313
newName: ${EPP_IMAGE}
1414
newTag: ${EPP_TAG}
1515

scripts/kubernetes-dev-env.sh

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
#!/bin/bash
22

3-
# This shell script deploys a Kubernetes or OpenShift cluster with an
3+
# This shell script deploys a Kubernetes cluster with an
44
# KGateway-based Gateway API implementation fully configured. It deploys the
5-
# vllm simulator, which it exposes with a Gateway -> HTTPRoute -> InferencePool.
5+
# vllm, which it exposes with a Gateway -> HTTPRoute -> InferencePool.
66
# The Gateway is configured with the a filter for the ext_proc endpoint picker.
77

88
set -eux
@@ -73,7 +73,7 @@ export EPP_IMAGE="${EPP_IMAGE:-llm-d-inference-scheduler}"
7373
# EPP image tag
7474
export EPP_TAG="${EPP_TAG:-v0.1.0}"
7575

76-
# Whether Prompt/Document (P/D) mode is enabled for this deployment
76+
# Whether P/D mode is enabled for this deployment
7777
export PD_ENABLED="\"${PD_ENABLED:-false}\""
7878

7979
# Token length threshold to trigger P/D logic
@@ -123,7 +123,7 @@ export VLLM_DEPLOYMENT_NAME="${VLLM_HELM_RELEASE_NAME}-${MODEL_NAME_SAFE}"
123123

124124
kubectl create namespace ${NAMESPACE} 2>/dev/null || true
125125

126-
# Hack to deal with KGateways broken OpenShift support
126+
# Hack to better deal with kgateway on OpenShift
127127
export PROXY_UID=$(kubectl get namespace ${NAMESPACE} -o json | jq -e -r '.metadata.annotations["openshift.io/sa.scc.uid-range"]' | perl -F'/' -lane 'print $F[0]+1');
128128

129129
# Detect if the cluster is OpenShift by checking for the 'route.openshift.io' API group
@@ -140,7 +140,7 @@ if [[ "$CLEAN" == "true" ]]; then
140140
kustomize build deploy/environments/dev/kubernetes-kgateway | envsubst > temp_delet.yaml
141141
kustomize build deploy/environments/dev/kubernetes-kgateway | envsubst | kubectl -n "${NAMESPACE}" delete --ignore-not-found=true -f -
142142
# Delete vllm resources.
143-
helm uninstall vllm --namespace c3
143+
helm uninstall vllm --namespace ${NAMESPACE}
144144
exit 0
145145
fi
146146

@@ -156,7 +156,7 @@ fi
156156
# Run Helm upgrade/install vllm
157157
echo "INFO: Deploying vLLM Environment in namespace ${NAMESPACE}, ${POOL_NAME}"
158158
helm upgrade --install "$VLLM_HELM_RELEASE_NAME" "$VLLM_CHART_DIR" \
159-
--namespace c3 \
159+
--namespace="$NAMESPACE" \
160160
--set secret.create=true \
161161
--set secret.hfTokenValue="$HF_TOKEN2" \
162162
--set vllm.poolLabelValue="$POOL_NAME" \

0 commit comments

Comments
 (0)