We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 8e9301c commit 501ff19Copy full SHA for 501ff19
site/docs/capabilities/inference/httproute-inferencepool.md
@@ -40,7 +40,7 @@ kubectl wait --timeout=2m -n envoy-gateway-system deployment/envoy-gateway --for
40
Deploy a sample inference backend that will serve as your inference endpoints:
41
42
```bash
43
-kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/sim-deployment.yaml
+kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/v1.0.1/config/manifests/vllm/sim-deployment.yaml
44
```
45
46
This creates a simulated vLLM deployment with multiple replicas that can handle inference requests.
0 commit comments