-
Notifications
You must be signed in to change notification settings - Fork 596
Closed
Labels
Area: InferenceActivities related to Gateway API Inference Extension support.Activities related to Gateway API Inference Extension support.Priority: HighRequired in next 3 months to make progress, bugs that affect multiple users, or very bad UXRequired in next 3 months to make progress, bugs that affect multiple users, or very bad UX
Milestone
Description
kgateway version
2.1.0
Kubernetes Version
1.34
Describe the bug
Follow the get started guide:
https://gateway-api-inference-extension.sigs.k8s.io/guides/#__tabbed_3_4
Was able to work around a few issues I hit, but now the curl cmd doesn't work.
curl -i http://localhost:8885/v1/completions -H 'Content-Type: application/json' -d '{
"model": "food-review-1",
"prompt": "Write as if you were a critic: San Francisco",
"max_tokens": 100,
"temperature": 0
}'
HTTP/1.1 500 Internal Server Error
content-type: text/plain
content-length: 34
date: Fri, 17 Oct 2025 13:24:34 GMT
ig pod log shows:
2025-10-17T13:24:24.6980Z info request gateway=default/inference-gateway listener=http route=default/llm-route endpoint=10.244.0.148:8000 src.addr=127.0.0.1:56254 http.method=GET http.host=localhost http.path=/ http.version=HTTP/1.1 http.status=500 inferencepool.selected_endpoint=10.244.0.148:8000 error="ext_proc failed: no more responses" duration=111ms
2025-10-17T13:24:34.421354Z warn http::ext_proc error InvalidHeaderName(InvalidHeaderName)
2025-10-17T13:24:34.421472Z info request gateway=default/inference-gateway listener=http route=default/llm-route src.addr=127.0.0.1:36192 http.method=POST http.host=localhost http.path=/v1/completions http.version=HTTP/1.1 http.status=500 error="ext_proc failed: no more responses" duration=4ms
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"gateway.networking.k8s.io/v1","kind":"HTTPRoute","metadata":{"annotations":{},"name":"llm-route","namespace":"default"},"spec":{"parentRefs":[{"group":"gateway.networking.k8s.io","kind":"Gateway","name":"inference-gateway"}],"rules":[{"backendRefs":[{"group":"inference.networking.k8s.io","kind":"InferencePool","name":"vllm-llama3-8b-instruct"}],"matches":[{"path":{"type":"PathPrefix","value":"/"}}],"timeouts":{"request":"300s"}}]}}
creationTimestamp: "2025-10-17T13:23:33Z"
generation: 1
name: llm-route
namespace: default
resourceVersion: "8197404"
uid: 87dcad6d-85ed-413b-8a7b-9fdb795a38dd
spec:
parentRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: inference-gateway
rules:
- backendRefs:
- group: inference.networking.k8s.io
kind: InferencePool
name: vllm-llama3-8b-instruct
weight: 1
matches:
- path:
type: PathPrefix
value: /
timeouts:
request: 300s
status:
parents:
- conditions:
- lastTransitionTime: "2025-10-17T13:23:34Z"
message: ""
observedGeneration: 1
reason: Accepted
status: "True"
type: Accepted
- lastTransitionTime: "2025-10-17T13:23:34Z"
message: ""
observedGeneration: 1
reason: ResolvedRefs
status: "True"
type: ResolvedRefs
controllerName: kgateway.dev/agentgateway
parentRef:
group: gateway.networking.k8s.io
kind: Gateway
name: inference-gateway
namespace: default
Expected Behavior
Able to call the curl cmd
Steps to reproduce the bug
Follow the guide: https://gateway-api-inference-extension.sigs.k8s.io/guides/#__tabbed_3_4 and try it out: https://gateway-api-inference-extension.sigs.k8s.io/guides/#try-it-out
Additional Environment Detail
No response
Additional Context
No response
Metadata
Metadata
Assignees
Labels
Area: InferenceActivities related to Gateway API Inference Extension support.Activities related to Gateway API Inference Extension support.Priority: HighRequired in next 3 months to make progress, bugs that affect multiple users, or very bad UXRequired in next 3 months to make progress, bugs that affect multiple users, or very bad UX