Skip to content

Can't get agw as inference gw working #12656

@linsun

Description

@linsun

kgateway version

2.1.0

Kubernetes Version

1.34

Describe the bug

Follow the get started guide:
https://gateway-api-inference-extension.sigs.k8s.io/guides/#__tabbed_3_4

Was able to work around a few issues I hit, but now the curl cmd doesn't work.

 curl -i http://localhost:8885/v1/completions -H 'Content-Type: application/json' -d '{
"model": "food-review-1",
"prompt": "Write as if you were a critic: San Francisco",
"max_tokens": 100,
"temperature": 0
}'
HTTP/1.1 500 Internal Server Error
content-type: text/plain
content-length: 34
date: Fri, 17 Oct 2025 13:24:34 GMT

ig pod log shows:

2025-10-17T13:24:24.6980Z	info	request gateway=default/inference-gateway listener=http route=default/llm-route endpoint=10.244.0.148:8000 src.addr=127.0.0.1:56254 http.method=GET http.host=localhost http.path=/ http.version=HTTP/1.1 http.status=500 inferencepool.selected_endpoint=10.244.0.148:8000 error="ext_proc failed: no more responses" duration=111ms
2025-10-17T13:24:34.421354Z	warn	http::ext_proc	error InvalidHeaderName(InvalidHeaderName)
2025-10-17T13:24:34.421472Z	info	request gateway=default/inference-gateway listener=http route=default/llm-route src.addr=127.0.0.1:36192 http.method=POST http.host=localhost http.path=/v1/completions http.version=HTTP/1.1 http.status=500 error="ext_proc failed: no more responses" duration=4ms
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"gateway.networking.k8s.io/v1","kind":"HTTPRoute","metadata":{"annotations":{},"name":"llm-route","namespace":"default"},"spec":{"parentRefs":[{"group":"gateway.networking.k8s.io","kind":"Gateway","name":"inference-gateway"}],"rules":[{"backendRefs":[{"group":"inference.networking.k8s.io","kind":"InferencePool","name":"vllm-llama3-8b-instruct"}],"matches":[{"path":{"type":"PathPrefix","value":"/"}}],"timeouts":{"request":"300s"}}]}}
  creationTimestamp: "2025-10-17T13:23:33Z"
  generation: 1
  name: llm-route
  namespace: default
  resourceVersion: "8197404"
  uid: 87dcad6d-85ed-413b-8a7b-9fdb795a38dd
spec:
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: inference-gateway
  rules:
  - backendRefs:
    - group: inference.networking.k8s.io
      kind: InferencePool
      name: vllm-llama3-8b-instruct
      weight: 1
    matches:
    - path:
        type: PathPrefix
        value: /
    timeouts:
      request: 300s
status:
  parents:
  - conditions:
    - lastTransitionTime: "2025-10-17T13:23:34Z"
      message: ""
      observedGeneration: 1
      reason: Accepted
      status: "True"
      type: Accepted
    - lastTransitionTime: "2025-10-17T13:23:34Z"
      message: ""
      observedGeneration: 1
      reason: ResolvedRefs
      status: "True"
      type: ResolvedRefs
    controllerName: kgateway.dev/agentgateway
    parentRef:
      group: gateway.networking.k8s.io
      kind: Gateway
      name: inference-gateway
      namespace: default

Expected Behavior

Able to call the curl cmd

Steps to reproduce the bug

Follow the guide: https://gateway-api-inference-extension.sigs.k8s.io/guides/#__tabbed_3_4 and try it out: https://gateway-api-inference-extension.sigs.k8s.io/guides/#try-it-out

Additional Environment Detail

No response

Additional Context

No response

Metadata

Metadata

Assignees

Labels

Area: InferenceActivities related to Gateway API Inference Extension support.Priority: HighRequired in next 3 months to make progress, bugs that affect multiple users, or very bad UX

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions