Skip to content

Commit 1fbe6e1

Browse files
committed
fix: align benchmark LLM_D_RELEASE to main for GA InferencePool API
The benchmark workflow was using the default llm-d v0.3.0 which deploys inferencepool chart v1.0.1 with the alpha API group (inference.networking.x-k8s.io/v1alpha2). Istio 1.29+ with ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true expects the GA API group (inference.networking.k8s.io/v1) and does not configure ext_proc for the alpha InferencePool, causing Gateway to return HTTP 500. This aligns the benchmark with the e2e workflow which already uses LLM_D_RELEASE=main (inferencepool chart v1.2.1 with GA API support). Made-with: Cursor
1 parent a16f025 commit 1fbe6e1

1 file changed

Lines changed: 3 additions & 0 deletions

File tree

.github/workflows/ci-benchmark.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -545,6 +545,9 @@ jobs:
545545
INSTALL_GATEWAY_CTRLPLANE: "false"
546546
E2E_TESTS_ENABLED: "true"
547547
NAMESPACE_SCOPED: "false"
548+
# Use main branch of llm-d/llm-d for inferencepool chart v1.2.1 (GA API support);
549+
# v0.3.0 default uses alpha API which Istio 1.29+ does not route through ext_proc.
550+
LLM_D_RELEASE: main
548551
LLMD_NS: ${{ env.LLMD_NAMESPACE }}
549552
WVA_NS: ${{ env.WVA_NAMESPACE }}
550553
CONTROLLER_INSTANCE: ${{ env.WVA_NAMESPACE }}

0 commit comments

Comments
 (0)