fix: align benchmark LLM_D_RELEASE to main for GA InferencePool API

kahilam · kahilam · commit 1fbe6e10b91c · 2026-04-14T08:13:20.000-07:00
The benchmark workflow was using the default llm-d v0.3.0 which deploys
inferencepool chart v1.0.1 with the alpha API group
(inference.networking.x-k8s.io/v1alpha2). Istio 1.29+ with
ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true expects the GA API group
(inference.networking.k8s.io/v1) and does not configure ext_proc for the
alpha InferencePool, causing Gateway to return HTTP 500.

This aligns the benchmark with the e2e workflow which already uses
LLM_D_RELEASE=main (inferencepool chart v1.2.1 with GA API support).

Made-with: Cursor
diff --git a/.github/workflows/ci-benchmark.yaml b/.github/workflows/ci-benchmark.yaml
@@ -545,6 +545,9 @@ jobs:
           INSTALL_GATEWAY_CTRLPLANE: "false"
           E2E_TESTS_ENABLED: "true"
           NAMESPACE_SCOPED: "false"
+          # Use main branch of llm-d/llm-d for inferencepool chart v1.2.1 (GA API support);
+          # v0.3.0 default uses alpha API which Istio 1.29+ does not route through ext_proc.
+          LLM_D_RELEASE: main
           LLMD_NS: ${{ env.LLMD_NAMESPACE }}
           WVA_NS: ${{ env.WVA_NAMESPACE }}
           CONTROLLER_INSTANCE: ${{ env.WVA_NAMESPACE }}