Commit 1fbe6e1
committed
fix: align benchmark LLM_D_RELEASE to main for GA InferencePool API
The benchmark workflow was using the default llm-d v0.3.0 which deploys
inferencepool chart v1.0.1 with the alpha API group
(inference.networking.x-k8s.io/v1alpha2). Istio 1.29+ with
ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true expects the GA API group
(inference.networking.k8s.io/v1) and does not configure ext_proc for the
alpha InferencePool, causing Gateway to return HTTP 500.
This aligns the benchmark with the e2e workflow which already uses
LLM_D_RELEASE=main (inferencepool chart v1.2.1 with GA API support).
Made-with: Cursor1 parent a16f025 commit 1fbe6e1
1 file changed
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
545 | 545 | | |
546 | 546 | | |
547 | 547 | | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
548 | 551 | | |
549 | 552 | | |
550 | 553 | | |
| |||
0 commit comments