feat: Define provider specific gateway capabilities for llm-d#288
feat: Define provider specific gateway capabilities for llm-d#288ericdbishop wants to merge 7 commits into
Conversation
Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>
There was a problem hiding this comment.
Pull request overview
Adds llm-d provider support for gateway/EPP customization by extending provider gateway capabilities so providers can override the controller-managed EPP image + plugin config, while keeping the controller responsible for the surrounding GAIE scaffolding (InferencePool + EPP resources).
Changes:
- Introduces
GatewayCapabilities.endpointPicker/EndpointPickerCapabilitiesin the API + CRD to allow provider-specific EPP image and config overrides. - Updates the gateway reconciler to distinguish “full pool delegation” (via
InferencePoolNamePattern) from “EPP customization” (viaEndpointPicker) and to apply EPP overrides during reconciliation/cleanup. - Updates the llm-d provider to declare EPP overrides (image + default config) and wires vLLM prefix caching / eager flags, with corresponding tests.
Reviewed changes
Copilot reviewed 9 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| providers/llmd/transformer.go | Adds vLLM arg emission for prefix caching and eager execution; minor label formatting. |
| providers/llmd/transformer_test.go | Adds unit test coverage for emitted prefix caching flags. |
| providers/llmd/status.go | Minor formatting adjustment. |
| providers/llmd/controller_test.go | Minor formatting cleanup in tests. |
| providers/llmd/config.go | Declares llm-d gateway capabilities with provider-supplied EPP image + default EndpointPickerConfig YAML. |
| providers/llmd/config_test.go | Adds assertions validating llm-d gateway capability fields (EPP override only, no pool delegation). |
| controller/internal/controller/gateway_reconciler.go | Adds EPP override plumbing and narrows “provider-managed pool” semantics to InferencePoolNamePattern != "". |
| controller/internal/controller/gateway_reconciler_test.go | Updates provider-managed cleanup test and adds tests for default vs provider-overridden EPP behavior. |
| controller/config/crd/bases/airunway.ai_inferenceproviderconfigs.yaml | Extends CRD schema to include gateway.endpointPicker fields. |
| controller/api/v1alpha1/zz_generated.deepcopy.go | Regenerates deep-copies for the new API types/fields. |
| controller/api/v1alpha1/inferenceproviderconfig_types.go | Adds EndpointPickerCapabilities and documents the two gateway extension paths. |
Files not reviewed (1)
- controller/api/v1alpha1/zz_generated.deepcopy.go: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>
| t.Error("did not expect --enable-prefix-caching when EnablePrefixCaching=false") | ||
| } | ||
| } | ||
| } |
Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>
Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>
Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>
| // TestGateway_EPP_OnlyImageOverride verifies image-only overrides leave the | ||
| // default ConfigMap in place (and vice versa via empty Image). | ||
| func TestGateway_EPP_OnlyImageOverride(t *testing.T) { |
| // The two extension points are independent. A provider may use either, both, | ||
| // or neither. EndpointPicker is ignored when ManagesInferencePool is true (the | ||
| // provider is then expected to manage the EPP itself). |
| // The two extension points are independent. A provider may use either, both, | ||
| // or neither. EndpointPicker is ignored when ManagesInferencePool is true (the | ||
| // provider is then expected to manage the EPP itself). |
| // named pool, reads its EndpointPickerRef, and wires HTTPRoute/ReferenceGrant | ||
| // accordingly. The controller does not create an InferencePool or EPP itself. | ||
| // | ||
| // 2. Endpoint Picker customization. When EndpointPicker is set, the controller |
There was a problem hiding this comment.
Clarify that the EPP is still managed by the controller when EndpointPicker is set.
| kind: EndpointPickerConfig | ||
| `, | ||
| ` | ||
| if overrides != nil && overrides.ConfigData != "" { |
There was a problem hiding this comment.
Maybe add some debug logs for these overrides
| Some inference providers (e.g., NVIDIA Dynamo, llm-d) have native Gateway API Inference Extension support with their own InferencePool and Endpoint Picker (EPP). These providers deploy specialized EPPs with capabilities beyond the generic upstream EPP — for example, Dynamo's EPP uses **KV-cache-aware scoring** to route requests to endpoints with the highest KV cache hit probability. | ||
|
|
||
| When a provider declares gateway capabilities in its `InferenceProviderConfig`, the controller **delegates** InferencePool and/or EPP management to the provider instead of creating its own. | ||
| When a provider declares gateway capabilities in its `InferenceProviderConfig`, the controller adapts what it creates. Two extension points exist and can be used independently: |
There was a problem hiding this comment.
Not independent since endpointPicker is ignored if managesInferencePool is set
| |---|---|---| | ||
| | `managesInferencePool` | Controller waits for the provider's InferencePool to exist, then uses it as the HTTPRoute backend. Skips `reconcileInferencePool()` and `labelModelPods()`. | Controller creates and owns the InferencePool (default behavior). | | ||
| | `managesEPP` | Controller does nothing. | Controller deploys the generic upstream EPP. | | ||
| | `managesInferencePool: true` | Controller waits for the provider's InferencePool to exist, then uses it as the HTTPRoute backend. Skips `reconcileInferencePool()`, `reconcileEPP()`, and `labelModelPods()`. | Controller creates and owns the InferencePool and the EPP (default behavior). | |
There was a problem hiding this comment.
true here makes me think this is the default value. Is that correct? Maybe add another column for "default value"
Description
Adding gateway capabilities for the llm-d provider. Follow-up to #213. Currently scoped to a custom EPP config and EPP image for llm-d.
AI Prompt (Optional)
🤖 AI Prompt Used
AI Tool:
Copilot CLI with Claude Opus 4.7
Type of Change
Related Issues
Fixes #174
Changes Made
Testing
bun run test)Checklist
bun run lintScreenshots
Additional Notes