feat: Define provider specific gateway capabilities for llm-d by ericdbishop · Pull Request #288 · kaito-project/airunway

ericdbishop · 2026-05-19T19:19:49Z

Description

Adding gateway capabilities for the llm-d provider. Follow-up to #213. Currently scoped to a custom EPP config and EPP image for llm-d.

AI Prompt (Optional)

🤖 AI Prompt Used

Initial solution written with Copilot, prompt summary:

  Prompt 1 — Initial assessment request

  You set the scene: you're delegating InferencePool/EPP management to the llm-d provider via gateway capabilities, the same way it was done for Dynamo in PR #213. You'd added the Gateway field to llm-d's capabilities locally and wanted me to assess what else was needed. Asked me to make obvious changes directly but to flag anything I wasn't sure about for
  discussion.

  Prompt 2 — Design decisions

  After my list of 7 questions, you answered each:

   1. One EPP per ModelDeployment
   2. Name the constant LLMDSchedulerImage
   3. Ship a sensible default ConfigMap baked into the provider
   4. Wire --kv-events-config automatically (with your review afterward)
   5. Reuse RBAC where possible; no provider-specific code in gateway_reconciler.go — anything provider-specific belongs in providers/
   6. Research upstream whether enablePrefixCaching causes issues for llm-d
   7. Land in one PR, but flagged the key design tension: the existing GatewayCapabilities abstraction (pool name + namespace) fits Dynamo but not llm-d. llm-d doesn't actually need pool delegation — it just needs a custom EPP image. Asked me to code only up to a logical stopping point where we could reconsider the interface.

  Prompt 3 — Approval

  Acknowledged the EndpointPickerCapabilities struct isn't perfectly self-documenting for Dynamo's case but gave the go-ahead to proceed.

  Key constraints you established (worth remembering for future work)

   - No provider-specific branching in the gateway reconciler — providers express themselves through capability declarations
   - Reuse generic scaffolding wherever possible
   - Distinguish "full delegation" (Dynamo) from "EPP customization" (llm-d) as separate, independent extension points
   - One EPP per ModelDeployment
   - Research upstream behavior before wiring flags that might break things

AI Tool:
Copilot CLI with Claude Opus 4.7

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to change)
📚 Documentation update
🎨 UI/UX improvement
♻️ Refactoring (no functional changes)
🧪 Test update
🔧 Build/CI configuration

Related Issues

Fixes #174

Changes Made

Testing

Unit tests pass (bun run test)
Manual testing performed
Tested with a Kubernetes cluster

Checklist

My code follows the project's style guidelines
I have run bun run lint
I have added tests that prove my fix/feature works
New and existing unit tests pass locally
I have updated documentation if needed
My changes generate no new warnings

Screenshots

Additional Notes

Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>

Copilot

Pull request overview

Adds llm-d provider support for gateway/EPP customization by extending provider gateway capabilities so providers can override the controller-managed EPP image + plugin config, while keeping the controller responsible for the surrounding GAIE scaffolding (InferencePool + EPP resources).

Changes:

Introduces GatewayCapabilities.endpointPicker / EndpointPickerCapabilities in the API + CRD to allow provider-specific EPP image and config overrides.
Updates the gateway reconciler to distinguish “full pool delegation” (via InferencePoolNamePattern) from “EPP customization” (via EndpointPicker) and to apply EPP overrides during reconciliation/cleanup.
Updates the llm-d provider to declare EPP overrides (image + default config) and wires vLLM prefix caching / eager flags, with corresponding tests.

Reviewed changes

Copilot reviewed 9 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
providers/llmd/transformer.go	Adds vLLM arg emission for prefix caching and eager execution; minor label formatting.
providers/llmd/transformer_test.go	Adds unit test coverage for emitted prefix caching flags.
providers/llmd/status.go	Minor formatting adjustment.
providers/llmd/controller_test.go	Minor formatting cleanup in tests.
providers/llmd/config.go	Declares llm-d gateway capabilities with provider-supplied EPP image + default EndpointPickerConfig YAML.
providers/llmd/config_test.go	Adds assertions validating llm-d gateway capability fields (EPP override only, no pool delegation).
controller/internal/controller/gateway_reconciler.go	Adds EPP override plumbing and narrows “provider-managed pool” semantics to `InferencePoolNamePattern != ""`.
controller/internal/controller/gateway_reconciler_test.go	Updates provider-managed cleanup test and adds tests for default vs provider-overridden EPP behavior.
controller/config/crd/bases/airunway.ai_inferenceproviderconfigs.yaml	Extends CRD schema to include `gateway.endpointPicker` fields.
controller/api/v1alpha1/zz_generated.deepcopy.go	Regenerates deep-copies for the new API types/fields.
controller/api/v1alpha1/inferenceproviderconfig_types.go	Adds `EndpointPickerCapabilities` and documents the two gateway extension paths.

Files not reviewed (1)

controller/api/v1alpha1/zz_generated.deepcopy.go: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>

Copilot

Pull request overview

Copilot reviewed 10 out of 12 changed files in this pull request and generated 1 comment.

Files not reviewed (1)

controller/api/v1alpha1/zz_generated.deepcopy.go: Generated file

+			t.Error("did not expect --enable-prefix-caching when EnablePrefixCaching=false")
+		}
+	}
+}


Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>

Copilot

Pull request overview

Copilot reviewed 10 out of 12 changed files in this pull request and generated 1 comment.

Files not reviewed (1)

controller/api/v1alpha1/zz_generated.deepcopy.go: Generated file

Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>

Copilot

Pull request overview

Copilot reviewed 12 out of 14 changed files in this pull request and generated 2 comments.

Files not reviewed (1)

controller/api/v1alpha1/zz_generated.deepcopy.go: Generated file

+// TestGateway_EPP_OnlyImageOverride verifies image-only overrides leave the
+// default ConfigMap in place (and vice versa via empty Image).
+func TestGateway_EPP_OnlyImageOverride(t *testing.T) {


jaellio · 2026-06-15T21:12:11Z

+// The two extension points are independent. A provider may use either, both,
+// or neither. EndpointPicker is ignored when ManagesInferencePool is true (the
+// provider is then expected to manage the EPP itself).


Agree with this feedback^

jaellio · 2026-06-15T21:12:11Z

+// The two extension points are independent. A provider may use either, both,
+// or neither. EndpointPicker is ignored when ManagesInferencePool is true (the
+// provider is then expected to manage the EPP itself).


Agree with this feedback^

jaellio · 2026-06-15T21:13:46Z

+//     named pool, reads its EndpointPickerRef, and wires HTTPRoute/ReferenceGrant
+//     accordingly. The controller does not create an InferencePool or EPP itself.
+//
+//  2. Endpoint Picker customization. When EndpointPicker is set, the controller


Clarify that the EPP is still managed by the controller when EndpointPicker is set.

jaellio · 2026-06-15T21:29:35Z

 kind: EndpointPickerConfig
-`,
+`
+		if overrides != nil && overrides.ConfigData != "" {


Maybe add some debug logs for these overrides

jaellio · 2026-06-15T21:32:17Z

 Some inference providers (e.g., NVIDIA Dynamo, llm-d) have native Gateway API Inference Extension support with their own InferencePool and Endpoint Picker (EPP). These providers deploy specialized EPPs with capabilities beyond the generic upstream EPP — for example, Dynamo's EPP uses **KV-cache-aware scoring** to route requests to endpoints with the highest KV cache hit probability.

-When a provider declares gateway capabilities in its `InferenceProviderConfig`, the controller **delegates** InferencePool and/or EPP management to the provider instead of creating its own.
+When a provider declares gateway capabilities in its `InferenceProviderConfig`, the controller adapts what it creates. Two extension points exist and can be used independently:


Not independent since endpointPicker is ignored if managesInferencePool is set

jaellio · 2026-06-15T21:34:19Z

 |---|---|---|
-| `managesInferencePool` | Controller waits for the provider's InferencePool to exist, then uses it as the HTTPRoute backend. Skips `reconcileInferencePool()` and `labelModelPods()`. | Controller creates and owns the InferencePool (default behavior). |
-| `managesEPP` | Controller does nothing. | Controller deploys the generic upstream EPP. |
+| `managesInferencePool: true` | Controller waits for the provider's InferencePool to exist, then uses it as the HTTPRoute backend. Skips `reconcileInferencePool()`, `reconcileEPP()`, and `labelModelPods()`. | Controller creates and owns the InferencePool and the EPP (default behavior). |


true here makes me think this is the default value. Is that correct? Maybe add another column for "default value"

feat: Define provider specific gateway capabilities for llm-d

98e5be6

Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>

Copilot AI review requested due to automatic review settings May 19, 2026 19:19

Copilot started reviewing on behalf of ericdbishop May 19, 2026 19:20 View session

Copilot AI reviewed May 19, 2026

View reviewed changes

Comment thread controller/api/v1alpha1/inferenceproviderconfig_types.go Outdated

ericdbishop added 2 commits June 11, 2026 12:16

Merge branch 'main' into llmd-gateway-capabilities

295d0c3

Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>

make gen

3279c27

Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>

Copilot AI review requested due to automatic review settings June 15, 2026 14:57

Copilot AI reviewed Jun 15, 2026

View reviewed changes

Comment thread providers/llmd/transformer_test.go

t.Error("did not expect --enable-prefix-caching when EnablePrefixCaching=false")

}

}

}

fix image for llm-d epp

e245877

Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>

ericdbishop marked this pull request as ready for review June 15, 2026 19:54

ericdbishop requested a review from a team as a code owner June 15, 2026 19:54

Copilot AI review requested due to automatic review settings June 15, 2026 19:54

Merge branch 'main' into llmd-gateway-capabilities

7ca0552

Copilot AI reviewed Jun 15, 2026

View reviewed changes

Comment thread providers/llmd/transformer.go

ericdbishop added 2 commits June 15, 2026 16:53

fix image for llm-d epp

c7ef236

Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>

docs update

990e3a6

Signed-off-by: Eric Bishop <ericbish.dev@gmail.com>

Copilot AI review requested due to automatic review settings June 15, 2026 20:53

Copilot AI reviewed Jun 15, 2026

View reviewed changes

jaellio reviewed Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Define provider specific gateway capabilities for llm-d#288

feat: Define provider specific gateway capabilities for llm-d#288
ericdbishop wants to merge 7 commits into
kaito-project:mainfrom
ericdbishop:llmd-gateway-capabilities

ericdbishop commented May 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

jaellio Jun 15, 2026

Uh oh!

jaellio Jun 15, 2026

Uh oh!

jaellio Jun 15, 2026

Uh oh!

jaellio Jun 15, 2026

Uh oh!

jaellio Jun 15, 2026

Uh oh!

jaellio Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ericdbishop commented May 19, 2026

Description

AI Prompt (Optional)

Type of Change

Related Issues

Changes Made

Testing

Checklist

Screenshots

Additional Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

jaellio Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

jaellio Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

jaellio Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

jaellio Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

jaellio Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

jaellio Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants