Skip to content

Conversation

@capri-xiyue
Copy link
Contributor

What type of PR is this?

What this PR does / why we need it:
see #1778

Which issue(s) this PR fixes:

Fixes #1778

Does this PR introduce a user-facing change?:

NONE

@netlify
Copy link

netlify bot commented Nov 5, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 92d36bd
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/690bebaaef32910008007bf7
😎 Deploy Preview https://deploy-preview-1821--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 5, 2025
@kfswain
Copy link
Collaborator

kfswain commented Nov 5, 2025

Thanks @capri-xiyue! Can we make this a distinct helm chart? Since we call this current chart inferencePool it would be odd to have a mode in that chart that doesn't even use inferencePool.

@capri-xiyue
Copy link
Contributor Author

capri-xiyue commented Nov 5, 2025

@kfswain For this stage, EPP standalone mode still uses InferencePool. That's why I didn't put it in a distinct chart. Later, if we decide to have a standalone EPP without any k8s crd, I will refactor it to a distinct helm chart.

To clarify, this PR just removes gateway api dependency, inferencepool api is still used.

Thanks @capri-xiyue! Can we make this a distinct helm chart? Since we call this current chart inferencePool it would be odd to have a mode in that chart that doesn't even use inferencePool.

@capri-xiyue
Copy link
Contributor Author

/assign @ahg-g

@kfswain
Copy link
Collaborator

kfswain commented Nov 5, 2025

@kfswain For this stage, EPP standalone mode still uses InferencePool. That's why I didn't put it in a distinct chart. Later, if we decide to have a standalone EPP without any k8s crd, I will refactor it to a distinct helm chart.

Thanks @capri-xiyue! Can we make this a distinct helm chart? Since we call this current chart inferencePool it would be odd to have a mode in that chart that doesn't even use inferencePool.

That may lead to confusing UX in the case where this is deployed in a cluster that has a GW controller, as it will attempt to reconcile on the InferencePool and integrate it into the GW system, is modifying EPP to just accept a selector not a viable path forward?

@capri-xiyue
Copy link
Contributor Author

capri-xiyue commented Nov 5, 2025

@kfswain For this stage, EPP standalone mode still uses InferencePool. That's why I didn't put it in a distinct chart. Later, if we decide to have a standalone EPP without any k8s crd, I will refactor it to a distinct helm chart.

Thanks @capri-xiyue! Can we make this a distinct helm chart? Since we call this current chart inferencePool it would be odd to have a mode in that chart that doesn't even use inferencePool.

That may lead to confusing UX in the case where this is deployed in a cluster that has a GW controller, as it will attempt to reconcile on the InferencePool and integrate it into the GW system, is modifying EPP to just accept a selector not a viable path forward?

I talked with @ahg-g before, modifying EPP to just accept a selector needs further discussion. Therefore he suggested me finalizing EPP with envoy proxy first with helm chart.

Curious now what will happen when a inference pool deployed in a cluster with two GW controller?(for example kgateway and istio), will it cause issues here? Initially I thought each GW controller is able to handle this case.

@ahg-g
Copy link
Contributor

ahg-g commented Nov 5, 2025

Yeah, my suggestion is to take a gradual approach, a gateway controller should not care about an inferencePool that is not referenced by an httpRoute.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: capri-xiyue
Once this PR has been reviewed and has the lgtm label, please ask for approval from ahg-g. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@capri-xiyue capri-xiyue requested a review from ahg-g November 6, 2025 00:28
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 6, 2025
@capri-xiyue
Copy link
Contributor Author

capri-xiyue commented Nov 6, 2025

As an update, I'm now working on another PR to modify EPP to just accept a selector and will refactor the helm chart to have a distinct one as no inference pool is needed in that PR. EPP refactor probably takes a little while as fix bunch of ut takes time. Will let you know when I send a PR. @kfswain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Standalone EPP - Proxy Replacement Investigation

4 participants