Skip to content

Conversation

@Gregory-Pereira
Copy link
Contributor

@Gregory-Pereira Gregory-Pereira commented Dec 13, 2025

What type of PR is this?
/kind cleanup
/kind feature

What this PR does / why we need it:

Enable utilization of the InferenceObjective CR we already have

Does this PR introduce a user-facing change?:
NONE, simply exposes the inferencepool objective in the helm charts

@netlify
Copy link

netlify bot commented Dec 13, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit db76251
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/6940304a6ff8670008660d30
😎 Deploy Preview https://deploy-preview-1995--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Gregory-Pereira
Once this PR has been reviewed and has the lgtm label, please assign danehans for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 13, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @Gregory-Pereira. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 13, 2025
@shmuelk
Copy link
Contributor

shmuelk commented Dec 14, 2025

This PR looks ok, but somehow I think it's missing something.

It is creating a single InferenceObjective with a name that matches the Helm Release Name.

As I understand things the InferenceObjective is referenced by the header x-gateway-inference-objective sent with the request. This is a request related thing. I would expect the ability to create several InferenceObjectives each with a different name and different priority.

@Gregory-Pereira
Copy link
Contributor Author

Good point, I will update the implementation so that users could define all the inference objectives they wish to relate to the inference pool

Signed-off-by: greg pereira <[email protected]>
@nirrozenbaum
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 15, 2025
@Gregory-Pereira Gregory-Pereira force-pushed the add-optional-inference-objective branch from ca76a99 to ff97818 Compare December 15, 2025 15:12
@ahg-g
Copy link
Contributor

ahg-g commented Dec 15, 2025

Can you please discuss the motivation for this? I see some value, but infObj are a resource that will be created/updated/deleted after creating the infPool; meaning likely new objectives will be added/deleted later.

@Gregory-Pereira Gregory-Pereira force-pushed the add-optional-inference-objective branch from e000098 to 6751dd6 Compare December 15, 2025 15:57
… over inference objectives

Signed-off-by: greg pereira <[email protected]>
@Gregory-Pereira Gregory-Pereira force-pushed the add-optional-inference-objective branch from 6751dd6 to db76251 Compare December 15, 2025 15:59
@Gregory-Pereira
Copy link
Contributor Author

Can you please discuss the motivation for this? I see some value, but infObj are a resource that will be created/updated/deleted after creating the infPool; meaning likely new objectives will be added/deleted later.

I saw the value as automating the creation / deletion of them. In this way they get created and cleaned up with the helm chart. Not to say that others cannot add more out of band. I started on this in preparation for the Flow Control integration work with regard to an LLM-D guide that could showcase the work.

priority: {{ .priority }}
poolRef:
group: {{ .Values.inferenceExtension.apiVersion }}
name: {{ .name }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another miss?

Suggested change
name: {{ .name }}
name: {{ .Release.Name }}

@ahg-g
Copy link
Contributor

ahg-g commented Dec 15, 2025

ok, I can see value in cases where for the most part the objectives are known in advance and mostly static

kind: InferenceObjective
metadata:
name: {{ .name }}
namespace: {{ $.Release.Namespace }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOC - why do we need here the $?
shouldn't we use

Suggested change
namespace: {{ $.Release.Namespace }}
namespace: {{ .Release.Namespace }}

?

| `inferenceExtension.sidecar.volumeMounts` | List of volume mounts for the sidecar container. Optional. |
| `inferenceExtension.sidecar.volumes` | List of volumes for the sidecar container. Optional. |
| `inferenceExtension.sidecar.configMapData` | Custom key-value pairs to be included in a ConfigMap created for the sidecar container. Only used when `inferenceExtension.sidecar.enabled` is `true`. Optional. |
| `inferenceObjectives` | A list of names and priorities to create InferenceObjectives from that will be assigned to the inference pool |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend documenting that this is for the case where the objectives are known in advance and mostly static, and that the user can still add/update/delete objectives later.

# maxRequestsPerConnection: 256000


# Optional: Define multiple InferenceObjectives for this InferencePool.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kfswain
Copy link
Collaborator

kfswain commented Dec 15, 2025

Agreed with the other comments here. As long as we communicate clearly that there isn't a need to correlate the infObjectives at Pool creation, this all seems reasonable to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants