Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v0.1 API Review] Cleaning up optional fields/clearer wording #185

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

kfswain
Copy link
Collaborator

@kfswain kfswain commented Jan 10, 2025

This is in response to #167, and intends to address these comments.

CC: @smarterclayton

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 10, 2025
@kfswain kfswain changed the title [v0.1 API Review] Cleaning up doc wording [v0.1 API Review] Cleaning up optional fields/clearer wording Jan 13, 2025
@kfswain kfswain mentioned this pull request Jan 13, 2025
Copy link
Contributor

@danehans danehans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits, otherwise /lgtm.

api/v1alpha1/inferencemodel_types.go Outdated Show resolved Hide resolved
api/v1alpha1/inferencemodel_types.go Outdated Show resolved Hide resolved
api/v1alpha1/inferencemodel_types.go Outdated Show resolved Hide resolved
api/v1alpha1/inferencepool_types.go Outdated Show resolved Hide resolved
docs/proposals/002-api-proposal/proposal.md Show resolved Hide resolved
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danehans, kfswain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 13, 2025
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kfswain! One more nit, otherwise LGTM

docs/proposals/002-api-proposal/proposal.md Outdated Show resolved Hide resolved
api/v1alpha1/inferencemodel_types.go Outdated Show resolved Hide resolved
// Criticality impacts how traffic is handled in resource constrained situations. It handles this by
// queuing or rejecting requests of lower criticality. InferenceModels of an equivalent Criticality will
// fairly share resources over throughput of tokens. In the future, the metric used to calculate fairness,
// and the proportionality of fairness will be configurable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we don't default the value in the api, this needs to be defaulted by the implementation, so created #198 to clearly document that the default value will be sheddable in the reference implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slack thread summary: Add back the default Criticality option with a name Standard and document that this will be expected the default although the api doesn't enforce it.

Copy link
Contributor

@ahg-g ahg-g Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document here that implementations should default this value to Standard when not set?

api/v1alpha1/inferencemodel_types.go Outdated Show resolved Hide resolved
Copy link

netlify bot commented Jan 15, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 6fe810d
🔍 Latest deploy log https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67897d8221f8b900088ecd47
😎 Deploy Preview https://deploy-preview-185--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants