-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v0.1 API Review] Cleaning up optional fields/clearer wording #185
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits, otherwise /lgtm.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danehans, kfswain The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kfswain! One more nit, otherwise LGTM
// Criticality impacts how traffic is handled in resource constrained situations. It handles this by | ||
// queuing or rejecting requests of lower criticality. InferenceModels of an equivalent Criticality will | ||
// fairly share resources over throughput of tokens. In the future, the metric used to calculate fairness, | ||
// and the proportionality of fairness will be configurable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we don't default the value in the api, this needs to be defaulted by the implementation, so created #198 to clearly document that the default value will be sheddable in the reference implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slack thread summary: Add back the default Criticality option with a name Standard
and document that this will be expected the default although the api doesn't enforce it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we document here that implementations should default this value to Standard when not set?
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
This is in response to #167, and intends to address these comments.
CC: @smarterclayton