-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Groundwork to support OpenAI API endpoints that vLLM supports #526
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kfswain The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
389ef35
to
a62a169
Compare
a62a169
to
d604a20
Compare
For these passthrough api sendpoints, users should setup policies to tell envoy not send the request to epp at all. It's better to shortcircuit in envoy. |
var ( | ||
// PassthroughEndpoints are informational endpoints that do not have a model param, | ||
// and do NOT run inference, so can be passed to any underlying model server at random. | ||
PassthroughEndpoints map[string]bool = map[string]bool{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not currently.
I just threw this together to have a hard example of what I was thinking.
I should ahve had a WIP prefix on this PR, I do now. Apologies
This PR sets up envoy to share request path attributes with EPP, and creates a set of maps to determine which routes we allow, and which are just passthrough.
Optionally, we can make the decision that EPP should do no route enforcement, and then we only map the
RoutableEndpoints
and assume any other route is just passthrough.The route list was made from an intersection of the endpoints in:
&&