-
Notifications
You must be signed in to change notification settings - Fork 207
BBR pluggable framework proposal #1964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
BBR pluggable framework proposal #1964
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: davidbreitgand The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @davidbreitgand. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
4ffe090 to
af49267
Compare
af49267 to
377259c
Compare
377259c to
cd109fd
Compare
cd109fd to
9b24a69
Compare
|
@srampal , @elevran , @nirrozenbaum could you please review? Initial implementation: PR 1981 |
|
|
||
| - Avoid monolithic architecture | ||
| - Mimic pluggability and configurability of the scheduling subsystem without coupling between the two | ||
| - Enable organizing plugins into a topology for ordered and concurrent execution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider grouping goals into sections (e.g., splitting goals into required and optional, generalizing into categories).
We might be able to support ordered and concurrent execution at a later stage, when a concrete requirement is defined.
|
|
||
| ### Overview | ||
|
|
||
| There is an embedded `BBRPlugin` interface building on the `Plugin` interface adopted from EPP. This interface should be implemented by any BBR plugin. Each pluigin is identified by its `TypedName` (adopted from EPP), where `TypedName().Type` gives the string representing the type of the plugin and `TypedName().Name()` returns the string representing the plugins implementation. BBR is refactored to implement the registered factory pattern. To that end, a `PluginRegistry` interface and its implementation are added to register `BBRPlugin` factories and concrete implementations created by the factories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| There is an embedded `BBRPlugin` interface building on the `Plugin` interface adopted from EPP. This interface should be implemented by any BBR plugin. Each pluigin is identified by its `TypedName` (adopted from EPP), where `TypedName().Type` gives the string representing the type of the plugin and `TypedName().Name()` returns the string representing the plugins implementation. BBR is refactored to implement the registered factory pattern. To that end, a `PluginRegistry` interface and its implementation are added to register `BBRPlugin` factories and concrete implementations created by the factories. | |
| There is an embedded `BBRPlugin` interface building on the `Plugin` interface adopted from EPP. This interface should be implemented by any BBR plugin. Each plugin is identified by its `TypedName` (adopted from EPP), where `TypedName().Type` gives the string representing the type of the plugin and `TypedName().Name()` returns the string representing the plugin's instance name. BBR is refactored to implement the registered factory pattern. To that end, a `PluginRegistry` interface and its implementation are added to register `BBRPlugin` factories and concrete implementations created by the factories. |
nit: In EPP factories are a concrete implementation and not an interface. What is the reason for the factory abstraction?
| ### Overview | ||
|
|
||
| There is an embedded `BBRPlugin` interface building on the `Plugin` interface adopted from EPP. This interface should be implemented by any BBR plugin. Each pluigin is identified by its `TypedName` (adopted from EPP), where `TypedName().Type` gives the string representing the type of the plugin and `TypedName().Name()` returns the string representing the plugins implementation. BBR is refactored to implement the registered factory pattern. To that end, a `PluginRegistry` interface and its implementation are added to register `BBRPlugin` factories and concrete implementations created by the factories. | ||
| In addition, a `PluginsChain` interface is defined to define an order of plugin executions. In the future, `PluginsChain` will be replaced by `PluginsDAG` to allow for more complex topological order and concurrency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the first implementation it might be worthwhile to treat it as a single plugin solely for "body based routing" functionality. Chains and DAGs might be desireable once the BBR changes to support body based processing and not just routing.
In EPP, we had initially supported a single profile and multiple profiles (e.g., for P/D disagg) were supported by wrapping multiple scheduler instances behind a single instance implementation.
| RequiresFullParsing() bool | ||
|
|
||
| // Execute runs the plugin logic on the request body. | ||
| // A plugin's imnplementation logic CAN mutate the body of the message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // A plugin's imnplementation logic CAN mutate the body of the message. | |
| // A plugin's implementation logic CAN mutate the body of the message. |
| // which, say, stands for "select a best model for this request at minimal cost" | ||
| // A plugin implementation of "semantic-model-selector" sets X-Gateway-Model-Name to any valid | ||
| // model name from the inventory of the backend models and also mutates the body accordingly | ||
| // In contrast, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sentence cut off?
| String() string //human readable string for logging | ||
| } | ||
|
|
||
| // PluginsChain is used to define a specific order of execution of the BBRPlugin instances stored in the registry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest moving this to future (once the need arises, will have concrete requirements for chain or DAG, parallel or sequential execution, etc.).
| ```go | ||
|
|
||
| const ( | ||
| //A deafult plugin implementation of this plugin type will always be configured for request plugins chain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| //A deafult plugin implementation of this plugin type will always be configured for request plugins chain | |
| // A default plugin implementation of this plugin type will always be configured for request plugins chain. |
nit: please close comment lines with . when the next comment starts a new sentence (upper case...)
| const ( | ||
| //A deafult plugin implementation of this plugin type will always be configured for request plugins chain | ||
| //Even though BBRPlugin type is not (yet) a K8s resource, it's logically akin to `kind` | ||
| //MUST start wit an upper case letter, use CamelNotation, only aplhanumericals after the first letter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| //MUST start wit an upper case letter, use CamelNotation, only aplhanumericals after the first letter | |
| // MUST start with an upper case letter, use CamelNotation, only aplhanumericals after the first letter |
|
|
||
| const ( | ||
| //A deafult plugin implementation of this plugin type will always be configured for request plugins chain | ||
| //Even though BBRPlugin type is not (yet) a K8s resource, it's logically akin to `kind` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a limitation we have in EPP. Perhaps delay it until a more formal k8s Plugin type is defined in some AI gateway work stream?
| ) | ||
| ``` | ||
|
|
||
| ### Current BBR reimplementation as BBRPlugin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we need to have (most of) the code duplicated. Suffice to say the current implementation can be made compliant with the proposed interface.
| headers map[string]string, | ||
| mutatedBodyBytes []byte, | ||
| err error, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: prefer single line if possible (headers map[string]string, mutatedBodyBytes []byte, err error).
also, avoid named returns unless it's required to disambiguate.
| // placeholder for BBRPlugin constructors | ||
| type PluginFactoryFunc func() bbrplugins.BBRPlugin //concrete constructors are assigned to this type | ||
|
|
||
| // PluginRegistry defines operations for managing plugin factories and plugin instances |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a rather wide interface. Consider removing functions or splitting into separate and indepenent interfaces.
| type PluginRegistry interface { | ||
| RegisterFactory(typeKey string, factory PluginFactoryFunc) error //constructors | ||
| RegisterPlugin(plugin bbrplugins.BBRPlugin) error //registers a plugin instance (the instance MUST be created via the factory first) | ||
| GetFactory(typeKey string) (PluginFactoryFunc, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider from the caller's point of view - which methods are needed in the abstractions. It would produce a more minimal interface (e.g., register-factory, create-instance, get-instance may be sufficient).
| The pluggable framework will be implemented iteratively over several phases. | ||
|
|
||
| 1. Introduce `BBRPlugin` `MetadataExtractor`, interface, registry, plugins chain, sample plugin implementation (`SimpleModelExtraction`) and its factory. Plugin configuration will be implemented via environment variables set in helm chart | ||
| 1. Introduce a second plugin interface, `ModelSelector` and sample plugin implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: for the initial task (various ways to do body based routing), a ModelSelector interface would have been sufficient. Why do we need multiple interfaces and what's the delta between interface in (1) and in (2)?
|
|
||
| 1. Introduce `BBRPlugin` `MetadataExtractor`, interface, registry, plugins chain, sample plugin implementation (`SimpleModelExtraction`) and its factory. Plugin configuration will be implemented via environment variables set in helm chart | ||
| 1. Introduce a second plugin interface, `ModelSelector` and sample plugin implementation | ||
| 1. Introduce shared struct (shared among the plugins of a plugins chain) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that does what?
| 1. Introduce `BBRPlugin` `MetadataExtractor`, interface, registry, plugins chain, sample plugin implementation (`SimpleModelExtraction`) and its factory. Plugin configuration will be implemented via environment variables set in helm chart | ||
| 1. Introduce a second plugin interface, `ModelSelector` and sample plugin implementation | ||
| 1. Introduce shared struct (shared among the plugins of a plugins chain) | ||
| 1. Introduce an interface for guardrail plugin, introduce simple reference implementation, experiment with plugins chains on request and response messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggets making this out of scope for the implementation focusing on model selection in BBR and not other features.
|
|
||
| ## Open Questions | ||
|
|
||
| 1. More elaborate shared memory architecture for the best performance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Consider mentioning performance numbers (e.g. for multiple naive parsing?) or other use cases to motivate this
|
|
||
| // PluginsChain is used to define a specific order of execution of the BBRPlugin instances stored in the registry | ||
| // The BBRPlugin instances | ||
| type PluginsChain interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: wide interface... which of these are needed by the caller?
Perhaps only support append initially and run to execute the chain?
Also, shouldn't Chain also implement Plugin?
| ParseChatCompletion(data []byte) (openai.ChatCompletionNewParams, error) //parses the bytes slice into an appropriate openai-go struct | ||
| ParseCompletion(data []byte) (openai.CompletionNewParams, error) //likewise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why needed (as opposed to just Run)?
What type of PR is this?
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #1963This is a design proposal PR, it does not introduce code changes.
Does this PR introduce a user-facing change?: