-
Notifications
You must be signed in to change notification settings - Fork 25.2k
[Inference API] Add Custom Model support to Inference API #124299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Inference API] Add Custom Model support to Inference API #124299
Conversation
Pinging @elastic/ml-core (Team:ML) |
Thank you @Huaixinww I love the idea of |
@davidkyle This feature is very useful for our AlibabaCloud user. We will continue improve the feature. And we can discuss how to import this feature to the elasticsearch community. |
@Huaixinww and @weizijun the ml team at Elasticsearch love this idea and consider it to be a core feature for the Inference API. Going forward we would like to take on this work in a new PR. I've opted to create a new PR because CI does not run automatically against PRs from external contributors and that will slow the development process. I've opened #125679 which contains all the commits from this PR plus some fixes for the build system. We will review the PR and add any missing tests etc. With your permission we would like to make some minor changes:
Would become
With these changes the embedding example from the PR description would look like this:
|
@davidkyle Our design patten is from openapi. I think your three changes are okay. I think both methods are ok, no problem. |
I messed up Dave's PR so opened a new one here: #127939 |
Add Custom Model support to Inference API.
You can use this Inference API to invoke models that support the HTTP format.
Inference Model Creation:
Support task_type
Parameter Description
secret_parameters
: secret parameters like api_key can be defined here.query_string
(optional): http's query parametersheaders
(optional):https' header parametersrequest.format
: only supportstring
nowrequest.content
: The body structure of the request requires passing in the string-escaped result of the JSON format HTTP request body.response.json_parser
: We need to parse the returned response into an object that Elasticsearch can recognize.(TextEmbeddingFloatResults, SparseEmbeddingResults, RankedDocsResults, ChatCompletionResults)Therefore, we use jsonPath syntax to parse the necessary content from the response.
(For the text_embedding type, we need a
List<List<Float>>
object. The same applies to other types.)Different task types have different json_parser parameters.
task_settings.parameters
: Due to the limitations of the inference framework, if the model requires more parameters to be configured, they can be set in task_settings.parameters. These parameters can be placed in the request.body as placeholders and replaced with the configured values when constructing the request.Testing
we use Alibaba Cloud AI Search Model for example,
Please replace the value of
secret_parameters.api_key
with your api_key.text_embedding
sparse_embedding
rerank
completion
In the completion module, we demonstrated how to use the
task_settings.parameters
parameter for more flexible parameter configuration.To understand completion interface definition for the Alibaba Cloud AI Search completion API, please refer to the official documentation alibaba cloud ai search completion api doc
custom
we use query-analyze for example