Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
WIP
Taking the ideas and commits from #124299
Notable changes from initial PR:
path
andmethod
nestingurl
fieldquery_string
and converted it to a list of tuples, to leveragedescription
andversion
as they weren't usedsparse_result
andvalue
fieldsresponse.error_parser
to indicate the location to find the error message fieldpath
path
field to tell it where to find that nested mapformat
field that specifies how the response is structure (elser's structure is an array of maps, where the key is the token id and the value is the weight, this parser expects the map to have a token id field and a weight field)Add Custom Model support to Inference API.
You can use this Inference API to invoke models that support the HTTP format.
Inference Endpoint Creation:
Endpoint creation
Support task_type
Parameter Description
Parameter Description
secret_parameters
: secret parameters like api_key can be defined here.headers
(optional):https' header parametersrequest.content
: The body structure of the request requires passing in the string-escaped result of the JSON format HTTP request body.NOTE: Unfortunately, if we aren't using kibana the content string needs to be a single line
response.json_parser
: We need to parse the returned response into an object that Elasticsearch can recognize.(TextEmbeddingFloatResults, SparseEmbeddingResults, RankedDocsResults, ChatCompletionResults)Therefore, we use jsonPath syntax to parse the necessary content from the response.
(For the text_embedding type, we need a
List<List<Float>>
object. The same applies to other types.)Different task types have different json_parser parameters.
response.error_parser
: Since each 3rd party service can have its own error response format we'll need the user to give us the location to retrieve the base error message. For example, openai's error structure is here: https://platform.openai.com/docs/api-reference/realtime-server-events/error. We'd want to extract themessage
field. An example of that might look like:task_settings.parameters
: Due to the limitations of the inference framework, if the model requires more parameters to be configured, they can be set in task_settings.parameters. These parameters can be placed in the request.body as placeholders and replaced with the configured values when constructing the request.Testing
🚧 In progress
Jon Testing
OpenAI
Texting Embedding
Cohere
Rerank
Azure OpenAI
Alibaba Testing
we use Alibaba Cloud AI Search Model for example,
Please replace the value of
secret_parameters.api_key
with your api_key.text_embedding
sparse_embedding
rerank