-
Notifications
You must be signed in to change notification settings - Fork 435
Description
Followup to #8515
Currently, to use an embedding model, the user create a new type in the schema with the correct annotations before being allowed to use it. This is cumbersome and potentially error prone.
Instead, let users specify embedding models using a model URI:
module default {
type Astronomy {
content: str;
deferred index ext::ai::index(embedding_model := 'openai:newmodel')
on (.content);
}
};
When migrating, if a new ai index is added, check if the embedding_model is URI of the form provider_name:model_name. If it is, check for the model with the following priority:
- AI extension and user defined models
- Fetch a reference json file from the github repo
- From
masterbranch or from release branch?
If the model is from the reference json file, generate a new type in the resulting ddl. For example:
create abstract type Newmodel
extending ext::ai::EmbeddingModel
{
alter annotation ext::ai::model_name := "newmodel";
alter annotation ext::ai::model_provider := "builtin::openai";
alter annotation ext::ai::embedding_model_max_input_tokens := "8191";
alter annotation ext::ai::embedding_model_max_batch_tokens := "8191";
alter annotation ext::ai::embedding_model_max_output_dimensions := "3072";
alter annotation ext::ai::embedding_model_supports_shortening := "true";
};
If no model is found, or if there was an error fetching the json file, raise an error.
The json file should be structured to cover the currently available types (ProviderConfig, EmbeddingModel, and TextGenerationModel) but also potentially any future models.
{
"providers": {
"openai": {
"name": "builtin::openai",
...
}
},
"embedding_models": {
"text-embedding-3-small": {
"model_name": "text-embedding-3-small",
"model_provider": "openai",
"max_input_tokens": 8191,
"max_batch_tokens": 8191,
"max_output_dimensions": 1536,
"supports_shortening": true
},
...
},
"text_generation_models": {
"gpt-3.5-turbo": {
"model_name": "gpt-3.5-turbo",
"model_provider": "openai",
"context_window": 16385
},
...
}
}Future work:
- As an additional step after checking the reference json, try to run discovery for unknown models by querying the providers directly.
TODO:
- Support URI model name (Implement URI lookup for AI embedding models. #8860)
- Create model type when present in reference file, but not in schema (Implement URI lookup for AI embedding models. #8860)
- Fetch reference file from github if possible
- Add redirect in geldata.com