Skip to content

Implement model selection via a URI string for embedding models #8740

@dnwpark

Description

@dnwpark

Followup to #8515

Currently, to use an embedding model, the user create a new type in the schema with the correct annotations before being allowed to use it. This is cumbersome and potentially error prone.

Instead, let users specify embedding models using a model URI:

module default {
  type Astronomy {
    content: str;
    deferred index ext::ai::index(embedding_model := 'openai:newmodel')
      on (.content);
  }
};

When migrating, if a new ai index is added, check if the embedding_model is URI of the form provider_name:model_name. If it is, check for the model with the following priority:

  1. AI extension and user defined models
  2. Fetch a reference json file from the github repo
  • From master branch or from release branch?

If the model is from the reference json file, generate a new type in the resulting ddl. For example:

create abstract type Newmodel
    extending ext::ai::EmbeddingModel
{
    alter annotation ext::ai::model_name := "newmodel";
    alter annotation ext::ai::model_provider := "builtin::openai";
    alter annotation ext::ai::embedding_model_max_input_tokens := "8191";
    alter annotation ext::ai::embedding_model_max_batch_tokens := "8191";
    alter annotation ext::ai::embedding_model_max_output_dimensions := "3072";
    alter annotation ext::ai::embedding_model_supports_shortening := "true";
};

If no model is found, or if there was an error fetching the json file, raise an error.

The json file should be structured to cover the currently available types (ProviderConfig, EmbeddingModel, and TextGenerationModel) but also potentially any future models.

{
  "providers": {
    "openai": {
      "name": "builtin::openai",
      ...
    }
  },
  "embedding_models": {
    "text-embedding-3-small": {
      "model_name": "text-embedding-3-small",
      "model_provider": "openai",
      "max_input_tokens": 8191,
      "max_batch_tokens": 8191,
      "max_output_dimensions": 1536,
      "supports_shortening": true
    },
    ...
  },
  "text_generation_models": {
    "gpt-3.5-turbo": {
      "model_name": "gpt-3.5-turbo",
      "model_provider": "openai",
      "context_window": 16385
    },
    ...
  }
}

Future work:

  • As an additional step after checking the reference json, try to run discovery for unknown models by querying the providers directly.

TODO:

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions