feat: custom OpenAI-compatible embedding providers#229
Open
long2ice wants to merge 2 commits into
Open
Conversation
Extend the custom-provider mechanism (tashfeenahmed#117/tashfeenahmed#212) to embeddings so any OpenAI-compatible /embeddings endpoint (self-hosted vLLM/Ollama/LM Studio or a remote gateway) can be added and routed. Server: - embedding_models gains a key_id column binding a custom row to its endpoint's api_keys row (migrateEmbeddingsV2Custom); built-ins stay NULL - embeddings service resolves a custom row's key + base_url and calls the endpoint via the OpenAI-style adapter; adds probeEmbeddingDimensions - POST /api/embeddings/custom: probes the endpoint to auto-detect the vector dimension, then upserts; joining an existing family requires a matching dimension and lands at the back of the chain - DELETE /api/embeddings/custom/:id and DELETE /api/models/custom/:id remove a single custom model while keeping the shared endpoint key - adding a model to an existing endpoint updates key/label independently and only when supplied, so a blank key no longer clobbers the shared one Client: - Keys page: one merged 'Add a custom model' form with a Chat/Embedding toggle; embedding mode auto-detects dimension and suggests a family by model id (overridable) - Embeddings and Models pages: per-row delete for custom providers Tests: custom embedding routing/failover, dimension probe + family join, blank-key reuse, and single custom-model deletion.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Extends the custom-provider mechanism (#117/#212) to embeddings, so any OpenAI-compatible
/embeddingsendpoint (self-hosted vLLM / Ollama / LM Studio, or a remote gateway) can be added and routed — and unifies how custom chat + embedding models are managed.Why
Custom providers previously only worked for chat models (
modelstable + chat fallback chain). Embeddings had nocustomsupport at all —callProvider'sswitch(platform)only covered the 7 built-in platforms. This adds parity.Changes
Server
embedding_modelsgains akey_idcolumn binding a custom row to its endpoint'sapi_keysrow (migrateEmbeddingsV2Custom); built-ins stayNULLand resolve by platform as before.base_urland calls the endpoint via the existing OpenAI-style adapter; addsprobeEmbeddingDimensions.POST /api/embeddings/custom— probes the endpoint to auto-detect the vector dimension, then upserts. Joining an existing family requires a matching dimension (a family is one vector space) and lands at the back of that family's chain.DELETE /api/embeddings/custom/:idandDELETE /api/models/custom/:id— remove a single custom model while keeping the shared endpoint key.Client
Routing semantics
Embeddings still only fail over within a family (same model, another provider) — vectors from different models are incompatible. A custom endpoint either becomes its own family (default) or joins an existing one for cross-provider redundancy when dimensions match.
Tests
345 passing. New coverage: custom embedding routing + within-family failover, dimension probe + family-join dimension guard, blank-key reuse of an existing endpoint key, and single custom-model deletion (keeps the shared key; 404 on built-ins).