feat(config): add support for remote embedding services via config.toml #4284
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This pull request introduces support for remote embedding models in
config.toml, enabling users to delegate embedding generation to external servers. This is particularly valuable for environments without GPU or lacking the ability to runllama-serverlocally.What’s Changed
Added support for
[embedding]section withtype = "remote"andendpointfields inconfig.toml.Updated
embedding::create()andmodel::load_embedding()to supportModelConfig::Http(remote models).Prevents Tabby from launching the local
llama-serverprocess when using a remote embedding service.Keeps compatibility with local embeddings (no breaking changes).
Example Usage
Run an embedding service like:
Then launch Tabby:
Motivation
Currently, Tabby always attempts to launch its internal
llama-serverbinary, which fails on machines without compatible GPU or CUDA libraries. This PR introduces flexibility and portability, enabling Tabby to run in lightweight environments with minimal dependencies.How to Test
Start Tabby with a valid
config.tomlthat includes a remote embedding config.Verify that:
llama-serverKnown Limitations
This does not disable internal embeddings when
[embedding]is omitted (default behavior).The remote server must follow the expected API (e.g.,
/v1/embeddingsin OpenAI-compatible format).Request for Review
Would love feedback on:
Integration approach
Potential edge cases to test
Any docs you'd like me to include
Let me know if you'd like me to add a sample embedding server (Python FastAPI) or documentation PR as a follow-up!