How about adding an online endpoint in Semantic? #310
Replies: 2 comments 2 replies
-
|
Thanks for raising this — CPU-only VM with local Ollama is the worst-case path for first-time indexing, fair complaint. One thing worth knowing first: the Ollama adapter already honors For an actual non-Ollama remote endpoint: I'm open to it, but only as one OpenAI-compatible adapter, not a per-vendor lineup. Gemini, OpenAI, vLLM, and llama.cpp's server all expose Two constraints I'd put on it:
Happy to take a PR shaped like that, or I can open a tracking issue if you'd rather discuss specifics first. |
Beta Was this translation helpful? Give feedback.
-
|
Got it — fair enough, if local Ollama itself is the slow part then a remote Ollama daemon doesn't actually help you. OpenAI-compatible adapter it is. Tracked in #324 with the design I outlined above (one adapter, env-var driven, Ollama stays the default, vector store rejects dim-mismatch on rebuild). I'd take a PR for it if you want to pick it up — happy to review against the acceptance list in the issue. Otherwise it'll land when I get to it. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
In some virtual machine systems, running an embedded model using only the CPU might result in poor performance. I think adding a remote embedded model endpoint would be a good idea, and Google happens to offer free embedded models.
Beta Was this translation helpful? Give feedback.
All reactions