How about adding an online endpoint in Semantic? #310

kabaka9527 · 2026-05-06T06:22:22Z

kabaka9527
May 6, 2026

In some virtual machine systems, running an embedded model using only the CPU might result in poor performance. I think adding a remote embedded model endpoint would be a good idea, and Google happens to offer free embedded models.

esengine · 2026-05-06T08:35:57Z

esengine
May 6, 2026
Maintainer

Thanks for raising this — CPU-only VM with local Ollama is the worst-case path for first-time indexing, fair complaint.

One thing worth knowing first: the Ollama adapter already honors OLLAMA_URL (env var). If you have any non-VM box reachable on the network — a desktop, a friend's machine, a small VPS — run ollama serve there and point Reasonix at the URL. Same model, same vectors, same code path, just off the slow CPU. For most "my VM is too slow" cases that's the cheapest fix and stays free + private.

For an actual non-Ollama remote endpoint: I'm open to it, but only as one OpenAI-compatible adapter, not a per-vendor lineup. Gemini, OpenAI, vLLM, and llama.cpp's server all expose POST /v1/embeddings with the same shape — one adapter covers them all via REASONIX_EMBED_PROVIDER / REASONIX_EMBED_BASE_URL / REASONIX_EMBED_API_KEY env vars. Ollama stays the default.

Two constraints I'd put on it:

Docs won't lean on "free tier" framing — free tiers move on someone else's schedule, and the project's positioning shouldn't depend on that.
The vector store has to refuse mixing embeddings from different models (different dimensions silently break similarity). Worth handling at the adapter boundary, not as a separate concern.

Happy to take a PR shaped like that, or I can open a tracking issue if you'd rather discuss specifics first.

1 reply

kabaka9527 May 6, 2026
Author

It's possible to create a universally compatible OpenAI API; at least I think Ollama's performance on my device isn't very good.

esengine · 2026-05-06T10:23:08Z

esengine
May 6, 2026
Maintainer

Got it — fair enough, if local Ollama itself is the slow part then a remote Ollama daemon doesn't actually help you. OpenAI-compatible adapter it is.

Tracked in #324 with the design I outlined above (one adapter, env-var driven, Ollama stays the default, vector store rejects dim-mismatch on rebuild). I'd take a PR for it if you want to pick it up — happy to review against the acceptance list in the issue. Otherwise it'll land when I get to it.

1 reply

kabaka9527 May 6, 2026
Author

I'll come, once I've finished what I'm doing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How about adding an online endpoint in Semantic? #310

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How about adding an online endpoint in Semantic? #310

Uh oh!

kabaka9527 May 6, 2026

Replies: 2 comments · 2 replies

Uh oh!

esengine May 6, 2026 Maintainer

Uh oh!

kabaka9527 May 6, 2026 Author

Uh oh!

esengine May 6, 2026 Maintainer

Uh oh!

kabaka9527 May 6, 2026 Author

kabaka9527
May 6, 2026

Replies: 2 comments 2 replies

esengine
May 6, 2026
Maintainer

kabaka9527 May 6, 2026
Author

esengine
May 6, 2026
Maintainer

kabaka9527 May 6, 2026
Author