Question: Enabling LLM-Based Semantic Search on a Self-Hosted OLS4 Instance
Hello, and thank you so much for your continued support — especially to @haideriqbal and the team who were incredibly helpful in resolving my earlier setup issue (#1192).
I now have a working self-hosted OLS4 instance (https://simpathic.services/ols4) serving a single ontology, and I'm very happy with it overall.
What I'd like to achieve
The EBI production instance has a wonderful LLM-powered semantic search feature (using llama-embed-nemotron). This is the feature I'd most benefit from in my own deployment, as my users need to search a single specialised ontology using natural language rather than exact term matching.
My constraints
My server is CPU-only, so running llama-embed-nemotron-8b (an 8B-parameter model) locally is not practical. I will only ever load a single ontology of ~1,000–2,000 classes, so the embedding index would be quite small.
My questions
-
Is the embedding/semantic search feature available at all for self-hosted instances? I can see there is an embeddings/ directory and a k8chart-embed-service/ in the repository, but I couldn't find any documentation on how to wire this up outside of the EBI Kubernetes environment.
-
Is there a configuration variable (e.g. in docker-compose.yml, application.properties, or as an environment variable) that tells the OLS4 backend where to find an embedding service? If so, what is the expected API contract for that service?
-
Would a lighter embedding model work as a drop-in replacement? For a CPU-only server with a small ontology, something like nomic-embed-text (via Ollama) or a sentence-transformers model would be feasible. Would OLS4 accept any OpenAI-compatible /v1/embeddings endpoint, or is it tightly coupled to the specific nemotron model/dimensionality?
-
If the above is not currently possible, would this be a reasonable feature request — i.e., making the embed service URL configurable for community deployments?
I'm very willing to do the legwork on my end; I just need to understand the connection points. Any guidance, even just a pointer to the relevant source files, would be enormously appreciated.
Thank you again for building and maintaining such a wonderful tool!
Mark
Question: Enabling LLM-Based Semantic Search on a Self-Hosted OLS4 Instance
Hello, and thank you so much for your continued support — especially to @haideriqbal and the team who were incredibly helpful in resolving my earlier setup issue (#1192).
I now have a working self-hosted OLS4 instance (https://simpathic.services/ols4) serving a single ontology, and I'm very happy with it overall.
What I'd like to achieve
The EBI production instance has a wonderful LLM-powered semantic search feature (using
llama-embed-nemotron). This is the feature I'd most benefit from in my own deployment, as my users need to search a single specialised ontology using natural language rather than exact term matching.My constraints
My server is CPU-only, so running
llama-embed-nemotron-8b(an 8B-parameter model) locally is not practical. I will only ever load a single ontology of ~1,000–2,000 classes, so the embedding index would be quite small.My questions
Is the embedding/semantic search feature available at all for self-hosted instances? I can see there is an
embeddings/directory and ak8chart-embed-service/in the repository, but I couldn't find any documentation on how to wire this up outside of the EBI Kubernetes environment.Is there a configuration variable (e.g. in
docker-compose.yml,application.properties, or as an environment variable) that tells the OLS4 backend where to find an embedding service? If so, what is the expected API contract for that service?Would a lighter embedding model work as a drop-in replacement? For a CPU-only server with a small ontology, something like
nomic-embed-text(via Ollama) or asentence-transformersmodel would be feasible. Would OLS4 accept any OpenAI-compatible/v1/embeddingsendpoint, or is it tightly coupled to the specific nemotron model/dimensionality?If the above is not currently possible, would this be a reasonable feature request — i.e., making the embed service URL configurable for community deployments?
I'm very willing to do the legwork on my end; I just need to understand the connection points. Any guidance, even just a pointer to the relevant source files, would be enormously appreciated.
Thank you again for building and maintaining such a wonderful tool!
Mark