Skip to content

Question: Enabling LLM-Based Semantic Search on a Self-Hosted OLS4 Instance #1238

@markwilkinson

Description

@markwilkinson

Question: Enabling LLM-Based Semantic Search on a Self-Hosted OLS4 Instance

Hello, and thank you so much for your continued support — especially to @haideriqbal and the team who were incredibly helpful in resolving my earlier setup issue (#1192).

I now have a working self-hosted OLS4 instance (https://simpathic.services/ols4) serving a single ontology, and I'm very happy with it overall.

What I'd like to achieve

The EBI production instance has a wonderful LLM-powered semantic search feature (using llama-embed-nemotron). This is the feature I'd most benefit from in my own deployment, as my users need to search a single specialised ontology using natural language rather than exact term matching.

My constraints

My server is CPU-only, so running llama-embed-nemotron-8b (an 8B-parameter model) locally is not practical. I will only ever load a single ontology of ~1,000–2,000 classes, so the embedding index would be quite small.

My questions

  1. Is the embedding/semantic search feature available at all for self-hosted instances? I can see there is an embeddings/ directory and a k8chart-embed-service/ in the repository, but I couldn't find any documentation on how to wire this up outside of the EBI Kubernetes environment.

  2. Is there a configuration variable (e.g. in docker-compose.yml, application.properties, or as an environment variable) that tells the OLS4 backend where to find an embedding service? If so, what is the expected API contract for that service?

  3. Would a lighter embedding model work as a drop-in replacement? For a CPU-only server with a small ontology, something like nomic-embed-text (via Ollama) or a sentence-transformers model would be feasible. Would OLS4 accept any OpenAI-compatible /v1/embeddings endpoint, or is it tightly coupled to the specific nemotron model/dimensionality?

  4. If the above is not currently possible, would this be a reasonable feature request — i.e., making the embed service URL configurable for community deployments?

I'm very willing to do the legwork on my end; I just need to understand the connection points. Any guidance, even just a pointer to the relevant source files, would be enormously appreciated.

Thank you again for building and maintaining such a wonderful tool!

Mark

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions