Question: Enabling LLM-Based Semantic Search on a Self-Hosted OLS4 Instance

## Question: Enabling LLM-Based Semantic Search on a Self-Hosted OLS4 Instance

**Hello, and thank you so much for your continued support — especially to @haideriqbal and the team who were incredibly helpful in resolving my earlier setup issue (#1192).**

I now have a working self-hosted OLS4 instance (https://simpathic.services/ols4) serving a single ontology, and I'm very happy with it overall.

### What I'd like to achieve

The EBI production instance has a wonderful LLM-powered semantic search feature (using `llama-embed-nemotron`). This is the feature I'd most benefit from in my own deployment, as my users need to search a single specialised ontology using natural language rather than exact term matching.

### My constraints

My server is CPU-only, so running `llama-embed-nemotron-8b` (an 8B-parameter model) locally is not practical. I will only ever load a single ontology of ~1,000–2,000 classes, so the embedding index would be quite small.

### My questions

1. **Is the embedding/semantic search feature available at all for self-hosted instances?** I can see there is an `embeddings/` directory and a `k8chart-embed-service/` in the repository, but I couldn't find any documentation on how to wire this up outside of the EBI Kubernetes environment.

2. **Is there a configuration variable** (e.g. in `docker-compose.yml`, `application.properties`, or as an environment variable) that tells the OLS4 backend where to find an embedding service? If so, what is the expected API contract for that service?

3. **Would a lighter embedding model work as a drop-in replacement?** For a CPU-only server with a small ontology, something like `nomic-embed-text` (via Ollama) or a `sentence-transformers` model would be feasible. Would OLS4 accept any OpenAI-compatible `/v1/embeddings` endpoint, or is it tightly coupled to the specific nemotron model/dimensionality?

4. If the above is not currently possible, **would this be a reasonable feature request** — i.e., making the embed service URL configurable for community deployments?

I'm very willing to do the legwork on my end; I just need to understand the connection points. Any guidance, even just a pointer to the relevant source files, would be enormously appreciated.

Thank you again for building and maintaining such a wonderful tool!

*Mark*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Enabling LLM-Based Semantic Search on a Self-Hosted OLS4 Instance #1238