-
Notifications
You must be signed in to change notification settings - Fork 2
Setup and configure OpenSearch
Most information for the Unipept API is loaded into memory by the executable itself. However, the protein metadata is too large to comfortably fit into memory and is therefor stored in an OpenSearch instance that's automatically accessed by the Unipept API. This guide explains how OpenSearch can be installed and configured.
First, you need to install OpenSearch through the apt repository on the server. This guide is based on the official installation guide, but slightly tweaked to allow for local access without password (this is safe since the server cannot be accessed from the outside), and to move all data files to a different drive.
Install dependencies
sudo apt-get update && sudo apt-get -y install lsb-release ca-certificates curl gnupg2Import the public GPG key
curl -o- https://artifacts.opensearch.org/publickeys/opensearch-release.pgp | sudo gpg --dearmor --batch --yes -o /usr/share/keyrings/opensearch-release-keyringCreate APT repository
echo "deb [signed-by=/usr/share/keyrings/opensearch-release-keyring] https://artifacts.opensearch.org/releases/bundle/opensearch/3.x/apt stable main" | sudo tee /etc/apt/sources.list.d/opensearch-3.x.listInstall OpenSearch
You will probably see an error that the root password has not been set, and that OpenSearch cannot be started because of that. This error can be ignored and will be fixed later.
sudo apt-get update
sudo apt-get install -y opensearchRemove the security plugin (which disables the required admin password)
sudo rm -rf /usr/share/opensearch/plugins/opensearch-securityStart OpenSearch
sudo systemctl start opensearchMake sure that OpenSearch automatically starts with the server
sudo systemctl enable opensearchTest OpenSearch to see if it's running correctly
curl -X GET http://localhost:9200This should return something like this if OpenSearch is running correctly:
{
"name" : "snowball",
"cluster_name" : "opensearch",
"cluster_uuid" : "7T2vRFIQSX-ebxXYq76Rhw",
"version" : {
"distribution" : "opensearch",
"number" : "3.1.0",
"build_type" : "deb",
"build_hash" : "8ff7c6ee924a49f0f59f80a6e1c73073c8904214",
"build_date" : "2025-06-21T08:05:38.757936053Z",
"build_snapshot" : false,
"lucene_version" : "10.2.1",
"minimum_wire_compatibility_version" : "2.19.0",
"minimum_index_compatibility_version" : "2.0.0"
},
"tagline" : "The OpenSearch Project: https://opensearch.org/"
}
In the case of Unipept, it is typically required to change the default data location of OpenSearch (since our servers have a small boot drive, but a large data drive). This section of the guide explains how OpenSearch can be configured to use another drive for its data storage.
Temporary stop OpenSearch
sudo systemctl stop opensearchMake a new data directory on another drive (/mnt/ssd in this case)
sudo mkdir -p /mnt/ssd/opensearch-dataSet permissions for the new data dir
sudo chown opensearch:opensearch /mnt/ssd/opensearch-dataMove data from the old location to the new one
sudo rsync -av /var/lib/opensearch/ /mnt/ssd/opensearch-data/The old data can be removed now
sudo rm -rf /var/lib/opensearchSet file permissions for new data dir
sudo chmod 750 /mnt/ssd/opensearch-dataConfigure OpenSearch for new data dir
Open the configuration file.
sudo nano /etc/opensearch/opensearch.ymlUpdate this file by changing the following values:
path.data: /mnt/ssd/opensearch-data
Start OpenSearch again
sudo systemctl start opensearchBy default, OpenSearch is configured to only use a maximum of 1GiB of memory during its operation. However, in our use case, this is far too low. In this section, we explain how the memory limit of OpenSearch can be increased.
Open the jvm.options configuration file
sudo nano /etc/opensearch/jvm.optionsUpdate the memory config values
Change the value for -Xms1g and Xmx1g to something higher (40GiB in this case).
-Xms40g
-Xmx40gRestart OpenSearch
sudo systemctl restart opensearch