This guide explains how to set up external databases for CLP instead of using the Docker Compose managed databases. If the host(s) on which you're running CLP are ephemeral, you should use external databases for metadata storage, and object storage for CLP's archives and streams; this will ensure data is persisted even if a host is replaced.
:::{warning} The CLP Docker Compose project includes MariaDB/MongoDB databases by default. This guide is only for users who want to customize their deployment by using their own database servers or cloud-managed databases (e.g., AWS RDS, Azure Database). :::
CLP requires two databases:
- MariaDB/MySQL - for storing metadata about archives, files, and jobs.
- MongoDB - for caching query results.
CLP is compatible with any MariaDB or MySQL database. The instructions below use Ubuntu as an example, but you can use any compatible database installation or cloud-managed service.
-
Install MariaDB server:
sudo apt update sudo apt install mariadb-server
-
Connect to MariaDB as root:
sudo mysql
-
Create the CLP database:
CREATE DATABASE `clp-db`; -
Create a user for CLP (replace
<password>with a secure password):CREATE USER 'clp-user'@'%' IDENTIFIED BY '<password>';
:::{note} The
'%'allows connections from any host. For better security, replace'%'with the specific hostname or IP address from which CLP will connect (e.g.,'clp-user'@'192.168.1.10'). ::: -
Grant privileges to the user:
GRANT ALL PRIVILEGES ON `clp-db`.* TO 'clp-user'@'%'; FLUSH PRIVILEGES;
-
Exit the MariaDB shell:
EXIT;
If CLP components will connect from a different host, you need to configure MariaDB to accept remote connections:
-
Edit the MariaDB configuration file:
sudo nano /etc/mysql/mariadb.conf.d/50-server.cnf
-
Find the
bind-addressline and change it to allow connections from all interfaces:bind-address = 0.0.0.0 -
Restart MariaDB:
sudo systemctl restart mariadb
You can verify the MariaDB connection by running:
mysql -h <mariadb-hostname-or-ip> -u clp-user -p clp-dbWhen using AWS RDS:
-
Create a MariaDB or MySQL RDS instance in the AWS Console.
-
Note the endpoint hostname and port (the default is
3306). -
Create the database and user using a MySQL client:
mysql -h <rds-endpoint> -u admin -p
Then follow steps 2-5 from Installing MariaDB on Ubuntu.
-
Ensure the RDS security group allows inbound connections on port 3306 from your CLP hosts.
CLP is compatible with any MongoDB database. For installation instructions, see the MongoDB installation documentation.
:::{warning}
Running an external MongoDB on the same host as CLP (i.e., using localhost or 127.0.0.1 as
the results_cache host) is not supported. CLP's results-cache-indices-creator initializes a
MongoDB replica set using the configured hostname, which MongoDB must be able to resolve to itself;
localhost from inside a Docker container does not resolve to the host machine.
Instead, either:
- Keep
results_cachein thebundledlist (recommended for single-host deployments). - Use a truly remote MongoDB instance and specify its hostname or IP.
- If you must use a same-host MongoDB, configure
results_cache.hostinclp-config.yamlto the host's non-loopback IP address (e.g.,192.168.1.10) and ensure MongoDB is bound to that address. :::
MongoDB automatically creates databases and collections when first accessed, so no manual database
creation is needed. CLP will create the necessary database and collections (clp-query-results by
default) when it first connects.
If CLP components will connect from a different host:
-
Edit the MongoDB configuration file:
sudo nano /etc/mongod.conf
-
Find the
net.bindIpsetting and change it to allow connections from all interfaces:net: port: 27017 bindIp: 0.0.0.0
-
Restart MongoDB:
sudo systemctl restart mongod
:::{warning} For production deployments, it's highly recommended to enable authentication and SSL/TLS for MongoDB. See the MongoDB security documentation for details. :::
You can verify the MongoDB connection by running:
mongosh "mongodb://<mongodb-hostname-or-ip>:27017/clp-query-results"When using AWS DocumentDB or MongoDB Atlas:
- Create a cluster in the AWS Console or MongoDB Atlas.
- Note the connection string/endpoint provided.
- Ensure the security group or IP access list allows connections from your CLP hosts.
- Use the provided connection string when configuring CLP (see below).
After setting up your external databases, configure CLP to use them:
-
Edit
etc/clp-config.yamlto specify which services are bundled (managed by theclp-packageDocker Compose project):# Remove "database" and "results_cache" from this list to use external instances bundled: # - "database" - "queue" - "redis" # - "results_cache"
-
Configure the connection details for your external databases in
etc/clp-config.yaml:database: host: "<mariadb-hostname-or-ip>" port: <mariadb-port> results_cache: host: "<mongodb-hostname-or-ip>" port: <mongodb-port>
-
Set the credentials in
etc/credentials.yaml:database: username: "clp-user" password: "<your-mariadb-password>"
:::{note}
When using external databases in a multi-host deployment, you do not need to start the
database and results-cache Docker Compose services. Skip those services when following the
multi-host deployment guide. However, you still need to run the database
initialization jobs (db-table-creator and results-cache-indices-creator).
:::