Complete guide to deploy MLflow on a local k3s cluster.
- k3s installed and running (
sudo k3s serveror installed as a service) kubectlconfigured to point to the k3s cluster (automatic with k3s)helmv3+ installed- An AWS S3 bucket (or S3-compatible storage) for artifacts
On k3s, kubectl is available via sudo k3s kubectl. To use kubectl directly:
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
# Or adjust permissions so non-root users can read it:
sudo chmod 644 /etc/rancher/k3s/k3s.yamlTip: Add
export KUBECONFIG=/etc/rancher/k3s/k3s.yamlto your~/.bashrcor~/.zshrcto make it persistent.
cp .env.example .envEdit .env and fill in the variables from the Common section:
| Variable | Description | Example |
|---|---|---|
| PORT | MLflow server port | 8080 |
| BACKEND_STORE_URI | PostgreSQL URI | postgresql://mlflow:password@mlflow-db-postgresql:5432/mlflow_db |
| ARTIFACT_ROOT | S3 path | s3://my-bucket/mlflow-artifacts/ |
| AWS_ACCESS_KEY_ID | AWS key | AKIA... |
| AWS_SECRET_ACCESS_KEY | AWS secret | wJal... |
| POSTGRES_USER | PostgreSQL user | mlflow |
| POSTGRES_PASSWORD | PostgreSQL password | (strong password) |
| POSTGRES_DB | Database name | mlflow_db |
| POSTGRES_ADMIN_PASSWORD | Admin password | (strong password) |
| MLFLOW_TRACKING_URI | Local tracking URI | http://localhost:5000 |
Note: The authentication variables (the "Scaleway only" section in
.env.example) are not required in local mode. You can safely ignore them.
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo updatesource .env
helm install mlflow-db bitnami/postgresql -f values-postgresql.yaml \
--set auth.username=$POSTGRES_USER \
--set auth.password=$POSTGRES_PASSWORD \
--set auth.database=$POSTGRES_DB \
--set auth.postgresPassword=$POSTGRES_ADMIN_PASSWORDkubectl get podsWait until mlflow-db-postgresql-0 shows a Running status.
kubectl logs mlflow-db-postgresql-0The message database system is ready to accept connections confirms that PostgreSQL is up and running.
kubectl create secret generic mlflow-env-variables --from-env-file=.envkubectl apply -f k8s/common/mlflow_deployment.yaml
kubectl apply -f k8s/local/mlflow_service.yamlkubectl get pods -l app=mlflow-dashboard
kubectl get svc mlflow-serviceWait until both pods show a Running status.
kubectl port-forward svc/mlflow-service 5000:5000Verify the connection is working:
curl http://localhost:5000/healthThen open http://localhost:5000 in your browser.
The service is of type NodePort. Retrieve the assigned port:
kubectl get svc mlflow-serviceThe PORT(S) column displays 5000:3xxxx/TCP. Access it at http://NODE_IP:3xxxx.
On k3s, the NODE_IP is usually 127.0.0.1 or the machine's IP address.
Note: If you use NodePort instead of port-forward, update
MLFLOW_TRACKING_URIin your.envfile to match the NodePort URL (e.g.http://127.0.0.1:3xxxx) instead ofhttp://localhost:5000.
In a first terminal, start the port-forward:
kubectl port-forward svc/mlflow-service 5000:5000In a second terminal, install the Python dependencies:
uv syncThen run the training script:
uv run python train.py --n_estimators 100 --min_samples_split 2Note:
train.pycallsload_dotenv()viapython-dotenv, soMLFLOW_TRACKING_URI=http://localhost:5000from your.envfile is loaded automatically without any manualexport.
Check the results in the MLflow UI (http://localhost:5000): experiment, runs, metrics, and registered model.
kubectl delete secret mlflow-env-variables
kubectl create secret generic mlflow-env-variables --from-env-file=.env
kubectl rollout restart deployment mlflow-deploymentsource .env
helm upgrade mlflow-db bitnami/postgresql -f values-postgresql.yaml \
--set auth.username=$POSTGRES_USER \
--set auth.password=$POSTGRES_PASSWORD \
--set auth.database=$POSTGRES_DB \
--set auth.postgresPassword=$POSTGRES_ADMIN_PASSWORDkubectl delete -f k8s/local/mlflow_service.yaml
kubectl delete -f k8s/common/mlflow_deployment.yaml
kubectl delete secret mlflow-env-variables
helm uninstall mlflow-db
kubectl delete pvc data-mlflow-db-postgresql-0Warning: Deleting the PVC permanently destroys all PostgreSQL data. This action is irreversible.
kubectl describe pod mlflow-db-postgresql-0
kubectl logs mlflow-db-postgresql-0
kubectl get pvcCommon causes: PVC stuck in Pending (StorageClass not available), incorrect credentials.
kubectl logs -l app=mlflow-dashboard --tail=100
kubectl describe pod -l app=mlflow-dashboardCommon causes: PostgreSQL not yet Ready, incorrect BACKEND_STORE_URI, invalid AWS credentials.
kubectl get endpoints mlflow-serviceIf the endpoints are empty, the MLflow pods are not Ready.