slurm-cloud-integration

Background

The slurm-cloud-integration project contains Dockerfiles, config files, and deployment/config content designed to enable the protyping and delivery of capabilities that integrate the Kubernetes and Slurm-HPC ecosystems

The combination of the slurm-jupyter-docker and slurm-single-node Dockerfiles are based upon the excellent work by Rodrigo Ancavil.

slurm-single-node: full stack, single-node Slurm in Docker

The slurm-single-node Dockerfile delivers an image that enables integration testing with a full Slurm stack w/ one worker (slurmd) node. This Dockerfile is based upon this excellent example written by Lennart Landsmeer.

The slurm-single-node Docker image is built from the project root directory as follows:

docker build -f src/docker/slurm-single-node -t hokiegeek2/slurm-single-node:$VERSION .

To simply run the slurm-single-node docker container, execute the following command:

docker run -it --rm --network=host hokiegeek2/slurm-single-node

In order to perform any integration testing with applications outside of the slurm-single-node, a munge.key used in the external app must be mounted into the docker container. Accordingly, to mount a munge.key and start the slurm-single-node docker container, execute the following command:

docker run -it --rm --network=host -v $PWD/munge.key:/tmp/munge.key hokiegeek2/slurm-single-node

Successful startup of slurm-single-node looks like this:

slurm-jupyterlab on k8s

The slurm-jupyter-docker Dockerfile and slurm-jupyter Helm chart enables deployment of the awesome NERSC jupyterlab-slurm application to Kubernetes.

The slurm-jupyter Docker image is built from the project root directory as follows:

docker build -f src/docker/slurm-jupyter-docker -t hokiegeek2/slurm-jupyter:$VERSION .

The command sequence to start slurm-jupyterlab is contained within the start-slurm-jupyter.sh file and is as follows:

#!/bin/bash

# copy munge.key, set ownership and permissions, and move to config dir
sudo cp /tmp/munge/munge.key /tmp/munge.key
sudo mv /tmp/munge.key /etc/munge/munge.key
sudo chown munge:munge /etc/munge/munge.key
sudo chmod 400 /etc/munge/munge.key

# start munge authorization service
sudo service munge start

jupyter lab --no-browser --allow-root --ip=0.0.0.0 --NotebookApp.token='' \
            --NotebookApp.password=''

tail -f /dev/null

Note the munge.key handling section, which is required to handle the munge.key passed in at container startup. Specifically, the munge.key file must be owned by the munge user and the permissions must be 400.

Deploying slurm-jupyterlab to Kubernetes

Preparting for slurm-jupyterlab Deployment

The munge.key configured for slurmctld needs to be added as a secret, which is accomplished as follows:

# Add secret encapsulating munge.key
kubectl create secret generic slurm-munge-key --from-file=/tmp/munge.key -n slurm-integration

# Confirm secret was created
kubectl get secret -n slurm-integration
NAME                                         TYPE                                  DATA   AGE
slurm-munge-key                              Opaque                                1      18d

Importantly, in analogy to the slurmd workers, the munge.key MUST be the same munge.key used in the munge service running on the slurmctld node.

Deploying slurm-jupyterlab is done via the slurm-jupyter Docker image and the slurm-jupyter Helm chart.

The helm command is executed as follows from the project root directory:

helm install -n slurm-integration slurm-jupyter-server deployment/charts/slurm-jupyter/

In addition to the helm chart artifacts, the slurm-jupyterhub k8s deployment requires the same munge.key used in the slurm cluster that the slurm-jupyterlab will connect to. The munge.key is used to create a Kubernetes secret that is mounted in the pod. The kubectl command is as follows:

kubectl create secret generic slurm-munge-key --from-file=munge.key -n slurm-integration

The configuration logic for loading the k8s munge.key secret is in the slurm-jupyter Helm template

Successful deployment of slurm-jupyterlab looks like this:

Confirm connectivity to slurm via the following commands:

# generic cluster info including slurmd node names 
sinfo

# specific info and statuses for each slurmd node
scontrol show nodes

Integration testing of slurm-jupyterlab on k8s with slurm-single-node

The combination of the slurm-jupyter-docker and slurm-single-node Dockerfiles are based upon the excellent work by Rodrigo Ancavil.

Integration testing of slurm-jupyterlab on k8s with slurm-single-node involves running the slurm-single-node Docker image. The docker run command is as follows:

docker run -it --rm --network=host -v $PWD/munge.key:/tmp/munge.key hokiegeek2/slurm-single-node:$VERSION

The munge.key is passed into the Docker container, which is an extremely important detail. The munge key either in the slurm docker container or on a bare-metal slurm cluster must be the same munge.key in the slurm-jupyterlab deployment on k8s. If not, authentication from slurm-jupyterlab on k8s to the slurm cluster will fail with the following message:

Using the test.slurm job, as successful job execution will look as follows in slurm-jupyterlab via terminal...

...as well as this in slurm queue manager:

...and finally this in slurm:

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
deployment/charts/slurm-jupyter		deployment/charts/slurm-jupyter
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

slurm-cloud-integration

Background

slurm-single-node: full stack, single-node Slurm in Docker

slurm-jupyterlab on k8s

Deploying slurm-jupyterlab to Kubernetes

Preparting for slurm-jupyterlab Deployment

Integration testing of slurm-jupyterlab on k8s with slurm-single-node

About

Uh oh!

Releases

Packages

Languages

ccmucdenver/slurm-cloud-integration

Folders and files

Latest commit

History

Repository files navigation

slurm-cloud-integration

Background

slurm-single-node: full stack, single-node Slurm in Docker

slurm-jupyterlab on k8s

Deploying slurm-jupyterlab to Kubernetes

Preparting for slurm-jupyterlab Deployment

Integration testing of slurm-jupyterlab on k8s with slurm-single-node

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages