https://zero-to-jupyterhub.readthedocs.io/en/latest/index.html
https://zero-to-jupyterhub.readthedocs.io/en/latest/jupyterhub/installation.html
It would be easier to create a host VM in cloud
All work in the cloud can be done from that VM
You can install kubectl, Helm, Docker and all other things on this VM and don`t mess with your own local machine
How to create VM: https://mcs.mail.ru/help/ru_RU/create-vm/vm-quick-create
How to connect: https://mcs.mail.ru/help/ru_RU/vm-connect/vm-connect-nix
Steps:
- Create VM
- Connect to VM with SSH
- Perform all steps described further in this instruction from this VM
- Enjoy cloud:)
Instruction: https://mcs.mail.ru/help/ru_RU/k8s-start/create-k8s
Kubernetes as a Service: https://mcs.mail.ru/app/services/containers/add/
https://mcs.mail.ru/help/ru_RU/k8s-start/connect-k8s
https://kubernetes.io/ru/docs/tasks/tools/install-kubectl/
curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectlexport KUBECONFIG=/replace_with_path/to_your_kubeconfig.yamlalias k=kubectl
source <(kubectl completion bash)
complete -F __start_kubectl khttps://helm.sh/docs/intro/install/
curl https://raw.githubusercontent.com/helm/helm/HEAD/scripts/get-helm-3 | bash(Optional) Install Docker if you want to build your own images for JupyterHub and log into a Docker registry
https://docs.docker.com/engine/install/ubuntu/
https://docs.docker.com/engine/reference/commandline/login/
https://ropenscilabs.github.io/r-docker-tutorial/04-Dockerhub.html
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo updateAlso we need to mark one of storage classes as default for successful installation
kubectl get storageclass
kubectl patch storageclass csi-ceph-ssd-dp1-retain -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'warning: this config for demo use only! NOT A PRODUCTION SOLUTION
nano config_basic.yaml
#paste this to config_basic.yaml
singleuser:
defaultUrl: "/lab"
storage:
dynamic:
storageClass: csi-ceph-ssd-dp1-retain
hub:
config:
Authenticator:
admin_users:
- admin
allowed_users:
- your_another_non_admin_user
#DummyAuthenticator not for production
DummyAuthenticator:
password: insertyourpasswordhereMVeP2VXfr
JupyterHub:
authenticator_class: dummyhelm upgrade --cleanup-on-fail \
--install defaultinstall jupyterhub/jupyterhub \
--namespace jupyterhub \
--create-namespace \
--version=1.1.3 \
--values config_basic.yaml \
--timeout 20m0sTo access JupyterHub we need to find external ip
kubectl get services -n jupyterhubLook for LoadBalancer Service type. Then look for external ip.
You can access JupyterHub by entering this external ip to browser.
For debug and troubleshouting
kubectl get pods -n jupyterhub
kubectl get events -n jupyterhub
kubectl describe pod <pod-name> -n jupyterhub
kubectl logs <POD_NAME> -n jupyterhubLet`s add some security measures
nano config_lb.yaml
#paste this to config_lb.yaml
singleuser:
defaultUrl: "/lab"
storage:
dynamic:
#you could use different storage classes
#get storage classes with: kubectl get storageclasses.storage.k8s.io
storageClass: csi-ceph-ssd-dp1-retain
hub:
config:
Authenticator:
admin_users:
- admin
allowed_users:
- your_another_non_admin_user
#DummyAuthenticator not for production
DummyAuthenticator:
password: insertyourpasswordhereMVeP2VXfr
JupyterHub:
authenticator_class: dummy
proxy:
service:
#if you set loadBalancerSourceRanges, you can access JupyterHub only from ip address from this setting.
#you can set a bunch of IP adresses
#https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/security.html#restricting-load-balancer-access
loadBalancerSourceRanges:
- PLACE_YOUR_IP_HERE
- PLACE_ANOTHER_YOUR_IP_HERE_OR_REMOVE_THIS_LINE
#EXAMPLE
# - 91.74.148.161/32helm upgrade --cleanup-on-fail \
defaultinstall jupyterhub/jupyterhub \
--namespace jupyterhub \
--version=1.1.3 \
--values config_lb.yaml \
--timeout 20m0sYou can check versions of Helm chart and JupyterHub here:
https://jupyterhub.github.io/helm-chart/
Also we can enable https and integrate JupyterHub with Github for authentication
Read more here:
https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/security.html#https
https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/authentication.html#github
https://docs.github.com/en/organizations/collaborating-with-groups-in-organizations/creating-a-new-organization-from-scratch
You can find working examples of config.yaml in config directory in this repo.
Read config_https_github.yaml
You could read more about oneAPI:
https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html
https://medium.com/intel-analytics-software/save-time-and-money-with-intel-extension-for-scikit-learn-33627425ae4
You need to install Docker if you want to build your own image or you could use our image: mcscloud/jupyter-ds-intel-mcs:v2
sudo docker login --username=YOUR_DOCKERHUB_USER_NAME
#sudo docker login --username=mcscloudadditional instruction about Docker Hub
https://jsta.github.io/r-docker-tutorial/04-Dockerhub.html
#make separate dir
mkdir ~/intel_based_docker_image && cd ~/intel_based_docker_image
#then create Dockerfile
nano Dockerfile
#paste this to Dockerfile
FROM jupyter/datascience-notebook:hub-1.4.2
RUN pip install --no-cache-dir nbgitpuller
#Доп информация про nbgitpuller
#https://github.com/jupyterhub/nbgitpuller
#Install git extension
RUN pip install --no-cache-dir jupyterlab-git
#Install Intel part
RUN conda install -c conda-forge scikit-learn-intelexLet`s build custom image with Intel ML package
export YOUR_DOCKER_REPO=
#example export YOUR_DOCKER_REPO=mcscloud
sudo docker build -t jupyter-ds-intel-mcs .
sudo docker images
#find image id and copy it
sudo docker tag YOUR_IMAGE_ID $YOUR_DOCKER_REPO/jupyter-ds-intel-mcs:v2
sudo docker push $YOUR_DOCKER_REPO/jupyter-ds-intel-mcs:v2If you want to test Intel libraries for ML you need more resources
As you can see we add cpu and memory requirements to config under singleuser part
nano config_intel.yaml
#paste this to config_intel.yaml
singleuser:
defaultUrl: "/lab"
storage:
dynamic:
#you could use different storage classes
#get storage classes with: kubectl get storageclasses.storage.k8s.io
storageClass: csi-ceph-ssd-dp1-retain
cpu:
limit: 3
guarantee: 2
memory:
limit: 3G
guarantee: 512M
# Defines the default image
image:
name: jupyter/minimal-notebook
tag: hub-1.4.2
profileList:
- display_name: "Minimal environment"
description: "To avoid too much bells and whistles: Python."
default: true
- display_name: "Tensorflow"
description: "If you want the additional bells and whistles: Python, R, and Julia."
kubespawner_override:
image: jupyter/tensorflow-notebook:hub-1.4.2
- display_name: "Spark environment"
description: "The Jupyter Stacks spark image!"
kubespawner_override:
image: jupyter/all-spark-notebook:hub-1.4.2
- display_name: "JupyterLab with Intel libraries"
description: "Use some Intel optimizations"
kubespawner_override:
image: PLACE_YOUR_DOCKER_REPO_OR_USE_mcscloud/jupyter-ds-intel-mcs:v2
# image: mcscloud/jupyter-ds-intel-mcs:v2
hub:
config:
Authenticator:
admin_users:
- admin
allowed_users:
- your_another_non_admin_user
#DummyAuthenticator not for production
DummyAuthenticator:
password: insertyourpasswordhereMVeP2VXfr
JupyterHub:
authenticator_class: dummy
proxy:
service:
#if you set loadBalancerSourceRanges, you can access JupyterHub only from ip address from this setting.
#you can set a bunch of IP adresses
#https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/security.html#restricting-load-balancer-access
loadBalancerSourceRanges:
- PLACE_YOUR_IP_HERE
- PLACE_ANOTHER_YOUR_IP_HERE_OR_REMOVE_THIS_LINE
#EXAMPLE
# - 91.74.148.161/32helm upgrade --cleanup-on-fail \
defaultinstall jupyterhub/jupyterhub \
--namespace jupyterhub \
--version=1.1.3 \
--values config_intel.yaml \
--timeout 20m0sFor testing Intel library clone repo https://github.com/intel/scikit-learn-intelex with Git extension already installed in Jupyter
You can find Git extension on the left vertical bar
After cloning repo you can find test Notebooks in scikit-learn-intelex/examples/notebooks/
To use JupyterHub, enter the external IP for the proxy-public service in to a browser.
kubectl get service -n jupyterhubIf you have any questions you can ask me here:
Telegram @volinski
Email [email protected]