Summary
A BinderHub isn't the most useful platform for a research community - it generally lacks persistence (although you can rectify this by deploying the persistent_binderhub helm chart) and configuring safe git push/pull to any kind of repository is clunky.
A lot of work has been going on in the z2jh-k8s helm chart by 2i2c and other parties to make this a useful platform for research purposes.
Must-Have Features
Azure AD authentication, so everyone can log in with their turing.ac.uk account
- Mapping specific user groups onto specific profile lists (i.e. machine types)
- Secure GitHub pull/push access to public/private repos both in the Turing GitHub org and users' own personal accounts
- Ability to sync folders to the Jupyter home dir server
- A hub environment image where all the packages users will need to use are installed
- 2i2c has a template repository for automatically building and pushing a docker image using the repo2docker-action: https://github.com/2i2c-org/hub-user-image-template
- Note: this example suggests pushing to quay.io, but a lot of other container registries are supported too, so we could push to the turinginst org on Docker Hub, for example
- We probably want to start with an image like Pangeo's and then start crowd-sourcing a list of packages people actually want and provide them in our own image
Nice-to-Have Features
Future Features
These features are still under active development and are not yet recommended for production deployment. But when they are, they'll be super awesome!
- Real-Time Collaboration in JupyterLab
- Accessing private data based in the cloud
- Dynamic environment generation using repo2docker directly from JupyterHub
- 2i2c will be working on this feature sometime next year
TODOs
- Tear down Hub23 the BinderHub
- Deploy Hub23 the JupyterHub
Tearing down the BinderHub
- Documentation to achieve this is available here and here
- Hub23 and the Binder Federation deployment share the same cluster, so we may not want to totally delete the cluster... Or maybe we do and want to redeploy into the
uksouth location while we have the opportunity. Up to you!
Clearing out the repo
Files we won't need (or will be creating new ones for JupyterHub):
- Everything under the
deploy/ folder
- Everything under the
hub23-chart/ folder (we should create a new local helm chart that has ingress-nginx, grafana and prometheus dependencies for monitoring)
.az-pipelines/cd.yaml will need rewriting and temporarily disabling until we have the new deployment set up
- Disable
.github/workflows/bump-helm-version.yaml
- If we end up with a new local helm chart, we can reenable this
- We should probably rewrite the admin docs under the
docs/ folder as we go too
Setting up the new JupyterHub
Next Steps
- Start enabling the features listed at the top of this issue!
- Some of them will require editing config at the helm chart level, other things are just making sure the packages are installed in the hub environment and providing docs
- Start promoting the hub in the Turing community, what it can do, and why it's useful
- Teach people how to use nbgitpuller to sync git repositories to the JupyterHub
Summary
A BinderHub isn't the most useful platform for a research community - it generally lacks persistence (although you can rectify this by deploying the persistent_binderhub helm chart) and configuring safe git push/pull to any kind of repository is clunky.
A lot of work has been going on in the z2jh-k8s helm chart by 2i2c and other parties to make this a useful platform for research purposes.
Must-Have Features
Azure AD authentication, so everyone can log in with their turing.ac.uk accountDocs: https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/authentication.html#azure-active-directoryTuring IT will need to create a relevant app and provide the client ID and secret for that appNice-to-Have Features
Future Features
These features are still under active development and are not yet recommended for production deployment. But when they are, they'll be super awesome!
TODOs
Tearing down the BinderHub
uksouthlocation while we have the opportunity. Up to you!Clearing out the repo
Files we won't need (or will be creating new ones for JupyterHub):
deploy/folderhub23-chart/folder (we should create a new local helm chart that has ingress-nginx, grafana and prometheus dependencies for monitoring).az-pipelines/cd.yamlwill need rewriting and temporarily disabling until we have the new deployment set up.github/workflows/bump-helm-version.yamldocs/folder as we go tooSetting up the new JupyterHub
basehubfor this: https://github.com/2i2c-org/infrastructure/tree/master/helm-charts/basehub.az-pipelines/cd.yamlto get it running again, or....github/workflows/bump-helm-version.yamlfor new local helm chartNext Steps