Description
Whats the Goal?
I am trying to figure out how to allow users to SSH into Notebook Pods from their laptop. The benefit of this is supporting tools like Remote VSCode and JetBrains Gateway (for PyCharm) with the resources (e.g. GPUs) of the Pod.
The main issue is how to expose the Notebook Pod via SSH on the Istio Ingress Gateway.
What's the Problem?
SSH uses TCP which can't do hostname/HTTP-path routing like we do for the web-based UIs of the Notebooks. The naive approach is to have the Istio Ingress Gateway listen on a unique port for each Notebook (which is obviously not scalable or secure).
In my mind there are only TWO ways to make this work:
- Use a "jump box" service (which has a single IP/Port) which listen on SSH, but route incoming requests to specific Notebooks Pods based on the SSH-key used to authenticate:
- This could be implemented by setting the
command
inauthorized_keys
to another-t username@<WORKSPACE_NAME>.<NAMESPACE_NAME>.svc.cluster.local
] command (see idea here) - Or possibly there might be a pre-made opensource ssh-routing tool for this exact use-case.
- I am not sure what kind of hardening is required on the jump box, but we need to consider stuff like:
- disabling ssh tunneling
- ensuring only traffic from the Istio Gateway gets to it (not from Pods inside the mesh)
- using
fail2ban
to stop brute forcing - regular/automatic rotation of SSH keys
- This could be implemented by setting the
- Using some kind of SD-WAN VPN like Tailscale (can be open source hosted), Cloudflare Tunnel, or ngrok:
- We would run the service both on the laptop and notebook pod, giving the Notebook Pod a special IP address that the laptop can use to access it.
- This is slightly problematic because it will not be a direct connection from the user to the Pod (and it will probably be slower because traffic might have to be relayed).
Other Notes
While it is technically possible to use kubectl port-forward
on the laptop to expose any port that the Notebook Pod is listening on (e.g. SSH port), I am not sure this is desirable at scale because it requires all users to have the pod/exec
RBAC on the profile namespace, which is very privileged.
Final Thoughts
There are lots of security considerations to allowing remote SSH access, especially for the people who put Kubeflow on the public internet (NOT advised).
I am interested to hear people's ideas for how we can do this safely.
Metadata
Metadata
Assignees
Type
Projects
Status