Skip to content

Support wss:// websocket URLs for api-managed SSH proxying #5016

@funkypenguin

Description

@funkypenguin

In our environment, we are using Istio Ingress Gateways, not nginx. We've successfully exposed the apiserver via https, (https://skypilot.mydomain.com) and can launch clusters.

When trying to SSH to a launched cluster, we get this error:

(base) root@2df1a7da2034:/# ssh sky-6ba6-root
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/websockets/asyncio/client.py", line 543, in __await_impl__
    await self.connection.handshake(
  File "/opt/conda/lib/python3.10/site-packages/websockets/asyncio/client.py", line 114, in handshake
    raise self.protocol.handshake_exc
  File "/opt/conda/lib/python3.10/site-packages/websockets/client.py", line 325, in parse
    self.process_response(response)
  File "/opt/conda/lib/python3.10/site-packages/websockets/client.py", line 142, in process_response
    raise InvalidStatus(response)
websockets.exceptions.InvalidStatus: server rejected WebSocket connection: HTTP 301

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/sky/templates/websocket_proxy.py", line 64, in <module>
    asyncio.run(main(websocket_url))
  File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/opt/conda/lib/python3.10/site-packages/sky/templates/websocket_proxy.py", line 14, in main
    async with websockets.connect(url, ping_interval=None) as websocket:
  File "/opt/conda/lib/python3.10/site-packages/websockets/asyncio/client.py", line 587, in __aenter__
    return await self
  File "/opt/conda/lib/python3.10/site-packages/websockets/asyncio/client.py", line 559, in __await_impl__
    uri_or_exc = self.process_redirect(exc)
  File "/opt/conda/lib/python3.10/site-packages/websockets/asyncio/client.py", line 494, in process_redirect
    new_ws_uri = parse_uri(new_uri)
  File "/opt/conda/lib/python3.10/site-packages/websockets/uri.py", line 77, in parse_uri
    raise InvalidURI(uri, "scheme isn't ws or wss")
websockets.exceptions.InvalidURI: https://skypilot.mydomain.com/kubernetes-pod-ssh-proxy?cluster_name=sky-6ba6-root isn't a valid URI: scheme isn't ws or wss
kex_exchange_identification: Connection closed by remote host
Connection closed by UNKNOWN port 65535
kex_exchange_identification: Connection closed by remote host
Connection closed by UNKNOWN port 65535

I believe what's happening is that https://github.com/skypilot-org/skypilot/blob/master/sky/templates/websocket_proxy.py is trying to connect to ws://skypilot.mydomain.com, and receiving a 301 redirect from Istio to https://skypilot.mydomain.com. This is then failing due to the URL scheme.

I've POC'd / worked around this issue by altering https://github.com/skypilot-org/skypilot/blob/master/sky/templates/websocket_proxy.py to from:

    websocket_url = (f'ws://{server_url}/kubernetes-pod-ssh-proxy'
                     f'?cluster_name={sys.argv[2]}')

To:

    websocket_url = (f'wss://{server_url}/kubernetes-pod-ssh-proxy'
                     f'?cluster_name={sys.argv[2]}')

And the SSH connection now works correctly. So I know that websockets via Istio Ingress Gateways will work, if the expected protocol is returned.

My request is this - alter the ssh proxy command to include the websocket scheme inline, inferred from the URL given to sky api login. This would allow both ws:// and wss:// to be used without requiring the ingress in front of skypilot-api to be able to redirect ws:// to wss://.

Thank you :)
D

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions