Open
Description
If you start the scheduler and accidentally open the TCP comm port in a browser instead of the dashboard port you get a confusing pickle message in the browser and an exception in the scheduler.

$ dask scheduler
2024-10-24 12:36:30,908 - distributed.scheduler - INFO - -----------------------------------------------
2024-10-24 12:36:31,259 - distributed.scheduler - INFO - State start
2024-10-24 12:36:31,262 - distributed.scheduler - INFO - -----------------------------------------------
2024-10-24 12:36:31,263 - distributed.scheduler - INFO - Scheduler at: tcp://10.51.100.43:8786
2024-10-24 12:36:31,263 - distributed.scheduler - INFO - dashboard at: http://10.51.100.43:8787/status
2024-10-24 12:36:31,263 - distributed.scheduler - INFO - Registering Worker plugin shuffle
2024-10-24 12:36:34,608 - tornado.application - ERROR - Exception in callback functools.partial(<function TCPServer._handle_connection.<locals>.<lambda> at 0x7f26e4194ae0>, <Task finished name='Task-254' coro=<BaseTCPListener._handle_stream() done, defined at /home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py:654> exception=MemoryError((6073139484287059271,), dtype('uint8'))>)
Traceback (most recent call last):
File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/ioloop.py", line 750, in _run_callback
ret = callback()
^^^^^^^^^^
File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/tcpserver.py", line 387, in <lambda>
gen.convert_yielded(future), lambda f: f.result()
^^^^^^^^^^
File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 666, in _handle_stream
await self.on_connection(comm)
File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 288, in on_connection
return await super().on_connection(comm, handshake_overrides)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 267, in on_connection
handshake = await comm.read()
^^^^^^^^^^^^^^^^^
File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 227, in read
frames_nosplit = await read_bytes_rw(stream, frames_nosplit_nbytes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 359, in read_bytes_rw
buf = host_array(n)
^^^^^^^^^^^^^
File "/home/jtomlinson/Projects/dask/distributed/distributed/protocol/utils.py", line 29, in host_array
return numpy.empty((n,), dtype="u1").data
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 5.27 EiB for an array with shape (6073139484287059271,) and data type uint8
2024-10-24 12:36:34,649 - tornado.application - ERROR - Exception in callback functools.partial(<function TCPServer._handle_connection.<locals>.<lambda> at 0x7f26e4194cc0>, <Task finished name='Task-257' coro=<BaseTCPListener._handle_stream() done, defined at /home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py:654> exception=MemoryError((8530211521808319815,), dtype('uint8'))>)
Traceback (most recent call last):
File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/ioloop.py", line 750, in _run_callback
ret = callback()
^^^^^^^^^^
File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/tcpserver.py", line 387, in <lambda>
gen.convert_yielded(future), lambda f: f.result()
^^^^^^^^^^
File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 666, in _handle_stream
await self.on_connection(comm)
File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 288, in on_connection
return await super().on_connection(comm, handshake_overrides)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 267, in on_connection
handshake = await comm.read()
^^^^^^^^^^^^^^^^^
File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 227, in read
frames_nosplit = await read_bytes_rw(stream, frames_nosplit_nbytes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 359, in read_bytes_rw
buf = host_array(n)
^^^^^^^^^^^^^
File "/home/jtomlinson/Projects/dask/distributed/distributed/protocol/utils.py", line 29, in host_array
return numpy.empty((n,), dtype="u1").data
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 7.40 EiB for an array with shape (8530211521808319815,) and data type uint8
This is sort of expected because you shouldn't open that port in a browser. But it would be nice if things failed in a better way.
It would be interesting to see if we can detect an HTTP connection and behave differently. For example we could try and show a better error in the browser and avoid going down the code path that raises the exception in the scheduler.