Skip to content

Opening the TCP comm port in a browser results in an exception #8905

Open
@jacobtomlinson

Description

@jacobtomlinson

If you start the scheduler and accidentally open the TCP comm port in a browser instead of the dashboard port you get a confusing pickle message in the browser and an exception in the scheduler.

image
$ dask scheduler
2024-10-24 12:36:30,908 - distributed.scheduler - INFO - -----------------------------------------------
2024-10-24 12:36:31,259 - distributed.scheduler - INFO - State start
2024-10-24 12:36:31,262 - distributed.scheduler - INFO - -----------------------------------------------
2024-10-24 12:36:31,263 - distributed.scheduler - INFO -   Scheduler at:   tcp://10.51.100.43:8786
2024-10-24 12:36:31,263 - distributed.scheduler - INFO -   dashboard at:  http://10.51.100.43:8787/status
2024-10-24 12:36:31,263 - distributed.scheduler - INFO - Registering Worker plugin shuffle
2024-10-24 12:36:34,608 - tornado.application - ERROR - Exception in callback functools.partial(<function TCPServer._handle_connection.<locals>.<lambda> at 0x7f26e4194ae0>, <Task finished name='Task-254' coro=<BaseTCPListener._handle_stream() done, defined at /home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py:654> exception=MemoryError((6073139484287059271,), dtype('uint8'))>)
Traceback (most recent call last):
  File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/ioloop.py", line 750, in _run_callback
    ret = callback()
          ^^^^^^^^^^
  File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/tcpserver.py", line 387, in <lambda>
    gen.convert_yielded(future), lambda f: f.result()
                                           ^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 666, in _handle_stream
    await self.on_connection(comm)
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 288, in on_connection
    return await super().on_connection(comm, handshake_overrides)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 267, in on_connection
    handshake = await comm.read()
                ^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 227, in read
    frames_nosplit = await read_bytes_rw(stream, frames_nosplit_nbytes)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 359, in read_bytes_rw
    buf = host_array(n)
          ^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/protocol/utils.py", line 29, in host_array
    return numpy.empty((n,), dtype="u1").data
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 5.27 EiB for an array with shape (6073139484287059271,) and data type uint8
2024-10-24 12:36:34,649 - tornado.application - ERROR - Exception in callback functools.partial(<function TCPServer._handle_connection.<locals>.<lambda> at 0x7f26e4194cc0>, <Task finished name='Task-257' coro=<BaseTCPListener._handle_stream() done, defined at /home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py:654> exception=MemoryError((8530211521808319815,), dtype('uint8'))>)
Traceback (most recent call last):
  File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/ioloop.py", line 750, in _run_callback
    ret = callback()
          ^^^^^^^^^^
  File "/home/jtomlinson/miniconda3/envs/dask/lib/python3.11/site-packages/tornado/tcpserver.py", line 387, in <lambda>
    gen.convert_yielded(future), lambda f: f.result()
                                           ^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 666, in _handle_stream
    await self.on_connection(comm)
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 288, in on_connection
    return await super().on_connection(comm, handshake_overrides)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/core.py", line 267, in on_connection
    handshake = await comm.read()
                ^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 227, in read
    frames_nosplit = await read_bytes_rw(stream, frames_nosplit_nbytes)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/comm/tcp.py", line 359, in read_bytes_rw
    buf = host_array(n)
          ^^^^^^^^^^^^^
  File "/home/jtomlinson/Projects/dask/distributed/distributed/protocol/utils.py", line 29, in host_array
    return numpy.empty((n,), dtype="u1").data
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 7.40 EiB for an array with shape (8530211521808319815,) and data type uint8

This is sort of expected because you shouldn't open that port in a browser. But it would be nice if things failed in a better way.

It would be interesting to see if we can detect an HTTP connection and behave differently. For example we could try and show a better error in the browser and avoid going down the code path that raises the exception in the scheduler.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions