gather(direct=True) may fall back to direct=False

The current implementation for

-  `Client(...).gather(..., direct=True)` 
- `Client(direct_to_workers=True).gather(...)`
- `get_client().gather(...)` (worker clients)
 
is that, if *anything* bad happens on the direct connection to the workers, the keys that failed to be fetched directly are fetched through the scheduler `(direct=False)` instead.

I believe that the intent of this design is to build resilience when the user specifies direct=True while in fact they have a firewall preventing comms.

First, I think even the original intent is not a good idea to begin with - most firewalls will just silently drop the packets, thus causing the user to wait ~30 seconds for timeout. If the workflow takes e.g. ~2 minutes to compute, the user may not realise there's a substantial slowdown.

Second, there are many cases where a key will fail to be gathered on the first try - notably, the who_has dict can fall out of sync with the scheduler, or a worker's GIL can be temporarily clogged. This may cause a workflow that gathers huge amounts of data from workers to client to run fine *most times* and then, once in a while, kill off the scheduler due to memory pressure. This is particularly true when the client mounts vastly more RAM than the scheduler - which is not an uncommon configuration, particularly when worker_clients are involved.

I suggest we drop this feature completely. If workers are permanently unreachable with direct=True due to network configuration, gather(direct=True) should just not work. Note that the default is direct=False.

CC @fjetter @hendrikmakait 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gather(direct=True) may fall back to direct=False #7993

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

gather(direct=True) may fall back to direct=False #7993

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions