Queuing does not prevent root task overproduction unless you have enough tasks

Queuing https://github.com/dask/distributed/pull/6614 is meant to prevent root task overproduction https://github.com/dask/distributed/issues/5555. And it's shown to be very effective at doing so: https://github.com/dask/distributed/discussions/7128.

However, due to the heuristic of what counts as a "root-ish" task, it'll only stop root task overproduction if you have > `total_nthreads * 2` root tasks.

Overproduction can occur any time there are > `total_nthreads` root tasks. So in this middle case, queuing won't kick in and the `worker-saturation` value won't be respected.

This would be confusing behavior to users. If you make your problem size smaller, or make your cluster bigger—two things that you'd expect to reduce per-worker memory usage—you may cross an opaque magic threshold at which your workload suddenly uses up to 2x more memory.

EDIT:

To be clear, I propose a two-character change to fix this. Just drop the `* 2` part:

```diff
diff --git a/distributed/scheduler.py b/distributed/scheduler.py
index b99e3f19..df20e807 100644
--- a/distributed/scheduler.py
+++ b/distributed/scheduler.py
@@ -3033,7 +3033,7 @@ class SchedulerState:
         tg = ts.group
         # TODO short-circuit to True if `not ts.dependencies`?
         return (
-            len(tg) > self.total_nthreads * 2
+            len(tg) > self.total_nthreads
             and len(tg.dependencies) < 5
             and sum(map(len, tg.dependencies)) < 5
         )
```

The `* 2` is a number @mrocklin and I just made up back in https://github.com/dask/distributed/pull/4967. There wasn't any benchmarking or empirical reason for it. Just saying `> nthreads` is more logical and easier to justify.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Queuing does not prevent root task overproduction unless you have enough tasks #7273

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Queuing does not prevent root task overproduction unless you have enough tasks #7273

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions