Making AMM `ReduceReplicas` less aggressive towards widely-shared dependencies

Corollary to https://github.com/dask/distributed/issues/6038. In that issue, I described a situation where workers thought a key (which most tasks depended on) had 82 replicas, but in reality it only had 1.

This issue is about the fact that `ReduceReplicas` maybe shouldn't try to delete copies of that critical key so aggressively.
```
* * * * * *
\ \ \ / / /
    x y
```
In this case `x` and `y` are going to be reused by every task, so they will end up having replicas on most workers. Constantly deleting them is inefficient—as soon as you delete it, the next task that wants to run on that worker is going to have to transfer it back again.

(Of course, once most of the `*` tasks are done, then you should start reducing replicas. But while the cluster is fully saturated with `*` tasks, there's no benefit to doing this.)

I'm not sure what metric to use for this. Ideas explored in https://github.com/dask/distributed/pull/4967, https://github.com/dask/distributed/pull/5325, https://github.com/dask/distributed/pull/5326 could be interesting here.

Really, this issue is just about how to calculate a smarter target for this `desired_replicas` count automatically based on the task's `waiters`, number of current workers, etc.: https://github.com/dask/distributed/blob/4b3e0c2595d29dc0069fbfb6d2cff8bd3fa83949/distributed/active_memory_manager.py#L477

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Making AMM `ReduceReplicas` less aggressive towards widely-shared dependencies #6056

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Making AMM ReduceReplicas less aggressive towards widely-shared dependencies #6056

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Making AMM `ReduceReplicas` less aggressive towards widely-shared dependencies #6056