Skip to content

Recommendation Requested / Observed mismatch #8119

Open
@Vesyrak

Description

@Vesyrak

When the adaptive core gets a target value that recommends a scale-down, it always appears to take the first worker, as defined here:

not_yet_arrived = requested - observed

This is because requested and observed contain completely different data.
requested takes the name of the worker, which is indicated by the clusters as an incremental integer.

new_worker_name = self._new_worker_name(self._i)

However, the observed names, or the names the scheduler gets from the workers, appear to be the addresses of the workers.
This results in a mismatch as the sets are compared, and as there is no overlap, the adaptive core assumes that it is still awaiting some workers, and thus can kill the not-yet-arrived workers. This is counterproductive, as this causes the adaptive algorithm to kill based on ordering, rather than idle behaviour.
Screenshot 2023-08-21 at 10 25 37

Environment:

  • Dask version: 2023.3.0
  • Python version: 3.11
  • Operating System: Ubuntu 22.04 (docker)
  • Install method (conda, pip, source): pip

Metadata

Metadata

Assignees

No one assigned

    Labels

    adaptiveAll things relating to adaptive scalingbugSomething is broken

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions