Skip to content

Repeated calls to memory_color take around 12% of CPU time of scheduler #8763

Open
@jonded94

Description

@jonded94

Describe the issue:

Some follow-up to: #8761

After fixing above issue already in #8762, the next big thing that takes very much CPU power with a scheduler with lots of workers (>2000), are the calls to _cluster_memory_color, more specifically _memory_color.

def _cluster_memory_color(self) -> str:

As far as I can see, this is about coloring the memory bar of a specific worker depending if it's deemed "good", "almost full" or "full".

Again, speedscope stuff (this was without the fix from PR 8762):

image

speedscope.json

Is this something that could be solved by binning the memory load & size (surely coloring doesn't have to be so exact that is has to be based on exact bytes of memory) and caching the result of this memory coloring process too?

Surely, one don't has to recalculate which color a worker process with for example 1024/4096MiB RAM shall have hundreds of times per second, especially since the coloring result doesn't change at all.

Environment:

  • Dask version: 2024.7.0
  • Python version: 3.10
  • Operating System: Linux, Debian
  • Install method (conda, pip, source): poetry / pip

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions