Skip to content

Scale testing #23

Open
Open
@JoshKarpel

Description

@JoshKarpel

@bbockelm reports possible issues observed by the Coffea team when scaling past ~50 workers with TLS, as well as issues where auto-scaled-down workers are killed while still holding useful results in memory. We should investigate both issues on our setup and see if we can reproduce them.

I'm also interested in testing overall stability during large/long calculations by manually killing workers and seeing if Dask can dynamically recover in a reasonable way (as it claims it can).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions