Open
Description
Several tests in test_shuffle.py are very flaky.
If I change .github/workflows/tests.yaml
as follows, to rerun the tests 20 times (ci1 + not ci1) per environment:
pytest distributed/shuffle/tests/test_shuffle.py --count=10 --runslow \
--leaks=...
I get the following failure rates:
test | n. failures |
---|---|
distributed/shuffle/tests/test_shuffle.py::test_clean_after_close | 1 |
distributed/shuffle/tests/test_shuffle.py::test_closed_input_only_worker_during_transfer | 1 |
distributed/shuffle/tests/test_shuffle.py::test_closed_worker_during_transfer | 29 |
distributed/shuffle/tests/test_shuffle.py::test_crashed_worker_during_transfer | 6 |
distributed/shuffle/tests/test_shuffle.py::test_restarting_during_transfer_raises_killed_worker | 38 |
Additionally, test_crashed_worker_during_transfer
deadlocks in a way that's irrecoverable on Windows, causing the whole test suite to be killed by
Lines 155 to 162 in eb297b3
logs: https://github.com/crusaderky/distributed/actions/runs/5761255813