Description
Discovered/verified in #814
Basically apparently it can happen that we generate the correct report but override it with an outdated one. Discovered running the parallel tests part of the features introduced in #841 (currently tagged wip to avoid flakies, remove the wip tag if you want to test this out). It doesn't happen reliably. Might happen 3 times in a row or not at all for 20 runs (for me at least).
I gathered some examples here: #814 (comment)
What you can see there is something like this:
Coverage report generated for (2/8) to /home/tobi/github/simplecov/tmp/aruba/project/coverage. 23 / 33 LOC (69.7%) covered.
Coverage report generated for (1/8), (2/8), (3/8) to /home/tobi/github/simplecov/tmp/aruba/project/coverage. 37 / 43 LOC (86.05%) covered.
Coverage report generated for (1/8), (2/8), (3/8), (4/8) to /home/tobi/github/simplecov/tmp/aruba/project/coverage. 42 / 47 LOC (89.36%) covered.
Coverage report generated for (1/8), (2/8) to /home/tobi/github/simplecov/tmp/aruba/project/coverage. 32 / 39 LOC (82.05%) covered.
Notice that the second too last has the right number of loc and coverage expected in the test and also seems to take everything into account (unsure why it says n/8 though, guess: I could run 8 in parallel but parallel tests only spawned 4 as there are only 4 test files) but is seemingly later overridden by an outdated report.
If possible we should of course find and fix that race condition, and then remove the wip tags to always run these tests!