Description
Many lock-free implementations use some randomness as an easy way to load balance without coordination. This may be a source of nondeterminism and DSCheck does not like it. For example, say we have a test that begins with two threads wanting to put two elements into random slots in an array.
Run:
- Thread A draws slot 3 and puts item there
- Thread B draws slot 3 and fails
- Thread B retries; draws slot 5 and puts item there
..
..
more steps
If DSCheck finds some races after step 3, it will keep rerunning the sequence 1-3 to explore events of interest that occur further down the sequence. But on the consecutive runs, thread B may succeed on the first try, yielding a different state. In the positive case, B will become disabled before its schedule and DSCheck crashes explicitly. In the more scary one, we may give a positive result without actually having explored all the interesting interleavings.
It's an easy mistake to make and a pretty difficult one to find. Perhaps, DSCheck should choose some random seed, and then reinitialize OCaml's prng to this value at the beginning of every run?
cc @lyrm, who's observed this issue in pratice, with dscheck crashing uncontrollably on ocaml-multicore/saturn#65
Activity