Description
While Weave is currently doing a very good job at restricting shared writable state to channels, and specifically trySend
and tryRecv
routines, we need tooling and tests to detect races and concurrent heisenbugs.
Unfortunately it is apparently a NP-hard problem. The issue is that to ease debugging we need to reliable trigger the bug to ensure it is fixed. However threads interleaving is non-deterministic and we can't ask people to use our own deterministic fork of Windows, Linux or Mac.
Also allocating and free-ing memory willy-nilly from lots of threads might also overwhelm the allocator in use or maybe trigger bugs (if we use Nim allocator), so it's probably better to never free memory for testing to avoid allocator bugs/slowness (b8ac8d6)
There are a couple approaches that can be taken with varying order of impracticality.
Sanitizers
This category only requires recompiling with extra flags, sometimes just --debugger:native
.
While they can detect the presence of bugs, I don't think they can prove the absence of one.
valgrind --tool:helgrind build/mybinary
http://valgrind.org/docs/manual/hg-manual.html
POSIX-only, with debugger:native it will mention the Nim lines that are potentially racy.
It slows down the code a lot.
Also there is some noise on memset, memcpy (not sure if it also happens if you never free memory)
LLVM ThreadSanitizer
- https://clang.llvm.org/docs/ThreadSanitizer.html
- https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual
- slides: https://llvm.org/devmtg/2012-11/Serebryany_TSan-MSan.pdf
Compile with clang with -fsanitize=thread
Libraries
Libraries can exhaustively tests all kind of thread interleaving provided a test scenario, say MPSC queue with 2 producers and 1 consumers. We however have to assume that being bug free for 2 producers mean being bug-free for N (which has been proved for some MPSC queue implementations).
Important: Library used should ideally support C++11 memory model and/or ensure correctness on weak memory model architecture (i.e. everything not x86 like ARM? PowerPC, MIPS)
Relacy Race Detector
- https://github.com/dvyukov/relacy
- http://www.1024cores.net/home/relacy-race-detector
- http://www.1024cores.net/home/relacy-race-detector/rrd-introduction
We can switch Nim atomics to Relacy atomics via templates and recompile.
Chess
Microsoft, Windows-only
- https://channel9.msdn.com/Shows/Going+Deep/CHESS-An-Automated-Concurrency-Testing-Tool
- https://www.microsoft.com/en-us/download/details.aspx?id=52619&from=https%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fdownloads%2Fb23f8dc3-bb73-498f-bd85-1de121672e69%2F
- Paper: Finding and Reproducing Heisenbugs in Concurrent Programs
- Publications: https://www.microsoft.com/en-us/research/project/chess-find-and-reproduce-heisenbugs-in-concurrent-programs/#!publications
- https://nebelwelt.net/teaching/15-510-SE/slides/16-testing3-chess.pdf
Landslide
https://github.com/bblum/landslide
https://www.pdl.cmu.edu/Landslide/index.shtml
See extensive PhD Thesis at the bottom
Do-it-yourself
The blog post for MultithreadedTC a verifier for the JVM explains in-depth how it is architected: via an internal metronome clock that sync all threads and then at synchronization points test all combinations of thread interleaving.
http://www.cs.umd.edu/projects/PL/multithreadedtc/overview.html
Model Checking and Formal verification
This is heavyweight and requires either using a foreign language or a lot of annotation and constraints in the source code, in exchange it provides mathematical guarantees of correctness:
VCC
Annotate C code and it will be passed to Z3
- http://moskal.me/pdf/tphol2009.pdf
- https://www.microsoft.com/en-us/research/project/vcc-a-verifier-for-concurrent-c/
Iris
TLA+
Spin
Resources
-
PhD Thesis, Ben Blum
Practical Concurrency Testing
or: How I Learned to Stop Worrying and Love the Exponential Explosionhttp://reports-archive.adm.cs.cmu.edu/anon/2018/CMU-CS-18-128.pdf