Description
For a while now, sv-benchmarks runs have been showing a number of spurious SEGMENTATION FAULT
results which are on different tasks on every run. In our latest run, it was 157/31498 tasks (0.5%). These make BenchExec diff tables particularly annoying to look at: one has to filter out those statuses from both columns to focus on actual differences.
In SV-COMP 2024 results we had zero of them, so something must have happened. AFAIK, segfaults in OCaml can only be due to misuse of Obj
or bad C stubs (which we use for float domains). I opened #1371 at one point but convinced myself that our stubs were actually fine. Maybe they still are not? But they were such also at SV-COMP 2024, so the same issues should've popped up then as well.
Their spurious nature makes them extremely difficult to debug as well. Since it's not any specific task, one cannot just bisect and run that under gdb
. Rather, at every bisect step, a full sv-benchmarks run needs to be done. And that only gives the first problematic commit at best, there's still no way to get the full backtrace from gdb
while running Goblint under BenchExec.