Description
For many bugs, it is often possible to narrow things down to get one (or more) test run which succeeds, and one (or more) test runs which fails.
Sometimes the cause for difference is in input data, and sometimes in the (version of) code used. Or in some cases, the environment the code is executing in.
But then to figure out why exactly this makes causes difference in behavior. In dataflow/FBP that usually means following the data until it starts diverging. A tricky/tedious part is usually being able to distinguish the insignificant from significant differences.
There might be timestamps, payload sizes etcs that continiously vary, but are not of interest. Or in other cases, there are slight differences which are functionally equivalent. For instance existance or value of some keys in an object might not matter at all. Or addition/removal of some elements of an array (but not others).
In general we'd need tooling that brings down the differences enough to be analyzed/understood easily/quickly by the developer. It should allow to progressively tighten whats shown when one has narrowed down possibilities.
If one does not know if difference is input or code/graphs, it may be useful with tools to help figure that out. For diffing the graphs, some integration with (yet to be developed) fbp-diff may be useful. Alternatively, one could use require user to combine (proposed)[https://github.com/flowbased/flowtrace/issue/13] flowtrace-extract --graph
+ fbp-spec
for that.