An easy way to check for statistically significant difference between benchmarks

Currently, we have optional `WelchTTestPValueColumn` which help you to verify that there is a statistically significant difference between benchmarks. However, it doesn't work great with default run strategy because this strategy typically doesn't perform enough iterations. Users have to manually choose a satisfactory amount of iterations. Thus, it's possible to do such checks, but the user experience is not good enough. We can do the following:

* Introduce additional property in `AccuracyMode`. Let's call it `StopСriterion` (let me know if you have better ideas about naming). It will contain logic which should decide when do we have enough iterations.
* Currently, we have hardcoded logic inside `EngineTargetStage`. Let's move it to a class called `StdErrStopCriterion`.
* We can introduce `WelchStopCriterion` which will do additional iterations until we sure that it's enough for the Welch's Two Sample t-test. (Bonus: users will be able to write own criterion)
* `StopCriterion` should be able to affect `IOrderProvider.GetExecutionOrder` and ask to run baseline benchmarks first.
* `EngineTargetStage.RunAuto` should get additional information about benchmarks like `IsBaseline` value. The non-baseline benchmarks should get all measurements from the baseline benchmark in the corresponded group.

Original request: https://twitter.com/AnthonyLloyd123/status/1005388154046644230

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

An easy way to check for statistically significant difference between benchmarks #786

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

An easy way to check for statistically significant difference between benchmarks #786

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions