Can we run the benchmarks multiple times and report mean and sd?

## Description

I noticed that when running the gold models like SIR multiple times it can can anywhere from 100 seconds to 150 seconds on my machine with the develop branch. That's a pretty big variance! I think for the performance benchmarks we report back via Jenkins it would be a good idea to run the models 15-20 times and then report the mean and variance of the performance benchmark for each model.

@serban-nicusor-toptal can you point me to the code that the Jenkins uses to set off the performance tests? I can also modify the compare-hash.py file to report the mean and sd. If I remember right you had something like this in the works?

#### Current Version:
v2.20.0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we run the benchmarks multiple times and report mean and sd? #1345

Description

Current Version:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can we run the benchmarks multiple times and report mean and sd? #1345

Description

Description

Current Version:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions