Support for generating identical indices during benchmarking

### Is your feature request related to a problem? Please describe

I often want to run OSB against 2 clusters, one with a search-related change and one without it, to see how search latency is affected. However, there is often quite a bit of noise in the results, and it can be tricky to tell if this is really due to my change, or if it's due to how exactly the data was indexed - for example the exact segment topology can very significantly affect sort query performance. 

It would be really useful if there was an optional, easy way OSB could ensure the index would be exactly the same every time. 

### Describe the solution you'd like

The simplest solution might be setting a random seed with some workload flag like `--rng-seed`. But, I don't know enough about the indexing flow to be sure if this would actually result in identical indices every time. 

### Describe alternatives you've considered

If the OSB random seed isn't enough to ensure identical indexes, another option would be hosting one "standard" snapshot for each dataset somewhere. Then, if the user specifies some flag, OSB would download this snapshot and install it to the cluster, instead of doing the typical indexing operation. 

This could be limited to only a snapshot for the current OS version, and maybe for only the important workloads such as nyc_taxis/big5/http_logs, to avoid having to host too many of them. 

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for generating identical indices during benchmarking #1002

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for generating identical indices during benchmarking #1002

Description

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions