We need to migrate the simulated data out of this repo and into the benchmarking repo. It should be its own standalone dataset. We are a user of the benchmarking data now. This project should be an attempt at the benchmark, but not the benchmark itself. Our project can serve as an example of how to work on the benchmark. Separating these will make things easier to use for external contributors.