The repo does not have a top-level/entry/orchestration script to run the pipelines and recreate the dataset. Can it be added?