Skip to content

Combine datasets to one region or province

marliekeverweij edited this page Jun 19, 2019 · 11 revisions
  1. Go to ETLocal branch: dataset-amalgamator (there the script is located to combine the regions)
  2. Run in your terminal: python3 app/services/dataset_combiner.py geo_id=<geo_id> name=<dataset_name> migration_name=<migration_name> dataset_ids=<id1,id2,id3...> (make sure there are only commas between the ids and no spaces)

For example when you want to update the Groningen-Drenthe region it would look like this: python3 app/services/dataset_combiner.py geo_id=RGGD01 name=Groningen-Drenthe migration_name=gd_migrate2 dataset_ids=15054,15056

  1. Check in the etlocal/db/migrate folder if a new folder is created for this migration
  2. Check the commits.yml: are all datasets you wanted to combine stated here? (you can also correct the spelling when necessary)
  3. If everything is fine: create a new branch from the master branch (not the dataset-amalgamator branch)
  4. Run in your terminal: rake db:migrate
  5. Commit the new files (including schema.rb, excluding app/services/dataset_combiner/*) and create a PR.

Please note that this is an experimental feature not intended for merging with the master branch!

Clone this wiki locally