Training Explorer has data coming from different sources (see data_model), which may involve updating our Postgres databases. If you are doing a migration for the ETPL data, then see this guide instead.
npm --prefix=backend run db-migrate create [migration-name] -- --sql-fileWhen you want to add a DB migration that is a seed operation (that is, inserting data from a CSV), there's a specific process for this:
- make sure that the CSV source file is in the
backend/datadirectory - ensure that it does not have any leading/trailing newlines
- run the above DB migrate command to create the migration scripts in
backend/migrations. I recommend the name to be "seed-[description]" - run the
csvInserterscript to populate the migration file with insert statements generated from the CSV:
node backend/data/csvInserter.js csvFilename.csv tablenameToInsertInto backend/migrations/sqls/seed-migration-name.sqlAssuming that you want a different seed for testing vs real life, then:
- create a CSV in
/backend/datawith matching structure, and test data - duplicate the
.sqlmigration file and rename it to end with-TEST.sql - run the same node command above, with the test CSV filename and the test sql migration filename
- edit the corresponding
.jsfile for the migration by replacing this line:
exports.up/down = function(db) {
var filePath = path.join(__dirname, 'sqls', 'filename.sql');with this instead:
exports.up/down = function(db) {
const fileName = process.env.NODE_ENV === 'test' ? 'filename-TEST.sql' : 'filename.sql';
var filePath = path.join(__dirname, 'sqls', fileName);Troubleshooting
- If you are trying to run the tests and get an error that looks like:
RangeError [ERR_CHILD_PROCESS_STDIO_MAXBUFFER]: stdout maxBuffer length exceeded, this implies that it is running the real migrations, not the test migrations, and that you forgot to add the.jsmodification above. - If you are trying to run a migration and get an error that looks like:
ifError got unwanted exception: INSERT has more target columns than expressions, this implies that there was an empty line at the end of your CSV, so your migration full of insert statements has a broken INSERT withnullin it at the end. Remove this from the CSV and the migration, and it should work.
When you want to add a DB migration that is a seed update operation (that is, replacing data in a table new fresher data from a CSV), here is what to do:
IMPORTANT: if we're updating the
etpltable, follow the ETPL table seed guide instead.
- Create a database migration using the same script as above in the root
d4adfolder. Use theupdate-*pattern for the migration name. For example, for theetpltable, you would runnpm --prefix=backend run db-migrate create update-etpl -- --sql-file. This will automatically create the up and down SQL files and the JS file, all prefixed with a timestamp. - Make sure that both the OLD (previous) CSV and also the NEW (about-to-be-inserted) CSV are in the
backend/datafolder. - Run the following script to find the changed rows between old and new, and then delete and re-insert only those rows. This script will fill the generated SQL files with the SQL commands that do this. In the script, update
oldFilename,newFilename,tablenameToInsertInto,upFileName,downFileNameaccordingly.
./backend/data/create_update_migrations.sh oldFilename.csv newFilename.csv tablenameToInsertInto backend/migrations/sqls/upFileName.sql backend/migrations/sqls/downFileName.sql- We need to make sure the test migrations are accurate. Create new up and down files with the same name but with the
-TEST.sqladded.
- If your operation is just an update, leave a comment in BOTH files for "intentionally left blank"
- If your operation adds new columns, your down file should be "intentionally left blank" but your up file should delete from and re-insert the newly restructured test data.
-
As mentioned in the "Seeding" section above, modify the
.jsfile for the migration to conditionally load the-TEST.sqlup and down files. -
Add, commit, and push the requisite files (up and down SQL files, the up and down TEST SQL files, the updated JS migration file, and the updated CSV file). The continuous deployment will automatically run the script that finds this new migration and executes it, thus updating the Postgres database with the commands outlined in the migrations.