Skip to content

Commit 61445c5

Browse files
authored
[REF] migrations (#1225)
* add table * fix add_tables revision * modify workflow and compose readme for working with migrations * drop tables if existing database * delete database * point to the correct database * fix the conflicting table creation * fix more table creation * fix more migration issues * fix store migration history * fix migrations for data ingestion
1 parent 371d69e commit 61445c5

File tree

10 files changed

+904
-521
lines changed

10 files changed

+904
-521
lines changed

.github/workflows/workflow.yml

Lines changed: 6 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -473,39 +473,29 @@ jobs:
473473
474474
docker compose exec -T \
475475
compose_pgsql17 \
476-
psql -U postgres -c "create database test_db"
476+
psql -U postgres -c "CREATE DATABASE test_db"
477477
-
478478
name: Create Store Database
479479
run: |
480480
cd store
481481
482482
docker compose exec -T \
483483
store-pgsql17 \
484-
psql -U postgres -c "create database test_db"
484+
psql -U postgres -c "CREATE DATABASE test_db"
485485
486486
docker compose exec -T \
487487
store-pgsql17 \
488488
psql -U postgres -d test_db -c "CREATE EXTENSION IF NOT EXISTS vector;"
489489
-
490-
name: Initialize Compose Database
490+
name: Apply Compose migrations
491491
run: |
492492
cd compose
493-
docker compose exec -T compose \
494-
bash -c \
495-
"flask db merge heads && \
496-
flask db stamp head && \
497-
flask db migrate && \
498-
flask db upgrade"
493+
docker compose exec -T compose flask db upgrade
499494
-
500-
name: Initialize Store Database
495+
name: Apply Store migrations
501496
run: |
502497
cd store
503-
docker compose exec -T neurostore \
504-
bash -c \
505-
"flask db merge heads && \
506-
flask db stamp head && \
507-
flask db migrate && \
508-
flask db upgrade"
498+
docker compose exec -T neurostore flask db upgrade
509499
-
510500
name: Ingest data into Store
511501
run: |

compose/backend/README.md

Lines changed: 34 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -27,39 +27,57 @@ The server should now be running at http://localhost:81
2727

2828
Create the database for compose:
2929

30-
docker-compose exec compose_pgsql psql -U postgres -c "create database compose"
30+
docker-compose exec compose_pgsql17 psql -U postgres -c "create database compose"
3131

32-
Next, migrate and upgrade the database migrations.
32+
Next, apply the existing migrations (they are the canonical schema definition):
3333

34-
docker-compose exec compose \
35-
bash -c \
36-
"flask db merge heads && \
37-
flask db stamp head && \
38-
flask db migrate && \
39-
flask db upgrade"
40-
41-
**Note**: `flask db merge heads` is not strictly necessary
42-
unless you have multiple schema versions that are not from the same history
43-
(e.g., multiple files in the `versions` directory).
44-
However, `flask db merge heads` makes the migration more robust
45-
when there are multiple versions from different histories.
34+
docker-compose exec compose flask db upgrade
4635

4736

4837
## Maintaining docker image and db
4938
If you make a change to compose, you should be able to simply restart the server.
5039

5140
docker-compose restart compose
5241

53-
If you need to upgrade the db after changing any models:
42+
If you change any models, generate a new Alembic migration and migrate the database (commit the generated revision file so it becomes the new source of truth):
5443

5544
docker-compose exec compose flask db migrate
5645
docker-compose exec compose flask db upgrade
5746

5847

48+
## Database migrations
49+
50+
The migrations stored in `backend/migrations` are the **only** source of truth for the schema—avoid merging heads, stamping, or manually altering the history. Always move the database forward (or rebuild from scratch) by applying the tracked revisions.
51+
52+
### Applying migrations after pulling a branch
53+
54+
Any time you start the backend or pull the latest changes, bring the database to the expected state with:
55+
56+
```sh
57+
docker-compose exec compose flask db upgrade
58+
```
59+
60+
`upgrade` is idempotent, so rerunning it is harmless; it only applies migrations that have not been run yet.
61+
62+
### Resetting the database when switching branches
63+
64+
Because each branch might change the schema independently, recreate the database before starting work on a different branch so that Alembic can replay only the migrations that exist on that branch.
65+
66+
```sh
67+
docker-compose stop compose
68+
docker-compose exec compose_pgsql17 psql -U postgres -c "DROP DATABASE IF EXISTS compose;"
69+
docker-compose exec compose_pgsql17 psql -U postgres -c "CREATE DATABASE compose;"
70+
docker-compose start compose
71+
docker-compose exec compose flask db upgrade
72+
```
73+
74+
If you're using the legacy Postgres container, replace `compose_pgsql17` with `compose_pgsql` in the commands above.
75+
76+
5977
## Running tests
6078
To run tests, after starting services, create a test database:
6179

62-
docker-compose exec compose_pgsql psql -U postgres -c "create database test_db"
80+
docker-compose exec compose_pgsql17 psql -U postgres -c "create database test_db"
6381

6482
**NOTE**: This command will ask you for the postgres password which is defined
6583
in the `.env` file.

compose/backend/migrations/versions/6297073f6fcd_.py

Lines changed: 30 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -19,31 +19,39 @@
1919

2020
def upgrade():
2121
# ### commands auto generated by Alembic - please adjust! ###
22-
op.create_table('conditions',
23-
sa.Column('id', sa.Text(), nullable=False),
24-
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
25-
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
26-
sa.Column('name', sa.Text(), nullable=True),
27-
sa.Column('description', sa.Text(), nullable=True),
28-
sa.PrimaryKeyConstraint('id')
29-
)
30-
op.create_table('specification_conditions',
31-
sa.Column('id', sa.Text(), nullable=False),
32-
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
33-
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
34-
sa.Column('weight', sa.Float(), nullable=True),
35-
sa.Column('specification_id', sa.Text(), nullable=False),
36-
sa.Column('condition_id', sa.Text(), nullable=False),
37-
sa.ForeignKeyConstraint(['condition_id'], ['conditions.id'], ),
38-
sa.ForeignKeyConstraint(['specification_id'], ['specifications.id'], ),
39-
sa.PrimaryKeyConstraint('id', 'specification_id', 'condition_id')
40-
)
41-
op.create_index(op.f('ix_specification_conditions_condition_id'), 'specification_conditions', ['condition_id'], unique=False)
42-
op.create_index(op.f('ix_specification_conditions_specification_id'), 'specification_conditions', ['specification_id'], unique=False)
22+
bind = op.get_bind()
23+
inspector = sa.inspect(bind)
24+
25+
if not inspector.has_table('conditions'):
26+
op.create_table('conditions',
27+
sa.Column('id', sa.Text(), nullable=False),
28+
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
29+
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
30+
sa.Column('name', sa.Text(), nullable=True),
31+
sa.Column('description', sa.Text(), nullable=True),
32+
sa.PrimaryKeyConstraint('id')
33+
)
34+
35+
if not inspector.has_table('specification_conditions'):
36+
op.create_table('specification_conditions',
37+
sa.Column('id', sa.Text(), nullable=False),
38+
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
39+
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
40+
sa.Column('weight', sa.Float(), nullable=True),
41+
sa.Column('specification_id', sa.Text(), nullable=False),
42+
sa.Column('condition_id', sa.Text(), nullable=False),
43+
sa.ForeignKeyConstraint(['condition_id'], ['conditions.id'], ),
44+
sa.ForeignKeyConstraint(['specification_id'], ['specifications.id'], ),
45+
sa.PrimaryKeyConstraint('id', 'specification_id', 'condition_id')
46+
)
47+
op.create_index(op.f('ix_specification_conditions_condition_id'), 'specification_conditions', ['condition_id'], unique=False)
48+
op.create_index(op.f('ix_specification_conditions_specification_id'), 'specification_conditions', ['specification_id'], unique=False)
4349
op.drop_column('specifications', 'contrast')
4450
op.add_column('studyset_references', sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True))
4551
op.add_column('studyset_references', sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True))
46-
op.add_column('studysets', sa.Column('version', sa.Text(), nullable=True))
52+
existing_cols = {col["name"] for col in inspector.get_columns('studysets')} if inspector.has_table('studysets') else set()
53+
if 'version' not in existing_cols and inspector.has_table('studysets'):
54+
op.add_column('studysets', sa.Column('version', sa.Text(), nullable=True))
4755
# ### end Alembic commands ###
4856

4957

compose/backend/migrations/versions/76c16b00a2bc_.py

Lines changed: 114 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -19,95 +19,120 @@
1919

2020
def upgrade():
2121
# ### commands auto generated by Alembic - please adjust! ###
22-
op.create_table('annotation_references',
23-
sa.Column('id', sa.Text(), nullable=False),
24-
sa.PrimaryKeyConstraint('id')
25-
)
26-
op.create_table('roles',
27-
sa.Column('id', sa.Text(), nullable=False),
28-
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
29-
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
30-
sa.Column('name', sa.Text(), nullable=True),
31-
sa.Column('description', sa.Text(), nullable=True),
32-
sa.PrimaryKeyConstraint('id'),
33-
sa.UniqueConstraint('name')
34-
)
35-
op.create_table('studyset_references',
36-
sa.Column('id', sa.Text(), nullable=False),
37-
sa.PrimaryKeyConstraint('id')
38-
)
39-
op.create_table('users',
40-
sa.Column('id', sa.Text(), nullable=False),
41-
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
42-
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
43-
sa.Column('active', sa.Boolean(), nullable=True),
44-
sa.Column('name', sa.Text(), nullable=True),
45-
sa.Column('external_id', sa.Text(), nullable=True),
46-
sa.PrimaryKeyConstraint('id'),
47-
sa.UniqueConstraint('external_id')
48-
)
49-
op.create_table('roles_users',
50-
sa.Column('user_id', sa.Text(), nullable=True),
51-
sa.Column('role_id', sa.Text(), nullable=True),
52-
sa.ForeignKeyConstraint(['role_id'], ['roles.id'], ),
53-
sa.ForeignKeyConstraint(['user_id'], ['users.id'], )
54-
)
55-
op.create_table('specifications',
56-
sa.Column('id', sa.Text(), nullable=False),
57-
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
58-
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
59-
sa.Column('type', sa.Text(), nullable=True),
60-
sa.Column('estimator', sa.JSON(), nullable=True),
61-
sa.Column('filter', sa.Text(), nullable=True),
62-
sa.Column('contrast', sa.JSON(), nullable=True),
63-
sa.Column('corrector', sa.JSON(), nullable=True),
64-
sa.Column('public', sa.Boolean(), nullable=True),
65-
sa.Column('user_id', sa.Text(), nullable=True),
66-
sa.ForeignKeyConstraint(['user_id'], ['users.external_id'], ),
67-
sa.PrimaryKeyConstraint('id')
68-
)
69-
op.create_table('studysets',
70-
sa.Column('id', sa.Text(), nullable=False),
71-
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
72-
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
73-
sa.Column('snapshot', sa.JSON(), nullable=True),
74-
sa.Column('public', sa.Boolean(), nullable=True),
75-
sa.Column('user_id', sa.Text(), nullable=True),
76-
sa.Column('neurostore_id', sa.Text(), nullable=True),
77-
sa.ForeignKeyConstraint(['neurostore_id'], ['studyset_references.id'], ),
78-
sa.ForeignKeyConstraint(['user_id'], ['users.external_id'], ),
79-
sa.PrimaryKeyConstraint('id')
80-
)
81-
op.create_table('annotations',
82-
sa.Column('id', sa.Text(), nullable=False),
83-
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
84-
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
85-
sa.Column('snapshot', sa.JSON(), nullable=True),
86-
sa.Column('public', sa.Boolean(), nullable=True),
87-
sa.Column('user_id', sa.Text(), nullable=True),
88-
sa.Column('neurostore_id', sa.Text(), nullable=True),
89-
sa.Column('internal_studyset_id', sa.Text(), nullable=True),
90-
sa.ForeignKeyConstraint(['internal_studyset_id'], ['studysets.id'], ),
91-
sa.ForeignKeyConstraint(['neurostore_id'], ['annotation_references.id'], ),
92-
sa.ForeignKeyConstraint(['user_id'], ['users.external_id'], ),
93-
sa.PrimaryKeyConstraint('id')
94-
)
95-
op.create_table('meta_analyses',
96-
sa.Column('id', sa.Text(), nullable=False),
97-
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
98-
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
99-
sa.Column('name', sa.Text(), nullable=True),
100-
sa.Column('description', sa.Text(), nullable=True),
101-
sa.Column('specification_id', sa.Text(), nullable=True),
102-
sa.Column('internal_studyset_id', sa.Text(), nullable=True),
103-
sa.Column('internal_annotation_id', sa.Text(), nullable=True),
104-
sa.Column('user_id', sa.Text(), nullable=True),
105-
sa.ForeignKeyConstraint(['internal_annotation_id'], ['annotations.id'], ),
106-
sa.ForeignKeyConstraint(['internal_studyset_id'], ['studysets.id'], ),
107-
sa.ForeignKeyConstraint(['specification_id'], ['specifications.id'], ),
108-
sa.ForeignKeyConstraint(['user_id'], ['users.external_id'], ),
109-
sa.PrimaryKeyConstraint('id')
110-
)
22+
bind = op.get_bind()
23+
inspector = sa.inspect(bind)
24+
25+
tables = {
26+
'annotation_references': lambda: op.create_table('annotation_references',
27+
sa.Column('id', sa.Text(), nullable=False),
28+
sa.PrimaryKeyConstraint('id')
29+
),
30+
'roles': lambda: op.create_table('roles',
31+
sa.Column('id', sa.Text(), nullable=False),
32+
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
33+
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
34+
sa.Column('name', sa.Text(), nullable=True),
35+
sa.Column('description', sa.Text(), nullable=True),
36+
sa.PrimaryKeyConstraint('id'),
37+
sa.UniqueConstraint('name')
38+
),
39+
'studyset_references': lambda: op.create_table('studyset_references',
40+
sa.Column('id', sa.Text(), nullable=False),
41+
sa.PrimaryKeyConstraint('id')
42+
),
43+
'users': lambda: op.create_table('users',
44+
sa.Column('id', sa.Text(), nullable=False),
45+
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
46+
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
47+
sa.Column('active', sa.Boolean(), nullable=True),
48+
sa.Column('name', sa.Text(), nullable=True),
49+
sa.Column('external_id', sa.Text(), nullable=True),
50+
sa.PrimaryKeyConstraint('id'),
51+
sa.UniqueConstraint('external_id')
52+
),
53+
'roles_users': lambda: op.create_table('roles_users',
54+
sa.Column('user_id', sa.Text(), nullable=True),
55+
sa.Column('role_id', sa.Text(), nullable=True),
56+
sa.ForeignKeyConstraint(['role_id'], ['roles.id'], ),
57+
sa.ForeignKeyConstraint(['user_id'], ['users.id'], )
58+
),
59+
'specifications': lambda: op.create_table('specifications',
60+
sa.Column('id', sa.Text(), nullable=False),
61+
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
62+
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
63+
sa.Column('type', sa.Text(), nullable=True),
64+
sa.Column('estimator', sa.JSON(), nullable=True),
65+
sa.Column('filter', sa.Text(), nullable=True),
66+
sa.Column('contrast', sa.JSON(), nullable=True),
67+
sa.Column('corrector', sa.JSON(), nullable=True),
68+
sa.Column('public', sa.Boolean(), nullable=True),
69+
sa.Column('user_id', sa.Text(), nullable=True),
70+
sa.ForeignKeyConstraint(['user_id'], ['users.external_id'], ),
71+
sa.PrimaryKeyConstraint('id')
72+
),
73+
'studysets': lambda: op.create_table('studysets',
74+
sa.Column('id', sa.Text(), nullable=False),
75+
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
76+
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
77+
sa.Column('snapshot', sa.JSON(), nullable=True),
78+
sa.Column('public', sa.Boolean(), nullable=True),
79+
sa.Column('user_id', sa.Text(), nullable=True),
80+
sa.Column('neurostore_id', sa.Text(), nullable=True),
81+
sa.ForeignKeyConstraint(['neurostore_id'], ['studyset_references.id'], ),
82+
sa.ForeignKeyConstraint(['user_id'], ['users.external_id'], ),
83+
sa.PrimaryKeyConstraint('id')
84+
),
85+
'annotations': lambda: op.create_table('annotations',
86+
sa.Column('id', sa.Text(), nullable=False),
87+
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
88+
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
89+
sa.Column('snapshot', sa.JSON(), nullable=True),
90+
sa.Column('public', sa.Boolean(), nullable=True),
91+
sa.Column('user_id', sa.Text(), nullable=True),
92+
sa.Column('neurostore_id', sa.Text(), nullable=True),
93+
sa.Column('internal_studyset_id', sa.Text(), nullable=True),
94+
sa.ForeignKeyConstraint(['internal_studyset_id'], ['studysets.id'], ),
95+
sa.ForeignKeyConstraint(['neurostore_id'], ['annotation_references.id'], ),
96+
sa.ForeignKeyConstraint(['user_id'], ['users.external_id'], ),
97+
sa.PrimaryKeyConstraint('id')
98+
),
99+
'meta_analyses': lambda: op.create_table('meta_analyses',
100+
sa.Column('id', sa.Text(), nullable=False),
101+
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.text('now()'), nullable=True),
102+
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=True),
103+
sa.Column('name', sa.Text(), nullable=True),
104+
sa.Column('description', sa.Text(), nullable=True),
105+
sa.Column('specification_id', sa.Text(), nullable=True),
106+
sa.Column('internal_studyset_id', sa.Text(), nullable=True),
107+
sa.Column('internal_annotation_id', sa.Text(), nullable=True),
108+
sa.Column('user_id', sa.Text(), nullable=True),
109+
sa.ForeignKeyConstraint(['internal_annotation_id'], ['annotations.id'], ),
110+
sa.ForeignKeyConstraint(['internal_studyset_id'], ['studysets.id'], ),
111+
sa.ForeignKeyConstraint(['specification_id'], ['specifications.id'], ),
112+
sa.ForeignKeyConstraint(['user_id'], ['users.external_id'], ),
113+
sa.PrimaryKeyConstraint('id')
114+
),
115+
}
116+
117+
for table_name, creator in tables.items():
118+
if not inspector.has_table(table_name):
119+
creator()
120+
121+
# Backfill columns on existing tables to match current models.
122+
studyset_cols = set(col["name"] for col in inspector.get_columns("studysets")) if inspector.has_table("studysets") else set()
123+
if inspector.has_table("studysets"):
124+
if "snapshot" not in studyset_cols:
125+
op.add_column("studysets", sa.Column("snapshot", sa.JSON(), nullable=True))
126+
if "version" not in studyset_cols:
127+
op.add_column("studysets", sa.Column("version", sa.Text(), nullable=True))
128+
129+
annotation_cols = set(col["name"] for col in inspector.get_columns("annotations")) if inspector.has_table("annotations") else set()
130+
if inspector.has_table("annotations"):
131+
if "snapshot" not in annotation_cols:
132+
op.add_column("annotations", sa.Column("snapshot", sa.JSON(), nullable=True))
133+
if "cached_studyset_id" not in annotation_cols:
134+
op.add_column("annotations", sa.Column("cached_studyset_id", sa.Text(), nullable=True))
135+
op.create_foreign_key(None, "annotations", "studysets", ["cached_studyset_id"], ["id"])
111136
# ### end Alembic commands ###
112137

113138

0 commit comments

Comments
 (0)