Skip to content

Commit 5812013

Browse files
Add service for database schema migration. Extend length of note.value field (#353)
the reason to use very_long is that it makes more sense than an arbitary long length that supports the current use case. the very long is based on when data can be stored in row, so has meaning to the database and will not change (unless db settings are changed). --------- Co-authored-by: taniya-das <[email protected]>
1 parent 9049328 commit 5812013

File tree

10 files changed

+304
-2
lines changed

10 files changed

+304
-2
lines changed

Diff for: README.md

+1
Original file line numberDiff line numberDiff line change
@@ -291,4 +291,5 @@ To create a new release,
291291
- Check which services currently work (before the update). It's a sanity check for if a service _doesn't_ work later.
292292
- Update the code on the server by checking out the release
293293
- Merge configurations as necessary
294+
- Make sure the latest database migrations are applied: see ["Schema Migrations"](alembic/readme.md#update-the-database)
294295
9. Notify everyone (e.g., in the API channel in Slack).

Diff for: alembic/Dockerfile

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
FROM aiod_metadata_catalogue
2+
RUN python -m pip install alembic
3+
ENV PYTHONPATH="$PYTHONPATH:/app"
4+
WORKDIR /alembic
5+
ENTRYPOINT ["alembic", "upgrade", "head"]

Diff for: alembic/README.md

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Database Schema Migrations
2+
3+
We use [Alembic](https://alembic.sqlalchemy.org/en/latest/tutorial.html#running-our-first-migration) to automate database schema migrations
4+
(e.g., adding a table, altering a column, and so on).
5+
Please refer to the Alembic documentation for more information.
6+
7+
## Usage
8+
Commands below assume that the root directory of the project is your current working directory.
9+
10+
Build the image with:
11+
```commandline
12+
docker build -f alembic/Dockerfile . -t aiod-migration
13+
```
14+
15+
With the sqlserver container running, you can migrate to the latest schema with:
16+
17+
```commandline
18+
docker run -v $(pwd)/alembic:/alembic:ro -v $(pwd)/src:/app -it --network aiod_default aiod-migration
19+
```
20+
Make sure that the specifid `--network` is the docker network that has the `sqlserver` container.
21+
The alembic directory is mounted to ensure the latest migrations are available,
22+
the src directory is mounted so the migration scripts can use defined classes and variable from the project.
23+
24+
## Update the Database
25+
> [!Caution]
26+
> Database migrations may be irreversible. Always make sure there is a backup of the old database.
27+
28+
Following the usage commands above, on a new release we should run alembic to ensure the latest schema changes are applied.
29+
The default entrypoint of the container specifies to upgrade the database to the latest schema.
30+
31+
## TODO
32+
- set up support for auto-generating migration scripts: https://alembic.sqlalchemy.org/en/latest/autogenerate.html

Diff for: alembic/alembic.ini

+116
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# A generic, single database configuration.
2+
3+
[alembic]
4+
# path to migration scripts
5+
# Use forward slashes (/) also on windows to provide an os agnostic path
6+
script_location = alembic
7+
8+
# template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
9+
# Uncomment the line below if you want the files to be prepended with date and time
10+
# see https://alembic.sqlalchemy.org/en/latest/tutorial.html#editing-the-ini-file
11+
# for all available tokens
12+
# file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s
13+
14+
# sys.path path, will be prepended to sys.path if present.
15+
# defaults to the current working directory.
16+
prepend_sys_path = .
17+
18+
# timezone to use when rendering the date within the migration file
19+
# as well as the filename.
20+
# If specified, requires the python>=3.9 or backports.zoneinfo library.
21+
# Any required deps can installed by adding `alembic[tz]` to the pip requirements
22+
# string value is passed to ZoneInfo()
23+
# leave blank for localtime
24+
# timezone =
25+
26+
# max length of characters to apply to the "slug" field
27+
# truncate_slug_length = 40
28+
29+
# set to 'true' to run the environment during
30+
# the 'revision' command, regardless of autogenerate
31+
# revision_environment = false
32+
33+
# set to 'true' to allow .pyc and .pyo files without
34+
# a source .py file to be detected as revisions in the
35+
# versions/ directory
36+
# sourceless = false
37+
38+
# version location specification; This defaults
39+
# to alembic/versions. When using multiple version
40+
# directories, initial revisions must be specified with --version-path.
41+
# The path separator used here should be the separator specified by "version_path_separator" below.
42+
# version_locations = %(here)s/bar:%(here)s/bat:alembic/versions
43+
44+
# version path separator; As mentioned above, this is the character used to split
45+
# version_locations. The default within new alembic.ini files is "os", which uses os.pathsep.
46+
# If this key is omitted entirely, it falls back to the legacy behavior of splitting on spaces and/or commas.
47+
# Valid values for version_path_separator are:
48+
#
49+
# version_path_separator = :
50+
# version_path_separator = ;
51+
# version_path_separator = space
52+
version_path_separator = os # Use os.pathsep. Default configuration used for new projects.
53+
54+
# set to 'true' to search source files recursively
55+
# in each "version_locations" directory
56+
# new in Alembic version 1.10
57+
# recursive_version_locations = false
58+
59+
# the output encoding used when revision files
60+
# are written from script.py.mako
61+
# output_encoding = utf-8
62+
63+
sqlalchemy.url = ''
64+
65+
66+
[post_write_hooks]
67+
# post_write_hooks defines scripts or Python functions that are run
68+
# on newly generated revision scripts. See the documentation for further
69+
# detail and examples
70+
71+
# format using "black" - use the console_scripts runner, against the "black" entrypoint
72+
# hooks = black
73+
# black.type = console_scripts
74+
# black.entrypoint = black
75+
# black.options = -l 79 REVISION_SCRIPT_FILENAME
76+
77+
# lint with attempts to fix using "ruff" - use the exec runner, execute a binary
78+
# hooks = ruff
79+
# ruff.type = exec
80+
# ruff.executable = %(here)s/.venv/bin/ruff
81+
# ruff.options = --fix REVISION_SCRIPT_FILENAME
82+
83+
# Logging configuration
84+
[loggers]
85+
keys = root,sqlalchemy,alembic
86+
87+
[handlers]
88+
keys = console
89+
90+
[formatters]
91+
keys = generic
92+
93+
[logger_root]
94+
level = WARN
95+
handlers = console
96+
qualname =
97+
98+
[logger_sqlalchemy]
99+
level = WARN
100+
handlers =
101+
qualname = sqlalchemy.engine
102+
103+
[logger_alembic]
104+
level = INFO
105+
handlers =
106+
qualname = alembic
107+
108+
[handler_console]
109+
class = StreamHandler
110+
args = (sys.stderr,)
111+
level = NOTSET
112+
formatter = generic
113+
114+
[formatter_generic]
115+
format = %(levelname)-5.5s [%(name)s] %(message)s
116+
datefmt = %H:%M:%S

Diff for: alembic/alembic/README

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Generic single-database configuration.

Diff for: alembic/alembic/env.py

+72
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
from logging.config import fileConfig
2+
3+
from alembic import context
4+
5+
# Assumes /src is in the Python path, so we can re-use logic for constructing db connections
6+
from database.session import db_url
7+
from database.session import EngineSingleton
8+
9+
# this is the Alembic Config object, which provides
10+
# access to the values within the .ini file in use.
11+
config = context.config
12+
13+
# Interpret the config file for Python logging.
14+
# This line sets up loggers basically.
15+
if config.config_file_name is not None:
16+
fileConfig(config.config_file_name)
17+
18+
# add your model's MetaData object here
19+
# for 'autogenerate' support
20+
# from myapp import mymodel
21+
# target_metadata = mymodel.Base.metadata
22+
target_metadata = None
23+
24+
# other values from the config, defined by the needs of env.py,
25+
# can be acquired:
26+
# my_important_option = config.get_main_option("my_important_option")
27+
# ... etc.
28+
29+
30+
def run_migrations_offline() -> None:
31+
"""Run migrations in 'offline' mode.
32+
33+
This configures the context with just a URL
34+
and not an Engine, though an Engine is acceptable
35+
here as well. By skipping the Engine creation
36+
we don't even need a DBAPI to be available.
37+
38+
Calls to context.execute() here emit the given string to the
39+
script output.
40+
41+
"""
42+
url = db_url()
43+
context.configure(
44+
url=url,
45+
target_metadata=target_metadata,
46+
literal_binds=True,
47+
dialect_opts={"paramstyle": "named"},
48+
)
49+
50+
with context.begin_transaction():
51+
context.run_migrations()
52+
53+
54+
def run_migrations_online() -> None:
55+
"""Run migrations in 'online' mode.
56+
57+
In this scenario we need to create an Engine
58+
and associate a connection with the context.
59+
60+
"""
61+
connectable = EngineSingleton().engine
62+
with connectable.connect() as connection:
63+
context.configure(connection=connection, target_metadata=target_metadata)
64+
65+
with context.begin_transaction():
66+
context.run_migrations()
67+
68+
69+
if context.is_offline_mode():
70+
run_migrations_offline()
71+
else:
72+
run_migrations_online()

Diff for: alembic/alembic/script.py.mako

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
"""${message}
2+
3+
Revision ID: ${up_revision}
4+
Revises: ${down_revision | comma,n}
5+
Create Date: ${create_date}
6+
7+
"""
8+
from typing import Sequence, Union
9+
10+
from alembic import op
11+
import sqlalchemy as sa
12+
${imports if imports else ""}
13+
14+
# revision identifiers, used by Alembic.
15+
revision: str = ${repr(up_revision)}
16+
down_revision: Union[str, None] = ${repr(down_revision)}
17+
branch_labels: Union[str, Sequence[str], None] = ${repr(branch_labels)}
18+
depends_on: Union[str, Sequence[str], None] = ${repr(depends_on)}
19+
20+
21+
def upgrade() -> None:
22+
${upgrades if upgrades else "pass"}
23+
24+
25+
def downgrade() -> None:
26+
${downgrades if downgrades else "pass"}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
"""Extend max length of text in note
2+
3+
Revision ID: 0a23b40cc09c
4+
Revises:
5+
Create Date: 2024-08-29 11:37:20.827291
6+
7+
"""
8+
from typing import Sequence, Union
9+
10+
from alembic import op
11+
from sqlalchemy import String
12+
13+
from database.model.field_length import VERY_LONG
14+
15+
# revision identifiers, used by Alembic.
16+
revision: str = "0a23b40cc09c"
17+
down_revision: Union[str, None] = None
18+
branch_labels: Union[str, Sequence[str], None] = None
19+
depends_on: Union[str, Sequence[str], None] = None
20+
21+
22+
def upgrade() -> None:
23+
# All models that derive from AIResourceBase
24+
for table in [
25+
"news",
26+
"team",
27+
"person",
28+
"organisation",
29+
"event",
30+
"project",
31+
"service",
32+
"dataset",
33+
"case_study",
34+
"publication",
35+
"computational_asset",
36+
"ml_model",
37+
"experiment",
38+
"educational_resource",
39+
]:
40+
op.alter_column(
41+
f"note_{table}",
42+
"value",
43+
type_=String(VERY_LONG),
44+
)
45+
46+
47+
def downgrade() -> None:
48+
pass

Diff for: src/database/model/ai_resource/note.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,15 @@
33
from sqlalchemy import Column, Integer, ForeignKey
44
from sqlmodel import Field, SQLModel
55

6-
from database.model.field_length import LONG
6+
from database.model.field_length import VERY_LONG
77

88

99
class NoteBase(SQLModel):
1010
value: str = Field(
1111
index=False,
1212
unique=False,
1313
description="The string value",
14-
max_length=LONG,
14+
max_length=VERY_LONG,
1515
schema_extra={"example": "A brief record of points or ideas about this AI resource."},
1616
)
1717

Diff for: src/database/model/field_length.py

+1
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,5 @@
77
SHORT = 64
88
NORMAL = 256
99
LONG = 1800 # an A4s full of text
10+
VERY_LONG = 8000 # Cut off for out-of-row storage
1011
MAX_TEXT = 65535 # max length for Mysql text

0 commit comments

Comments
 (0)