-
Notifications
You must be signed in to change notification settings - Fork 5
Transition value JSON from old to new flat format #539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
suvayu
wants to merge
115
commits into
master
Choose a base branch
from
WIP-data-transition
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 16 commits
Commits
Show all changes
115 commits
Select commit
Hold shift + click to select a range
c5bbc0e
move compatibility scripts to their own folder
f114d29
add parameter_value reencoding
304b257
add alembic migration script
3ab46be
spinedb_api/compat/: remove unused methods
suvayu 0872ab0
spinedb_api/compat/: fix type hints
suvayu 8bda858
spinedb_api/compat/: remove PEP 723 metadata
suvayu 06db77a
spinedb_api/compat/: fix input type
0c7dd8e
add developer's documentation for data transition
a7f2140
alembic: blacken reencode_parameter_values.py
suvayu 463767d
alembic: partly handle type column in reencode_parameter_values.py
suvayu 3533c2c
models.py: remove bytes, since pa.UnionArray is now supported
suvayu 7baf441
models.py: remove last remaining bits for bytes support
suvayu c5a41a2
models.py: fix TimePattern
suvayu acff451
models.py: add 'any_array' to support nullable mixed types
suvayu 4841a6f
data_transition: fix missing import TimePattern
suvayu 287851d
data_transition: alternate implementation
suvayu 8ce8086
alembic: improve data migration script
7c2e269
models.py: remove last remaining bits for bytes support
suvayu 1c6001d
models.py: fix 'any_array'
suvayu 47d79a5
models.py: cleanup type aliases
suvayu fa25066
models.py: use pydantic dataclasses
suvayu 811e263
compat: factor out some array creation to encode.py (WIP)
suvayu 8d480ac
compat: make warning in make_columns more prominent
suvayu 80749bd
data-transition: use relativedelta instead of pandas.DateOffset
suvayu 57b78c4
models.py: model duration with relativedelta from dateutil
suvayu 4b8ea09
compat/encode.py: change sentinel implementation
suvayu ae1f30f
compat/encode.py: fix column to array conversion for all types
suvayu 4c8a990
compat/encode.py: drop all conversion from dataframe
suvayu 4b11c43
pyproject.toml: upgrade pyarrow version, explicitly require pydantic
suvayu 102bde2
compat: rename module for clarity
suvayu b375ce8
README_dev.md: update renamed module
suvayu 0ac0896
compat: move duration parsing to own module for reusability
suvayu 7d6053e
models.py: minor fix to import
suvayu 4e767cc
models.py: cleaner implementation for time-pattern
suvayu 5af2dd2
data_transition: fix old -> new format conversions
suvayu e4ffc8a
data_transition: fix module import formatting
suvayu 51a33da
compat/converters.py: converters for duration types
suvayu 02131e5
models.py: function to convert dict to array dataclasses
suvayu bfa64ea
arrow_value.py: function to convert array dataclasses to pyarrow
suvayu 4f95a91
{arrow_values,models}.py: convert any array to union array
suvayu 21122be
Merge branch 'master' into WIP-data-transition
soininen 145faf1
compat/encode.py: bugfix, variables weren't named correctly
suvayu 9115bc4
compat: fix formatter doc string, and minor formatting fix
suvayu ee03a13
{models,compat/encode}.py: fix conversion from pandas.Timestamp
suvayu 759b563
to_database() for new array JSON
soininen 6936c61
Add durations as acceptable values for value column
soininen 5d95640
Make durations and datetimes work in run end encoded and dictionary a…
soininen d45575e
Fix using pd.TimeStamps or datetimes in IndexArray and the like
soininen c31bbdf
Apply new value JSON migration to all relevant tables
soininen e56e44f
Fix Alembic migration
soininen b70569d
Fix Alembic migration
soininen b153481
Merge branch 'master' into WIP-data-transition
soininen 3a9d259
Add to_arrow() method to ParameterValue and its subclasses
soininen 136f762
Always store new tabular JSON in database
soininen 3fab999
Implement migration to tabular JSON while keeping values backwards co…
soininen 2de12c8
Fix Alembic migration
soininen e9f8c99
Fix arrow_value.from_database()
soininen 8c1603d
data_transition.py: remove experimentation converter
suvayu 90c2d29
models.py: remove PEP 723 script, now depends on spinedb_api
suvayu 54338d1
models.py: one import per line
suvayu aeedea7
converters.py: refactor relativedelta <-> JSON duration conversion
suvayu 765a729
models.py: fix type annotations
suvayu d3ef8f6
models.py: remove custom validation & conversion
suvayu 1172e89
models.py: change to schema generation using TypeAdapter (recommended)
suvayu 023130d
models.py: refactor (de)serialisation
suvayu cfc61c6
data_transition.py: use models.TimePeriod to wrap time-patterns
suvayu d2dc941
data_transition.py: don't name date-time/duration as "value"
suvayu d946df8
models.py: fix metadata handling after rebase
suvayu 2c696f2
models.py: allow time_period in any array for consistency
suvayu 81de0e4
models.py: add missing mode in to_json
suvayu 4ce80aa
Merge branch 'master' into WIP-data-transition
soininen ebf335a
models.py: fix typeddict, any array does not have value_type
suvayu 97d93da
models.py: remove time_period from any array
suvayu 6f81265
models.py: fix json serialization call
suvayu 143f367
compat/converters.py: add pa.MonthDayNano -> duration/relativedelta
suvayu 01fbd13
compat/converters.py: bug fix pa.MonthDayNano -> intermediate dict
suvayu 6e4a3e9
Allow null values in index arrays, fix unit tests
soininen 89f3808
Remove unused test_data_transition.py
soininen 78002e6
Merge branch 'master' into WIP-data-transition
soininen 1ad08e5
value_support.py: backwards compat fix load_db_value signature
suvayu 117edde
Improve parameter value compatibility & refactoring
soininen deaa13a
Merge branch 'master' into WIP-data-transition
soininen 482c322
Bump DB server version to 9
soininen 67537c2
Merge branch 'master' into WIP-data-transition
soininen 31588a0
Fix GAMS version check when current work directory is read-only
soininen a6085da
Revert "Fix GAMS version check when current work directory is read-only"
soininen d3cb47b
Fix Alembic migration
soininen 2bb0da9
Merge branch 'master' into WIP-data-transition
soininen fbeda3e
Merge branch 'master' into WIP-data-transition
soininen 24eb6d0
Fix unit tests for Alembic migrations
soininen faa8a51
Merge branch 'master' into WIP-data-transition
soininen cfeac47
Store leaf TimeSeries indices to last column when expanding Maps.
soininen 4159c5d
Rework Map.from_arrow() to handle more corner-cases
soininen cfd4475
Add support for converting Map's leafs from/to Arrays
soininen 567e095
Merge branch 'master' into WIP-data-transition
soininen 1a9ea66
models.py: refactor model_validators
suvayu 0bdb23b
models.py: support conversions from pyarrow
suvayu eb148aa
models.py: fix typo
suvayu e822879
arrow_value.py: type hints & consolidate error handling for metadata
suvayu 922762b
parameter_value.py: refactor conversion from pyarrow to use models
suvayu a189ab5
{parameter_value,incomplete_values}.py: update JSON serialisation
suvayu 1d801b0
tests: fix parameter_value tests after refactor
suvayu 8049d96
models.py: handle default type for empty arrays cleanly
suvayu f54cec6
models.py: fix array - index array logic
suvayu 965aa7e
models.py: cleaner from_* function api
suvayu 6f17e75
models.py: better error msg
suvayu 3611b0b
models.py: add utilities for any_array checking & schema generation
suvayu a27b45f
compat/converters.py: raise error on bad duration string input
suvayu 6e18e1d
compat/converters.py: complete duration intermediate representation
suvayu 6951277
test_models.py: add tests
suvayu fd8efaa
test_compat_converters.py: add tests
suvayu 2e964b3
spinedb_api/compat/data_transition.py: remove unused module
suvayu 79ec818
models.py: complete type_map for null, reorder for legibility
suvayu 3ad53f6
compat/encode.py: hormanise encoders w/ other utilities in models.py
suvayu 5cd944d
test_compat_encode.py: add tests
suvayu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| # Developing Data Transition | ||
|
|
||
| ## Testing `alembic` Migration | ||
|
|
||
| 1. Edit `./spinedb_api/alembic.ini`, point `sqlalchemy.url` to a (copy of a) SQLite test database. ⚠️ Its data will be altered by the migration script. | ||
| 1. Edit `./spinedb_api/alembic/versions/a973ab537da2_reencode_parameter_values.py` and temporarily change | ||
| ```python | ||
| new_value = transition_data(old_value) | ||
| ``` | ||
| to | ||
| ```python | ||
| new_value = b'prepend_me ' + old_value | ||
| ``` | ||
| 1. Within the `./spinedb_api` folder, execute | ||
| ```bash | ||
| alembic upgrade head | ||
| ``` | ||
| 1. Open your SQLite test database in a database editor and check for changed `paramater_value`s. | ||
|
|
||
| ## Developing the Data Transition Module | ||
|
|
||
| 1. Edit `./spinedb_api/compat/reencode_for_data_transition.py` for development. | ||
| 1. In a Python REPL, call its function `transition_data(old_json_bytes)` and check for correct output of our test cases. | ||
| 1. Once this works, revert the changes of `./spinedb_api/alembic/versions/a973ab537da2_reencode_parameter_values.py` and test the above `alembic` migration again. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
60 changes: 60 additions & 0 deletions
60
spinedb_api/alembic/versions/a973ab537da2_reencode_parameter_values.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| """reencode parameter_values | ||
|
|
||
| Revision ID: a973ab537da2 | ||
| Revises: 91f1f55aa972 | ||
| Create Date: 2025-05-21 12:49:16.861670 | ||
|
|
||
| """ | ||
|
|
||
| from alembic import op | ||
| import sqlalchemy as sa | ||
| from spinedb_api.compat.reencode_for_data_transition import transition_data | ||
|
|
||
|
|
||
| # revision identifiers, used by Alembic. | ||
| revision = "a973ab537da2" | ||
| down_revision = "91f1f55aa972" | ||
| branch_labels = None | ||
| depends_on = None | ||
|
|
||
|
|
||
| def upgrade(): | ||
| # Reflect table definition | ||
| conn = op.get_bind() | ||
| metadata = sa.MetaData() | ||
| metadata.bind = conn | ||
|
|
||
| # Define a lightweight representation of the table | ||
| my_table = sa.Table( | ||
| "parameter_value", | ||
| metadata, | ||
| sa.Column("id", sa.Integer, primary_key=True), | ||
| sa.Column("value", sa.BINARY), | ||
| sa.Column("type", sa.String), # TODO do we need the type? | ||
| ) | ||
|
|
||
| # Read current data | ||
| results = conn.execute(sa.select(my_table.c.id, my_table.c.value, my_table.c.type)).fetchall() | ||
|
|
||
| # NOTE: maybe this should be derived from `models.ValueTypeNames`, | ||
| # but we don't want non-JSON values like integer, number, boolean; | ||
| # also some names differ by `-` <-> `_` | ||
| convertible = ("date_time", "duration", "time_pattern", "time_series", "array", "map") | ||
|
|
||
| # Apply transformation | ||
| for row in results: | ||
| if row.type not in convertible: | ||
| continue | ||
| old_value = row.value | ||
| new_value = transition_data(old_value) | ||
|
|
||
| # FIXME: | ||
| # - `type` also needs translation; from the `convertible` | ||
| # list, ("time_series", "array", "map") -> "table" | ||
| # - can the insertions be queued? | ||
| # Update the row | ||
| conn.execute(my_table.update().where(my_table.c.id == row.id).values(value=new_value)) | ||
soininen marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| def downgrade(): | ||
| pass | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.