Skip to content

Add schema/metadata migrations #24

@mferrera

Description

@mferrera

This is just a sketch. This will requirement some refinement, discussion, and design.

We need to be able to migrate metadata from older versions to newer versions. We should add a new block, i.e. schema_migrations that logs this. We should add a single function that upgrade metadata from version n to n+1. This means every schema version should have a corresponding migration function, if it applies.

Sketch:

$schema: ..
version: "0.9.0"
source: "fmu"

schema_migration:
  original_version: "0.8.0"
  - timestamp: 2025-....
    migration_tool:
      name: fmu-dataio
      version: 2.5.0
    from_version: "0.8.0"
    to_version: "0.9.0"
    changes:
    - field: "data.standard_result"
      action: "renamed"
      previous_name: "data.product"
      required_context: false  # If FMU runtime is required
          
tracklog:
  - ...
     event: created
     ...
  - datetime: 2025..
     user:
       id: system
     ...
     event: "schema_migrated"
     migration_details: # this block not necessary, if in a migration block?
       from_version: "0.8.0"
       to_version: "0.9.0"
       migration_id: <uuid>

In dataio, we will need to establish functions that do these migrations and validated them. Sumo can then use them in ETL pipelines.

from fmu.dataio.migrations import migrate_schema

new_schema = migrate_schema(existing_schema, data_object, "0.9.0")
# Apply all schema migrations between the existing schema version and 0.9.0
# Each version change has its own upgrade function. It may in some cases do nothing

Open questions

What about metadata values only possible to get during FMU experiment runtime?

  • We probably need to establish default values in these circumstances with a way to flag when they have been generated by a migration. This is a form of "optionality" that adds some difficulty on the consumer end, but not more difficult than managing a multitude of logic for a multitude of schema versions (i.e., it should be able to handle such fields programmatically no matter the version)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions