Skip to content

Move away from SQLite as core data persistence layer #5208

@jdangerx

Description

@jdangerx

Overview

  • in order to write individual re-materializations of assets with new schemas, we needed some way of managing schema changes w/o blowing away whole db (Alembic!)
  • Alembic migrations are confusing & painful when dealing with many schema changes at once
  • Alembic migrations don't include CheckConstraints which means that in nightly/local builds we aren't checking e.g. "this string matches this regex" in SQLite. Though they are nominally being checked in Pandera rn - see Field constraints missing from Dagster SQLite outputs #4169 for details.
  • The only checks that live solely in SQLite, then, are FK/PK checks, and if we can replicate those checks elsewhere we can avoid both Alembic and SQLite.

Success criteria

  • we don't need to use Alembic to manage migrations anymore
  • we have foreign key and primary key checks on our data, still

Out of scope, but good follow-up:

  • we confirm that Pandera is successfully checking all CheckConstraints OR
  • we port CheckConstraints to dbt and confirm their enforcement there

Next steps

Metadata

Metadata

Assignees

Labels

data-validationIssues related to checking whether data meets our quality expectations.developer experienceThings that make the developers' lives easier, but don't necessarily directly improve the data.panderaIssues related to our use of the Pandera dataframe schmas and validations

Type

No type
No fields configured for issues without a type.

Projects

Status
Epic

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions