-
-
Notifications
You must be signed in to change notification settings - Fork 141
Move away from SQLite as core data persistence layer #5208
Copy link
Copy link
Open
Labels
data-validationIssues related to checking whether data meets our quality expectations.Issues related to checking whether data meets our quality expectations.developer experienceThings that make the developers' lives easier, but don't necessarily directly improve the data.Things that make the developers' lives easier, but don't necessarily directly improve the data.panderaIssues related to our use of the Pandera dataframe schmas and validationsIssues related to our use of the Pandera dataframe schmas and validations
Metadata
Metadata
Assignees
Labels
data-validationIssues related to checking whether data meets our quality expectations.Issues related to checking whether data meets our quality expectations.developer experienceThings that make the developers' lives easier, but don't necessarily directly improve the data.Things that make the developers' lives easier, but don't necessarily directly improve the data.panderaIssues related to our use of the Pandera dataframe schmas and validationsIssues related to our use of the Pandera dataframe schmas and validations
Type
Fields
Give feedbackNo fields configured for issues without a type.
Projects
StatusShow more project fields
Epic
Overview
CheckConstraintswhich means that in nightly/local builds we aren't checking e.g. "this string matches this regex" in SQLite. Though they are nominally being checked in Pandera rn - see Field constraints missing from Dagster SQLite outputs #4169 for details.Success criteria
Out of scope, but good follow-up:
dbtand confirm their enforcement thereNext steps
dbtvalidations asdata_tests on each table.foreign_keysgeneric data test, because there doesn't appear to be one online that supports composite FKs.primary_keygeneric data test that just glues together existing 'unique columns' and 'not-null' checks.PUDL_PACKAGE.schema.human.yml(human-owned tests) andschema.machine.yml(everything auto-generated fromPUDL_PACKAGE) and then have dbt_helper merge them into an actualschema.ymlschema.ymlfiles #5268DbtTable.from_table_namemethod so they are machine generated.