Skip to content

Commit 9378de8

Browse files
authored
fix(writer): align ducklake_column / ducklake_data_file schema with the DuckLake spec (#116)
* fix(writer): align ducklake_column / ducklake_data_file with the spec `SqliteMetadataWriter` was producing a `ducklake_column` table that lacked `parent_column`, `initial_default`, `default_value`, `default_value_type`, and `default_value_dialect`. The reader's SQL_GET_TABLE_COLUMNS query (added in PR #89) projects `parent_column`, so any catalog written by `SqliteMetadataWriter` was unreadable through the metadata provider, panicking with "no such column: parent_column". The bug shipped because CI ran `cargo test` without `write-sqlite` features, and both `tests/sql_write_tests.rs` and `tests/sqlite_metadata_provider_test.rs` are gated behind features that CI didn't enable — so the broken-catalog path compiled to a zero-test binary in CI and was invisible. Three changes: 1. `metadata_writer_sqlite.rs::SQL_CREATE_SCHEMA` adds the five missing `ducklake_column` columns, matching the upstream `ducklake_metadata_manager.cpp` schema definition. We don't populate the four `*_default*` columns today (we don't support nested types or column defaults yet), but the columns must exist so the reader query doesn't error. 2. `tests/sqlite_metadata_provider_test.rs::init_schema` is brought in line with the same spec. It also gains `row_id_start`, `record_count`, and `mapping_id` on `ducklake_data_file` — these are projected by SQL_GET_DATA_FILES since the row-lineage work landed in PR #115 and any sqlite-fixture test that exercises the data-file read path would have hit "no such column" without them. 3. `.github/workflows/ci.yml` enables `write-sqlite` on the test step so `sql_write_tests` and `sqlite_metadata_provider_test` actually run. Without this, the next analogous schema drift will ship invisibly again. Verified by running `cargo test --features "skip-tests-with-docker write-sqlite"` locally — 228 unit tests + every integration suite green, including the 7 previously-broken `sql_write_tests` and 18 `sqlite_metadata_provider_test` cases. Out of scope (separately broken with the same root cause, no CI coverage today): the encryption, postgres, and mysql test fixtures have analogous spec gaps. Worth a follow-up once CI for those backends is set up. * docs(ci): reword the write-sqlite comment so the intent isn't ambiguous The earlier wording read like "enable write-sqlite so the file runs" which could be parsed as either "we previously didn't run them and now we are" (correct) or "we want to skip running them" (wrong). Rephrased to make clear we're adding features so the tests become visible to cargo, not the other way round. * docs: trim the new comments to just the non-obvious WHY Three comment blocks added earlier in this branch were verbose, duplicated reasoning across files, or carried a PR reference that will rot. Trimmed each to the smallest signal a future reader needs: * CI: 6 lines → 3. Keep the "without this, cargo compiles to a zero-test binary" hazard; drop the cfg-syntax tutorial. * metadata_writer_sqlite.rs: 8 lines → 5. Spell out who reads each group of columns (our reader vs DuckDB) without re-explaining the same thing twice. * sqlite_metadata_provider_test.rs: 7 lines across two blocks → 1. Replace the duplicated "why columns exist" lecture with a single pointer to the writer file. Also drops the "since PR #115" reference per the don't-cite-PR-numbers convention — the linkage is now in the writer file's comment.
1 parent 83e3972 commit 9378de8

3 files changed

Lines changed: 24 additions & 2 deletions

File tree

.github/workflows/ci.yml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -83,11 +83,14 @@ jobs:
8383
run: cargo clippy --all-targets --all-features --no-deps -- -D warnings
8484

8585
- name: Run tests
86+
# `write-sqlite` pulls in the writer and sqlite-provider tests,
87+
# which are `#![cfg]`-gated; without it cargo compiles them to
88+
# zero-test binaries.
8689
run: |
8790
if [[ "$RUNNER_OS" == "macOS" ]]; then
88-
cargo test --features skip-tests-with-docker
91+
cargo test --features "skip-tests-with-docker write-sqlite"
8992
else
90-
cargo test
93+
cargo test --features "write-sqlite"
9194
fi
9295
9396
- name: Check documentation

src/metadata_writer_sqlite.rs

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,16 @@ CREATE TABLE IF NOT EXISTS ducklake_column (
5050
column_type VARCHAR NOT NULL,
5151
column_order INTEGER NOT NULL,
5252
nulls_allowed BOOLEAN DEFAULT 1,
53+
-- Mirror the upstream DuckLake spec (ducklake_metadata_manager.cpp):
54+
-- `parent_column` is projected by our reader's SQL_GET_TABLE_COLUMNS;
55+
-- the four `*default*` columns are projected by DuckDB when it reads
56+
-- catalogs we produce. We leave them NULL — no nested-type or
57+
-- column-default writes yet.
58+
initial_default VARCHAR,
59+
default_value VARCHAR,
60+
parent_column INTEGER,
61+
default_value_type VARCHAR,
62+
default_value_dialect VARCHAR,
5363
begin_snapshot INTEGER NOT NULL,
5464
end_snapshot INTEGER
5565
);

tests/sqlite_metadata_provider_test.rs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ async fn init_schema(pool: &SqlitePool) -> anyhow::Result<()> {
6868
.execute(pool)
6969
.await?;
7070

71+
// Schema must match SQL_CREATE_SCHEMA in metadata_writer_sqlite.rs.
7172
sqlx::query(
7273
"CREATE TABLE IF NOT EXISTS ducklake_column (
7374
column_id INTEGER PRIMARY KEY,
@@ -76,6 +77,11 @@ async fn init_schema(pool: &SqlitePool) -> anyhow::Result<()> {
7677
column_type TEXT NOT NULL,
7778
column_order INTEGER NOT NULL,
7879
nulls_allowed INTEGER,
80+
initial_default TEXT,
81+
default_value TEXT,
82+
parent_column INTEGER,
83+
default_value_type TEXT,
84+
default_value_dialect TEXT,
7985
begin_snapshot INTEGER NOT NULL DEFAULT 1,
8086
end_snapshot INTEGER,
8187
FOREIGN KEY (table_id) REFERENCES ducklake_table(table_id)
@@ -93,6 +99,9 @@ async fn init_schema(pool: &SqlitePool) -> anyhow::Result<()> {
9399
file_size_bytes INTEGER NOT NULL,
94100
footer_size INTEGER,
95101
encryption_key TEXT,
102+
record_count INTEGER,
103+
row_id_start INTEGER,
104+
mapping_id INTEGER,
96105
begin_snapshot INTEGER NOT NULL DEFAULT 1,
97106
end_snapshot INTEGER,
98107
FOREIGN KEY (table_id) REFERENCES ducklake_table(table_id)

0 commit comments

Comments
 (0)