Fix incorrect data when flushing inlined deletes with sorting#1166
Open
Alex-Monahan wants to merge 5 commits into
Open
Fix incorrect data when flushing inlined deletes with sorting#1166Alex-Monahan wants to merge 5 commits into
Alex-Monahan wants to merge 5 commits into
Conversation
…kdb#999) Issue: ducklake_flush_inlined_data marks the wrong rows as deleted when the table's SET SORTED BY clause uses an expression (e.g. (id + 0) ASC) and inlined deletes from prior UPDATEs are pending. The data file is written sorted by the user expression, but the deletes-position query falls back to ordering by row_id, begin_snapshot only -- so positions diverge from the file's order and the wrong rows get marked deleted. Two minimal cooperating fixes: 1. BuildSortOrderSQL (src/storage/ducklake_sort_data.cpp): replace the non-simple-column short-circuit with parse -> optionally rewrite ColumnRef leaves via the rename map -> ToString. Non-bare-column expressions now land in the deletes-position query's ORDER BY, matching the file's sort. Parser::ParseExpressionList rejects malformed input; ParsedExpression::ToString is the canonical re-serializer. 2. Tiebreakers in InsertSort (src/functions/ducklake_compaction_functions.* + src/functions/ducklake_flush_inlined_data.cpp): InsertSort gains an optional add_tiebreakers parameter. When true, scans the underlying LogicalGet's column_ids for COLUMN_IDENTIFIER_ROW_ID and COLUMN_IDENTIFIER_SNAPSHOT_ID symbolic identifiers, builds BoundColumnRefExpressions at the corresponding plan binding positions, and appends them as ASC NULLS LAST tiebreakers. Position-by-scan (not by latest_table.PhysicalColumnCount()) handles ADD COLUMN in the same transaction as the flush. The deletes-position query is tightened to use explicit ASC NULLS LAST for the tiebreakers so the user's default_null_order setting cannot desync them from the file. The flush call site passes add_tiebreakers=true; compaction passes nothing (default false). Out of scope (intentionally): DuckLake-stored macros referenced in sort expressions. With this change such expressions are emitted verbatim into the deletes-position query and the metadata connection's binder errors at flush time with "Catalog Error: Scalar Function with name X does not exist!". This is strictly better than v1.5's silent wrong delete positions, but it is a behavior change for the existing test data_inlining_flush_sorted_macro_expression_transaction.test (the only v1.5 test exercising this path with no UPDATEs, where the silent short-circuit produced "passing" results coincidentally). That test is updated to expect the clear error and roll back. Proper support for macros in sort expressions would require cross-ClientContext catalog visibility plumbing and is a separate concern. New tests: - data_inlining_flush_sorted_expression_deletes_999.test - data_inlining_flush_sorted_expression_deletes_999_transaction.test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes #999 .
However, there is a previously silently failing condition that this now causes to error, so it is a FIXME for a follow up PR. It will fail if a single transaction contains:
CREATE MACROfor a DuckLake macroSET SORTED BYthat uses that macroThis is because the connection to the metadata DB does not have the user's DuckLake transaction context where that macro exists.
I wanted to split it apart since the PR was growing in scope and I think I need some help on how to cleanly handle within-transaction DuckLake macros in this case. This PR at least removes the silent corruption issue and throws an error in the very specific case above instead of a silent failure.