You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is one ticket in a series carrying forward #12 foundation work. Read #12 first for repo context.
DataFusion natively plans INSERT against a TableProvider, but it does not natively plan DELETE, UPDATE, or MERGE. The fork adds a custom QueryPlanner that intercepts DataFusion's logical DML plans for DuckLake tables and routes them to DuckLake-specific execution plans. This is the integration point — every other DML ticket (#17 DELETE, #18 UPDATE, #19 MERGE) depends on it.
Reference branch
ducklake-features/integration:
src/query_planner.rs — the planner. Read it end-to-end; it's ~600 LOC and not bigger than necessary.
Inline #[cfg(test)] tests in the same file cover the routing logic (planner rejects unsupported plan shapes with explicit errors instead of silently emptying filters — a deliberate safety choice flagged in the audit).
Scope
Port src/query_planner.rs.
The planner intercepts only LogicalPlan::Dml(Delete | Update). INSERT continues to use DataFusion's native path via TableProvider::insert_into. MERGE is implemented as a custom logical extension node (see MERGE physical execution (INSERT/UPDATE/DELETE atomic) #19).
For DELETE/UPDATE, the planner extracts:
the target DuckLakeTable (downcast via TableProvider::as_any)
the filter expression (rejecting plans where DataFusion rewrote the filter through joins/subqueries — emit a clear error rather than silently dropping the predicate)
for UPDATE, the SET expressions, identified by positional matching against the target schema (see audit note below)
Explicit-error behavior preserved: a DELETE/UPDATE plan whose filter has been rewritten through a JOIN or subquery returns a planner error, not an empty filter
No duckdb crate imports
Documented integration example in the crate's top-level doc-comment showing how to register the planner
The DELETE/UPDATE/MERGE physical execs themselves — separate tickets
DDL planning (CREATE TABLE etc. continues to flow through DataFusion's native path)
Notes
Audit concern to address before merging: in the fork's implementation, UPDATE detection uses positional matching: projection_exprs[i].name == schema.fields()[i].name(). This depends on DataFusion never reordering projections. The audit flagged this as fragile but not currently broken. Add a runtime assertion that the names match by index and fail loudly with a planner error if they don't — that way a future DataFusion behavior change becomes a clear bug report rather than silent data corruption.
The audit verdict on this file was "solid" — the planner's rejection of join/subquery-rewritten filters is called out as a "genuinely thoughtful safety check."
Context
This is one ticket in a series carrying forward #12 foundation work. Read #12 first for repo context.
DataFusion natively plans
INSERTagainst aTableProvider, but it does not natively planDELETE,UPDATE, orMERGE. The fork adds a customQueryPlannerthat intercepts DataFusion's logical DML plans for DuckLake tables and routes them to DuckLake-specific execution plans. This is the integration point — every other DML ticket (#17 DELETE, #18 UPDATE, #19 MERGE) depends on it.Reference branch
ducklake-features/integration:src/query_planner.rs— the planner. Read it end-to-end; it's ~600 LOC and not bigger than necessary.#[cfg(test)]tests in the same file cover the routing logic (planner rejects unsupported plan shapes with explicit errors instead of silently emptying filters — a deliberate safety choice flagged in the audit).Scope
src/query_planner.rs.LogicalPlan::Dml(Delete | Update). INSERT continues to use DataFusion's native path viaTableProvider::insert_into. MERGE is implemented as a custom logical extension node (see MERGE physical execution (INSERT/UPDATE/DELETE atomic) #19).DuckLakeTable(downcast viaTableProvider::as_any)DeleteExec(DELETE physical execution (MOR delete files) #17) orUpdateExec(UPDATE physical execution (MOR delete + insert) #18).SessionStateBuilder::with_query_plannerand expose a helper, e.g.DuckLakeQueryPlanner::register(&mut state).Acceptance criteria
src/query_planner.rscompiles standalone (mock the exec types if DELETE physical execution (MOR delete files) #17/UPDATE physical execution (MOR delete + insert) #18 are not yet ported)duckdbcrate importsDependencies
Out of scope
Notes
projection_exprs[i].name == schema.fields()[i].name(). This depends on DataFusion never reordering projections. The audit flagged this as fragile but not currently broken. Add a runtime assertion that the names match by index and fail loudly with a planner error if they don't — that way a future DataFusion behavior change becomes a clear bug report rather than silent data corruption.