Skip to content

[branch-53] Fix DELETE/UPDATE filter extraction when predicates are pushed down into TableScan (#19884)#20898

Open
alamb wants to merge 1 commit intoapache:branch-53from
alamb:alamb/backport_19884_branch53
Open

[branch-53] Fix DELETE/UPDATE filter extraction when predicates are pushed down into TableScan (#19884)#20898
alamb wants to merge 1 commit intoapache:branch-53from
alamb:alamb/backport_19884_branch53

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Mar 12, 2026

…nto TableScan (apache#19884)

## Which issue does this PR close?

* Closes apache#19840.

## Rationale for this change

Custom `TableProvider` implementations (notably those that support
filter pushdown) expect the DML engine to pass `WHERE` predicates into
`TableProvider::delete_from` / `TableProvider::update` so they can apply
targeted changes.

However, when the optimizer pushes predicates down into
`TableScan.filters`, the logical plan may no longer contain a top-level
`Filter` node. The existing DML planning code only collected predicates
from `Filter` nodes, so `delete_from` / `update` could receive an empty
filter list and unintentionally apply the operation to all rows.

This PR fixes filter extraction to also consider pushdown predicates
stored in `TableScan.filters`, while avoiding incorrect behavior in
multi-table contexts.

## What changes are included in this PR?

* Extend DML predicate extraction to collect filters from both:

  * `LogicalPlan::Filter` nodes (splitting `AND` conjunctions), and
  * `LogicalPlan::TableScan.filters` for the **target** table.
* Add target scoping for filter extraction:

* Determine the set of allowed table references for the target table
(including simple aliases via `SubqueryAlias`).
* For `Filter` nodes, retain only predicates whose column references
belong to the target table (or are unqualified).
* For `TableScan.filters`, only read filters from the target table scan
to prevent accidentally pulling predicates from source tables.
* Strip column qualifiers before passing expressions to the provider so
filters evaluate against the provider schema.
* Deduplicate extracted predicates (needed when partial/inexact pushdown
causes the same predicate to appear in both a `Filter` node and
`TableScan.filters`).
* Add comprehensive tests for DELETE/UPDATE covering:

  * exact pushdown (filters live in `TableScan.filters`),
  * compound predicates,
  * mixed locations due to `Inexact` pushdown and per-filter decisions,
  * qualifier stripping behavior.
* Temporarily reject `UPDATE ... FROM` during planning with a clear
`not_impl` error and align sqllogictest expectations.

* This avoids incorrect semantics caused by qualifier stripping /
predicate selection in multi-table update scenarios.

## Are these changes tested?

Yes.

* Added new unit tests in
`datafusion/core/tests/custom_sources_cases/dml_planning.rs` that
validate:

* pushed-down filters are extracted from `TableScan` and passed through
to providers,
  * compound predicates are preserved (not over-deduped),
  * mixed pushdown/residual cases still deliver the full predicate set,
  * qualifiers are stripped from passed filters.
* Updated sqllogictest coverage for `UPDATE ... FROM` to assert the
feature is currently not implemented.

## Are there any user-facing changes?

* **Bug fix:** `DELETE ... WHERE ...` and `UPDATE ... WHERE ...` against
custom `TableProvider`s that use filter pushdown will now correctly pass
predicates to the provider, preventing accidental full-table operations.
* **Behavior change (planning error message / support level):** `UPDATE
... FROM` now consistently returns `This feature is not implemented:
UPDATE ... FROM is not supported` (until the underlying issue is fixed).

## LLM-generated code disclosure

This PR includes LLM-generated code and comments. All LLM-generated
content has been manually reviewed and tested.
@github-actions github-actions bot added sql SQL Planner core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants