Skip to content

Comments

refactor: split pushdown and residual predicates in scan plan#583

Open
belveryin wants to merge 10 commits intomainfrom
split-pushdown-residual
Open

refactor: split pushdown and residual predicates in scan plan#583
belveryin wants to merge 10 commits intomainfrom
split-pushdown-residual

Conversation

@belveryin
Copy link
Collaborator

@belveryin belveryin commented Feb 6, 2026

Summary

  • keep source pushdown (pushdown_predicate) separate from post-merge filtering (residual_predicate)
  • compute residual specifically for non-SST sources so only non-pushdownable clauses are evaluated in PackageStream
  • remove post-merge evaluation of pushdown_predicate; evaluate residual only
  • apply early merge limit whenever residual is empty

Tests

  • cargo test
  • cargo clippy -- -D warnings
  • targeted scan tests added/updated in src/db/tests/core/scan.rs

@belveryin belveryin requested a review from ethe February 6, 2026 15:09
Base automatically changed from prune-memtable-keyrange to main February 12, 2026 09:14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StartsWith range pruning is not sound for non-SST scans. next_prefix_string can return None on Unicode boundary cases (when char+1 is not a valid scalar), which degrades pruning to a lower-bound-only key range. In non-SST paths, StartsWith is treated as fully pushdown (residual = None), so false positives are never re-filtered.

}

fn split_predicate_for_non_sst(predicate: &Expr, key_schema: &SchemaRef) -> NonSstPredicateSplit {
if key_schema.fields().len() != 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Composite-key + Expr::True incorrectly forces residual filtering and disables early merge limit. split_predicate_for_non_sst returns residual: Some(predicate.clone()) for multi-column keys before handling Expr::True, which propagates to needs_post_filter = true and turns off early merge limit.

@belveryin belveryin requested a review from ethe February 12, 2026 15:32
Copy link
Member

@ethe ethe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StartsWith can be silently dropped for non-string primary keys on non-SSTscans.
In split_predicate_for_non_sst_inner, Expr::StartsWith is treated as fullpushdown when the column matches and next_prefix_string(prefix) exists (src/db/scan.rs:1215), but this path does not verify the key type.However, key-range derivation only supports StartsWith for string keys (src/query/scan.rs:523). For non-string keys, no key range is produced, and becauseresidual is None (src/db/scan.rs:1232, consumed at src/db/scan.rs:226),the predicate is not evaluated at all. This is a correctness issue

@belveryin
Copy link
Collaborator Author

belveryin commented Feb 18, 2026

StartsWith can be silently dropped for non-string primary keys on non-SSTscans. In split_predicate_for_non_sst_inner, Expr::StartsWith is treated as fullpushdown when the column matches and next_prefix_string(prefix) exists (src/db/scan.rs:1215), but this path does not verify the key type.However, key-range derivation only supports StartsWith for string keys (src/query/scan.rs:523). For non-string keys, no key range is produced, and becauseresidual is None (src/db/scan.rs:1232, consumed at src/db/scan.rs:226),the predicate is not evaluated at all. This is a correctness issue

Good catch! Though the phrasing is not completely accurate: this path wasn’t silently dropping the predicate because residual evaluation was still preserving it end-to-end.

This uncovered a real inconsistency, though: non-SST StartsWith pushdown did not check primary-key type while key-range derivation is string-only. The last commit makes non-SST StartsWith type-aware and added a regression test, so non-string PKs keep StartsWith as residual instead of being treated as full pushdown.

@belveryin belveryin requested a review from ethe February 18, 2026 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants