-
Notifications
You must be signed in to change notification settings - Fork 95
Open
Labels
XL - Extra LargeSystem architecture overhaul, adding support for new platforms, large-scale dependency updates.System architecture overhaul, adding support for new platforms, large-scale dependency updates.
Description
Goal
Implement RFC 0010’s two-phase read path so scans push predicates, projection, limit, and PK ordering down to memtables and Parquet SSTs. Deliver a reusable ScanPlan with pruning/RowSets and an execution stream that filters before projection, prunes row-groups/pages, enforces limits early, and preserves PK order.
Current state to build on:
- ScanPlan/projection_with_predicate exist; MergeStream/PackageStream already do PK-ordered MVCC merge + limit; SstableScan handles MVCC/delete sidecar.
- Missing: RowSet abstraction, SST-level pruning (sst_entries includes all), row-group/page pruning, Parquet RowFilter pushdown, bloom filters.
Acceptance criteria:
- Planner builds a RowSet per source, extends scan schema with predicate columns, and filters SST/memtable inputs by key bounds and commit_ts stats.
- Parquet scans fetch metadata async, prune row-groups/pages by min/max (and read_ts), and push a predicate-derived RowFilter; only projected columns are read.
- Execution preserves PK-ascending order through pruning, applies residuals (if any), and enforces limit early (stop once satisfied).
- Missing columns in predicates error cleanly; NULL semantics match RFC.
- Tests cover SST/memtable pruning, row-group/page skip, projection+predicate, MVCC correctness, early limit, PK order contract, and missing-column errors.
- Bloom filters are either implemented as a stretch (write+read) or explicitly out of scope.
Stories:
- Planner & pruning foundation
- Parquet pushdown execution
- Validation & tests
- Stretch: Bloom filters
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
XL - Extra LargeSystem architecture overhaul, adding support for new platforms, large-scale dependency updates.System architecture overhaul, adding support for new platforms, large-scale dependency updates.