Summary
Support aggregate-stats based scan optimization — when COUNT/MIN/MAX aggregates can be answered from parquet row-group/file statistics, skip normal row scans for eligible files and fall back safely when stats are missing or unsupported. This reduces scan cost for simple aggregate queries while preserving correctness through schema/type checks and row-scan fallback.
Related PRs
Component Breakdown
| Component |
Description |
Status |
| Foundation types / helpers |
Adds store-api stats DTOs, common-query stats candidate evaluation, and table row-group pruning stats helpers |
🔄 |
| Scanner runtime integration |
Adds RegionScanner::scan_stats, stats-aware scanner properties/request wiring, runtime decision state, and mito2 stats stream production |
🔜 |
| Optimizer rewrite |
Adds AggrStats optimizer rule and StatsScanExec to rewrite eligible aggregate scans to stats-backed execution |
🔜 |
| Validation coverage |
Adds sqlness, integration tests, and partition/filter test adaptations for end-to-end behavior |
🔜 |
Summary
Support aggregate-stats based scan optimization — when COUNT/MIN/MAX aggregates can be answered from parquet row-group/file statistics, skip normal row scans for eligible files and fall back safely when stats are missing or unsupported. This reduces scan cost for simple aggregate queries while preserving correctness through schema/type checks and row-scan fallback.
Related PRs
Component Breakdown
RegionScanner::scan_stats, stats-aware scanner properties/request wiring, runtime decision state, and mito2 stats stream productionAggrStatsoptimizer rule andStatsScanExecto rewrite eligible aggregate scans to stats-backed execution