Skip to content

Add EnsureRequirements: merged EnforceDistribution + EnforceSorting with idempotent pushdown_sorts#21976

Draft
zhuqi-lucas wants to merge 1 commit intoapache:mainfrom
zhuqi-lucas:ensure-requirements
Draft

Add EnsureRequirements: merged EnforceDistribution + EnforceSorting with idempotent pushdown_sorts#21976
zhuqi-lucas wants to merge 1 commit intoapache:mainfrom
zhuqi-lucas:ensure-requirements

Conversation

@zhuqi-lucas
Copy link
Copy Markdown
Contributor

@zhuqi-lucas zhuqi-lucas commented May 1, 2026

Summary

Replace the separate EnforceDistribution and EnforceSorting optimizer rules with a single EnsureRequirements rule in the default optimizer chain. Fix pushdown_sorts to be distribution-aware and fix EnforceDistribution fetch preservation (#14150), making the composition idempotent.

Epic: #21973

Problem

EnforceDistribution and EnforceSorting run as separate rules, but sorting and distribution are coupled through SortExec.preserve_partitioning. This caused:

  1. Production 502 errors: pushdown_sorts set preserve_partitioning=true on multi-partition input without inserting SortPreservingMergeExec, violating SinglePartition requirements from GlobalLimitExecSanityCheckPlan failure.

  2. Non-idempotent composition: Running the rules multiple times produced different (sometimes invalid) plans.

  3. Lost fetch values (Bug: applying multiple times EnforceDistribution generates invalid plan #14150): EnforceDistribution dropped fetch from SortPreservingMergeExec/CoalescePartitionsExec when stripping and re-adding distribution operators.

DataFusion was the only major query engine with separate rules — Spark (EnsureRequirements) and Presto/Trino (AddExchanges) handle both in a single rule.

Changes

1. EnsureRequirements rule (new, replaces default chain)

  • Composes EnforceDistribution::optimize() + EnforceSorting::optimize() in a single rule
  • Replaces Arc::new(EnforceDistribution) + Arc::new(EnforceSorting) in the default optimizer chain
  • 53 comprehensive tests covering all known bug topologies + idempotency verification

2. Distribution-aware pushdown_sorts (sort_pushdown.rs)

  • Add distribution_requirement: Distribution field to ParentRequirements
  • New add_sort_above_with_distribution() in utils.rs — inserts SortPreservingMergeExec when parent requires SinglePartition and input has multiple partitions
  • Switch both add_sort_above call sites to distribution-aware variant
  • Propagate distribution through recursion with stronger_distribution() helper
  • Reset distribution below partition-merging nodes (SPM, single-partition outputs)

3. Fix EnforceDistribution fetch preservation (#14150)

  • remove_dist_changing_operators() now saves fetch from removed SPM/Coalesce
  • add_merge_on_top() re-applies saved fetch to re-created operators

4. Updated SLT

  • explain.slt: Two optimizer rule names (EnforceDistribution, EnforceSorting) become one (EnsureRequirements) in EXPLAIN VERBOSE output

Testing

Suite Result
EnsureRequirements (new) 53 passed
enforce_sorting (existing) 124 passed, 0 regressions
enforce_distribution (existing) 66 passed, 0 regressions
SLT (465 files) 1 pre-existing failure only
Total 243 unit + 464 SLT = 0 new failures

Idempotency coverage

Scenario Verified
Multi-partition sort + limit (2,4,8,16,32,64 partitions)
Union with mixed partition counts
Projection over multi-partition
HashJoin (Partitioned)
SortMergeJoin
Window function partitioning + ordering
Aggregate (Partial + FinalPartitioned)
Nested sort + limit
Hash repartition + sort
CoalescePartitions + sort (parallelize_sorts)
SPM → Sort → multi-partition
PR #53 scenario (OutputRequirementExec + SinglePartition)
PR #54 scenario (ProjectionExec + multi-partition)
#14150 scenario (fetch preservation across passes)
All partition counts 1-64 sweep
Triple optimization convergence
10x consecutive optimization stability
EnforceDistribution::optimize twice (sort+limit)
EnforceDistribution::optimize twice (HashJoin)

Architecture

EnsureRequirements::optimize(plan)
  Step 1: EnforceDistribution::optimize(plan)
    - Join key reordering (top-down)
    - Distribution enforcement (bottom-up)
    - Fetch preservation on SPM/Coalesce removal (#14150 fix)
  Step 2: EnforceSorting::optimize(plan)
    - ensure_sorting (bottom-up)
    - parallelize_sorts (bottom-up)
    - replace_with_order_preserving_variants (bottom-up)
    - pushdown_sorts (top-down, distribution-aware)
    - replace_with_partial_sort (bottom-up)

Idempotent because:

  • pushdown_sorts now carries distribution_requirement and uses add_sort_above_with_distribution
  • EnforceDistribution preserves fetch across strip/re-add cycles
  • Running EnsureRequirements twice produces identical plans (verified by 53 tests)

Next Steps (future PRs)

  • Gradually implement pushdown_sorts optimizations in the bottom-up ensure_sorting pass
  • Eliminate pushdown_sorts top-down pass entirely
  • Single-pass architecture (one transform_up for both distribution + sorting, like Spark)

Closes: #14150

@github-actions github-actions Bot added optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) and removed sqllogictest SQL Logic Tests (.slt) labels May 1, 2026
@zhuqi-lucas zhuqi-lucas force-pushed the ensure-requirements branch 4 times, most recently from 50ce304 to dfb1043 Compare May 2, 2026 04:58
…ceSorting

## Summary

Replace the separate `EnforceDistribution` and `EnforceSorting` optimizer rules
with a single `EnsureRequirements` rule in the default optimizer chain. This makes
the composition idempotent by fixing distribution-awareness in `pushdown_sorts`
and fetch preservation in `EnforceDistribution`.

## Problem

`EnforceDistribution` and `EnforceSorting` are coupled through
`SortExec.preserve_partitioning` but run as independent rules. This caused:

1. **Production 502 errors**: `pushdown_sorts` set `preserve_partitioning=true`
   without `SortPreservingMergeExec`, violating `SinglePartition` requirements
   from `GlobalLimitExec` → `SanityCheckPlan` failure.

2. **Non-idempotent composition**: Running the rules multiple times produced
   different (sometimes invalid) plans.

3. **Lost fetch values** (apache#14150): `EnforceDistribution` dropped `fetch` from
   `SortPreservingMergeExec` when stripping and re-adding distribution operators.

DataFusion was the only major engine with separate rules — Spark (`EnsureRequirements`)
and Presto (`AddExchanges`) use a single rule.

## Changes

### `EnsureRequirements` rule (new)
- Composes `EnforceDistribution::optimize()` + `EnforceSorting::optimize()`
- Replaces both rules in the default optimizer chain
- 53 comprehensive tests including idempotency verification

### Distribution-aware `pushdown_sorts` (fix)
- Add `distribution_requirement` field to `ParentRequirements`
- New `add_sort_above_with_distribution()` inserts `SortPreservingMergeExec`
  when parent requires `SinglePartition` and input has multiple partitions
- Propagate distribution through recursion with `stronger_distribution()`
- Reset distribution below partition-merging nodes (SPM, single-partition outputs)

### Fix `EnforceDistribution` fetch preservation (apache#14150)
- `remove_dist_changing_operators()` now saves fetch from removed SPM/Coalesce
- `add_merge_on_top()` re-applies saved fetch to new operators

## Testing

| Suite | Result |
|-------|--------|
| EnsureRequirements (new) | 53 passed |
| enforce_sorting (existing) | 124 passed, 0 regressions |
| enforce_distribution (existing) | 66 passed, 0 regressions |
| SLT (465 files) | 1 pre-existing failure only |
| **Total** | **243 unit + 464 SLT, 0 new failures** |

Idempotency verified:
- All partition counts 1-64
- Triple + 10x consecutive optimization passes
- SortMergeJoin, HashJoin, Window, Aggregate topologies
- PR apache#53/apache#54 regression scenarios
- apache#14150 fetch preservation across passes

Closes: apache#14150
Part of: apache#21973
@zhuqi-lucas zhuqi-lucas force-pushed the ensure-requirements branch from dfb1043 to ba7e30e Compare May 2, 2026 07:41
@github-actions github-actions Bot added the core Core DataFusion crate label May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: applying multiple times EnforceDistribution generates invalid plan

1 participant