Skip to content

Feature: Add delimiter-style correlated join with outer-to-inner runtime filtering #19790

@forsaken628

Description

@forsaken628

Summary

Improve execution of decorrelated correlated subqueries by introducing a delimiter-style plan that uses the outer query keys to restrict the inner/subquery scan.

Databend rewrites correlated subqueries into joins. In many cases, the inner side only needs rows whose correlated keys appear in the outer side. If we can materialize the outer
key domain and apply it to the inner scan, we can avoid scanning and processing irrelevant inner rows.

This is similar to the delimiter join idea used by systems such as DuckDB.

Problem

Today, after decorrelation, correlated subqueries are represented as normal joins, for example:

LeftSemi(outer, inner)
LeftAnti(outer, inner)
RightMark(outer, inner)
LeftSingle(outer, inner)

The physical HashJoin currently treats:

left child = probe
right child = build

Runtime filters are generated from the build side and applied to the probe side. For correlated subqueries, this often means the runtime filter is generated from the inner side
and applied to the outer side.

The more useful direction is usually the opposite:

outer correlated keys -> filter inner scan

Rows in the inner side whose correlated keys do not appear in the outer side cannot affect the correlated subquery result.

Proposal

Introduce explicit logical support for correlated-subquery execution instead of relying on late physical-plan pattern matching.

A possible representation is either a new logical operator:

DelimJoin / CorrelatedJoin

or a Join with correlation metadata:

  Join {
      from_correlated_subquery: true,
      correlation: Some(CorrelationInfo),
  }

The metadata should describe:

outer keys
inner keys
null semantics
whether the correlation filter is optional or required

Two Execution Modes

There should be two separate modes.

  1. Best-effort outer-to-inner runtime filter

Keep the normal decorrelated plan unchanged.

The optimizer records correlation metadata, and the physical planner tries to generate a runtime filter from the outer keys to the inner scan.

If the runtime filter cannot be built, the plan still runs as a normal decorrelated join.

This mode is safe because runtime filter is only an optimization.

  1. Required delimiter-style plan

Create a separate logical alternative where the outer key domain is a required input for the inner scan.

In this mode, some aggregates introduced by decorrelation may be removed, because the delimiter key domain already provides the needed deduplication/filtering effect.

This plan must not silently fall back to a normal join after the aggregate has been removed. If the required delimiter filter cannot be preserved or implemented, the optimizer
must choose the normal decorrelated plan instead.

Why This Must Be Logical

Databend still has a logical SExpr tree after CBO. The physical builder later interprets this logical tree.

Therefore this optimization cannot be only a late physical runtime-filter trick. The logical plan must carry the correlation execution property.

This is especially important for aggregate removal. Aggregates in Databend are tied to distributed execution:

partial/final aggregate split
shuffle/exchange requirement
aggregate state schema
statistics and required properties

Once an aggregate-free correlated plan is chosen, the physical planner cannot cheaply add the aggregate back as a fallback.

So the optimizer should keep two logical alternatives:

normal decorrelated plan with aggregate
delimiter-style plan without aggregate

The second alternative is only valid if the required delimiter filter can be guaranteed.

Correctness Requirements

The optimization must preserve correlated subquery semantics.

Important constraints:

runtime/delimiter filters must have no false negatives
Bloom false positives are acceptable
NULL-aware semantics for IN / ANY / NOT IN / mark joins must be preserved
residual non-equi predicates must still be evaluated
if key lineage is unclear, best-effort metadata can be dropped
required delimiter metadata must not be silently dropped

Expected Benefits

reduce inner-side scan work
reduce join input size
avoid unnecessary aggregate/shuffle for eligible correlated plans
improve performance for selective outer key domains
provide a safe fallback through the normal decorrelated plan

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions