refactor: make Recursive CTE execution more streaming-oriented#19545
refactor: make Recursive CTE execution more streaming-oriented#19545KKould wants to merge 16 commits intodatabendlabs:mainfrom
Conversation
1c0f22f to
930ea96
Compare
83676b2 to
66f55aa
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f3ae46dbf6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/query/service/src/pipelines/processors/transforms/transform_recursive_cte_source.rs
Outdated
Show resolved
Hide resolved
src/query/sql/src/planner/binder/bind_table_reference/bind_cte.rs
Outdated
Show resolved
Hide resolved
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 92de3e8860
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/query/service/src/pipelines/processors/transforms/transform_recursive_cte_source.rs
Outdated
Show resolved
Hide resolved
|
@codex review |
b192ab2 to
18df7e1
Compare
🤖 CI Job Analysis
📊 Summary
❌ NO RETRY NEEDEDAll failures appear to be code/test issues requiring manual fixes. 🔍 Job Details
🤖 AboutAutomated analysis using job annotations to distinguish infrastructure issues (auto-retried) from code/test issues (manual fixes needed). |
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
This PR was originally intended to address the issue where Databend could not execute a Sudoku query described in #18237
However, the surface-level error reported in that issue has already been fixed by #19212
While working on this problem, this PR added support for UnionAll and RecursiveCteScan in the SubqueryDecorrelatorOptimizer within flatten_plan. During testing with deeply recursive cases such as Sudoku, it was discovered that the current implementation of RecursiveCteScan performs a large amount of intermediate materialization. In scenarios with deep recursion, these intermediate results accumulate rapidly, causing query memory usage to grow dramatically, which eventually prevents the query from completing and may even lead to a system crash.
Therefore, this PR prioritizes a refactoring of RecursiveCteScan. The goal is to make its execution more streaming-oriented, reducing unnecessary intermediate materialization and repeated execution during the query process. This helps control memory usage and enables deeply recursive queries like Sudoku to run more reliably.
Key Fixes
Previously, rCTE internal table names were only unique within a query. This caused duplicated recursive CTE branches produced by decorrelation to run independently on different internal tables.
The change makes the naming stable per query + logical rCTE identity, ensuring duplicated branches of the same recursive CTE share the same internal tables instead of executing separately.
A logical_recursive_cte_id is introduced and passed through the logical plan → physical plan → execution layer.
This allows the executor to determine that multiple operators belong to the same recursive CTE, instead of inferring it indirectly from table_name or alias.
The runtime id was previously stored only within a single QueryContext.
It is now promoted to shared state, allowing subqueries or child contexts to retrieve the same runtime id for the same logical rCTE.
This is essential for enabling reuse in correlated subqueries.
Execution Layer Improvements
RecursiveCteScan now operates through reader registration and block fetching, instead of being tightly coupled to a single execution instance.
The memory table has been changed from a flat Vec to a generation/frontier + reader cursor model.
This explicitly tracks:
Instead of concatenating an entire iteration and emitting it at once, the scan now pulls data block by block via:
take_one_block(reader_id)Tips:
Example:
Tests
Type of change
This change is