Add more window function plumbing and implement rank() function#7388
Open
jussisaurio wants to merge 5 commits into
Open
Add more window function plumbing and implement rank() function#7388jussisaurio wants to merge 5 commits into
jussisaurio wants to merge 5 commits into
Conversation
0bd094d to
9ca08e3
Compare
Merging this PR will not alter performance
Comparing Footnotes
|
109cbda to
57105c9
Compare
Introduces a plan-layer Frame type that captures a window function's effective frame after per-function coercion. Built-in window functions get their frame from WindowFunc::coerced_frame(), which mirrors SQLite's coercion table in window.c:700-708. Aggregate window functions inherit the default RANGE UNBOUNDED PRECEDING TO CURRENT ROW. The Frame is recorded on each WindowFunction at translate time. No emit machinery reads it yet — this is foundation for the frame-cursor emission rewrite that follows.
Renames the per-window buffer cursors from buffer_read / buffer_write to csr_current / csr_write so they line up with SQLite's sqlite3WindowCodeStep naming (window.c around line 2786). No behaviour change. Additional frame-position cursors (csr_start / csr_end in SQLite's numbering) will land alongside the emit code that consumes them.
A single OVER clause can host functions whose coerced frames disagree (e.g. row_number coerces to ROWS UNBOUNDED..CURRENT, rank to RANGE UNBOUNDED..CURRENT). SQLite's emit pipeline asserts that every function attached to one window step has the same frame (window.c:1679), and lets mixed-frame queries compile to a separate ephemeral-table pass per frame. Move the Frame field off WindowFunction and onto Window. Two windows are now equivalent only when both their OVER clause and their coerced frame agree, so the planner allocates one Window per (partition, order, frame) triple. The existing nested-subquery rewrite in prepare_window_subquery then produces one ephemeral-table pass per Window — same shape as SQLite. For named windows referenced via Over::Name, the WINDOW-clause stub is created with a placeholder frame and overwritten by the first function that attaches; subsequent functions with conflicting frames clone the named window's partition/order into a fresh entry carrying their own frame. No emit-side change yet; the emit body still treats each Window's frame as the implicit UNBOUNDED..CURRENT it always was. The split is the planner-side pre-condition for the upcoming xStep/xInverse dispatch.
d06b5af to
965ee56
Compare
…uffered-row pattern
Window functions fall into two categories with respect to when their
AggStep fires: aggregate-like functions (sum/count/avg as windows, and
the upcoming rank-family functions) count source rows as they arrive
from the input subquery and capture a single value per peer-group
flush; row_number and the upcoming positional functions (lag, lead,
first/last/nth_value) step per buffered row at flush time because
their value changes within a peer group.
Express this as a WindowFunc::steps_per_source_row() classifier and
split the two emit sites accordingly:
- emit_aggregation_step (called per source row) handles aggregate
window funcs and the rank-family;
- emit_return_buffered_rows pre-loop captures AggValue once per
flush for both categories above, and the per-row loop body steps
+ reads only the per-buffered-row category.
No behavioral change: the rank-family branch is currently dead because
none of those functions are wired in name resolution yet. row_number
continues to emit via the per-buffered-row path. Mirrors how SQLite
gates its xStep emission per function family rather than running a
runtime peer-detection flag.
rank() assigns the same value to all rows in a peer group (rows with equal ORDER BY values) and skips ranks by group size, so [a,a,b,c,c] ordered ascending becomes [1,1,3,4,4]. Wire 'rank' name resolution to WindowFunc::Rank and add step/value arms in op_window_step / op_window_value. The state machine mirrors SQLite's rankStepFunc / rankValueFunc: payload[0] is the current rank value (cleared by AggValue, latched by the next step), payload[1] is the rows-seen counter (always increments). The 'latch on zero' trick lets every peer in a group read the same rank without an explicit peer-detection signal — AggValue's clear is what tells the next step 'you start a new peer group'. Verification: - New tests/window/rank.sqltest covers peers/no-peers, partition, no-ORDER-BY, NULL handling, empty input, multi-column ORDER BY, DESC, COLLATE NOCASE peer detection, partition reset with repeated values, alongside row_number, and two rank windows with different orderings. - Updates two cases in tests/window/memory.sqltest that previously expected 'no such function: rank' — one becomes a passing test, the other now waits on dense_rank().
965ee56 to
651e34a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add some new window-specific plumbing and implement
rank()just so that we have some feature work here instead of just enablers.