Skip to content

Add more window function plumbing and implement rank() function#7388

Open
jussisaurio wants to merge 5 commits into
mainfrom
window-plumbing-2-and-rank-function
Open

Add more window function plumbing and implement rank() function#7388
jussisaurio wants to merge 5 commits into
mainfrom
window-plumbing-2-and-rank-function

Conversation

@jussisaurio

@jussisaurio jussisaurio commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Add some new window-specific plumbing and implement rank() just so that we have some feature work here instead of just enablers.

@codspeed-hq

codspeed-hq Bot commented Jun 8, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 638 untouched benchmarks
⏩ 105 skipped benchmarks1


Comparing window-plumbing-2-and-rank-function (651e34a) with main (c54ea4f)

Open in CodSpeed

Footnotes

  1. 105 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@jussisaurio jussisaurio force-pushed the window-plumbing-2-and-rank-function branch 3 times, most recently from 109cbda to 57105c9 Compare June 8, 2026 12:11
Introduces a plan-layer Frame type that captures a window function's
effective frame after per-function coercion. Built-in window functions
get their frame from WindowFunc::coerced_frame(), which mirrors
SQLite's coercion table in window.c:700-708. Aggregate window functions
inherit the default RANGE UNBOUNDED PRECEDING TO CURRENT ROW.

The Frame is recorded on each WindowFunction at translate time. No
emit machinery reads it yet — this is foundation for the frame-cursor
emission rewrite that follows.
Renames the per-window buffer cursors from buffer_read / buffer_write
to csr_current / csr_write so they line up with SQLite's
sqlite3WindowCodeStep naming (window.c around line 2786). No behaviour
change.

Additional frame-position cursors (csr_start / csr_end in SQLite's
numbering) will land alongside the emit code that consumes them.
A single OVER clause can host functions whose coerced frames disagree
(e.g. row_number coerces to ROWS UNBOUNDED..CURRENT, rank to
RANGE UNBOUNDED..CURRENT). SQLite's emit pipeline asserts that every
function attached to one window step has the same frame
(window.c:1679), and lets mixed-frame queries compile to a separate
ephemeral-table pass per frame.

Move the Frame field off WindowFunction and onto Window. Two windows
are now equivalent only when both their OVER clause and their coerced
frame agree, so the planner allocates one Window per (partition,
order, frame) triple. The existing nested-subquery rewrite in
prepare_window_subquery then produces one ephemeral-table pass per
Window — same shape as SQLite.

For named windows referenced via Over::Name, the WINDOW-clause stub
is created with a placeholder frame and overwritten by the first
function that attaches; subsequent functions with conflicting frames
clone the named window's partition/order into a fresh entry carrying
their own frame.

No emit-side change yet; the emit body still treats each Window's
frame as the implicit UNBOUNDED..CURRENT it always was. The split is
the planner-side pre-condition for the upcoming xStep/xInverse
dispatch.
@jussisaurio jussisaurio force-pushed the window-plumbing-2-and-rank-function branch 2 times, most recently from d06b5af to 965ee56 Compare June 8, 2026 13:13
…uffered-row pattern

Window functions fall into two categories with respect to when their
AggStep fires: aggregate-like functions (sum/count/avg as windows, and
the upcoming rank-family functions) count source rows as they arrive
from the input subquery and capture a single value per peer-group
flush; row_number and the upcoming positional functions (lag, lead,
first/last/nth_value) step per buffered row at flush time because
their value changes within a peer group.

Express this as a WindowFunc::steps_per_source_row() classifier and
split the two emit sites accordingly:

  - emit_aggregation_step (called per source row) handles aggregate
    window funcs and the rank-family;
  - emit_return_buffered_rows pre-loop captures AggValue once per
    flush for both categories above, and the per-row loop body steps
    + reads only the per-buffered-row category.

No behavioral change: the rank-family branch is currently dead because
none of those functions are wired in name resolution yet. row_number
continues to emit via the per-buffered-row path. Mirrors how SQLite
gates its xStep emission per function family rather than running a
runtime peer-detection flag.
rank() assigns the same value to all rows in a peer group (rows with
equal ORDER BY values) and skips ranks by group size, so [a,a,b,c,c]
ordered ascending becomes [1,1,3,4,4].

Wire 'rank' name resolution to WindowFunc::Rank and add step/value
arms in op_window_step / op_window_value. The state machine mirrors
SQLite's rankStepFunc / rankValueFunc: payload[0] is the current rank
value (cleared by AggValue, latched by the next step), payload[1] is
the rows-seen counter (always increments). The 'latch on zero' trick
lets every peer in a group read the same rank without an explicit
peer-detection signal — AggValue's clear is what tells the next step
'you start a new peer group'.

Verification:
- New tests/window/rank.sqltest covers peers/no-peers, partition,
  no-ORDER-BY, NULL handling, empty input, multi-column ORDER BY,
  DESC, COLLATE NOCASE peer detection, partition reset with
  repeated values, alongside row_number, and two rank windows
  with different orderings.
- Updates two cases in tests/window/memory.sqltest that previously
  expected 'no such function: rank' — one becomes a passing test,
  the other now waits on dense_rank().
@jussisaurio jussisaurio force-pushed the window-plumbing-2-and-rank-function branch from 965ee56 to 651e34a Compare June 8, 2026 14:05
@jussisaurio jussisaurio marked this pull request as ready for review June 8, 2026 14:07
@jussisaurio jussisaurio changed the title wip: window plumbing 2 and rank function Add more window function plumbing and implement rank() function Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant