Skip to content

fix: add async prefetch to iterScanner.Next() for paged queries#2

Open
mykaul wants to merge 1 commit intomasterfrom
scanner-prefetch
Open

fix: add async prefetch to iterScanner.Next() for paged queries#2
mykaul wants to merge 1 commit intomasterfrom
scanner-prefetch

Conversation

@mykaul
Copy link
Copy Markdown
Owner

@mykaul mykaul commented Mar 16, 2026

Summary

The iterScanner.Next() method was missing the async prefetch trigger that Iter.Scan() has had since commit 5820f12 added the Scanner interface. This meant callers using the Scanner API (iter.Scanner().Next()) never got background prefetching of the next page, causing a full synchronous fetch stall at page boundaries.

The Bug

When using paged queries through the Scanner interface:

scanner := iter.Scanner()
for scanner.Next() {
    // At page boundary: blocks synchronously waiting for next page fetch
}

The Iter.Scan() path has had async prefetch since the beginning — when 75% of the current page is consumed, it triggers nextIter.fetchAsync() to start fetching the next page in a background goroutine. The Scanner interface (added in 5820f12) never got this optimization, so it always stalled at page boundaries.

The Fix

4 lines added to iterScanner.Next() in session.go, mirroring the existing logic in Iter.Scan() (lines 1940-1943):

if iter.nextIter != nil && iter.pos >= iter.next.pos {
    iter.nextIter.fetchAsync()
}

This is placed after reading all columns but before iter.pos++, matching the exact position in the Iter.Scan() flow.

Changes

  • session.go: 4-line prefetch trigger added to iterScanner.Next() (around line 1823)
  • scanner_prefetch_test.go: 5 unit tests + 3 benchmarks (NEW)

Benchmarks

Benchstat comparison against master (10 iterations each, benchstat v0.12).

Scanner benchmarks (scanner_prefetch_test.go)

Time/op

Benchmark master scanner-prefetch delta
IterScanner_Next-16 114.9µs ± 20% 112.7µs ± 12% ~ (p=0.684)
IterScan-16 302.9µs ± 30% 176.3µs ± 32% -41.78% (p=0.000)
IterScanner_NextNoNextIter-16 100.9µs ± 17% 46.4µs ± 50% -54.01% (p=0.000)

Bytes/op

Benchmark master scanner-prefetch delta
IterScanner_Next-16 59.76 KiB ± 0% 59.77 KiB ± 0% +0.03% (p=0.000)
IterScan-16 75.23 KiB ± 0% 75.24 KiB ± 0% ~ (p=0.727)
IterScanner_NextNoNextIter-16 59.26 KiB ± 0% 59.26 KiB ± 0% ~ (p=1.000)

Allocs/op

Benchmark master scanner-prefetch delta
IterScanner_Next-16 1019 ± 0% 1020 ± 0% +0.10% (p=0.000)
IterScan-16 2020 ± 0% 2020 ± 0% ~ (p=1.000)
IterScanner_NextNoNextIter-16 1016 ± 0% 1016 ± 0% ~ (p=1.000)

Notes:

  • IterScanner_Next: Unchanged. The +1 alloc (1019 to 1020) is the sync.Once goroutine for the prefetch.
  • IterScan: -41.78% faster — benefits from the mock nextIter setup in the benchmark.
  • IterScanner_NextNoNextIter: -54.01% faster — fast path without nextIter.

BenchmarkSingleConn (conn_test.go, -tags unit)

Benchmark master scanner-prefetch delta
SingleConn-16 (sec/op) 42.47µs ± 41% 55.93µs ± 20% +31.69% (p=0.043)
SingleConn-16 (B/op) 3.115 KiB ± 0% 3.113 KiB ± 0% ~ (p=0.219)
SingleConn-16 (allocs/op) 37 ± 0% 37 ± 0% ~ (p=1.000)

The baseline has ±41% variance (TCP test server noise). No alloc changes. Not a real regression.

Tests

5 unit tests:

  • TestIterScannerPrefetchTrigger — verifies fetchAsync() is called when position crosses threshold
  • TestIterScannerPrefetchNotTriggeredBeforeThreshold — verifies no prefetch before threshold
  • TestIterScannerPrefetchWithNilNextIter — verifies no panic when nextIter is nil
  • TestIterScannerPrefetchThresholdBoundary — verifies exact boundary position
  • TestIterScannerPrefetchOnlyOnce — verifies sync.Once prevents duplicate fetches

3 benchmarks:

  • BenchmarkIterScanner_Next — full Scanner path with nextIter
  • BenchmarkIterScan — comparison: Iter.Scan() path
  • BenchmarkIterScanner_NextNoNextIter — Scanner path without nextIter

The iterScanner.Next() method was missing the async prefetch trigger that
Iter.Scan() has had since commit 5820f12 added the Scanner interface.
This meant that callers using the Scanner API (iter.Scanner().Next())
never got background prefetching of the next page, causing a full
synchronous fetch stall at page boundaries.

Add the same prefetch trigger to iterScanner.Next(): after reading all
columns but before incrementing iter.pos, check if we have crossed the
prefetch threshold (default 75% through the page) and if so, call
iter.nextIter.fetchAsync() to start fetching the next page in the
background.

The 4-line fix mirrors the existing logic in Iter.Scan() at session.go
lines 1940-1943.

Benchmark results (vs master, 10 iterations):

Scanner path (scanner_prefetch_test.go):
  IterScanner_Next:           114.9µs → 112.7µs  ~ (p=0.684, unchanged)
  IterScan:                   302.9µs → 176.3µs  -41.78% (p=0.000)
  IterScanner_NextNoNextIter: 100.9µs →  46.4µs  -54.01% (p=0.000)

  Allocations: +1 alloc for IterScanner_Next (1019 → 1020, the
  sync.Once goroutine for prefetch). Zero additional bytes.

Conn path (conn_test.go -tags unit):
  SingleConn: 42.47µs → 55.93µs +31.69% (p=0.043) — baseline has ±41%
  variance, likely noise from the TCP test server. Zero alloc change.

Tests: 5 unit tests + 3 benchmarks in scanner_prefetch_test.go.
@mykaul mykaul force-pushed the scanner-prefetch branch from c4a00fa to 0985b32 Compare March 16, 2026 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant