Skip to content

scan: refresh parameter state and rescan on parameter changes (PARAM_EXTERN + PARAM_EXEC)#587

Merged
burmecia merged 2 commits intomainfrom
bnjjj/param_fingerprint_rescan
Mar 12, 2026
Merged

scan: refresh parameter state and rescan on parameter changes (PARAM_EXTERN + PARAM_EXEC)#587
burmecia merged 2 commits intomainfrom
bnjjj/param_fingerprint_rescan

Conversation

@bnjjj
Copy link
Copy Markdown
Contributor

@bnjjj bnjjj commented Mar 9, 2026

What kind of change does this PR introduce?

Bug/Feature

What is the current behavior?

supabase-wrappers currently updates qual.value from runtime params, but parameter state is not consistently tracked for FDW consumers and rescans:

  1. PARAM_EXTERN values were not written into param.eval_value.
  2. PARAM_EXEC wrote param.eval_value only on successful non-null eval and did not always clear stale previous values.
  3. ReScanForeignScan always called re_scan() and never re-ran begin_scan() when parameter values changed.

This is problematic for FDWs that:

  • rely on param.eval_value for SQL pushdown decisions, and/or
  • buffer/materialize rows in begin_scan.

For parameterized queries, parameter values can change between rescans (especially with correlated/LATERAL execution). If wrappers only rewinds local FDW state, the FDW may serve stale rows from the previous parameter value.

Example queries affected

  1. Correlated/LATERAL (PARAM_EXEC) rescan correctness
SELECT o.id, f.note
FROM (VALUES (1), (2)) AS o(id)
LEFT JOIN LATERAL (
  SELECT note
  FROM orders_fdw
  WHERE orders_fdw.id = o.id
  ORDER BY note
  LIMIT 1
) AS f ON true;

Expected:

row for o.id=1 uses param 1
row for o.id=2 uses param 2

Before this patch, a materializing FDW could reuse buffered rows from first parameter and return incorrect repeated results.

  1. Prepared statement (PARAM_EXTERN) pushdown visibility
PREPARE q(int) AS
SELECT id, note
FROM orders_fdw
WHERE id = $1;

FDWs reading param.eval_value had no consistent runtime extern value to push down from wrappers.

  1. NULL transition safety
EXECUTE q(2);
EXECUTE q(NULL);

Without clearing runtime parameter state, FDW-side logic could accidentally observe stale non-null values.

What is the new behavior?

My main goal was to not introduce big breaking changes.

  1. Always refresh param.eval_value in parameter assignment:
    • PARAM_EXTERN: set Some(Value::Cell(...)) when available, otherwise None
    • PARAM_EXEC: set Some(...) on success, otherwise None
    • no stale carry-over
  2. Add compute_param_fingerprint to scan state:
    • hashes qual/param identity + current param.eval_value
    • used to detect actual parameter changes between scans
  3. Change rescan behavior:
    • On BeginForeignScan: assign params + compute initial fingerprint
    • On ReScanForeignScan:
      • recompute fingerprint
      • if changed: call begin_scan() (refresh remote/materialized results)
      • if unchanged: call re_scan() (cheap rewind)

Additional context

compute_param_fingerprint intentionally hashes param.eval_value (runtime truth), not qual.value, because from my understanding qual.value can contain placeholder/stale values for parameterized quals.

I don't know if it could break existing behavior, you'll probably better than me to know this so if you think this fix is unnecessary or wrong feel free to tell me.

… rescan

Signed-off-by: Benjamin <5719034+bnjjj@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 9, 2026 09:41
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 9, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced scan parameter handling to prevent crashes when dealing with null or out-of-range parameter lists.
  • Performance

    • Optimized rescan operations through parameter change detection, reducing unnecessary scan re-initialization when parameters remain unchanged.

Walkthrough

This pull request introduces a parameter fingerprinting mechanism to the foreign data wrapper scan logic. A new param_fingerprint field is added to FdwState to track parameter state across scan iterations. During initial scan setup in begin_foreign_scan, a fingerprint is computed and stored. During rescans in re_scan_foreign_scan, the current fingerprint is compared with the stored one; if they differ, the scan is reinitialized, otherwise a standard rescan is performed. Additionally, the parameter assignment flow is extended to track current_value for each parameter, and minor null-safety guards are added to PARAM_EXTERN handling.

Sequence Diagram(s)

sequenceDiagram
    participant Executor as Query Executor
    participant FdwState as FdwState
    participant ParamAssign as Parameter Assignment
    participant FingerprintCalc as Fingerprint Computation
    participant ScanOp as Scan Operation

    Executor->>FdwState: begin_foreign_scan()
    FdwState->>ParamAssign: assign_parameter_value()
    ParamAssign->>ParamAssign: track current_value
    ParamAssign->>FdwState: update param.eval_value
    FdwState->>FingerprintCalc: compute_param_fingerprint()
    FingerprintCalc->>FingerprintCalc: hash(field, operator, kind, id, type_oid, eval_value)
    FingerprintCalc->>FdwState: return fingerprint hash
    FdwState->>FdwState: store param_fingerprint
    FdwState->>ScanOp: initialize scan

    Executor->>FdwState: re_scan_foreign_scan()
    FdwState->>FingerprintCalc: compute_param_fingerprint()
    FingerprintCalc->>FdwState: return current fingerprint
    FdwState->>FdwState: compare with stored param_fingerprint
    alt Fingerprint Changed
        FdwState->>FdwState: reinitialize via begin_scan()
    else Fingerprint Unchanged
        FdwState->>ScanOp: perform standard rescan()
    end
Loading

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes three related bugs in the supabase-wrappers scan infrastructure around how runtime parameter values (PARAM_EXTERN from prepared statements and PARAM_EXEC from LATERAL/correlated queries) are tracked and propagated to FDW implementations.

Changes:

  • PARAM_EXTERN values are now written into param.eval_value, and both PARAM_EXTERN and PARAM_EXEC now correctly set eval_value to None when the parameter is NULL or unavailable (stale values no longer persist).
  • A compute_param_fingerprint function and param_fingerprint field are introduced to track whether parameter values have changed between rescans.
  • re_scan_foreign_scan now calls begin_scan() instead of re_scan() when the fingerprint changes, so FDWs that buffer data in begin_scan will re-execute their remote query with updated parameter values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

type_oid: param.type_oid,
eval_value: match param.eval_value.lock() {
Ok(value) => {
format!("{:?}", *value)
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eval_value field in ParamFingerPrint is computed using format!("{:?}", *value), which relies on the Debug representation of Option<Value> and in turn Value and Cell. For floating-point variants (Cell::F32, Cell::F64), the Debug representation may expose representation quirks (e.g., NaN and -NaN would both hash differently even if logically equivalent). More importantly, if the Debug output of Value or Cell changes in the future, it would silently change fingerprint behavior.

A more robust approach would be to implement Hash directly on Value and Cell, or to compare eval_value contents for equality rather than relying on serialized Debug strings.

Suggested change
format!("{:?}", *value)
// Derive a stable fingerprint from the parameter value itself
let mut value_hasher = DefaultHasher::new();
(*value).hash(&mut value_hasher);
value_hasher.finish().to_string()

Copilot uses AI. Check for mistakes.
}

#[derive(Hash, PartialEq, Eq)]
struct ParamFingerPrint<'a> {
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The struct is named ParamFingerPrint (with capital P in "Print"), but Rust naming conventions for structs use PascalCase without internal capitalization distinctions for compound words — it should be ParamFingerprint (lowercase 'p' in 'print'). All other structs in this file (FdwState, ExprEval, etc.) follow standard PascalCase without gratuitous internal capitalization.

Copilot uses AI. Check for mistakes.
if !fdw_state.is_null() {
let mut state = PgBox::<FdwState<E, W>>::from_pg(fdw_state);
let result = state.re_scan();
assign_paramenter_value(node, &mut state);
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new caller in re_scan_foreign_scan at this line still uses the typo-containing name assign_paramenter_value (should be assign_parameter_value). Since the PR adds this new call site, it would be appropriate to also rename the function to fix the pre-existing spelling error paramenterparameter.

Copilot uses AI. Check for mistakes.
Comment on lines +555 to +556
state.param_fingerprint = next_fingerprint;
state.begin_scan()
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When re_scan_foreign_scan detects a fingerprint change and calls begin_scan() (line 556), the previous scan session's resources have never been cleaned up via end_scan(). FDW implementations such as ClickHouseFdw, BigQueryFdw, IcebergFdw, etc., typically allocate connections, cursors, or buffers in begin_scan and release them in end_scan. Calling begin_scan() again without a preceding end_scan() may cause connection/resource leaks, or even double initialization bugs.

To fix this, end_scan() should be called on the state before invoking begin_scan() when the fingerprint has changed.

Suggested change
state.param_fingerprint = next_fingerprint;
state.begin_scan()
// The parameters have changed; clean up the previous scan session
// before starting a new one to avoid leaking resources.
let end_result = state.end_scan();
if end_result.is_err() {
end_result
} else {
state.param_fingerprint = next_fingerprint;
state.begin_scan()
}

Copilot uses AI. Check for mistakes.
fn compute_param_fingerprint<E: Into<ErrorReport>, W: ForeignDataWrapper<E>>(
state: &FdwState<E, W>,
) -> u64 {
let mut hasher = DefaultHasher::new();
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::collections::hash_map::DefaultHasher is explicitly documented as not being stable: its hashing algorithm may change between Rust versions, and its output is also subject to per-process hash randomization (HashDoS protection). While collision probability is low for this use case (same-process comparison only), the implementation is specifically called out in Rust docs as "not guaranteed" for stability.

A more robust choice for a fingerprint would be using std::hash::BuildHasher with a stable algorithm (e.g., via the ahash or fnv crates), or simply comparing the eval_value contents directly (equality check without hashing), since the number of params per qual is expected to be small. Using a direct equality comparison would also eliminate the (very small but nonzero) probability of hash collisions causing a missed rescan.

Copilot uses AI. Check for mistakes.
Comment on lines 359 to 384
@@ -370,21 +379,63 @@ unsafe fn assign_paramenter_value<E: Into<ErrorReport>, W: ForeignDataWrapper<E>
) && let Some(cell) =
Cell::from_polymorphic_datum(datum, isnull, param.type_oid)
{
let mut eval_value = param
.eval_value
.lock()
.expect("param.eval_value should be locked");
*eval_value = Some(Value::Cell(cell.clone()));
qual.value = Value::Cell(cell);
qual.value = Value::Cell(cell.clone());
current_value = Some(Value::Cell(cell));
}
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For PARAM_EXTERN: when the parameter is SQL NULL (i.e., p.isnull == true), Cell::from_polymorphic_datum returns None, which means eval_value is correctly set to None, but qual.value is NOT updated — it retains its previous value (or the initial dummy Cell::I64(0) set at planning time in qual.rs:238). This stale qual.value will be passed to begin_scan() with an incorrect value if the parameter is NULL.

The same issue applies to PARAM_EXEC when the evaluation yields NULL (line 379–384): qual.value is not reset to reflect the NULL state.

For correctness, when a parameter evaluates to NULL, qual.value should also be updated to reflect the null state (e.g., by clearing or setting to a sentinel value).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@supabase-wrappers/src/scan.rs`:
- Around line 554-557: The code currently calls state.begin_scan() when
parameters change (comparing next_fingerprint to state.param_fingerprint), which
can re-enter begin_scan() on a live FDW and leak or corrupt resources; instead
detect the parameter change and invoke the proper restart/cleanup flow: if the
FDW implements re_scan(), call state.re_scan() (or the trait-equivalent) to let
the FDW refresh cursors/buffers; otherwise call state.end_scan() to tear down
existing state and then call state.begin_scan() to reinitialize cleanly; update
the branch that compares next_fingerprint and state.param_fingerprint to use
re_scan()/end_scan()+begin_scan() rather than directly calling begin_scan() on a
live instance.
- Around line 359-364: When Cell::from_polymorphic_datum(p.value, p.isnull,
p.ptype) returns None you must clear the stored predicate value so stale
predicates aren't used; update the branch around Cell::from_polymorphic_datum to
set qual.value to a cleared/null value and set current_value (and
param.eval_value where applicable) to None when the datum is NULL/missing,
mirroring the non-NULL path but with clearing semantics; ensure the same fix is
applied to the other similar block (the 375-394 region) that currently only
writes qual.value on the Some(cell) path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: eb5d267d-8cd3-4915-8dc1-33c5d146ca3e

📥 Commits

Reviewing files that changed from the base of the PR and between f56b281 and 8f64c7b.

📒 Files selected for processing (1)
  • supabase-wrappers/src/scan.rs

Comment on lines +359 to +364
if let Some(cell) =
Cell::from_polymorphic_datum(p.value, p.isnull, p.ptype)
{
qual.value = Value::Cell(cell.clone());
current_value = Some(Value::Cell(cell));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Clear qual.value on NULL / missing parameter updates.

Line 393 resets param.eval_value to None, but Lines 362 and 382 only overwrite qual.value on the Some(cell) path. After a non-NULL → NULL/missing transition, FDWs that read qual.value still see the previous predicate, so the NULL-rescan case remains stale.

Also applies to: 375-394

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@supabase-wrappers/src/scan.rs` around lines 359 - 364, When
Cell::from_polymorphic_datum(p.value, p.isnull, p.ptype) returns None you must
clear the stored predicate value so stale predicates aren't used; update the
branch around Cell::from_polymorphic_datum to set qual.value to a cleared/null
value and set current_value (and param.eval_value where applicable) to None when
the datum is NULL/missing, mirroring the non-NULL path but with clearing
semantics; ensure the same fix is applied to the other similar block (the
375-394 region) that currently only writes qual.value on the Some(cell) path.

Comment on lines +554 to +557
let result = if next_fingerprint != state.param_fingerprint {
state.param_fingerprint = next_fingerprint;
state.begin_scan()
} else {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't re-enter begin_scan() on a live FDW instance.

Line 556 turns begin_scan() into a rescan path without any teardown. The trait only defines re_scan()/end_scan() for restart/cleanup, so existing FDWs that allocate cursors or materialized buffers in begin_scan() can leak or append stale state when parameters change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@supabase-wrappers/src/scan.rs` around lines 554 - 557, The code currently
calls state.begin_scan() when parameters change (comparing next_fingerprint to
state.param_fingerprint), which can re-enter begin_scan() on a live FDW and leak
or corrupt resources; instead detect the parameter change and invoke the proper
restart/cleanup flow: if the FDW implements re_scan(), call state.re_scan() (or
the trait-equivalent) to let the FDW refresh cursors/buffers; otherwise call
state.end_scan() to tear down existing state and then call state.begin_scan() to
reinitialize cleanly; update the branch that compares next_fingerprint and
state.param_fingerprint to use re_scan()/end_scan()+begin_scan() rather than
directly calling begin_scan() on a live instance.

@burmecia burmecia added the core label Mar 12, 2026
@burmecia burmecia merged commit 96a0eab into main Mar 12, 2026
7 checks passed
@burmecia burmecia deleted the bnjjj/param_fingerprint_rescan branch March 12, 2026 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants