Conversation
… rescan Signed-off-by: Benjamin <5719034+bnjjj@users.noreply.github.com>
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis pull request introduces a parameter fingerprinting mechanism to the foreign data wrapper scan logic. A new Sequence Diagram(s)sequenceDiagram
participant Executor as Query Executor
participant FdwState as FdwState
participant ParamAssign as Parameter Assignment
participant FingerprintCalc as Fingerprint Computation
participant ScanOp as Scan Operation
Executor->>FdwState: begin_foreign_scan()
FdwState->>ParamAssign: assign_parameter_value()
ParamAssign->>ParamAssign: track current_value
ParamAssign->>FdwState: update param.eval_value
FdwState->>FingerprintCalc: compute_param_fingerprint()
FingerprintCalc->>FingerprintCalc: hash(field, operator, kind, id, type_oid, eval_value)
FingerprintCalc->>FdwState: return fingerprint hash
FdwState->>FdwState: store param_fingerprint
FdwState->>ScanOp: initialize scan
Executor->>FdwState: re_scan_foreign_scan()
FdwState->>FingerprintCalc: compute_param_fingerprint()
FingerprintCalc->>FdwState: return current fingerprint
FdwState->>FdwState: compare with stored param_fingerprint
alt Fingerprint Changed
FdwState->>FdwState: reinitialize via begin_scan()
else Fingerprint Unchanged
FdwState->>ScanOp: perform standard rescan()
end
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR fixes three related bugs in the supabase-wrappers scan infrastructure around how runtime parameter values (PARAM_EXTERN from prepared statements and PARAM_EXEC from LATERAL/correlated queries) are tracked and propagated to FDW implementations.
Changes:
PARAM_EXTERNvalues are now written intoparam.eval_value, and bothPARAM_EXTERNandPARAM_EXECnow correctly seteval_valuetoNonewhen the parameter is NULL or unavailable (stale values no longer persist).- A
compute_param_fingerprintfunction andparam_fingerprintfield are introduced to track whether parameter values have changed between rescans. re_scan_foreign_scannow callsbegin_scan()instead ofre_scan()when the fingerprint changes, so FDWs that buffer data inbegin_scanwill re-execute their remote query with updated parameter values.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
supabase-wrappers/src/scan.rs
Outdated
| type_oid: param.type_oid, | ||
| eval_value: match param.eval_value.lock() { | ||
| Ok(value) => { | ||
| format!("{:?}", *value) |
There was a problem hiding this comment.
The eval_value field in ParamFingerPrint is computed using format!("{:?}", *value), which relies on the Debug representation of Option<Value> and in turn Value and Cell. For floating-point variants (Cell::F32, Cell::F64), the Debug representation may expose representation quirks (e.g., NaN and -NaN would both hash differently even if logically equivalent). More importantly, if the Debug output of Value or Cell changes in the future, it would silently change fingerprint behavior.
A more robust approach would be to implement Hash directly on Value and Cell, or to compare eval_value contents for equality rather than relying on serialized Debug strings.
| format!("{:?}", *value) | |
| // Derive a stable fingerprint from the parameter value itself | |
| let mut value_hasher = DefaultHasher::new(); | |
| (*value).hash(&mut value_hasher); | |
| value_hasher.finish().to_string() |
supabase-wrappers/src/scan.rs
Outdated
| } | ||
|
|
||
| #[derive(Hash, PartialEq, Eq)] | ||
| struct ParamFingerPrint<'a> { |
There was a problem hiding this comment.
The struct is named ParamFingerPrint (with capital P in "Print"), but Rust naming conventions for structs use PascalCase without internal capitalization distinctions for compound words — it should be ParamFingerprint (lowercase 'p' in 'print'). All other structs in this file (FdwState, ExprEval, etc.) follow standard PascalCase without gratuitous internal capitalization.
supabase-wrappers/src/scan.rs
Outdated
| if !fdw_state.is_null() { | ||
| let mut state = PgBox::<FdwState<E, W>>::from_pg(fdw_state); | ||
| let result = state.re_scan(); | ||
| assign_paramenter_value(node, &mut state); |
There was a problem hiding this comment.
The new caller in re_scan_foreign_scan at this line still uses the typo-containing name assign_paramenter_value (should be assign_parameter_value). Since the PR adds this new call site, it would be appropriate to also rename the function to fix the pre-existing spelling error paramenter → parameter.
| state.param_fingerprint = next_fingerprint; | ||
| state.begin_scan() |
There was a problem hiding this comment.
When re_scan_foreign_scan detects a fingerprint change and calls begin_scan() (line 556), the previous scan session's resources have never been cleaned up via end_scan(). FDW implementations such as ClickHouseFdw, BigQueryFdw, IcebergFdw, etc., typically allocate connections, cursors, or buffers in begin_scan and release them in end_scan. Calling begin_scan() again without a preceding end_scan() may cause connection/resource leaks, or even double initialization bugs.
To fix this, end_scan() should be called on the state before invoking begin_scan() when the fingerprint has changed.
| state.param_fingerprint = next_fingerprint; | |
| state.begin_scan() | |
| // The parameters have changed; clean up the previous scan session | |
| // before starting a new one to avoid leaking resources. | |
| let end_result = state.end_scan(); | |
| if end_result.is_err() { | |
| end_result | |
| } else { | |
| state.param_fingerprint = next_fingerprint; | |
| state.begin_scan() | |
| } |
supabase-wrappers/src/scan.rs
Outdated
| fn compute_param_fingerprint<E: Into<ErrorReport>, W: ForeignDataWrapper<E>>( | ||
| state: &FdwState<E, W>, | ||
| ) -> u64 { | ||
| let mut hasher = DefaultHasher::new(); |
There was a problem hiding this comment.
std::collections::hash_map::DefaultHasher is explicitly documented as not being stable: its hashing algorithm may change between Rust versions, and its output is also subject to per-process hash randomization (HashDoS protection). While collision probability is low for this use case (same-process comparison only), the implementation is specifically called out in Rust docs as "not guaranteed" for stability.
A more robust choice for a fingerprint would be using std::hash::BuildHasher with a stable algorithm (e.g., via the ahash or fnv crates), or simply comparing the eval_value contents directly (equality check without hashing), since the number of params per qual is expected to be small. Using a direct equality comparison would also eliminate the (very small but nonzero) probability of hash collisions causing a missed rescan.
| @@ -370,21 +379,63 @@ unsafe fn assign_paramenter_value<E: Into<ErrorReport>, W: ForeignDataWrapper<E> | |||
| ) && let Some(cell) = | |||
| Cell::from_polymorphic_datum(datum, isnull, param.type_oid) | |||
| { | |||
| let mut eval_value = param | |||
| .eval_value | |||
| .lock() | |||
| .expect("param.eval_value should be locked"); | |||
| *eval_value = Some(Value::Cell(cell.clone())); | |||
| qual.value = Value::Cell(cell); | |||
| qual.value = Value::Cell(cell.clone()); | |||
| current_value = Some(Value::Cell(cell)); | |||
| } | |||
There was a problem hiding this comment.
For PARAM_EXTERN: when the parameter is SQL NULL (i.e., p.isnull == true), Cell::from_polymorphic_datum returns None, which means eval_value is correctly set to None, but qual.value is NOT updated — it retains its previous value (or the initial dummy Cell::I64(0) set at planning time in qual.rs:238). This stale qual.value will be passed to begin_scan() with an incorrect value if the parameter is NULL.
The same issue applies to PARAM_EXEC when the evaluation yields NULL (line 379–384): qual.value is not reset to reflect the NULL state.
For correctness, when a parameter evaluates to NULL, qual.value should also be updated to reflect the null state (e.g., by clearing or setting to a sentinel value).
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@supabase-wrappers/src/scan.rs`:
- Around line 554-557: The code currently calls state.begin_scan() when
parameters change (comparing next_fingerprint to state.param_fingerprint), which
can re-enter begin_scan() on a live FDW and leak or corrupt resources; instead
detect the parameter change and invoke the proper restart/cleanup flow: if the
FDW implements re_scan(), call state.re_scan() (or the trait-equivalent) to let
the FDW refresh cursors/buffers; otherwise call state.end_scan() to tear down
existing state and then call state.begin_scan() to reinitialize cleanly; update
the branch that compares next_fingerprint and state.param_fingerprint to use
re_scan()/end_scan()+begin_scan() rather than directly calling begin_scan() on a
live instance.
- Around line 359-364: When Cell::from_polymorphic_datum(p.value, p.isnull,
p.ptype) returns None you must clear the stored predicate value so stale
predicates aren't used; update the branch around Cell::from_polymorphic_datum to
set qual.value to a cleared/null value and set current_value (and
param.eval_value where applicable) to None when the datum is NULL/missing,
mirroring the non-NULL path but with clearing semantics; ensure the same fix is
applied to the other similar block (the 375-394 region) that currently only
writes qual.value on the Some(cell) path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: eb5d267d-8cd3-4915-8dc1-33c5d146ca3e
📒 Files selected for processing (1)
supabase-wrappers/src/scan.rs
| if let Some(cell) = | ||
| Cell::from_polymorphic_datum(p.value, p.isnull, p.ptype) | ||
| { | ||
| qual.value = Value::Cell(cell.clone()); | ||
| current_value = Some(Value::Cell(cell)); | ||
| } |
There was a problem hiding this comment.
Clear qual.value on NULL / missing parameter updates.
Line 393 resets param.eval_value to None, but Lines 362 and 382 only overwrite qual.value on the Some(cell) path. After a non-NULL → NULL/missing transition, FDWs that read qual.value still see the previous predicate, so the NULL-rescan case remains stale.
Also applies to: 375-394
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@supabase-wrappers/src/scan.rs` around lines 359 - 364, When
Cell::from_polymorphic_datum(p.value, p.isnull, p.ptype) returns None you must
clear the stored predicate value so stale predicates aren't used; update the
branch around Cell::from_polymorphic_datum to set qual.value to a cleared/null
value and set current_value (and param.eval_value where applicable) to None when
the datum is NULL/missing, mirroring the non-NULL path but with clearing
semantics; ensure the same fix is applied to the other similar block (the
375-394 region) that currently only writes qual.value on the Some(cell) path.
| let result = if next_fingerprint != state.param_fingerprint { | ||
| state.param_fingerprint = next_fingerprint; | ||
| state.begin_scan() | ||
| } else { |
There was a problem hiding this comment.
Don't re-enter begin_scan() on a live FDW instance.
Line 556 turns begin_scan() into a rescan path without any teardown. The trait only defines re_scan()/end_scan() for restart/cleanup, so existing FDWs that allocate cursors or materialized buffers in begin_scan() can leak or append stale state when parameters change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@supabase-wrappers/src/scan.rs` around lines 554 - 557, The code currently
calls state.begin_scan() when parameters change (comparing next_fingerprint to
state.param_fingerprint), which can re-enter begin_scan() on a live FDW and leak
or corrupt resources; instead detect the parameter change and invoke the proper
restart/cleanup flow: if the FDW implements re_scan(), call state.re_scan() (or
the trait-equivalent) to let the FDW refresh cursors/buffers; otherwise call
state.end_scan() to tear down existing state and then call state.begin_scan() to
reinitialize cleanly; update the branch that compares next_fingerprint and
state.param_fingerprint to use re_scan()/end_scan()+begin_scan() rather than
directly calling begin_scan() on a live instance.
…ing during rescan
What kind of change does this PR introduce?
Bug/Feature
What is the current behavior?
supabase-wrapperscurrently updatesqual.valuefrom runtime params, but parameter state is not consistently tracked for FDW consumers and rescans:PARAM_EXTERNvalues were not written into param.eval_value.PARAM_EXECwroteparam.eval_valueonly on successful non-null eval and did not always clear stale previous values.ReScanForeignScanalways calledre_scan()and never re-ranbegin_scan()when parameter values changed.This is problematic for FDWs that:
For parameterized queries, parameter values can change between rescans (especially with correlated/LATERAL execution). If wrappers only rewinds local FDW state, the FDW may serve stale rows from the previous parameter value.
Example queries affected
Expected:
Before this patch, a materializing FDW could reuse buffered rows from first parameter and return incorrect repeated results.
FDWs reading
param.eval_valuehad no consistent runtime extern value to push down from wrappers.Without clearing runtime parameter state, FDW-side logic could accidentally observe stale non-null values.
What is the new behavior?
My main goal was to not introduce big breaking changes.
Additional context
compute_param_fingerprintintentionally hashesparam.eval_value(runtime truth), notqual.value, because from my understandingqual.valuecan contain placeholder/stale values for parameterized quals.I don't know if it could break existing behavior, you'll probably better than me to know this so if you think this fix is unnecessary or wrong feel free to tell me.