Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 75 additions & 16 deletions supabase-wrappers/src/scan.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ use pgrx::{
prelude::*,
};
use std::collections::HashMap;
use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};
use std::marker::PhantomData;

use pgrx::pg_sys::panic::ErrorReport;
Expand Down Expand Up @@ -51,6 +53,8 @@ struct FdwState<E: Into<ErrorReport>, W: ForeignDataWrapper<E>> {
values: Vec<Datum>,
nulls: Vec<bool>,
row: Row,
// fingerprint of current parameter values to detect rescan changes
param_fingerprint: u64,
_phantom: PhantomData<E>,
}

Expand All @@ -67,6 +71,7 @@ impl<E: Into<ErrorReport>, W: ForeignDataWrapper<E>> FdwState<E, W> {
values: Vec::new(),
nulls: Vec::new(),
row: Row::new(),
param_fingerprint: 0,
_phantom: PhantomData,
}
}
Expand Down Expand Up @@ -341,19 +346,23 @@ unsafe fn assign_paramenter_value<E: Into<ErrorReport>, W: ForeignDataWrapper<E>
// assign parameter value to qual
for qual in &mut state.quals.iter_mut() {
if let Some(param) = &mut qual.param {
let mut current_value: Option<Value> = None;
match param.kind {
ParamKind::PARAM_EXTERN => {
// get parameter list in execution state
let plist_info = (*estate).es_param_list_info;
if plist_info.is_null() {
continue;
}
let params_cnt = (*plist_info).numParams as usize;
let plist = (*plist_info).params.as_slice(params_cnt);
let p: pg_sys::ParamExternData = plist[param.id - 1];
if let Some(cell) = Cell::from_polymorphic_datum(p.value, p.isnull, p.ptype)
{
qual.value = Value::Cell(cell);
if !plist_info.is_null() {
let params_cnt = (*plist_info).numParams as usize;
if param.id > 0 && param.id <= params_cnt {
let plist = (*plist_info).params.as_slice(params_cnt);
let p: pg_sys::ParamExternData = plist[param.id - 1];
if let Some(cell) =
Cell::from_polymorphic_datum(p.value, p.isnull, p.ptype)
{
qual.value = Value::Cell(cell.clone());
current_value = Some(Value::Cell(cell));
}
Comment on lines +357 to +362
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Clear qual.value on NULL / missing parameter updates.

Line 393 resets param.eval_value to None, but Lines 362 and 382 only overwrite qual.value on the Some(cell) path. After a non-NULL → NULL/missing transition, FDWs that read qual.value still see the previous predicate, so the NULL-rescan case remains stale.

Also applies to: 375-394

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@supabase-wrappers/src/scan.rs` around lines 359 - 364, When
Cell::from_polymorphic_datum(p.value, p.isnull, p.ptype) returns None you must
clear the stored predicate value so stale predicates aren't used; update the
branch around Cell::from_polymorphic_datum to set qual.value to a cleared/null
value and set current_value (and param.eval_value where applicable) to None when
the datum is NULL/missing, mirroring the non-NULL path but with clearing
semantics; ensure the same fix is applied to the other similar block (the
375-394 region) that currently only writes qual.value on the Some(cell) path.

}
}
}
ParamKind::PARAM_EXEC => {
Expand All @@ -370,21 +379,63 @@ unsafe fn assign_paramenter_value<E: Into<ErrorReport>, W: ForeignDataWrapper<E>
) && let Some(cell) =
Cell::from_polymorphic_datum(datum, isnull, param.type_oid)
{
let mut eval_value = param
.eval_value
.lock()
.expect("param.eval_value should be locked");
*eval_value = Some(Value::Cell(cell.clone()));
qual.value = Value::Cell(cell);
qual.value = Value::Cell(cell.clone());
current_value = Some(Value::Cell(cell));
}
Comment on lines 357 to 382
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For PARAM_EXTERN: when the parameter is SQL NULL (i.e., p.isnull == true), Cell::from_polymorphic_datum returns None, which means eval_value is correctly set to None, but qual.value is NOT updated — it retains its previous value (or the initial dummy Cell::I64(0) set at planning time in qual.rs:238). This stale qual.value will be passed to begin_scan() with an incorrect value if the parameter is NULL.

The same issue applies to PARAM_EXEC when the evaluation yields NULL (line 379–384): qual.value is not reset to reflect the NULL state.

For correctness, when a parameter evaluates to NULL, qual.value should also be updated to reflect the null state (e.g., by clearing or setting to a sentinel value).

Copilot uses AI. Check for mistakes.
}
_ => {}
}

let mut eval_value = param
.eval_value
.lock()
.expect("param.eval_value should be locked");
*eval_value = current_value;
}
}
}
}

#[derive(Hash, PartialEq, Eq)]
struct ParamFingerPrint<'a> {
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The struct is named ParamFingerPrint (with capital P in "Print"), but Rust naming conventions for structs use PascalCase without internal capitalization distinctions for compound words — it should be ParamFingerprint (lowercase 'p' in 'print'). All other structs in this file (FdwState, ExprEval, etc.) follow standard PascalCase without gratuitous internal capitalization.

Copilot uses AI. Check for mistakes.
field: &'a str,
operator: &'a str,
use_or: bool,
kind: u32,
id: usize,
type_oid: Oid,
eval_value: String,
}

fn compute_param_fingerprint<E: Into<ErrorReport>, W: ForeignDataWrapper<E>>(
state: &FdwState<E, W>,
) -> u64 {
let mut hasher = DefaultHasher::new();
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::collections::hash_map::DefaultHasher is explicitly documented as not being stable: its hashing algorithm may change between Rust versions, and its output is also subject to per-process hash randomization (HashDoS protection). While collision probability is low for this use case (same-process comparison only), the implementation is specifically called out in Rust docs as "not guaranteed" for stability.

A more robust choice for a fingerprint would be using std::hash::BuildHasher with a stable algorithm (e.g., via the ahash or fnv crates), or simply comparing the eval_value contents directly (equality check without hashing), since the number of params per qual is expected to be small. Using a direct equality comparison would also eliminate the (very small but nonzero) probability of hash collisions causing a missed rescan.

Copilot uses AI. Check for mistakes.

for qual in &state.quals {
let Some(param) = &qual.param else {
continue;
};
let param_finger_print = ParamFingerPrint {
field: &qual.field,
operator: &qual.operator,
use_or: qual.use_or,
kind: param.kind,
id: param.id,
type_oid: param.type_oid,
eval_value: match param.eval_value.lock() {
Ok(value) => {
format!("{:?}", *value)
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eval_value field in ParamFingerPrint is computed using format!("{:?}", *value), which relies on the Debug representation of Option<Value> and in turn Value and Cell. For floating-point variants (Cell::F32, Cell::F64), the Debug representation may expose representation quirks (e.g., NaN and -NaN would both hash differently even if logically equivalent). More importantly, if the Debug output of Value or Cell changes in the future, it would silently change fingerprint behavior.

A more robust approach would be to implement Hash directly on Value and Cell, or to compare eval_value contents for equality rather than relying on serialized Debug strings.

Suggested change
format!("{:?}", *value)
// Derive a stable fingerprint from the parameter value itself
let mut value_hasher = DefaultHasher::new();
(*value).hash(&mut value_hasher);
value_hasher.finish().to_string()

Copilot uses AI. Check for mistakes.
}
Err(_) => "param_eval_lock_error".to_string(),
},
};

param_finger_print.hash(&mut hasher);
}
hasher.finish()
}

#[pg_guard]
pub(super) extern "C-unwind" fn begin_foreign_scan<
E: Into<ErrorReport>,
Expand All @@ -402,6 +453,7 @@ pub(super) extern "C-unwind" fn begin_foreign_scan<

// assign parameter values to qual
assign_paramenter_value(node, &mut state);
state.param_fingerprint = compute_param_fingerprint(&state);

// begin scan if it is not EXPLAIN statement
if eflags & pg_sys::EXEC_FLAG_EXPLAIN_ONLY as c_int <= 0 {
Expand Down Expand Up @@ -497,7 +549,14 @@ pub(super) extern "C-unwind" fn re_scan_foreign_scan<
let fdw_state = (*node).fdw_state as *mut FdwState<E, W>;
if !fdw_state.is_null() {
let mut state = PgBox::<FdwState<E, W>>::from_pg(fdw_state);
let result = state.re_scan();
assign_paramenter_value(node, &mut state);
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new caller in re_scan_foreign_scan at this line still uses the typo-containing name assign_paramenter_value (should be assign_parameter_value). Since the PR adds this new call site, it would be appropriate to also rename the function to fix the pre-existing spelling error paramenterparameter.

Copilot uses AI. Check for mistakes.
let next_fingerprint = compute_param_fingerprint(&state);
let result = if next_fingerprint != state.param_fingerprint {
state.param_fingerprint = next_fingerprint;
state.begin_scan()
Comment on lines +541 to +544
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When re_scan_foreign_scan detects a fingerprint change and calls begin_scan() (line 556), the previous scan session's resources have never been cleaned up via end_scan(). FDW implementations such as ClickHouseFdw, BigQueryFdw, IcebergFdw, etc., typically allocate connections, cursors, or buffers in begin_scan and release them in end_scan. Calling begin_scan() again without a preceding end_scan() may cause connection/resource leaks, or even double initialization bugs.

To fix this, end_scan() should be called on the state before invoking begin_scan() when the fingerprint has changed.

Suggested change
state.param_fingerprint = next_fingerprint;
state.begin_scan()
// The parameters have changed; clean up the previous scan session
// before starting a new one to avoid leaking resources.
let end_result = state.end_scan();
if end_result.is_err() {
end_result
} else {
state.param_fingerprint = next_fingerprint;
state.begin_scan()
}

Copilot uses AI. Check for mistakes.
} else {
Comment on lines +540 to +545
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't re-enter begin_scan() on a live FDW instance.

Line 556 turns begin_scan() into a rescan path without any teardown. The trait only defines re_scan()/end_scan() for restart/cleanup, so existing FDWs that allocate cursors or materialized buffers in begin_scan() can leak or append stale state when parameters change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@supabase-wrappers/src/scan.rs` around lines 554 - 557, The code currently
calls state.begin_scan() when parameters change (comparing next_fingerprint to
state.param_fingerprint), which can re-enter begin_scan() on a live FDW and leak
or corrupt resources; instead detect the parameter change and invoke the proper
restart/cleanup flow: if the FDW implements re_scan(), call state.re_scan() (or
the trait-equivalent) to let the FDW refresh cursors/buffers; otherwise call
state.end_scan() to tear down existing state and then call state.begin_scan() to
reinitialize cleanly; update the branch that compares next_fingerprint and
state.param_fingerprint to use re_scan()/end_scan()+begin_scan() rather than
directly calling begin_scan() on a live instance.

state.re_scan()
};
if result.is_err() {
drop_fdw_state(state.as_ptr());
(*node).fdw_state = ptr::null::<FdwState<E, W>>() as _;
Expand Down
Loading