Context
When profiling or running EXPLAIN on queries that hit remote databases via Quack, DuckDB's query planner currently abstracts the entire remote operation into a single, shallow QUACK_QUERY node.
Because of this abstraction, we lose visibility into deep local operators, actual/estimated cardinalities, and storage statistics happening on the Quack side. We only see high-level projections and a single row approximation.
Current EXPLAIN behavior example:
┌─────────────────────────────────┐
│ STREAMING_LIMIT │
└─────────────────────────────────┘
│
┌─────────────────────────────────┐
│ QUACK_QUERY │
├─────────────────────────────────┤
│ Server: │
│ quack:127.0.0.1:9494 │
│ │
│ Projections: │
│ id │
│ created_at │
└─────────────────────────────────┘
The Problem
DuckDB itself supports incredibly rich execution plan and profiling outputs, including:
EXPLAIN: Shows estimated cardinalities and structural query plans.
EXPLAIN ANALYZE: Runs the query and returns actual cardinalities, timing per operator, and breakdown metrics (like rows scanned).
- Profiling Modes: Native support for JSON/HTML profiling outputs.
By limiting the Quack path to a shallow QUACK_QUERY block, it becomes difficult to optimize queries, debug performance bottlenecks, or understand how data is being processed, filtered, or aggregated on the remote instance before it streams back.
Proposed Solution / Feature Request
We would love to see the Quack extension pass down or reconstruct a more detailed execution plan inside DuckDB. Ideally, the QUACK_QUERY node could either:
- Nest the remote plan: Expand the
QUACK_QUERY node to reflect the underlying operators (scans, filters, projections) being executed remotely.
- Expose rich metrics on ANALYZE: Allow
EXPLAIN ANALYZE to pull back and append metrics from the remote engine (e.g., remote execution time, exact rows scanned, memory usage).
This would bring the Quack remote protocol debugging experience much closer to native DuckDB profiling.
References
Context
When profiling or running
EXPLAINon queries that hit remote databases via Quack, DuckDB's query planner currently abstracts the entire remote operation into a single, shallowQUACK_QUERYnode.Because of this abstraction, we lose visibility into deep local operators, actual/estimated cardinalities, and storage statistics happening on the Quack side. We only see high-level projections and a single row approximation.
Current
EXPLAINbehavior example:The Problem
DuckDB itself supports incredibly rich execution plan and profiling outputs, including:
EXPLAIN: Shows estimated cardinalities and structural query plans.EXPLAIN ANALYZE: Runs the query and returns actual cardinalities, timing per operator, and breakdown metrics (like rows scanned).By limiting the Quack path to a shallow
QUACK_QUERYblock, it becomes difficult to optimize queries, debug performance bottlenecks, or understand how data is being processed, filtered, or aggregated on the remote instance before it streams back.Proposed Solution / Feature Request
We would love to see the Quack extension pass down or reconstruct a more detailed execution plan inside DuckDB. Ideally, the
QUACK_QUERYnode could either:QUACK_QUERYnode to reflect the underlying operators (scans, filters, projections) being executed remotely.EXPLAIN ANALYZEto pull back and append metrics from the remote engine (e.g., remote execution time, exact rows scanned, memory usage).This would bring the Quack remote protocol debugging experience much closer to native DuckDB profiling.
References