fix(duckdb): cast query_arrow results to projected_schema by ewgenius · Pull Request #652 · datafusion-contrib/datafusion-table-providers

ewgenius · 2026-05-21T02:49:51Z

DuckDB's query_arrow ignored the projected_schema parameter, returning batches with DuckDB's native types (e.g. Timestamp(µs)) even when the caller expected different types (e.g. Timestamp(ns)).

This caused schema mismatches for downstream operators (SortExec, RowConverter) that get pushed below SchemaCastScanExec in partitioned execution plans.

Changes

Cast result batches to projected_schema in the DuckDB query_arrow output stream when types differ
Add shared cast_batch_to_schema utility in util/arrow.rs for reuse by other Arrow-native connectors (ADBC, ODBC)
Reverted fix(duckdb): use actual DuckDB schema for read provider #650 as not needed anymore

DuckDB's query_arrow ignored the projected_schema parameter, returning batches with DuckDB's native types (e.g. Timestamp(µs)) even when the caller expected different types (e.g. Timestamp(ns)). This caused schema mismatches for downstream operators pushed below SchemaCastScanExec. Cast result batches to projected_schema in the output stream when types differ. Add shared cast_batch_to_schema utility in util/arrow.rs for reuse by other Arrow-native connectors (ADBC, ODBC).

This reverts commit 040aa83.

846d4de245e919bf3c3c1729c85f50a3564d7949 Include datafusion-contrib/datafusion-table-providers#652

…ESTAMPTZ columns (#10947) * test: Add failing tests for monotonic cast ordering propagation in SchemaCastScanExec Add tests that verify SchemaCastScanExec should propagate ordering through monotonic casts (temporal→temporal, numeric widening) and should return maintains_input_order=false for non-monotonic casts. These tests currently fail, demonstrating the RowConverter schema mismatch bug when using ORDER BY on partitioned DuckDB-accelerated tables with TIMESTAMPTZ columns. * fix: Propagate ordering through monotonic casts in SchemaCastScanExec Add is_order_preserving_cast() helper that identifies monotonic type casts (temporal→temporal, numeric→numeric) following DataFusion's CastExpr convention. Update equivalence_properties to propagate input ordering when the sort-key column cast is monotonic, and update maintains_input_order() to return false only when a non-monotonic cast exists. This fixes the 'RowConverter column schema mismatch' and 'does not satisfy order requirements' errors when using ORDER BY on partitioned DuckDB-accelerated tables with TIMESTAMPTZ columns (Timestamp µs→ns cast). * fix formatting * update datafusion-table-providers, to include datafusion schema fix * fix: Tighten is_order_preserving_cast to whitelist safe numeric widenings Address review comments: - Restrict numeric casts to a known-safe monotonic whitelist instead of allowing all numeric→numeric (signed↔unsigned can reorder). - Trim comments for clarity. - Update stale comment above ordering propagation logic. - Add comprehensive unit tests for is_order_preserving_cast and is_numeric_widening covering all positive and negative cases. - Add test for sort-key-unchanged-but-other-column-cast edge case. * test: Update retention test to expect Timestamp(µs) from DuckDB accelerator DuckDB stores timestamps in microsecond precision. With the table-providers fix, DuckSqlExec now correctly reports its actual µs schema instead of claiming ns. Update the test assertions accordingly. * Move is_order_preserving_cast and is_numeric_widening to arrow_tools::schema * Fix clippy::unnested_or_patterns in is_numeric_widening * remove unused import * Skip DuckDB nullability assertion in test_schema_preservation DuckDB does not preserve NOT NULL field metadata when returning Arrow results (all scanned columns are reported as nullable). See: duckdb/duckdb#4629 * Fix test * fix lint * Advertise engine schema from partitions in PartitionTableProvider Add test to verify schema reflects engine type downgrades (e.g. Timestamp ns→µs) * linting * Revert partitioned table schema override (handled by DTP schema cast at read time) * Update datafusion-table-providers to 846d4de245e919bf3c3c1729c85f50a3564d7949 Include datafusion-contrib/datafusion-table-providers#652 * cleanup * cleanup --------- Co-authored-by: Sergei Grebnov <sergei.grebnov@gmail.com> Co-authored-by: Jeadie <jeadie@users.noreply.github.com> Co-authored-by: jeadie <jack@spice.ai>

ewgenius changed the base branch from main to spiceai-52 May 21, 2026 02:50

ewgenius self-assigned this May 21, 2026

ewgenius added the bug Something isn't working label May 21, 2026

ewgenius changed the title ~~Evgenii/0521/duckdb cast to projected schema~~ fix(duckdb): cast query_arrow results to projected_schema May 21, 2026

ewgenius marked this pull request as ready for review May 21, 2026 02:51

ewgenius requested a review from phillipleblanc May 21, 2026 02:53

Revert "fix(duckdb): use actual DuckDB schema for read provider (#650)"

21d6f81

This reverts commit 040aa83.

phillipleblanc approved these changes May 21, 2026

View reviewed changes

ewgenius enabled auto-merge (squash) May 21, 2026 03:48

ewgenius merged commit 846d4de into spiceai-52 May 21, 2026
12 checks passed

ewgenius deleted the evgenii/0521/duckdb-cast-to-projected-schema branch May 21, 2026 04:03

ewgenius added a commit to spiceai/spiceai that referenced this pull request May 21, 2026

Update datafusion-table-providers to

2459a00

846d4de245e919bf3c3c1729c85f50a3564d7949 Include datafusion-contrib/datafusion-table-providers#652

ewgenius added a commit to spiceai/spiceai that referenced this pull request May 21, 2026

Update datafusion-table-providers to

d5038f5

846d4de245e919bf3c3c1729c85f50a3564d7949 Include datafusion-contrib/datafusion-table-providers#652

ewgenius mentioned this pull request May 21, 2026

fix: ORDER BY fails on partitioned DuckDB-accelerated tables with TIMESTAMPTZ columns spiceai/spiceai#10947

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(duckdb): cast query_arrow results to projected_schema#652

fix(duckdb): cast query_arrow results to projected_schema#652
ewgenius merged 2 commits into
spiceai-52from
evgenii/0521/duckdb-cast-to-projected-schema

ewgenius commented May 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ewgenius commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ewgenius commented May 21, 2026 •

edited

Loading