Skip to content

feat(duckdb): support version v1.5.3#12009

Open
JonAnCla wants to merge 4 commits into
ibis-project:mainfrom
JonAnCla:duckdb-1.5.3-compatibility
Open

feat(duckdb): support version v1.5.3#12009
JonAnCla wants to merge 4 commits into
ibis-project:mainfrom
JonAnCla:duckdb-1.5.3-compatibility

Conversation

@JonAnCla

Copy link
Copy Markdown
Contributor

This PR fixes issues with using duckdb 1.5.3 with ibis - see related issue here #12008. Note that there is one issue (duckdb/duckdb-spatial#818) in duckdb geospatial that needs fixing before all tests will pass with 1.5.3

Full disclosure: to make this PR I used copilot to run tests and then make changes. I guided it to refine its approach to fixing in some places and also had it check compatibility on duckdb 1.0 and 0.10.3 (minimum version ibis supports).

The changes below are copilot's summary with links to relevant files. References to 1.4.x are for duckdb 1.4.x and earlier.

I've left the duckdb==1.4.4 lock so that CI can show that these changes pass on current locked duckdb version. Once that is proven we can bump to 1.5.3 to run CI again.

Please let me know if any reservations about using copilot for PRs like this.

Thanks!

Description of changes

DuckDB 1.5.3 compatibility

Fixes the ibis DuckDB backend to work correctly with 1.5.3
without version-gated xfail markers in tests.


ibis/backends/duckdb/__init__.py

to_pyarrow_batches — replaced fetch_record_batch() with a cross-version branch:

  • DuckDB 1.5.x: fetch_record_batch() was removed (ABI breakage); now uses rel.to_arrow_reader(batch_size=chunk_size)
  • DuckDB 1.4.x: uses rel.fetch_arrow_reader(chunk_size) (positional arg only)

to_pyarrow and execute — both branch on hasattr(rel, "to_arrow_reader"):

_to_duckdb_relation — added a DuckDB 1.4.x geometry fixup. In 1.4.x,
to_arrow_table() returns DuckDB's internal 20-byte binary format for GEOMETRY
columns, not WKB — Shapely cannot parse this. When geometry columns are present and
not hasattr(duckdb.DuckDBPyRelation, "to_arrow_reader"), wraps each geometry column
with ST_ASWKB(col) AS col. The check is at class level (no throwaway relation created)
and the wrapping query is built using sqlglot expressions, not string assembly.


ibis/backends/sql/compilers/duckdb.py

to_sqlglot — reverted to a clean passthrough. A previous version injected
ST_ASWKB here, which mutated user-visible SQL from con.compile(), but has consequence
that to_geo/to_parquet/to_csv output were affected, and made SQL snapshots version-dependent.
The wrapping now lives exclusively in _to_duckdb_relation.

visit_ArrayStringJoin — new override: in DuckDB 1.5.x, array_to_string([], ',')
returns '' instead of NULL. Fix: IF(len(arg) > 0, array_to_string(arg, sep), NULL).


ibis/backends/sql/datatypes.py

_from_sqlglot_GEOMETRY and _from_sqlglot_GEOGRAPHY — DuckDB 1.5.x reports
SRID-qualified geometry types as GEOMETRY('EPSG:2263') (string positional arg) rather
than an integer SRID. The previous _geotypes[arg.this.this] lookup raised KeyError.
Now falls back gracefully and parses EPSG:xxxx strings as SRIDs.


ibis/backends/tests/test_numeric.py

test_numeric_literal / duckdb-decimal-big — the error for DECIMAL(76, 38)
changed from DuckDBParserException (1.4.x) to DuckDBBinderException (1.5.x).
Updated to raises=(DuckDBParserException, DuckDBBinderException).


ibis/backends/duckdb/tests/test_client.py

test_pyarrow_batches_chunk_sizechunk_size=-1 now raises TypeError eagerly
at call time rather than on the first next(). Test updated accordingly.


ibis/backends/duckdb/tests/test_geospatial.py

test_geospatial_buffer — replaced @pytest.mark.xfail_version (fired for any
DuckDB version when shapely>=2.1.0) with a conditional @pytest.mark.xfail scoped to
DuckDB <1.5 + shapely>=2.1.0 only.

Selafin — added raises=duckdb.Error to no_roundtrip (failure mode changed in
1.5.x).

MapML and GeoRSS — crash in DuckDB 1.5.3 with
basic_string: construction from null is not valid (upstream:
duckdb/duckdb-spatial#818). Left as plain param(...) entries
with # NOTE: comments and the upstream issue link rather than xfail, since they pass
on 1.4.4.

PMTilesno_roundtrip(reason="row counts differ", raises=AssertionError).


ibis/backends/duckdb/tests/snapshots/test_geospatial/test_literal_geospatial_explicit/expr0/out.sql and expr1/out.sql

Both snapshots simplified from a SELECT * REPLACE (ST_ASWKB(...)) FROM (...) wrapper
back to plain SELECT ST_GEOMFROMTEXT('POINT (1 0)') AS "p", matching the reverted
to_sqlglot.

@github-actions github-actions Bot added tests Issues or PRs related to tests duckdb The DuckDB backend sql Backends that generate SQL labels May 29, 2026
@JonAnCla

Copy link
Copy Markdown
Contributor Author

tests on current duckdb (1.4.4) passed in CI https://github.com/ibis-project/ibis/actions/runs/26638631127/job/78505291131?pr=12009

@github-actions github-actions Bot added the dependencies Issues or PRs related to dependencies label May 29, 2026
@JonAnCla

Copy link
Copy Markdown
Contributor Author

I've pushed lockfile change to use duckdb==1.5.3

Two unit tests fail: https://github.com/ibis-project/ibis/actions/runs/26641555600/job/78515786365?pr=12009

These are due to duckdb/duckdb-spatial#818

@deepyaman deepyaman changed the title feat: support duckdb 1.5.3 feat(duckdb): support version v1.5.3 Jun 2, 2026
@deepyaman

Copy link
Copy Markdown
Collaborator

Please let me know if any reservations about using copilot for PRs like this.

As per https://ibis-project.org/contribute/06_automated_code_and_ai.html, I think it would be better to rewrite the PR description in your own words (or just move the Copilot stuff to a comment, since it's almost excessive detail for a normal PR description). But, since you've clearly looked into the changes yourself, I think it's fair to say that the requisite human effort is there.

Two unit tests fail: https://github.com/ibis-project/ibis/actions/runs/26641555600/job/78515786365?pr=12009

Can you xfail them with comment? Ideally, they may be fixed upstream before next Ibis release.

@jc-5s

jc-5s commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Thanks @deepyaman for taking time to review and appreciate your comments an the AI point. I'll clean up the PR description

Re: the failing tests - I'm hesitant to mark them as xfail as if the duckdb changes aren't addressed before next release that would break functionality for some users? So I think we should either wait for the duckdb changes and merge, or merge now leave the tests as is so that another release is not made until either fixed or we accept that this will be a (very small) breaking change?

@deepyaman

Copy link
Copy Markdown
Collaborator

So I think we should either wait for the duckdb changes and merge

This sounds fine! Leaves time to revisit if and when start hearing about the next Ibis release, in case want to find a workaround at that point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Issues or PRs related to dependencies duckdb The DuckDB backend sql Backends that generate SQL tests Issues or PRs related to tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants