Skip to content

fix(duckdb): support materializing enum types to pyarrow #11214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

NickCrews
Copy link
Contributor

@NickCrews NickCrews commented May 14, 2025

See the added test, and the comment.

I'm not sure if this approach, of pa_table.cast(ibis_table.schema().to_pyarrow()), is too broad.
Instead of casting the entire pyarrow table to the expected schema, we could look for the specific case of
when the ibis type is string and the pyarrow type is dictionary<values=string, indices=[intlike]>), and just convert those columns. For instance, this approach could be masking some other semantic mistake that we are making. I assume that pyarrow is smart enough to not actually perform computation unless the cast is really required, so I don't think this should be computationally wasteful.

This also makes it much more explicit that to get to pandas, we first convert to pyarrow, then to pandas, further cementing pyarrow as our first-class citizen, before pandas. I think I remember seeing this as a desire in a different issue/comment. It certainly makes it easier to reason about consistencies between to_pyarrow() and to_pandas()

@github-actions github-actions bot added the duckdb The DuckDB backend label May 14, 2025
@NickCrews NickCrews force-pushed the duckdb-enum-to-pyarrow branch from 3ca1f4d to 3cec096 Compare May 14, 2025 19:55
@github-actions github-actions bot added the tests Issues or PRs related to tests label May 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duckdb The DuckDB backend tests Issues or PRs related to tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant