Skip to content

Fix View::to_polars and add LargeUTF8 Arrow dictionary support#2924

Merged
texodus merged 1 commit intomasterfrom
polars-fix
Feb 14, 2025
Merged

Fix View::to_polars and add LargeUTF8 Arrow dictionary support#2924
texodus merged 1 commit intomasterfrom
polars-fix

Conversation

@texodus
Copy link
Member

@texodus texodus commented Feb 14, 2025

This PR fixes the Python View::to_polars function to work and adds a test.

While attempting to implement transitive testing (perspective -> polars -> perspective), I realized Polars stores dictionary columns in the uncommon LargeUTF8 dictionary format, which caused segfaults in our u32-indexed dictionary reading code. As a result, this PR also adds support for Arrow/PyArrowLargeUTF8 dictionary columns.

Signed-off-by: Andrew Stein <steinlink@gmail.com>
@texodus texodus added the enhancement Feature requests or improvements label Feb 14, 2025
@texodus texodus merged commit 6f4be68 into master Feb 14, 2025
14 checks passed
@texodus texodus deleted the polars-fix branch February 14, 2025 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Feature requests or improvements

Development

Successfully merging this pull request may close these issues.

1 participant