Open
Description
Describe the enhancement requested
Python 3.10.16 (main, Dec 3 2024, 17:27:57) [Clang 16.0.0 (clang-1600.0.26.4)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.31.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import pyarrow as pa
...: import uuid
...:
...: arr_table = pa.Table.from_pydict(
...: {
...: "uuid": [
...: uuid.UUID("00000000-0000-0000-0000-000000000000").bytes,
...: uuid.UUID("11111111-1111-1111-1111-111111111111").bytes,
...: ],
...: },
...: schema=pa.schema(
...: [
...: pa.field("uuid", pa.uuid(), nullable=False),
...: ]
...: ),
...: )
...:
...: arr_table.group_by('uuid').aggregate([])
---------------------------------------------------------------------------
ArrowNotImplementedError Traceback (most recent call last)
Cell In[1], line 18
2 import uuid
4 arr_table = pa.Table.from_pydict(
5 {
6 "uuid": [
(...)
15 ),
16 )
---> 18 arr_table.group_by('uuid').aggregate([])
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/table.pxi:6560, in pyarrow.lib.TableGroupBy.aggregate()
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/acero.py:410, in _group_by(table, aggregates, keys, use_threads)
404 def _group_by(table, aggregates, keys, use_threads=True):
406 decl = Declaration.from_sequence([
407 Declaration("table_source", TableSourceNodeOptions(table)),
408 Declaration("aggregate", AggregateNodeOptions(aggregates, keys=keys))
409 ])
--> 410 return decl.to_table(use_threads=use_threads)
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/_acero.pyx:590, in pyarrow._acero.Declaration.to_table()
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/error.pxi:155, in pyarrow.lib.pyarrow_internal_check_status()
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/error.pxi:92, in pyarrow.lib.check_status()
ArrowNotImplementedError: Keys of type extension<arrow.uuid>
Looking at the stacktrace, I think we've need to change something here. The UUID is just a fixed with column under the hood, so I think we can re-use that logic.
Thoughts from the Arrow maintainers?
Component(s)
Python