Skip to content

[BUG] ValueError in cudf-polars experimental group_by.agg with no aggregations #18276

Closed
@TomAugspurger

Description

@TomAugspurger

Describe the bug

Steps/Code to reproduce bug

In cudf_polars/tests/experimental/test_groupby:

def test_groupby_agg_empty(df: pl.LazyFrame, engine: pl.GPUEngine) -> None:
    q = df.group_by("y").agg()
    assert_gpu_result_equal(q, engine=engine, check_row_order=False)

That fails with

________________________________________________________________________________________________________________________________________________________________________________________________________ test_groupby_agg_empty _________________________________________________________________________________________________________________________________________________________________________________________________________
Traceback (most recent call last):
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/runner.py", line 341, in from_call
    result: Optional[TResult] = func()
                                ^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/runner.py", line 262, in <lambda>
    lambda: ihook(item=item, **kwds), when=when, reraise=reraise
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_callers.py", line 182, in _multicall
    return outcome.get_result()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_result.py", line 100, in get_result
    raise exc.with_traceback(exc.__traceback__)
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/runner.py", line 177, in pytest_runtest_call
    raise e
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/runner.py", line 169, in pytest_runtest_call
    item.runtest()
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/python.py", line 1792, in runtest
    self.ihook.pytest_pyfunc_call(pyfuncitem=self)
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/python.py", line 194, in pytest_pyfunc_call
    result = testfunction(**testargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/cudf/python/cudf_polars/tests/experimental/test_groupby.py", line 133, in test_groupby_agg_empty
    assert_gpu_result_equal(q, engine=engine, check_row_order=False)
  File "/home/coder/cudf/python/cudf_polars/cudf_polars/testing/asserts.py", line 99, in assert_gpu_result_equal
    got = lazydf.collect(**final_cudf_collect_kwargs, engine=engine)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/polars/lazyframe/frame.py", line 2066, in collect
    return wrap_df(ldf.collect(callback))
                   ^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.ComputeError: ValueError: not enough values to unpack (expected 3, got 0)
======================================================================================================================================================================================================== short test summary info ========================================================================================================================================================================================================
FAILED tests/experimental/test_groupby.py::test_groupby_agg_empty - polars.exceptions.ComputeError: ValueError: not enough values to unpack (expected 3, got 0)

Expected behavior

In [4]: df.group_by("y").agg().collect()
Out[4]: 
shape: (3, 1)
┌─────┐
│ y   │
│ --- │
│ i64 │
╞═════╡
│ 2   │
│ 3   │
│ 1   │
└─────┘

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of cuDF install: from source

Environment details
Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcudf-polarsIssues specific to cudf-polars

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions