Skip to content

[BUG] ValueError in cudf-polars experimental group_by.agg with no aggregations #18276

Open
@TomAugspurger

Description

@TomAugspurger

Describe the bug

Steps/Code to reproduce bug

In cudf_polars/tests/experimental/test_groupby:

def test_groupby_agg_empty(df: pl.LazyFrame, engine: pl.GPUEngine) -> None:
    q = df.group_by("y").agg()
    assert_gpu_result_equal(q, engine=engine, check_row_order=False)

That fails with

________________________________________________________________________________________________________________________________________________________________________________________________________ test_groupby_agg_empty _________________________________________________________________________________________________________________________________________________________________________________________________________
Traceback (most recent call last):
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/runner.py", line 341, in from_call
    result: Optional[TResult] = func()
                                ^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/runner.py", line 262, in <lambda>
    lambda: ihook(item=item, **kwds), when=when, reraise=reraise
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_callers.py", line 182, in _multicall
    return outcome.get_result()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_result.py", line 100, in get_result
    raise exc.with_traceback(exc.__traceback__)
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/runner.py", line 177, in pytest_runtest_call
    raise e
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/runner.py", line 169, in pytest_runtest_call
    item.runtest()
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/python.py", line 1792, in runtest
    self.ihook.pytest_pyfunc_call(pyfuncitem=self)
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/_pytest/python.py", line 194, in pytest_pyfunc_call
    result = testfunction(**testargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/cudf/python/cudf_polars/tests/experimental/test_groupby.py", line 133, in test_groupby_agg_empty
    assert_gpu_result_equal(q, engine=engine, check_row_order=False)
  File "/home/coder/cudf/python/cudf_polars/cudf_polars/testing/asserts.py", line 99, in assert_gpu_result_equal
    got = lazydf.collect(**final_cudf_collect_kwargs, engine=engine)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/coder/.conda/envs/rapids/lib/python3.12/site-packages/polars/lazyframe/frame.py", line 2066, in collect
    return wrap_df(ldf.collect(callback))
                   ^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.ComputeError: ValueError: not enough values to unpack (expected 3, got 0)
======================================================================================================================================================================================================== short test summary info ========================================================================================================================================================================================================
FAILED tests/experimental/test_groupby.py::test_groupby_agg_empty - polars.exceptions.ComputeError: ValueError: not enough values to unpack (expected 3, got 0)

Expected behavior

In [4]: df.group_by("y").agg().collect()
Out[4]: 
shape: (3, 1)
┌─────┐
│ y   │
│ --- │
│ i64 │
╞═════╡
│ 2   │
│ 3   │
│ 1   │
└─────┘

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of cuDF install: from source

Environment details
Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcudf.polarsIssues specific to cudf.polars

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions