Skip to content

Refactor groupby to rely less on storing keys as Index objects #12037

@shwina

Description

@shwina

#11792 introduces the ability to group on list columns. In the future, we can expect grouping by, e.g., structs and other types that are not supported by Pandas.

In #6932, we made the decision not to support creating an Index with elements of type list.

Unfortunately, our groupby internals rely heavily on being able to store the key columns of a groupby as an Index. In particular, the internal _Grouping.keys method is heavily used.

We should rely less on storing keys as Index objects, which will make it much easier to support grouping by lists and structs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    0 - BacklogIn queue waiting for assignmentPythonAffects Python cuDF API.feature requestNew feature or request

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions