Skip to content

separate pl.list() and pl.concat_list #17307

Closed as not planned
Closed as not planned
@mcrumiller

Description

Description

Edit: I just noticed there is more to the documentation in 1.0 which isn't on the current 0.20 documentation, which clarifies that non-list dtypes are cast to lists prior to concatenation, but my proposal still stands.

Edit 2: found #8510 which seems to be the same issue/request. I'll wait for @stinodego's feedback before closing.


The description of pl.concat_list is:

Horizontally concatenate columns into a single list column.

This is confusing, as discussed in #17294, since the name might imply concatenating existing lists into a single list. This is the current behavior on lists:

import polars as pl
df = pl.DataFrame({
    "a": [[1]],    <-- pl.List(pl.Int64)
    "b": [[2]],
})

df.select(pl.concat_list("a", "b"))
shape: (1, 1)
# ┌───────────┐
# │ a         │
# │ ---       │
# │ list[i64] │
# ╞═══════════╡
# │ [1, 2]    │  <--lists concatenated together
# └───────────┘

However, concat_list also concatenates the values in columns into lists:

import polars as pl
df = pl.DataFrame({
    "a": [1],      <-- pl.Int64
    "b": [2],
})

df.select(pl.concat_list("a", "b"))
shape: (1, 1)
# ┌───────────┐
# │ a         │
# │ ---       │
# │ list[i64] │
# ╞═══════════╡
# │ [1, 2]    │  <--columns concatenated together
# └───────────┘

Note that the result of the operation in both cases is identical. Instead, I propose that we have:

  • pl.list(a, b, ...) which creates a new pl.List column out of the expressions a, b, .... The dtypes must have a common supertype.
  • pl.concat_list(a, b, ...) where a, b, ... must all be pl.List columns, and they are concatenated into a single list. The inner dtypes must have a common supertype.

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or an improvement of an existing feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions