feat: add `list.sort` #3356

raisadz · 2025-12-15T13:52:16Z

Description

I tried to have some workarounds for the edge cases for pyarrow that we discussed #3332 (comment) (None, empty lists and lists with only None elements) with pc.sort_indices and pc.replace_with_mask but the latter doesn't seem to work for the list types. There is an open issue apache/arrow#48060 that can make it work for pyarrow, pandas and modin when solved.

Also, I opened a couple of issues in sqlframe and ibis related to this PR:

eakmanrq/sqlframe#559
eakmanrq/sqlframe#560
ibis-project/ibis#11735

What type of PR is this? (check all applicable)

Related issues

Related issue #<issue number>
Closes #<issue number>

Checklist

Code follows style guide (ruff)
Tests added
Documented the changes

dangotbanned · 2025-12-17T12:26:58Z

@raisadz I still need to loop back to it and clean things up, but I think list.sort will be possible for pyarrow 🙂

feat(expr-ir): Add list.sort #3359

Here's a rough equivalent in polars:

Outdated

Note: I've hidden this because of (#3356 (comment))

import polars as pl


def list_sort(
    native: pl.Series, *, descending: bool = False, nulls_last: bool = False
) -> pl.Series:
    idx, name = "index", native.name
    indexed = native.to_frame().with_row_index(name=idx)

    len_gt_1 = pl.col(name).list.len() > 1
    valid = indexed.filter(len_gt_1)
    invalid = indexed.filter(pl.col(name).is_null() | ~len_gt_1)

    exploded = valid.explode(name, empty_as_null=False, keep_nulls=False)
    valid_finished = (
        exploded.sort(idx, name, descending=[False, descending], nulls_last=nulls_last)
        .group_by(idx, maintain_order=True)
        .agg(name)
    )
    return pl.concat([valid_finished, invalid], how="vertical").sort(idx).get_column(name)

And then against the test suite (https://github.com/narwhals-dev/narwhals/pull/3356/files#diff-24b32e940c35026188c24063b47ae536444a402c28879bdf7fa4853c7c5f5a0b)

data = {"a": [[3, 2, 2, 4, -10, None, None], [-1], None, [None, None, None], []]}
ser = pl.DataFrame(data).to_series()

print(list_sort(ser, descending=True, nulls_last=True).to_list())
print(list_sort(ser, descending=True, nulls_last=False).to_list())
print(list_sort(ser, descending=False, nulls_last=True).to_list())
print(list_sort(ser, descending=False, nulls_last=False).to_list())

[[4, 3, 2, 2, -10, None, None], [-1], None, [None, None, None], []]
[[None, None, 4, 3, 2, 2, -10], [-1], None, [None, None, None], []]
[[-10, 2, 2, 3, 4, None, None], [-1], None, [None, None, None], []]
[[None, None, -10, 2, 2, 3, 4], [-1], None, [None, None, None], []]

narwhals/_arrow/utils.py

MarcoGorelli

thanks for working on this! just some comments

MarcoGorelli · 2025-12-23T17:43:06Z

tests/expr_and_series/list/sort_test.py

+def test_sort_expr(request: pytest.FixtureRequest, constructor: Constructor) -> None:
+    if any(backend in str(constructor) for backend in ("dask", "cudf")):
+        request.applymarker(pytest.mark.xfail)
+    if "sqlframe" in str(constructor):
+        # https://github.com/eakmanrq/sqlframe/issues/559
+        # https://github.com/eakmanrq/sqlframe/issues/560
+        request.applymarker(pytest.mark.xfail)
+    if "polars" in str(constructor) and POLARS_VERSION < (0, 20, 5):
+        pytest.skip()
+    if "pandas" in str(constructor):
+        if PANDAS_VERSION < (2, 2):
+            pytest.skip()
+        pytest.importorskip("pyarrow")
+    result = nw.from_native(constructor(data)).select(
+        nw.col("a").cast(nw.List(nw.Int32())).list.sort()
+    )
+    assert_equal_data(result, {"a": expected_asc_nulls_first})


maybe we can skip this test if the below already tests all possibilities?

MarcoGorelli · 2025-12-23T17:43:09Z

narwhals/_pandas_like/series_list.py

+        )
+        result_native = type(self.native)(
+            result_arr, dtype=out_dtype, index=self.native.index, name=self.native.name
+        )


@FBruzzesi factored some repeated logic here out using _apply_pyarrow_compute_func, is it possible to use that here?

MarcoGorelli · 2025-12-23T17:43:19Z

tests/expr_and_series/list/sort_test.py

+    assert_equal_data(result, {"a": expected})
+
+
+def test_sort_series(


raisadz added 5 commits December 15, 2025 12:57

feat: add list_sort

97af862

Merge remote-tracking branch 'upstream/main' into feat/list-sort

ab3c88e

add not_implemented to pyarrow and pandas

7c2f1f3

skip old polars as nulls_last arg was not implemented

42cfcd5

skip old polars again

4402e98

raisadz marked this pull request as ready for review December 15, 2025 14:38

raisadz changed the title ~~feat: add list_sort~~ feat: add list.sort Dec 15, 2025

FBruzzesi added the enhancement New feature or request label Dec 15, 2025

update pyarrow issue link

5ce0cf9

dangotbanned mentioned this pull request Dec 17, 2025

feat(RFC): A richer Expr IR #2572

Draft

dangotbanned added a commit that referenced this pull request Dec 17, 2025

test: Port tests from (#3356)

6402103

dangotbanned mentioned this pull request Dec 17, 2025

feat(expr-ir): Add list.sort #3359

Merged

raisadz added 4 commits December 18, 2025 15:42

implement pyarrow sort

f1308bb

Merge remote-tracking branch 'upstream/main' into feat/list-sort

7b0bc82

use arange from utils

ea59694

skip old pandas

58b86ad

dangotbanned reviewed Dec 18, 2025

View reviewed changes

narwhals/_arrow/utils.py Show resolved Hide resolved

raisadz added 3 commits December 19, 2025 15:29

change var names

aa16798

fix typing

5343e0f

Merge remote-tracking branch 'upstream/main' into feat/list-sort

d0a06a2

MarcoGorelli reviewed Dec 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add `list.sort` #3356

feat: add `list.sort` #3356

raisadz commented Dec 15, 2025

Uh oh!

dangotbanned commented Dec 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

MarcoGorelli left a comment

Uh oh!

MarcoGorelli Dec 23, 2025

Uh oh!

MarcoGorelli Dec 23, 2025

Uh oh!

MarcoGorelli Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		assert_equal_data(result, {"a": expected})


		def test_sort_series(

feat: add list.sort #3356

Are you sure you want to change the base?

feat: add list.sort #3356

Conversation

raisadz commented Dec 15, 2025

Description

What type of PR is this? (check all applicable)

Related issues

Checklist

Uh oh!

dangotbanned commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

MarcoGorelli left a comment

Choose a reason for hiding this comment

Uh oh!

MarcoGorelli Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

MarcoGorelli Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

MarcoGorelli Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: add `list.sort` #3356

feat: add `list.sort` #3356

dangotbanned commented Dec 17, 2025 •

edited

Loading