Open
Description
What happened?
The polars ArrayJoin operation does not behave like documented
When the array is empty, the expected value is NULL, but the result is an empty string. (ID 3 below).
ibis.set_backend("polars")
table = ibis.memtable(
[
{"id": 1, "arr": ["a", "b", "c"], "expected": "a|b|c"},
{"id": 2, "arr": None, "expected": None},
{"id": 3, "arr": [], "expected": None},
{"id": 4, "arr": ["b", None], "expected": "b"},
]
)
table.mutate(
result=_["arr"].join("|"),
).preview()
┏━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┓
┃ id ┃ arr ┃ expected ┃ result ┃
┡━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━┩
│ int64 │ array<string> │ string │ string │
├───────┼────────────────────┼──────────┼────────┤
│ 1 │ ['a', 'b', ... +1] │ a|b|c │ a|b|c │
│ 2 │ NULL │ NULL │ NULL │
│ 3 │ [] │ NULL │ ~ │ <-- issue here
│ 4 │ ['b', None] │ b │ b │
└───────┴────────────────────┴──────────┴────────┘
Proposed change for code:
@translate.register(ops.ArrayStringJoin)
def array_string_join(op, **kw):
arg = translate(op.arg, **kw)
sep = _literal_value(op.sep)
return (
pl.when(arg.list.len() > 0)
.then(arg.list.join(sep))
.otherwise(None)
)
What version of ibis are you using?
10.1.0
What backend(s) are you using, if any?
polars
Relevant log output
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Type
Projects
Status
backlog
Activity
cpcloud commentedon Feb 27, 2025
Thanks for the issue, seems like a reasonable change in implementation!
cpcloud commentedon Feb 27, 2025
@GLeurquin Are you interested in making a PR with that change?
GLeurquin commentedon Feb 28, 2025
Here is a PR: #10913
However I don't quite understand how to add the test example to the tests to avoid regression in the future, if you can guide me to where I should add it I can modify the PR