Skip to content

merge_sorted on categorical column panics in revmap when printing result #21952

Open
@magbak

Description

@magbak

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

pl.string_cache.enable_string_cache()
df1 = pl.DataFrame({
    "a": ["a", "b", "c"],
}).cast(pl.Categorical(ordering="lexical"))
df2 = pl.DataFrame({
    "a": ["a", "b", "d"],
}).cast(pl.Categorical(ordering="lexical"))

df = df1.merge_sorted(df2, key="a")
print(df) 

Log output

thread '<unnamed>' panicked at crates/polars-core/src/chunked_array/logical/categorical/revmap.rs:99:42:
called `Option::unwrap()` on a `None` value
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "/home/mag/repos/my_repo/testit.py", line 12, in <module>
    print(df)
  File "/home/mag/miniconda3/envs/myenv/lib/python3.12/site-packages/polars/dataframe/frame.py", line 1188, in __str__
    return self._df.as_str()
           ^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: called `Option::unwrap()` on a `None` value

Issue description

Likely introduced recently, after 0.45.1 (Rust).

Expected behavior

shape: (6, 1)
┌─────┐
│ a   │
│ --- │
│ cat │
╞═════╡
│ a   │
│ a   │
│ b   │
│ b   │
│ c   │
│ d   │
└─────┘

Installed versions

--------Version info---------
Polars:              1.26.0
Index type:          UInt32
Platform:            Linux-6.11.0-19-generic-x86_64-with-glibc2.39
Python:              3.12.8 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 16:31:09) [GCC 11.2.0]
LTS CPU:             False

----Optional dependencies----
Azure CLI            2.70.0
adbc_driver_manager  <not installed>
altair               5.5.0
azure.identity       1.21.0
boto3                <not installed>
cloudpickle          <not installed>
connectorx           <not installed>
deltalake            <not installed>
fastexcel            <not installed>
fsspec               <not installed>
gevent               <not installed>
google.auth          2.38.0
great_tables         <not installed>
matplotlib           <not installed>
numpy                1.26.4
openpyxl             3.1.5
pandas               2.2.3
polars_cloud         <not installed>
pyarrow              16.1.0
pydantic             2.10.5
pyiceberg            <not installed>
sqlalchemy           2.0.39
torch                <not installed>
xlsx2csv             <not installed>
xlsxwriter           <not installed>
None

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds triageAwaiting prioritization by a maintainerpythonRelated to Python Polars

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions