Skip to content

Commit c4a8503

Browse files
committed
Use Narwhals to support more DataFrame types
Remove support for Python 3.7
1 parent a3dfd99 commit c4a8503

15 files changed

+399
-141
lines changed

.github/workflows/continuous-integration.yml

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -40,20 +40,22 @@ jobs:
4040
strategy:
4141
fail-fast: false
4242
matrix:
43-
python-version: [3.7, 3.8, 3.9, "3.10", "3.11", "3.12", "3.13"]
43+
python-version: [3.8, 3.9, "3.10", "3.11", "3.12", "3.13"]
4444
pandas-version: [latest]
4545
numpy-version: [latest]
4646
include:
47-
- python-version: 3.7
48-
pandas-version: '<1.0'
4947
- python-version: 3.9
5048
pandas-version: '<2.0'
5149
numpy-version: '<2.0'
5250
- python-version: "3.13"
5351
pandas-version: pre
5452
polars: true
53+
- python-version: "3.13"
54+
uninstall_narwhals: true
5555
- python-version: "3.13"
5656
uninstall_jinja2: true
57+
- python-version: "3.13"
58+
modin: true
5759
runs-on: ubuntu-20.04
5860
steps:
5961
- name: Checkout
@@ -85,10 +87,17 @@ jobs:
8587

8688
- name: Install polars
8789
if: matrix.polars
88-
run: pip install -e .[polars]
90+
run: pip install polars
91+
92+
- name: Install modin
93+
if: matrix.modin
94+
run: pip install modin[all]
95+
96+
- name: Uninstall narwhals
97+
if: matrix.uninstall_narwhals
98+
run: pip uninstall narwhals -y
8999

90100
- name: Install shiny
91-
if: matrix.python-version != '3.7'
92101
run: pip install "shiny>=1.0"
93102

94103
- name: Uninstall jinja2

docs/_toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,3 +32,4 @@ parts:
3232
chapters:
3333
- file: sample_dataframes
3434
- file: polars_dataframes
35+
- file: modin_dataframes

docs/changelog.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,13 @@
11
ITables ChangeLog
22
=================
33

4+
2.3.0-dev
5+
---------
6+
7+
**Added**
8+
- In addition to Pandas and Polars, ITables now support Modin DataFrames ([#325](https://github.com/mwouts/itables/issues/325)). Under the hoods we use [Narwhals](https://github.com/narwhals-dev/narwhals) to handle the different types of DataFrames. Thanks to [Dea María Léon](https://github.com/DeaMariaLeon) and to [Marco Gorelli](https://github.com/MarcoGorelli) for making this work, and for developing Narwhals too!
9+
10+
411
2.2.4 (2024-12-07)
512
------------------
613

docs/modin_dataframes.md

Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
---
2+
jupytext:
3+
formats: md:myst
4+
notebook_metadata_filter: -jupytext.text_representation.jupytext_version
5+
text_representation:
6+
extension: .md
7+
format_name: myst
8+
format_version: 0.13
9+
kernelspec:
10+
display_name: itables
11+
language: python
12+
name: itables
13+
---
14+
15+
# Modin DataFrames
16+
17+
In this notebook we make sure that our test dataframes are displayed nicely with the default `itables` settings.
18+
19+
```{code-cell}
20+
from itables import init_notebook_mode, show
21+
from itables.sample_dfs import get_dict_of_test_modin_dfs
22+
23+
dict_of_test_dfs = get_dict_of_test_modin_dfs()
24+
init_notebook_mode(all_interactive=True)
25+
```
26+
27+
## empty
28+
29+
```{code-cell}
30+
show(dict_of_test_dfs["empty"])
31+
```
32+
33+
## No rows
34+
35+
```{code-cell}
36+
show(dict_of_test_dfs["no_rows"])
37+
```
38+
39+
## No rows one column
40+
41+
```{code-cell}
42+
show(dict_of_test_dfs["no_rows_one_column"])
43+
```
44+
45+
## No columns
46+
47+
```{code-cell}
48+
show(dict_of_test_dfs["no_columns"])
49+
```
50+
51+
## No columns one row
52+
53+
```{code-cell}
54+
show(dict_of_test_dfs["no_columns_one_row"])
55+
```
56+
57+
## bool
58+
59+
```{code-cell}
60+
show(dict_of_test_dfs["bool"])
61+
```
62+
63+
## Nullable boolean
64+
65+
```{code-cell}
66+
show(dict_of_test_dfs["nullable_boolean"])
67+
```
68+
69+
## int
70+
71+
```{code-cell}
72+
show(dict_of_test_dfs["int"])
73+
```
74+
75+
## Nullable integer
76+
77+
```{code-cell}
78+
show(dict_of_test_dfs["nullable_int"])
79+
```
80+
81+
## float
82+
83+
```{code-cell}
84+
show(dict_of_test_dfs["float"])
85+
```
86+
87+
## str
88+
89+
```{code-cell}
90+
show(dict_of_test_dfs["str"])
91+
```
92+
93+
## time
94+
95+
```{code-cell}
96+
show(dict_of_test_dfs["time"])
97+
```
98+
99+
## object
100+
101+
```{code-cell}
102+
show(dict_of_test_dfs["object"])
103+
```
104+
105+
## ordered_categories
106+
107+
```{code-cell}
108+
show(dict_of_test_dfs["ordered_categories"])
109+
```
110+
111+
## ordered_categories_in_multiindex
112+
113+
```{code-cell}
114+
show(dict_of_test_dfs["ordered_categories_in_multiindex"])
115+
```
116+
117+
## countries
118+
119+
```{code-cell}
120+
:tags: [full-width]
121+
122+
show(dict_of_test_dfs["countries"])
123+
```
124+
125+
## capital
126+
127+
```{code-cell}
128+
show(dict_of_test_dfs["capital"])
129+
```
130+
131+
## int_float_str
132+
133+
```{code-cell}
134+
show(dict_of_test_dfs["int_float_str"])
135+
```
136+
137+
## wide
138+
139+
```{code-cell}
140+
:tags: [full-width]
141+
142+
show(dict_of_test_dfs["wide"], maxBytes=100000, maxColumns=100, scrollX=True)
143+
```
144+
145+
## long_column_names
146+
147+
```{code-cell}
148+
:tags: [full-width]
149+
150+
show(dict_of_test_dfs["long_column_names"], scrollX=True)
151+
```
152+
153+
## named_column_index
154+
155+
```{code-cell}
156+
show(dict_of_test_dfs["named_column_index"])
157+
```
158+
159+
## big_integers
160+
161+
```{code-cell}
162+
show(dict_of_test_dfs["big_integers"])
163+
```

docs/polars_dataframes.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ dataframes are displayed nicely with the default `itables` settings.
1919

2020
```{code-cell}
2121
from itables import init_notebook_mode, show
22-
from itables.sample_dfs import get_dict_of_test_dfs
22+
from itables.sample_dfs import get_dict_of_test_polars_dfs
2323
24-
dict_of_test_dfs = get_dict_of_test_dfs(polars=True)
24+
dict_of_test_dfs = get_dict_of_test_polars_dfs()
2525
init_notebook_mode(all_interactive=True)
2626
```
2727

pyproject.toml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,16 +18,15 @@ classifiers = [
1818
"Intended Audience :: Science/Research",
1919
"Programming Language :: Python",
2020
"Programming Language :: Python :: 3",
21-
"Programming Language :: Python :: 3.7",
2221
"Programming Language :: Python :: 3.8",
2322
"Programming Language :: Python :: 3.9",
2423
"Programming Language :: Python :: 3.10",
2524
"Programming Language :: Python :: 3.11",
2625
"Programming Language :: Python :: 3.12",
2726
"Programming Language :: Python :: 3.13",
2827
]
29-
requires-python = ">= 3.7"
30-
dependencies = ["IPython", "pandas", "numpy"]
28+
requires-python = ">= 3.8"
29+
dependencies = ["IPython", "pandas", "numpy", "narwhals>=1.18.3"]
3130
dynamic = ["version"]
3231

3332
[project.optional-dependencies]

src/itables/datatables_format.py

Lines changed: 8 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,6 @@
66
import pandas as pd
77
import pandas.io.formats.format as fmt
88

9-
try:
10-
import polars as pl
11-
except ImportError:
12-
pl = None
13-
14-
159
JS_MAX_SAFE_INTEGER = 2**53 - 1
1610
JS_MIN_SAFE_INTEGER = -(2**53 - 1)
1711

@@ -91,8 +85,7 @@ def datatables_rows(df, count=None, warn_on_unexpected_types=False, pure_json=Fa
9185
assert missing_columns > 0
9286
empty_columns = [[None] * len(df)] * missing_columns
9387

94-
try:
95-
# Pandas DataFrame
88+
if isinstance(df, pd.DataFrame):
9689
data = list(
9790
zip(
9891
*(empty_columns + [_format_column(x, pure_json) for _, x in df.items()])
@@ -108,17 +101,19 @@ def datatables_rows(df, count=None, warn_on_unexpected_types=False, pure_json=Fa
108101
cls=generate_encoder(warn_on_unexpected_types),
109102
allow_nan=not pure_json,
110103
)
111-
except AttributeError:
112-
# Polars DataFrame
104+
else:
105+
# Polars, Modin, or other
106+
import narwhals as nw
107+
108+
df = nw.from_native(df)
113109
data = df.rows()
114-
import polars as pl
115110

116111
has_bigints = any(
117112
(
118-
x.dtype == pl.Int64
113+
x.dtype == nw.Int64
119114
and ((x > JS_MAX_SAFE_INTEGER).any() or (x < JS_MIN_SAFE_INTEGER).any())
120115
)
121-
or (x.dtype == pl.UInt64 and (x > JS_MAX_SAFE_INTEGER).any())
116+
or (x.dtype == nw.UInt64 and (x > JS_MAX_SAFE_INTEGER).any())
122117
for x in (df[col] for col in df.columns)
123118
)
124119
js = json.dumps(data, cls=generate_encoder(False), allow_nan=not pure_json)

src/itables/downsample.py

Lines changed: 21 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,13 @@
22

33
import pandas as pd
44

5-
from .datatables_format import _isetitem
6-
75

86
def nbytes(df):
9-
try:
7+
if isinstance(df, pd.DataFrame):
108
return sum(x.values.nbytes for _, x in df.items())
11-
except AttributeError:
12-
# Polars DataFrame
13-
return df.estimated_size()
9+
10+
# Narwhals
11+
return df.estimated_size()
1412

1513

1614
def as_nbytes(mem):
@@ -97,31 +95,36 @@ def _downsample(df, max_rows=0, max_columns=0, max_bytes=0, target_aspect_ratio=
9795
second_half = max_rows // 2
9896
first_half = max_rows - second_half
9997
if second_half:
100-
try:
98+
if isinstance(df, pd.DataFrame):
10199
df = pd.concat((df.iloc[:first_half], df.iloc[-second_half:]))
102-
except AttributeError:
103-
df = df.head(first_half).vstack(df.tail(second_half))
100+
else:
101+
from narwhals import concat
102+
103+
df = concat([df.head(first_half), df.tail(second_half)], how="vertical")
104104
else:
105-
try:
105+
if isinstance(df, pd.DataFrame):
106106
df = df.iloc[:first_half]
107-
except AttributeError:
107+
else:
108108
df = df.head(first_half)
109109

110110
if len(df.columns) > max_columns > 0:
111111
second_half = max_columns // 2
112112
first_half = max_columns - second_half
113113
if second_half:
114-
try:
114+
if isinstance(df, pd.DataFrame):
115115
df = pd.concat(
116116
(df.iloc[:, :first_half], df.iloc[:, -second_half:]), axis=1
117117
)
118-
except AttributeError:
119-
df = df[df.columns[:first_half]].hstack(df[df.columns[-second_half:]])
118+
else:
119+
first_and_last_columns = (
120+
df.columns[:first_half] + df.columns[-second_half:]
121+
)
122+
df = df.select(first_and_last_columns)
120123
else:
121-
try:
124+
if isinstance(df, pd.DataFrame):
122125
df = df.iloc[:, :first_half]
123-
except AttributeError:
124-
df = df[df.columns[:first_half]]
126+
else:
127+
df = df.select(df.columns[:first_half])
125128

126129
df_nbytes = nbytes(df)
127130
if df_nbytes > max_bytes > 0:
@@ -144,13 +147,6 @@ def _downsample(df, max_rows=0, max_columns=0, max_bytes=0, target_aspect_ratio=
144147
)
145148

146149
# max_bytes is smaller than the average size of one cell
147-
try:
148-
df = df.iloc[:1, :1]
149-
_isetitem(df, 0, ["..."])
150-
except AttributeError:
151-
import polars as pl # noqa
152-
153-
df = pl.DataFrame({df.columns[0]: ["..."]})
154-
return df
150+
return pd.DataFrame({df.columns[0]: ["..."]})
155151

156152
return df

0 commit comments

Comments
 (0)