Skip to content

Commit 6f94fde

Browse files
authored
Merge pull request #7 from mwouts/version_0.2
Version 0.2
2 parents 1470e14 + 1be3114 commit 6f94fde

File tree

14 files changed

+575
-1280
lines changed

14 files changed

+575
-1280
lines changed

CHANGELOG.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
0.2.0 (2019-11-20)
2+
------------------
3+
4+
Added
5+
=====
6+
- Large tables are downsampled (#2)
7+
8+
Changed
9+
=======
10+
- Javascript code moved to Javascript files
11+
12+
Fixed
13+
=====
14+
- Tables with many columns are now well rendered (#5)
15+
16+
17+
0.1.0 (2019-04-23)
18+
------------------
19+
20+
Initial release

HISTORY.md

Lines changed: 0 additions & 7 deletions
This file was deleted.

README.md

Lines changed: 23 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ df
1919

2020
You don't see any table above? Please either open the [HTML export](https://mwouts.github.io/itables/) of this notebook, or run this README on [Binder](https://mybinder.org/v2/gh/mwouts/itables/master?filepath=README.md)!
2121

22-
2322
# Quick start
2423

2524
Install the package with
@@ -129,16 +128,10 @@ show(
129128

130129
## Column width
131130

132-
FIXME: This does not appear to be working...
133-
134-
```python
135-
show(df, columnDefs=[{"width": "200px", "target": 3}])
136-
```
137-
138-
But in some cases - a table with many column like the one below, we can use the `width` parameter...
131+
For tables that are larger than the notebook, the `columnDefs` argument allows to specify the desired width. If you wish you can also change the default in `itables.options`.
139132

140133
```python
141-
show(x.to_frame().T, columnDefs=[{"width": "80px", "targets": "_all"}])
134+
show(x.to_frame().T, columnDefs=[{"width": "120px", "targets": "_all"}])
142135
```
143136

144137
## HTML in cells
@@ -169,22 +162,37 @@ Not currently implemented. May be made available at a later stage using the [sel
169162
Not currently implemented. May be made available at a later stage thanks to the [buttons](https://datatables.net/extensions/buttons/) extension for datatable.
170163

171164

172-
## Large table support
165+
## Downsampling
173166

174-
`itables` will not display dataframes that are larger than `maxBytes`, which is equal to 1MB by default. Truncate the dataframe with `df.head()`, or set the `maxBytes` parameter or option to an other value to display the dataframe. Or deactivate the limit with `maxBytes=0`.
167+
When the data in a table is larger than `maxBytes`, which is equal to 64KB by default, `itables` will display only a subset of the table - one that fits into `maxBytes`. If you wish, you can deactivate the limit with `maxBytes=0`, change the value of `maxBytes`, or similarly set a limit on the number of rows (`maxRows`, defaults to 0) or columns (`maxColumns`, defaults to `pd.get_option('display.max_columns')`).
175168

176169
Note that datatables support [server-side processing](https://datatables.net/examples/data_sources/server_side). At a later stage we may implement support for larger tables using this feature.
177170

178171
```python
179-
df = wb.get_indicators()
172+
df = wb.get_indicators().head(500)
173+
opt.maxBytes = 10000
180174
df.values.nbytes
181175
```
182176

183177
```python
184-
opt.maxBytes = 1000000
185178
df
186179
```
187180

181+
To show the table in full, we can modify the value of `maxBytes` either locally:
182+
183+
```python
184+
show(df, maxBytes=0)
185+
```
186+
187+
or globally:
188+
189+
```python
190+
opt.maxBytes = 2**20
191+
df
192+
```
193+
194+
The `maxRows` and `maxColumns` arguments work similarly.
195+
188196
# References
189197

190198
## DataTables
@@ -195,7 +203,8 @@ df
195203

196204
## Alternatives
197205

198-
ITables is not a Jupyter widget, which means that it does not allows you to **edit** the content of the dataframe.
206+
ITables uses basic Javascript, and because of this it will only work in Jupyter Notebook, not in JupyterLab. It is not a Jupyter widget, which means that it does not allows you to **edit** the content of the dataframe.
207+
199208
If you are looking for Jupyter widgets, have a look at
200209
- [QGrid](https://github.com/quantopian/qgrid) by Quantopian
201210
- [IPyaggrid](https://dgothrek.gitlab.io/ipyaggrid/) by Louis Raison and Olivier Borderies

index.html

Lines changed: 275 additions & 1163 deletions
Large diffs are not rendered by default.

itables/downsample.py

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
import pandas as pd
2+
import logging
3+
4+
logging.basicConfig()
5+
logger = logging.getLogger(__name__)
6+
7+
8+
def downsample(df, max_rows=0, max_columns=0, max_bytes=0):
9+
"""Return a subset of the dataframe that fits the limits"""
10+
org_rows, org_columns, org_bytes = len(df.index), len(df.columns), df.values.nbytes
11+
df = _downsample(df, max_rows=max_rows, max_columns=max_columns, max_bytes=max_bytes)
12+
13+
if len(df.index) < org_rows or len(df.columns) < org_columns:
14+
reasons = []
15+
if org_rows > max_rows > 0:
16+
reasons.append('maxRows={}'.format(max_rows))
17+
if org_columns > max_columns > 0:
18+
reasons.append('maxColumns={}'.format(max_columns))
19+
if org_bytes > max_bytes > 0:
20+
reasons.append('nbytes={}>{}=maxBytes'.format(org_bytes, max_bytes))
21+
22+
logger.warning('showing {}x{} of {}x{} as {}. See https://mwouts.github.io/itables/#downsampling'.format(
23+
len(df.index), len(df.columns), org_rows, org_columns, ' and '.join(reasons)))
24+
25+
return df
26+
27+
28+
def _downsample(df, max_rows=0, max_columns=0, max_bytes=0):
29+
"""Implementation of downsample - may be called recursively"""
30+
if len(df.index) > max_rows > 0:
31+
second_half = max_rows // 2
32+
first_half = max_rows - second_half
33+
if second_half:
34+
df = pd.concat((df.iloc[:first_half], df.iloc[-second_half:]))
35+
else:
36+
df = df.iloc[:first_half]
37+
38+
if len(df.columns) > max_columns > 0:
39+
second_half = max_columns // 2
40+
first_half = max_columns - second_half
41+
if second_half:
42+
df = pd.concat((df.iloc[:, :first_half], df.iloc[:, -second_half:]), axis=1)
43+
else:
44+
df = df.iloc[:, :first_half]
45+
46+
if df.values.nbytes > max_bytes > 0:
47+
max_rows = len(df.index)
48+
max_columns = len(df.columns)
49+
50+
# we want to decrease max_rows * max_columns by df.values.nbytes / max_bytes
51+
max_product = max_rows * max_columns / (float(df.values.nbytes) / max_bytes)
52+
53+
while max_product >= 1:
54+
max_rows = max(max_rows // 2, 1)
55+
if max_rows * max_columns <= max_product:
56+
return _downsample(df, max_rows, max_columns, max_bytes)
57+
58+
max_columns = max(max_columns // 2, 1)
59+
if max_rows * max_columns <= max_product:
60+
return _downsample(df, max_rows, max_columns, max_bytes)
61+
62+
# max_product < 1.0:
63+
df = df.iloc[:1, :1]
64+
df.iloc[0, 0] = '...'
65+
return df
66+
67+
return df

itables/javascript.py

Lines changed: 46 additions & 92 deletions
Original file line numberDiff line numberDiff line change
@@ -1,59 +1,86 @@
11
"""HTML/js representation of Pandas dataframes"""
22

3+
import os
4+
import io
35
import re
46
import uuid
57
import json
6-
import warnings
8+
import logging
79
import numpy as np
810
import pandas as pd
911
import pandas.io.formats.format as fmt
1012
from IPython.core.display import display, Javascript, HTML
1113
import itables.options as opt
14+
from .downsample import downsample
15+
16+
logging.basicConfig()
17+
logger = logging.getLogger(__name__)
1218

1319
try:
1420
unicode # Python 2
1521
except NameError:
1622
unicode = str # Python 3
1723

1824

25+
def read_package_file(*path):
26+
current_path = os.path.dirname(__file__)
27+
with io.open(os.path.join(current_path, *path), encoding='utf-8') as fp:
28+
return fp.read()
29+
30+
1931
def load_datatables():
2032
"""Load the datatables.net library, and the corresponding css"""
21-
display(Javascript("""require.config({
22-
paths: {
23-
datatables: 'https://cdn.datatables.net/1.10.19/js/jquery.dataTables.min',
24-
}
25-
});
33+
load_datatables_js = read_package_file('javascript', 'load_datatables_connected.js')
34+
eval_functions_js = read_package_file('javascript', 'eval_functions.js')
35+
load_datatables_js += "\n$('head').append(`<script>\n" + eval_functions_js + "\n</` + 'script>');"
36+
37+
display(Javascript(load_datatables_js))
38+
39+
40+
def _formatted_values(df):
41+
"""Return the table content as a list of lists for DataTables"""
42+
formatted_df = df.copy()
43+
for col in formatted_df:
44+
x = formatted_df[col]
45+
if x.dtype.kind in ['b', 'i', 's']:
46+
continue
47+
48+
if x.dtype.kind == 'O':
49+
formatted_df[col] = formatted_df[col].astype(unicode)
50+
continue
2651

27-
$('head').append('<link rel="stylesheet" type="text/css" \
28-
href = "https://cdn.datatables.net/1.10.19/css/jquery.dataTables.min.css" > ');
52+
formatted_df[col] = np.array(fmt.format_array(x.values, None))
53+
if x.dtype.kind == 'f':
54+
try:
55+
formatted_df[col] = formatted_df[col].astype(np.float)
56+
except ValueError:
57+
pass
2958

30-
$('head').append('<style> table td { text-overflow: ellipsis; overflow: hidden; } </style>');
31-
"""))
59+
return formatted_df.values.tolist()
3260

3361

3462
def _datatables_repr_(df=None, tableId=None, **kwargs):
3563
"""Return the HTML/javascript representation of the table"""
3664

3765
# Default options
3866
for option in dir(opt):
39-
if not option in kwargs and not option.startswith("__"):
67+
if option not in kwargs and not option.startswith("__"):
4068
kwargs[option] = getattr(opt, option)
4169

4270
# These options are used here, not in DataTable
4371
classes = kwargs.pop('classes')
4472
showIndex = kwargs.pop('showIndex')
45-
maxBytes = kwargs.pop('maxBytes')
73+
maxBytes = kwargs.pop('maxBytes', 0)
74+
maxRows = kwargs.pop('maxRows', 0)
75+
maxColumns = kwargs.pop('maxColumns', pd.get_option('display.max_columns'))
4676

4777
if isinstance(df, (np.ndarray, np.generic)):
4878
df = pd.DataFrame(df)
4979

5080
if isinstance(df, pd.Series):
5181
df = df.to_frame()
5282

53-
if df.values.nbytes > maxBytes > 0:
54-
raise ValueError('The dataframe has size {}, larger than the limit {}\n'.format(df.values.nbytes, maxBytes) +
55-
'Please print a smaller dataframe, or enlarge or remove the limit:\n'
56-
'import itables.options as opt; opt.maxBytes=0')
83+
df = downsample(df, max_rows=maxRows, max_columns=maxColumns, max_bytes=maxBytes)
5784

5885
# Do not show the page menu when the table has fewer rows than min length menu
5986
if 'paging' not in kwargs and len(df.index) <= kwargs.get('lengthMenu', [10])[0]:
@@ -77,87 +104,14 @@ def _datatables_repr_(df=None, tableId=None, **kwargs):
77104
thead = thead.replace('<th></th>', '', 1)
78105
html_table = '<table id="' + tableId + '" class="' + classes + '"><thead>' + thead + '</thead></table>'
79106

80-
# Table content as 'data' for DataTable
81-
formatted_df = df.reset_index() if showIndex else df.copy()
82-
for col in formatted_df:
83-
x = formatted_df[col]
84-
if x.dtype.kind in ['b', 'i', 's']:
85-
continue
86-
87-
if x.dtype.kind == 'O':
88-
formatted_df[col] = formatted_df[col].astype(unicode)
89-
continue
90-
91-
formatted_df[col] = np.array(fmt.format_array(x.values, None))
92-
if x.dtype.kind == 'f':
93-
try:
94-
formatted_df[col] = formatted_df[col].astype(np.float)
95-
except ValueError:
96-
pass
97-
98-
kwargs['data'] = formatted_df.values.tolist()
107+
kwargs['data'] = _formatted_values(df.reset_index() if showIndex else df)
99108

100109
try:
101110
dt_args = json.dumps(kwargs)
102111
return """<div>""" + html_table + """
103112
<script type="text/javascript">
104113
require(["datatables"], function (datatables) {
105-
$(document).ready(function () {
106-
function eval_functions(map_or_text) {
107-
if (typeof map_or_text === "string") {
108-
if (map_or_text.startsWith("function")) {
109-
try {
110-
// Note: parenthesis are required around the whole expression for eval to return a value!
111-
// See https://stackoverflow.com/a/7399078/911298.
112-
//
113-
// eval("local_fun = " + map_or_text) would fail because local_fun is not declared
114-
// (using var, let or const would work, but it would only be declared in the local scope
115-
// and therefore the value could not be retrieved).
116-
const func = eval(`(${map_or_text})`);
117-
if (typeof func !== "function") {
118-
// Note: backquotes are super convenient!
119-
// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals
120-
console.error(
121-
`Evaluated expression "${map_or_text}" is not a function (type is ${typeof func})`
122-
);
123-
return map_or_text;
124-
}
125-
// Return the function
126-
return func;
127-
} catch (e) {
128-
// Make sure to print the error with a second argument to console.error().
129-
console.error(`itables was not able to parse "${map_or_text}"`, e);
130-
}
131-
}
132-
} else if (typeof map_or_text === "object") {
133-
if (map_or_text instanceof Array) {
134-
// Note: "var" is now superseded by "let" and "const".
135-
// https://medium.com/javascript-scene/javascript-es6-var-let-or-const-ba58b8dcde75
136-
const result = [];
137-
// Note: "for of" is the best way to iterate through an iterable.
138-
// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for...of
139-
for (const item of map_or_text) {
140-
result.push(eval_functions(item));
141-
}
142-
return result;
143-
144-
// Alternatively, more functional approach in one line:
145-
// return map_or_text.map(eval_functions);
146-
} else {
147-
const result = {};
148-
// Object.keys() is safer than "for in" because otherwise you might have keys
149-
// that aren't defined in the object itself.
150-
//
151-
// See https://stackoverflow.com/a/684692/911298.
152-
for (const item of Object.keys(map_or_text)) {
153-
result[item] = eval_functions(map_or_text[item]);
154-
}
155-
return result;
156-
}
157-
}
158-
159-
return map_or_text;
160-
}
114+
$(document).ready(function () {
161115
var dt_args = """ + dt_args + """;
162116
dt_args = eval_functions(dt_args);
163117
table = $('#""" + tableId + """').DataTable(dt_args);
@@ -167,7 +121,7 @@ def _datatables_repr_(df=None, tableId=None, **kwargs):
167121
</div>
168122
"""
169123
except TypeError as error:
170-
warnings.warn(str(error))
124+
logger.error(str(error))
171125
return ''
172126

173127

0 commit comments

Comments
 (0)