Description
I work with lots of tabular datasets of the size 50-500MB. For exploratory data analysis the Perspective Viewer is really powerful and unique. Unfortunately sending the full datasets to the client is slow and often breaks of a websocket max limitation imposed by Panel, JupyterHub or Kubernetes. You can increase these limits but only to some extend and also this can be outside the control or capability of a data scientist.
I'm increasingly seeing this problem and I'm not the only one seeing this problem (Discourse #6804). Its actually a problem that is very common in Finance and Trading where I work. Currently Excel support larger tables than we do with Perspective. I believe we should enable users to work with larger files than Excel can in Perspective.
Actually Perspective was built to support large tabular data via virtualization. See regular-table
and Perspective. But our implementation only use the perspective-viewer
web component. Not the advanced client-server virtualization architecture supported.
A Panel user actually showcased how to use the client-server virtualization in Discourse #6430. But its only a complicated to use proof of concept.
Please note that the client-server virtualization architecture seems similar to Mosaic - Mosaic is just built on DuckDB. There is a request to add Mosaic in FR #7358.
Discussion
The Tabulator Pane provides a kind of virtualization via the pagination
parameter ("local" or "remote"). We could support a similar parameter with Perspective making it really easy for users. On the other hand there is power in exposing more of the underlying Perspective api like the PerspectiveManager and hosting tables once but using across sessions and users. I think also Panel Perspective pane would be more useful if it implemented the Jupyter Perspective Widget api and capabilities. See PyCon Italy 2024 and PerspectiveWidget Implementation for inspiration.
Today Panel can be running on both Tornado and FastAPI servers. The solution should work in both environments. Personally I want to migrate to FastAPI deployments if that is possible.
Also it should just work in Pyodide because that is where lots of the showcasing of the functionality will take place.
Cannot use JupyterWidget
Unfortunately its not a workaround to use the Jupyter Widget
import pandas as pd
import panel as pn
from perspective.widget import PerspectiveWidget
pn.extension("ipywidgets")
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=["A", "B", "C"])
p = PerspectiveWidget(df)
pn.pane.IPyWidget(p).servable()