Skip to content

Conversation

@tokoko
Copy link
Contributor

@tokoko tokoko commented Oct 23, 2025

Adds minimal setup for narwhals-complaint dataframe (lazyframe).

@tokoko
Copy link
Contributor Author

tokoko commented Oct 23, 2025

  • I have previously tried implementing subframe which was supposed to be lighter-weight ibis clone, but trying to mimic ibis can be tricky at times and I'm not sure would be all that beneficial either.
  • Then I did another draft PR here (feat: dataframe api #70) trying to build out our own flavor of dataframe api (plus Bryce also has a similar project substrait-dataframe). The problem with both is that devising your own dataframe api can turn incredibly creative at times and more creative we are, the less people will actually use anything we do.

This is yet another attempt to bring dataframe api to substrait-python, also trying to bring the best of both worlds from previous approaches. PR adds DataFrame class that is compliant to narwhals LazyFrame API. narwhals is a lightweight datarame api layer that specifies the api and handles function routing. It's basically odbc (or should i say adbc) for dataframes, just the protocol and nothing else. That makes building a narwhals-compliant dataframe a lot simpler.

Another upside of this approach is that we technically don't have to depend on narwhals at all, the api can be used either natively (import substrait.dataframe as sdf as namespace) or with narwhals (import narwhals as nw).

From implementation perspective, dataframe should also be a very lightweight glue code that delegates all the heavy lifting to plan and extended expression builders from substrait.builders.

@MarcoGorelli
Copy link

thanks for your interest!

we're revisiting the extensions/plugins mechanism, and my hope is that in a few months it will be significantly easier to build and test these - if it feels hard and confusing at the moment, there is light at the end of the tunnel!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants