Skip to content

feat: Add python FFI interface and expose sqlite and duckdb table providers #264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Mar 24, 2025

Conversation

crystalxyz
Copy link
Contributor

@crystalxyz crystalxyz commented Mar 16, 2025

Expose table providers in Python via FFI

Summary

  1. Created python subcrate for python-related development
  2. Reorganized Cargo dependency and share common dependencies (arrow, datafusion, duckdb) in workspace cargo.toml
  3. Implemented sqlite and duckdb table provider interfaces and tested in examples
  4. Updated duckdbconn.rs to start new tokio runtime when triggered from PyO3
  5. Updated Github CICD testing command

Testing

  • Verified that all existing tests in Rust have passed
  • Verified that Python examples can execute correctly via the following script:
cd python
maturin develop
cd examples
python3 duckdb_demo.py

Follow-up works

  • Improve duckdb enum interface (ongoing)

The rest will be left for future works: #279

  • Python ODBC table provider + example
  • Python Postgres table provider + example
  • Python MySQL table provider + example
  • Integration testing
  • Finalize Python interface design, i.e. new_memory() or new(":memory:")

Copy link
Contributor

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks very nice. I suggest we merge this in but do not yet publish to pypi. There should be no breaking changes to the rust code, so anyone who depends on it should be okay. I think we will need to bump the minor version number if you haven't done so (I didn't check).

@phillipleblanc Would you be comfortable if we merged in partially complete work on the python side so we can have a series of smaller PRs until we're ready to publish?

@phillipleblanc
Copy link
Collaborator

Overall, this looks very nice. I suggest we merge this in but do not yet publish to pypi. There should be no breaking changes to the rust code, so anyone who depends on it should be okay. I think we will need to bump the minor version number if you haven't done so (I didn't check).

@phillipleblanc Would you be comfortable if we merged in partially complete work on the python side so we can have a series of smaller PRs until we're ready to publish?

Yes, that works for me - just mark this PR as ready to review and tag me when you are ready to merge. Thanks for working on this!

@crystalxyz crystalxyz marked this pull request as ready for review March 23, 2025 02:26
@crystalxyz
Copy link
Contributor Author

@timsaucer @phillipleblanc This PR is ready for review!

Copy link
Contributor

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just a couple of smaller suggestions. This feels like a huge leap forward in making these table providers accessible. I do recommend updating the title of this PR.

Also coming in DF 47, whenever it releases, we will have FFI catalog providers. Some of these interfaces might work well to support that as well. But that is future work. For now, this is excellent. Well done.

@crystalxyz crystalxyz changed the title [Draft] Try to add python table provider interface Try to add python table provider interface Mar 23, 2025
@crystalxyz crystalxyz changed the title Try to add python table provider interface feat: Add python table provider interface for sqlite and duckdb Mar 23, 2025
@crystalxyz crystalxyz changed the title feat: Add python table provider interface for sqlite and duckdb feat: Add python FFI interface and expose sqlite and duckdb table providers Mar 23, 2025
@phillipleblanc phillipleblanc merged commit 5390f37 into datafusion-contrib:main Mar 24, 2025
3 checks passed
@phillipleblanc
Copy link
Collaborator

Thank you @CrystalZhou0529 and @timsaucer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants