Skip to content

Epic: Python bindings with native async support #555

@ethe

Description

@ethe

Summary

Provide first-class Python bindings for Tonbo, leveraging fusio's executor abstraction to deliver native asyncio integration. Python users get the full Tonbo experience—Arrow-native, S3-ready, MVCC time travel—with idiomatic async/await syntax and zero-copy PyArrow interop.

Motivation

Why Python?

  • Data community: Python dominates data science, ML, and analytics
  • Agent frameworks: LangChain, AutoGPT, CrewAI are Python-first
  • Adoption multiplier: Python bindings unlock 10x potential user base
  • Manifesto alignment: "Agent execution substrate" needs Python support

Why Native Async?

Most Rust→Python bindings use blocking wrappers:

# Typical blocking approach (bad)
result = db.scan()  # blocks Python event loop

With fusio's executor abstraction, we can implement a Python asyncio executor:
# Native async (good)
result = await db.scan()  # yields to Python event loop

This enables:

  • Non-blocking I/O in async Python applications
  • Integration with aiohttp, FastAPI, asyncio frameworks
  • Proper backpressure and cancellation
  • No thread pool overhead for I/O-bound operations

Goals

  1. PyArrow-native: Ingest/query via pyarrow.RecordBatch with zero-copy where possible
  2. Native asyncio: Real async/await, not thread-pool-wrapped blocking
  3. Full API coverage: Transactions, snapshots, time travel, scans
  4. S3/Object storage: Same backends as Rust API
  5. pip installable: pip install tonbo with pre-built wheels
  6. Type hints: Full typing for IDE support

Non-Goals

  • Pandas-first API (users can convert PyArrow ↔ Pandas themselves)
  • SQLAlchemy/ORM integration (future consideration)
  • Synchronous API (async-only for MVP; sync wrapper can come later)

Design

┌─────────────────────────────────────────────────────────────┐
│                     Python User Code                        │
│   async with tonbo.open("s3://bucket/db") as db:            │
│       await db.ingest(batch)                                │
│       results = await db.scan().filter(...).collect()       │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    tonbo-python (PyO3)                      │
│  ┌──────────────┐  ┌─────────────┐  ┌─────────────────────┐ │
│  │ PyDB         │  │ PyTransaction│ │ PyScanBuilder       │ │
│  │ PySnapshot   │  │ PyCheckpoint │ │ PyRecordBatchStream │ │
│  └──────────────┘  └─────────────┘  └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                   AsyncioExecutor (fusio)                   │
│  ┌─────────────────────────────────────────────────────┐    │
│  │ impl Executor for AsyncioExecutor                   │    │
│  │   - spawn() → Python asyncio.create_task()          │    │
│  │   - Futures bridge Rust ↔ Python                    │    │
│  └─────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      Tonbo Core (Rust)                      │
│            DB<FS, AsyncioExecutor> - unchanged              │
└─────────────────────────────────────────────────────────────┘

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    XL - Extra LargeSystem architecture overhaul, adding support for new platforms, large-scale dependency updates.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions