Open
Description
Overview
We do not yet have a DuckDB source connector. Normally, DuckDB database are local files and not very useful as sources, but now they can also be remote (e.g. MotherDuck) and they can be a pass-through for other datasource (e.g. #30 and the Hugging Face Datasets).
Technical spec
You would write a new source connector which can connect to a (remote) DuckDB dataset or database, and emit records from DuckDB, allowing Airbyte users to send these to any Airbyte destination.
Notes:
- We do have a DuckDB Destination and a PyAirbyte
Cache
andSQLProcessor
. - It is not obvious how (or if) incremental processing should be handled for DuckDB sources. Whoever pick this task should plan to propose a path forward for this during development.
Definition of Done
- You would build a new "DuckDB" source in Python (reusing code if helpful).
- If primary keys exist, they should be registered in the catalog.
- You should use the CDK as much as possible.
- The connector should pass integration tests and acceptance tests.
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Not Started