Ducklake with multiple remote users #291
Replies: 3 comments 1 reply
-
There is no "central DuckDB instance" by default, each client node that runs a JVM with DuckDB JDBC driver, will have a DuckDB instance running (in the same Java process) that will be used to access the Postgres catalog and read Parquet files from object storage. That said, it is possible to organize a "central DuckDB instance" that clients will use to access a DuckLake instance. For out-of-the-box setup the GizmoSQL may be convenient, because of its low overhead due to performant Arrow Flight SQL protocol it uses. For more flexible setup it may be convenient to run your own Java service on a single node that encapsulates all access to a DuckLake instance and then exposes it to clients using Arrow Flight SQL (implemented with Java libraries) or any other protocol. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the explanation. So if I want clients to be able update or insert data, they would need write access to the path where the parquet files are stored. This seems risky to me, every client could possibly damage the whole database by deleting parquet files. |
Beta Was this translation helpful? Give feedback.
-
Good question, @supergrobi23 . Knowing DuckDB has not central server concept, I was wondering what the point of a data lake is when each consumer needs to have a powerful machine to run queries. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I am not sure if I understand the setup for a ducklake with multiple remote users. It is my understanding that I would need a Postgres db as catalog and some storage for the parquet files. But how do the remote clients access the data? Let‘s say I have multiple users who want to connect via JDBC. Does every user run a local instance of duckdb that connects to Postgres and then reads data from the shared storage with the parquet files? Or is there a central duckdb instance?
Beta Was this translation helpful? Give feedback.
All reactions