You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What: I would like to host the sqlite metadata file along with the data on S3.
I would like this to work:
duckdb.sql("""ATTACH 'ducklake:sqlite:s3://some-bucket/ducklake/metadata.sqlite' AS s3_ducklake (DATA_PATH 's3://some-bucket/ducklake/data');USE s3_ducklake;""")
Currently duckdb interprets everything after "...sqlite:" as a local file path.
I am using the Python API and am new to duckdb. ducklake is my first interaction with the duckdb (despite speaking to the founder at PyCon last year :D).
Why: I work in a simulation team (mostly non-software engineers) and we use Plotly Dash dashboards to visualize our simulation results. Versioning the datasets for these dashboards is a pain. It came to my mind to use duckdb for this. In order for the dashboards to read the data they need to have access to both the data and metadata. The simplest way for us to do that is to store the sqlite database on the S3 bucket as that is the only storage we have available at the moment in our architecture. ducklake here would be a gamechanger for managing these datasets without a dedicated data team in a very simple architecture.
Questions:
Is this already possible and I am just holding ducklake/duckdb wrong?
If not would this be a desirable architecture?
If yes can this be implemented? (may be with a sponsorship)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
What: I would like to host the sqlite metadata file along with the data on S3.
I would like this to work:
Currently duckdb interprets everything after "...sqlite:" as a local file path.
I am using the Python API and am new to duckdb. ducklake is my first interaction with the duckdb (despite speaking to the founder at PyCon last year :D).
Why: I work in a simulation team (mostly non-software engineers) and we use Plotly Dash dashboards to visualize our simulation results. Versioning the datasets for these dashboards is a pain. It came to my mind to use duckdb for this. In order for the dashboards to read the data they need to have access to both the data and metadata. The simplest way for us to do that is to store the sqlite database on the S3 bucket as that is the only storage we have available at the moment in our architecture. ducklake here would be a gamechanger for managing these datasets without a dedicated data team in a very simple architecture.
Questions:
Beta Was this translation helpful? Give feedback.
All reactions