Replies: 1 comment 1 reply
-
The ability to add existing Parquet files is not there yet but I'm currently working on adding this functionality. The main issue in the current design is that we need field ids to be present in the files, which prevents you from adding arbitrary Parquet files. I'm currently working on extending the support for how we do column mappings which will allow arbitrary Parquet files to be added without needing to rewrite them.
Files and their metadata are removed automatically when they are no longer required due to either all snapshots that refer to them being expired or their contents being moved into other files as part of a merge operation. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, congrats on releasing DuckLake!
We have a bunch of immutable parquet files that we would like to manage through DuckLake. These files get created and deleted, but never modified. When inserting into DuckLake, new parquet files get created. This can be expensive in some scenarios.
Is there a way to import already existent files where the metadata from the parquet files would be recorded in the process? The same way, is there a way to remove a given file and its related metadata? Is this a valid use-case for DuckLake?
I could not find anything in the available documentation. Any pointers are appreciated, thanks!
Beta Was this translation helpful? Give feedback.
All reactions