S3 multipart upload of Parquet objects?! #326
javafanboy
started this conversation in
General
Replies: 1 comment
-
DuckLake uses DuckDB's infrastructure for communicating with external file systems like S3, so all operations that are supported by DuckDB are also supported by DuckLake. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
When creating large Parquet objects it would be very nice if the DuckLake extension used multi-part upload to S3 and allowed configuration of the part size & a lower size limit when multi-part upload is used. I do belive DuckDB suports multipart upload to S3 but did not find any info for DuckLake if this is leveraged...
Controling the part size is important as too many (= too small) parts will not result in any speed improvmeent (can even be slower) but will increase the cost due to larger than needed number of S3 put requests...
Beta Was this translation helpful? Give feedback.
All reactions