Rewriting history in ducklake #276
c-herrewijn
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Background
Consider the following scenarios:
True deletion can be obtained by expiring past snapshots (in combination with
ducklake_cleanup_old_files()
). The drawback of that method is that you then loose all time travel options in the affected time window.Recreating the affected parquet files via a user created script that replaces some data with anonymized values might be an option, but that is cumbersome and requires a deeper understanding of the ducklake format. (At the risk of corrupting the datalake)
Suggestion
Would logic to 'rewrite history' be a useful addition to the ducklake specification / implementation?
This would likely mean alternative versions of UPDATE, DELETE, and DROP statements to achieve "delete as if it never existed", or "update as if it always was like this".
Beta Was this translation helpful? Give feedback.
All reactions