-
Notifications
You must be signed in to change notification settings - Fork 549
feat: add support for blind appends #3890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add support for blind appends #3890
Conversation
|
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
|
👋 Thanks for starting the PR in draft to get the discussion going! Blind appends are ... tricky 😄 For your specific situation I am curious if setting the There's not really anything in the protocol that states what the rules should When I think of a blind append, I think about something that doesn't |
|
@rtyler the offending code is here: Yes, my tables are already marked as append only, but I still run into the conflict bug. Where if there is no read predicate specified, then all the files that were read in the snapshot are returned. However, in the case of blind appends, while the table state was read, it doesn't have any actual impact on what was written. Additionally, even if a table is not AppendOnly by property, I'd still say that marking commits as blind appends if they are is a decent practice.
|
|
For what it's worth, this PR has solved my problem entirely of being able to run concurrent appends + compactions as a background tokio task without any sort of concurrency issues. |
|
I opened an issue in the delta-spark repository |
Signed-off-by: Abhi Agarwal <[email protected]>
96c2408 to
20ee5c6
Compare
Description
Consider the following case:
This will fail, since by default, the conflict checker assumes that the write depended on all the files in the table at a given state, this with
ConcurrentDeleteRead. However, if it is a true blind append, the writer never depended on the state of the table.Instead, this PR marks a write commit as a blind append if it is in mode Append and there are no remove actions.
Marked as draft bc I'd like to write tests and confirm this is the correct behavior.
Related Issue(s)
Likely closes #2700
Documentation
Reference here: https://books.japila.pl/delta-lake-internals/OptimisticTransactionImpl/?h=blind+app#isBlindAppend
Planning on diving into the delta-spark code when I have the time to ensure this is the correct heuristic.