Fix false transaction conflicts on commit retry#3
Open
fuziontech wants to merge 3 commits into
Open
Conversation
…ion_changes GetTransactionChanges() was computed once before the retry loop, but CommitChanges (via GetNewTableInfo) and WriteSnapshotChanges (via AddTableChanges) mutate it in-place by adding committed (remapped) table IDs. When the first commit attempt fails (e.g., duplicate snapshot_id race), these stale committed IDs persist into the retry. On retry, CheckForConflicts compares the stale IDs against the successfully-committed transaction's changes. Because both transactions derived their IDs from the same next_catalog_id counter, the IDs collide and produce a false conflict (e.g., "insert into table while another transaction altered it") even when the transactions operated on entirely different tables. Fix: move GetTransactionChanges() inside the retry loop so it is recomputed fresh each iteration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous tests used pre-existing tables with fixed IDs. Since existing table IDs don't change across retries, they could never collide with another transaction's committed IDs — the tests would pass even without the fix. The new tests use concurrent CTAS + ALTER operations. Both transactions start from the same snapshot (same next_catalog_id), so they derive identical committed IDs for their new tables. When the first attempt fails, WriteSnapshotChanges has already added the stale committed ID to tables_inserted_into. On retry, this collides with the other transaction's altered_tables entry (same ID), triggering the false conflict that this PR fixes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The COMMENT ON must be on the first committer so its altered_tables entry collides with the retrying transaction's stale tables_inserted_into. Previously it was on con2 (the retrier), meaning con1's altered_tables was empty and no false conflict could trigger. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Root cause:
GetTransactionChanges()is computed once before the retry loop inFlushChanges(), butCommitChanges(viaGetNewTableInfo) andWriteSnapshotChanges(viaAddTableChanges) mutate it in-place by adding committed (remapped) table IDs. When the first commit attempt fails (duplicatesnapshot_idrace), these stale committed IDs persist into the retry.False conflict mechanism: On retry,
CheckForConflictscompares the stale IDs against the other transaction's committed changes. Both transactions derived their committed IDs from the samenext_catalog_idcounter, so the IDs collide — producing a false conflict like "insert into table while another transaction altered it" even when operating on entirely different tables.Example scenario:
next_catalog_id=2577025771→WriteSnapshotChangesmutatestables_inserted_into += {25771}→ Commit fails (Tx B committed first)altered_tables: {25771}(fromCOMMENT ON TABLE)CheckForConflictsseestables_inserted_into: {25771}(stale) vsaltered_tables: {25771}(Tx B) → false conflictFix: Move
GetTransactionChanges()inside the retry loop so it is recomputed fresh each iterationTest plan
transaction_retry_stale_changes.testcovering:🤖 Generated with Claude Code