Fix potential data race: Lock all KV maps during apply_changes
, even those which are read-only
#6866
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Taking another look at the still-failing tests in #6616, I managed to parse one of the consistent error stacks and I think this is the right fix.
Explanation
A pseudocode summary of this
apply_changes
function, which is taking a Tx-owned collection of local maybe-modified views over maps and applying them to the local persistent KV if they're still valid:Most of the
if
s in this are about checking for writes. If we have a read-only transaction, then it can have no conflicts*, and we can skip theprepare()
step entirely. Since we're not callingprepare()
to read from the map, andcommit()
won't affect the map, we avoid locking these read-only views.But there's a bug in this thinking - there's a difference between having a read-only view/changeset (
change.has_writes()
) and a read-only transaction (has_writes==false
,all change.has_writes() == false
). Writing transactions may contain read-only views! And in those transactions, we still need to run theprepare()
check for the read-only views (my write to table A could depend on readingfoo
from table B, so we need to confirm that B still containsfoo
).We previously did call
prepare()
for all changes, but without locking the read-only changes. That means we're reading the underlying Map memory without a lock, and racing with any other transactions which may be writing to the same Map.This PR simply makes the
lock()
(andunlock()
) unconditional - we lock all maps. There may be a marginally narrower use of this where we continue to avoid the locks for read-only transactions, but we can't know that until we've iterated over all changes, and there's extra complication from thetrack_read_version
flag. I think this would be a minor optimisation, not worth the complexity.* If you run a read-only transaction, and read a bunch of state at seqno=15, then even if the KV has been updated to store to seqno=20 by the time you get here, you have no problem - your transaction reports that it read at seqno=15, and that state is still valid.
Traces
Dumping some TSAN trace walk-through here for posterity.
Full trace is here, but 1400 lines of verbose call-stack: tsan_trace.txt
There are 5 data-race warnings. The last 4 are very similar, the first looks a little different but I think has the same root cause. Trimming to the relevant bits of the second:
Mutexes M0, M3, M4, and M5 are mutexes within each Map (created within
create_change_set
when that Map was first accessed). Thread T2 is going through the signature commit path, so also holds store/sig related mutexes M1 and M2.We have a data race accessing some
champ::SubNodes
fields, both under the call-stack of thisapply_changes
function. Both threads hold some locks, but none of the same, so they race. Thecommit()
side must have some writes, as it's creating new entries, so it should have locked all of the maps it's handling, and be strictly ordered against any nearbyprepare()
s from other transactions? But not for a map it considered read-only!