-
Notifications
You must be signed in to change notification settings - Fork 7.3k
ZOOKEEPER-4925: Fix data loss due to propagation of discontinuous committedLog #2254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…mittedLog There are two variants of `ZooKeeperServer::processTxn`. Those two variants diverge significantly since ZOOKEEPER-3484. `processTxn(Request request)` pops outstanding change from `outstandingChanges` and adds txn to `committedLog` for follower to sync in addition to what `processTxn(TxnHeader hdr, Record txn)` does. The `Learner` uses `processTxn(TxnHeader hdr, Record txn)` to commit txn to memory after ZOOKEEPER-4394, which means it leaves `committedLog` untouched in `SYNCHRONIZATION` phase. This way, a stale follower will have hole in its `committedLog` after joining cluster. The stale follower will propagate the in memory hole to other stale nodes after becoming leader. This causes data loss. The test case fails on master and 3.9.3, and passes on 3.9.2. So only 3.9.3 is affected. This commit drops `processTxn(TxnHeader hdr, Record txn)` as `processTxn(Request request)` is capable in `SYNCHRONIZATION` phase too. Also, this commit rejects discontinuous proposals in `syncWithLeader` and `committedLog`, so to avoid possible data loss. Refs: ZOOKEEPER-4925, ZOOKEEPER-4394, ZOOKEEPER-3484
zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumSyncTest.java
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java
Outdated
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java
Outdated
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java
Outdated
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZKDatabase.java
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZKDatabase.java
Show resolved
Hide resolved
The changes for fixing the data loss issue looks good. It's a very critical issue. Thanks for fixing it. The changes for blocking proposal and commit requests being out of order looks good in general. We need to make sure it doesn't introduce any unexpected issue. |
Hi @li4wang, thank for for reviewing this! I have pushed a new commit. Please take another to look and let me know whether it solves your concerns. |
@cnauroth @anmolnar @eolivelli @tisonkun Would you like to take another look ? I plan to merge this in days to not block 3.9.4. |
There are two variants of
ZooKeeperServer::processTxn
. Those two variants diverge significantly since ZOOKEEPER-3484.processTxn(Request request)
pops outstanding change fromoutstandingChanges
and adds txn tocommittedLog
for follower to sync in addition to whatprocessTxn(TxnHeader hdr, Record txn)
does. TheLearner
usesprocessTxn(TxnHeader hdr, Record txn)
to commit txn to memory after ZOOKEEPER-4394, which means it leavescommittedLog
untouched inSYNCHRONIZATION
phase.This way, a stale follower will have hole in its
committedLog
after joining cluster. The stale follower will propagate the in memory hole to other stale nodes after becoming leader. This causes data loss.The test case fails on master and 3.9.3, and passes on 3.9.2. So only 3.9.3 is affected.
This commit drops
processTxn(TxnHeader hdr, Record txn)
asprocessTxn(Request request)
is capable inSYNCHRONIZATION
phase too.Also, this commit rejects discontinuous proposals in
syncWithLeader
andcommittedLog
, so to avoid possible data loss.Refs: ZOOKEEPER-4925, ZOOKEEPER-4394, ZOOKEEPER-3484