Skip to content

fix: prevent lost iterations in CircularKey concurrent worker race#49

Open
ShivamPaliwal1 wants to merge 1 commit into
couchbaselabs:mainfrom
ShivamPaliwal1:fix/circular-key-concurrent-iteration-race
Open

fix: prevent lost iterations in CircularKey concurrent worker race#49
ShivamPaliwal1 wants to merge 1 commit into
couchbaselabs:mainfrom
ShivamPaliwal1:fix/circular-key-concurrent-iteration-race

Conversation

@ShivamPaliwal1
Copy link
Copy Markdown
Contributor

Two workers sharing a DocumentGenerator could both detect end-of-pass (updateItr >= update_e) before either called resetUpdate(), causing double-decrements of the plain-int iterations counter in CircularKey.

Each race event silently dropped one full pass, producing fewer total mutations than expected (observed: 54204 vs 60000 ep_dcp_items_sent in CDC history retention tests).

Add double-checked locking (synchronized on this.keys) in has_next_read(), has_next_update(), and has_next_expiry() so only one worker triggers checkIterations() + reset per pass-end event.

The fast unsynchronized path is preserved for normal in-pass execution, keeping full worker concurrency for deduplication-sensitive tests.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a concurrency race when multiple workers share a single DocumentGenerator using CircularKey, ensuring only one worker performs the end-of-pass iteration check and counter reset.

Changes:

  • Added double-checked locking (synchronized (this.keys)) around end-of-pass handling in has_next_read(), has_next_update(), and has_next_expiry().
  • Ensured these methods return true immediately after a successful iteration rollover + reset, preserving progress across passes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if (this.ws.dr.expiryItr.get() < this.ws.dr.expiry_e)
return true;
try {
if ((boolean)this.iterationsMethod.invoke(this.keys, null)) {
return true;
try {
if ((boolean)this.iterationsMethod.invoke(this.keys)) {
this.resetRead();
Comment on lines +238 to +241
if ((boolean)this.iterationsMethod.invoke(this.keys)) {
this.resetUpdate();
this.ws.mutated += 1;
return true;
Two workers sharing a DocumentGenerator could both detect end-of-pass
(updateItr >= update_e) before either called resetUpdate(), causing
double-decrements of the plain-int iterations counter in CircularKey.

Each race event silently dropped one full pass, producing fewer total
mutations than expected (observed: 54204 vs 60000 ep_dcp_items_sent
in CDC history retention tests).

Add double-checked locking (synchronized on this.keys) in
has_next_read(), has_next_update(), and has_next_expiry() so only one
worker triggers checkIterations() + reset per pass-end event.

The fast unsynchronized path is preserved for normal in-pass execution,
keeping full worker concurrency for deduplication-sensitive tests.
@ShivamPaliwal1 ShivamPaliwal1 force-pushed the fix/circular-key-concurrent-iteration-race branch from 020aa86 to c140886 Compare May 12, 2026 11:12
@bkumaran bkumaran requested a review from Copilot May 12, 2026 12:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment on lines +233 to 252
synchronized (this.keys) {
if (this.ws.dr.updateItr.get() < this.ws.dr.update_e)
return true;
if (this.keyInstance.getSimpleName().equals(CircularKey.class.getSimpleName())) {
try {
if ((boolean)this.iterationsMethod.invoke(this.keys)) {
this.resetUpdate();
this.ws.mutated += 1;
return true;
}
} catch (IllegalAccessException | IllegalArgumentException | InvocationTargetException e1) {
e1.printStackTrace();
}
} catch (IllegalAccessException | IllegalArgumentException | InvocationTargetException e1) {
e1.printStackTrace();
}
}
if (TimeUnit.MILLISECONDS.toSeconds(System.currentTimeMillis())-startTime<ws.mutation_timeout) {
this.resetUpdate();
this.ws.mutated += 1;
return true;
if (TimeUnit.MILLISECONDS.toSeconds(System.currentTimeMillis())-startTime<ws.mutation_timeout) {
this.resetUpdate();
this.ws.mutated += 1;
return true;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants