Skip to content

fix(coprocessor): dependence chain update during reorg in case of collisions (#1768)#1899

Merged
mergify[bot] merged 1 commit intomainfrom
antoniu/backport-missing-pr
Feb 6, 2026
Merged

fix(coprocessor): dependence chain update during reorg in case of collisions (#1768)#1899
mergify[bot] merged 1 commit intomainfrom
antoniu/backport-missing-pr

Conversation

@antoniupop
Copy link
Copy Markdown
Collaborator

Backport from release/v0.10.x

  • fix(coprocessor): improve dependence chain update in HL during reorg in case of collisions

  • fix(coprocessor): allow TFHE worker to lock earliest dependence chain on no progress

  • chore(coprocessor): add index for pending dependence chains where dependence count is not zero

  • fix(coprocessor): add threshold for ignoring dependency_count and make dependent list unique

@antoniupop antoniupop requested a review from a team as a code owner February 4, 2026 09:52
@cla-bot cla-bot Bot added the cla-signed label Feb 4, 2026
@Eikix
Copy link
Copy Markdown
Contributor

Eikix commented Feb 4, 2026

@claude /pr-review

@claude
Copy link
Copy Markdown

claude Bot commented Feb 4, 2026

Claude finished @Eikix's task in 3m 51s —— View job


[AUTOMATED] PR Review: fix(coprocessor): dependence chain update during reorg in case of collisions

Summary

This is a backport PR that introduces 4 related fixes for handling dependence chains during blockchain reorganizations:

  1. Improved SQL UPSERT - Uses LEAST() for timestamps and merges dependent lists with DISTINCT
  2. New acquire_early_lock() method - Fallback lock acquisition that ignores dependency_count
  3. New database index - Partial index for faster lock acquisition queries
  4. New threshold parameter - dcid_ignore_dependency_count_threshold for controlling fallback behavior

Issues Found

🔴 Medium: earliest_schedule_order may be initialized from non-allowed work item

File: coprocessor/fhevm-engine/tfhe-worker/src/tfhe_worker.rs
Lines: 425, 496-500

Description:
The variable earliest_schedule_order is initialized at line 425 from the_work.first().unwrap().schedule_order. However, the_work contains ALL computations from selected transactions (including those where is_allowed = false), not just allowed ones.

The loop at lines 496-500 only updates earliest_schedule_order when BOTH conditions are met:

if w.schedule_order < earliest_schedule_order && w.is_allowed {
    earliest_schedule_order = w.schedule_order;
}

Problem: If the first item in the_work has is_allowed = false, and all allowed items have a later schedule_order, then earliest_schedule_order will never be updated because no allowed item will satisfy schedule_order < earliest_schedule_order.

This contradicts the intent stated in the comment (lines 497-499): "Only account for allowed to avoid case of reorg where trivial encrypts will be in collision."

Suggested Fix:
Initialize earliest_schedule_order to PrimitiveDateTime::MAX and let the loop find the actual earliest allowed item:

let mut earliest_schedule_order = PrimitiveDateTime::MAX;
// ... later in the loop:
if w.is_allowed && w.schedule_order < earliest_schedule_order {
    earliest_schedule_order = w.schedule_order;
}

Fix this →

Confidence: 90/100


Nits (Non-blocking)

📝 Missing tests for new acquire_early_lock() method

File: coprocessor/fhevm-engine/tfhe-worker/src/dependence_chain.rs
Lines: 220-284

The new acquire_early_lock() function lacks dedicated test coverage. The existing acquire_next_lock() has tests in src/tests/dependence_chain.rs, but no tests cover the new fallback mechanism. Given this is critical for reorg handling, tests would help ensure correctness.

Confidence: 95/100


📝 CLI argument documentation could be more descriptive

File: coprocessor/fhevm-engine/tfhe-worker/src/daemon_cli.rs
Lines: 124-126

The new dcid_ignore_dependency_count_threshold parameter has minimal documentation. Consider explaining:

  • What happens when the threshold is reached
  • The relationship to dcid_max_no_progress_cycles
  • Why this is needed (reorg handling)

Confidence: 85/100


What Looks Good ✅

  • SQL LEAST() logic - Correctly preserves earliest timestamp during updates
  • SQL DISTINCT merge - Properly deduplicates dependents array
  • New partial index - Matches the query pattern in acquire_early_lock()
  • Lock acquisition logic - FOR UPDATE SKIP LOCKED prevents race conditions
  • Threshold-based fallback - Sound design for handling incorrect dependency counts due to reorgs
  • Error handling - Properly propagates errors with ? operator

Review Decision

COMMENT - One medium-severity issue found that should be addressed. The earliest_schedule_order initialization bug could cause the same reorg collision issues the PR aims to fix.


@Eikix
Copy link
Copy Markdown
Contributor

Eikix commented Feb 4, 2026

Not sure claude's review is relevant, since this is already merged code.
If you choose to disregard, I can approve

@mergify
Copy link
Copy Markdown

mergify Bot commented Feb 4, 2026

🧪 CI Insights

Here's what we observed from your CI run for d84c5a0.

🟢 All jobs passed!

But CI Insights is watching 👀

Eikix
Eikix previously approved these changes Feb 5, 2026
Copy link
Copy Markdown
Contributor

@Eikix Eikix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backport check only: patch-id matches e2918f4 on release/0.10.x (PR #1768). Single commit, no extra changes. LGTM.

@antoniupop antoniupop force-pushed the antoniu/backport-missing-pr branch from 414a201 to c9b7d0b Compare February 5, 2026 08:42
@antoniupop
Copy link
Copy Markdown
Collaborator Author

@Mergifyio queue

@mergify
Copy link
Copy Markdown

mergify Bot commented Feb 5, 2026

queue

✅ The pull request has been merged automatically

Details

The pull request has been merged automatically at 74e9474

@mergify
Copy link
Copy Markdown

mergify Bot commented Feb 5, 2026

Merge Queue Status

✅ The pull request has been merged at d84c5a0

This pull request spent 2 hours 37 minutes 24 seconds in the queue, including 1 hour 36 minutes 31 seconds running CI.
The checks were run on draft #1937.

Required conditions to merge
  • #approved-reviews-by >= 1 [🛡 GitHub branch protection]
  • #changes-requested-reviews-by = 0 [🛡 GitHub branch protection]
  • #review-threads-unresolved = 0 [🛡 GitHub branch protection]
  • branch-protection-review-decision = APPROVED [🛡 GitHub branch protection]
  • check-success = run-e2e-tests / fhevm-e2e-test
  • any of [🛡 GitHub branch protection]:
    • check-success = common-pull-request/lint (bpr)
    • check-neutral = common-pull-request/lint (bpr)
    • check-skipped = common-pull-request/lint (bpr)
  • any of [🛡 GitHub branch protection]:
    • check-skipped = coprocessor-cargo-listener-tests/cargo-tests (bpr)
    • check-neutral = coprocessor-cargo-listener-tests/cargo-tests (bpr)
    • check-success = coprocessor-cargo-listener-tests/cargo-tests (bpr)
  • any of [🛡 GitHub branch protection]:
    • check-success = coprocessor-cargo-test/cargo-tests (bpr)
    • check-neutral = coprocessor-cargo-test/cargo-tests (bpr)
    • check-skipped = coprocessor-cargo-test/cargo-tests (bpr)
  • any of [🛡 GitHub branch protection]:
    • check-success = coprocessor-dependency-analysis/dependencies-check (bpr)
    • check-neutral = coprocessor-dependency-analysis/dependencies-check (bpr)
    • check-skipped = coprocessor-dependency-analysis/dependencies-check (bpr)
  • any of [🛡 GitHub branch protection]:
    • check-skipped = gateway-contracts-deployment-tests/sc-deploy (bpr)
    • check-neutral = gateway-contracts-deployment-tests/sc-deploy (bpr)
    • check-success = gateway-contracts-deployment-tests/sc-deploy (bpr)
  • any of [🛡 GitHub branch protection]:
    • check-success = kms-connector-tests/test-connector (bpr)
    • check-neutral = kms-connector-tests/test-connector (bpr)
    • check-skipped = kms-connector-tests/test-connector (bpr)

@antoniupop antoniupop force-pushed the antoniu/backport-missing-pr branch 2 times, most recently from c6d6d62 to b3a3f24 Compare February 5, 2026 16:33
@antoniupop
Copy link
Copy Markdown
Collaborator Author

@Mergifyio queue

@mergify
Copy link
Copy Markdown

mergify Bot commented Feb 6, 2026

queue

☑️ Command queue ignored because it is already running from a previous command.

…e of collisions (#1768)

* fix(coprocessor): improve dependence chain update in HL during reorg in case of collisions

* fix(coprocessor): allow TFHE worker to lock earliest dependence chain on no progress

* chore(coprocessor): add index for pending dependence chains where dependence count is not zero

* fix(coprocessor): add threshold for ignoring dependency_count and make dependent list unique
@antoniupop antoniupop force-pushed the antoniu/backport-missing-pr branch from b6e1410 to d84c5a0 Compare February 6, 2026 12:59
@antoniupop antoniupop requested a review from Eikix February 6, 2026 12:59
Copy link
Copy Markdown
Contributor

@Eikix Eikix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

mergify Bot added a commit that referenced this pull request Feb 6, 2026
mergify Bot added a commit that referenced this pull request Feb 6, 2026
@mergify mergify Bot merged commit 74e9474 into main Feb 6, 2026
62 checks passed
@mergify mergify Bot deleted the antoniu/backport-missing-pr branch February 6, 2026 17:13
@mergify mergify Bot removed the queued label Feb 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants