Skip to content

fix(coprocessor): use block hash to identify transaction#2098

Open
rudy-6-4 wants to merge 1 commit intomainfrom
rudy/fix/tx_id_with_block_hash
Open

fix(coprocessor): use block hash to identify transaction#2098
rudy-6-4 wants to merge 1 commit intomainfrom
rudy/fix/tx_id_with_block_hash

Conversation

@rudy-6-4
Copy link
Copy Markdown
Contributor

No description provided.

@rudy-6-4 rudy-6-4 requested a review from a team as a code owner March 11, 2026 17:10
@cla-bot cla-bot bot added the cla-signed label Mar 11, 2026
@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch 2 times, most recently from 9bdd318 to dd87cb8 Compare March 11, 2026 17:13
@mergify
Copy link
Copy Markdown

mergify bot commented Mar 11, 2026

🧪 CI Insights

Here's what we observed from your CI run for ba3a8bb.

🟢 All jobs passed!

But CI Insights is watching 👀

@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch 3 times, most recently from fd5809b to 707a3db Compare March 13, 2026 15:52
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 13, 2026

Coprocessor Changed Lines Coverage

Coverage of added/modified lines in coprocessor: 82.0%

Per-file breakdown

Diff Coverage

Diff: origin/main...HEAD, staged and unstaged changes

  • coprocessor/fhevm-engine/host-listener/src/database/dependence_chains.rs (100%)
  • coprocessor/fhevm-engine/host-listener/src/database/ingest.rs (100%)
  • coprocessor/fhevm-engine/host-listener/src/database/tfhe_event_propagate.rs (100%)
  • coprocessor/fhevm-engine/scheduler/src/dfg.rs (60.0%): Missing lines 289,386,434,438,457,560,576,583
  • coprocessor/fhevm-engine/scheduler/src/dfg/scheduler.rs (81.8%): Missing lines 304,336
  • coprocessor/fhevm-engine/scheduler/src/dfg/types.rs (100%)
  • coprocessor/fhevm-engine/tfhe-worker/src/tfhe_worker.rs (100%)

Summary

  • Total: 58 lines
  • Missing: 10 lines
  • Coverage: 82%

coprocessor/fhevm-engine/scheduler/src/dfg.rs

  285     }
  286 }
  287 impl std::fmt::Debug for ComponentNode {
  288     fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
! 289         let _ = writeln!(f, "Transaction: [{:?}]", self.transaction);
  290         let _ = writeln!(
  291             f,
  292             "{:?}",
  293             daggy::petgraph::dot::Dot::with_config(self.graph.graph.graph(), &[])

  382                 let cons = self
  383                     .graph
  384                     .node_weight(*consumer)
  385                     .ok_or(SchedulerError::DataflowGraphError)?;
! 386                 error!(target: "scheduler", { producer_tx = ?prod.transaction.clone(), consumer_tx = ?cons.transaction.clone()},
  387 		       "Unexpected cycle in same-transaction dependence");
  388                 return Err(SchedulerError::CyclicDependence.into());
  389             }
  390         }

  430                         .graph
  431                         .node_weight_mut(*idx)
  432                         .ok_or(SchedulerError::DataflowGraphError)?;
  433                     tx.is_uncomputable = true;
! 434                     error!(target: "scheduler", { transaction = ?tx.transaction.clone() },
  435 		       "Transaction is part of a dependence cycle");
  436                     for (_, op) in tx.graph.graph.node_references() {
  437                         self.results.push(DFGTxResult {
! 438                             transaction: tx.transaction.clone(),
  439                             handle: op.result_handle.to_vec(),
  440                             compressed_ct: Err(SchedulerError::CyclicDependence.into()),
  441                         });
  442                     }

  453                 let cons = self
  454                     .graph
  455                     .node_weight(*consumer)
  456                     .ok_or(SchedulerError::DataflowGraphError)?;
! 457                 error!(target: "scheduler", { producer_tx = ?prod.transaction.clone(), consumer_tx = ?cons.transaction.clone() },
  458 		       "Dependence cycle when adding dependence - initial cycle detection failed");
  459                 return Err(SchedulerError::CyclicDependence.into());
  460             }
  461         }

  556 
  557             // Add error results for all operations in this transaction
  558             for (_idx, op) in tx_node.graph.graph.node_references() {
  559                 self.results.push(DFGTxResult {
! 560                     transaction: tx_node.transaction.clone(),
  561                     handle: op.result_handle.to_vec(),
  562                     compressed_ct: Err(SchedulerError::MissingInputs.into()),
  563                 });
  564             }

  572     }
  573     pub fn get_results(&mut self) -> Vec<DFGTxResult> {
  574         std::mem::take(&mut self.results)
  575     }
! 576     pub fn get_intermediate_handles(&mut self) -> Vec<(Handle, Transaction)> {
  577         let mut res = vec![];
  578         for tx in self.graph.node_weights_mut() {
  579             if !tx.is_uncomputable {
  580                 res.append(

  579             if !tx.is_uncomputable {
  580                 res.append(
  581                     &mut (std::mem::take(&mut tx.intermediate_handles))
  582                         .into_iter()
! 583                         .map(|h| (h, tx.transaction.clone()))
  584                         .collect::<Vec<_>>(),
  585                 );
  586             }
  587         }

coprocessor/fhevm-engine/scheduler/src/dfg/scheduler.rs

  300         // this transaction and possibly more downstream.
  301         for (h, i) in tx_inputs.iter_mut() {
  302             if i.is_none() {
  303                 let Some(Ok(ct)) = res.get(h) else {
! 304                     warn!(target: "scheduler", {transaction = ?tid},
  305 		       "Missing input to compute transaction - skipping");
  306                     for nidx in dfg.graph.node_identifiers() {
  307                         let Some(node) = dfg.graph.node_weight_mut(nidx) else {
  308                             error!(target: "scheduler", {index = ?nidx.index() }, "Wrong dataflow graph index");

  332         .entered();
  333         let started_at = std::time::Instant::now();
  334 
  335         let Ok(ts) = daggy::petgraph::algo::toposort(&dfg.graph, None) else {
! 336             error!(target: "scheduler", {transaction = ?tid },
  337 		       "Cyclical dependence error in transaction");
  338             for nidx in dfg.graph.node_identifiers() {
  339                 let Some(node) = dfg.graph.node_weight_mut(nidx) else {
  340                     error!(target: "scheduler", {index = ?nidx.index() }, "Wrong dataflow graph index");

@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch from 707a3db to 60650da Compare March 13, 2026 16:23
@rudy-6-4 rudy-6-4 requested a review from a team as a code owner March 13, 2026 16:23
@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch from 60650da to b582cca Compare March 13, 2026 17:08
@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

Claude finished @rudy-6-4's request.View run

claude[bot]

This comment was marked as outdated.

@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch from b582cca to 9f0a2e9 Compare March 17, 2026 12:47
@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch 4 times, most recently from 1b633ec to b3736cc Compare March 18, 2026 13:55
@zama-ai zama-ai deleted a comment from claude bot Mar 18, 2026
@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch from b3736cc to 586b47e Compare March 18, 2026 14:26
@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Mar 18, 2026

Run was cancelled before completion.View run

@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Mar 18, 2026

Claude finished @rudy-6-4's request.View run

claude[bot]

This comment was marked as outdated.

@claude
Copy link
Copy Markdown

claude bot commented Mar 18, 2026

The review has been posted successfully. Here's a summary of what was reviewed and found:

Review Posted ✅

PR #2098: fix(coprocessor): use block hash to identify transaction

Key Finding

ON CONFLICT clause inconsistency — The INSERT into computations now includes block_hash, and all SELECT/UPDATE queries filter on (transaction_id, block_hash), but the ON CONFLICT clause and its backing unique index still only use (output_handle, transaction_id). This could cause issues in chain reorg scenarios where the same transaction appears in different blocks — the new block's computation would be silently dropped, and subsequent queries (which now filter by block_hash) would never find the existing row. A clarification question was posed to the author.

Clean Areas

  • Migration files are well-structured with concurrent index creation
  • Transaction struct correctly implements Hash, PartialEq, Eq
  • DFG/scheduler code consistently updated throughout
  • Error handling improved (previously discarded results now checked)
  • Test helpers properly updated

Nit

Redundant PartialEq<Transaction> for &Transaction impl noted.

@claude
Copy link
Copy Markdown

claude bot commented Mar 18, 2026

Run was cancelled before completion.View run

@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Mar 18, 2026

Claude finished @rudy-6-4's request.View run

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AUTOMATED]

Code Review Summary

This PR changes transaction identification in the coprocessor from using transaction_id alone to a composite key (transaction_id, block_hash). The change is well-structured — the new Transaction struct cleanly encapsulates the composite identity, and the SQL queries, scheduler, and worker are all consistently updated.

Overall: Sound architectural change with one blocking database migration issue and a couple of minor nits.

Key Observations

  • ✅ Good: The new Transaction type provides a clean abstraction for the composite key
  • ✅ Good: set_computation_error now checks rows_affected() == 0 and warns — this is a meaningful improvement over silently ignoring the result
  • ✅ Good: Concurrent index creation avoids locking the table during migration
  • ✅ Good: Tests and helpers consistently updated with block_hash: ...::ZERO
  • ⚠️ Blocking: Old unique index not dropped (see inline comment)
  • ℹ️ sqlx-cli version bump from 0.7.2 → 0.8.6 is bundled in — presumably needed for the new query patterns


-- For next release, we should remove the default values for block_hash and block_number.
-- ALTER TABLE computations ALTER COLUMN block_hash DROP DEFAULT;
-- ALTER TABLE computations ALTER COLUMN block_number DROP DEFAULT;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AUTOMATED] Bug (Confidence: 95/100): The existing unique index idx_computations_no_tenant ON computations (output_handle, transaction_id) (created in migration 20260128095635_remove_tenants.sql, line 88) is not dropped by any of the new migrations.

This means:

  1. The old 2-column unique constraint (output_handle, transaction_id) is still enforced
  2. You cannot insert two rows with the same (output_handle, transaction_id) but different block_hash values — the old index will reject them
  3. The ON CONFLICT (output_handle, transaction_id, block_hash) DO NOTHING clause targets the new 3-column unique index, so a conflict on the old 2-column index would raise an error rather than being silently handled

This defeats the purpose of the PR — the whole point of adding block_hash to the key is to allow the same (output_handle, transaction_id) to exist across different blocks (e.g., during reorgs).

Suggested fix: Add a migration step before creating the new index:

DROP INDEX IF EXISTS idx_computations_no_tenant;

.node_weight(*consumer)
.ok_or(SchedulerError::DataflowGraphError)?;
error!(target: "scheduler", { producer_id = ?hex::encode(prod.transaction_id.clone()), consumer_id = ?hex::encode(cons.transaction_id.clone()) },
error!(target: "scheduler", { producer = ?prod.transaction.clone(), consumer_id = ?cons.transaction.clone() },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AUTOMATED] Nit (Confidence: 85/100): Inconsistent structured log field naming — one field is producer and the other is consumer_id:

error!(... { producer = ?prod.transaction.clone(), consumer_id = ?cons.transaction.clone() }, ...)

Since the values are now Transaction structs (not IDs), consider using consistent naming like producer / consumer or producer_tx / consumer_tx. The _id suffix on only one of them is misleading. This also applies to the similar log line earlier in the same function.

@claude
Copy link
Copy Markdown

claude bot commented Mar 18, 2026

The review was successfully posted. Let me provide a summary.


Review Posted Successfully ✅

The review has been posted to PR #2098 with the following findings:

🔴 Critical Issue (1)

Missing DROP INDEX for old unique constraint (Confidence: 95/100)

  • File: 20260309161441_computations_block_hash.sql

  • The existing unique index idx_computations_no_tenant ON computations (output_handle, transaction_id) is never dropped by the new migrations. This means:

    1. The old 2-column uniqueness is still enforced
    2. Inserting rows with the same (output_handle, transaction_id) but different block_hash will fail against the old index
    3. The ON CONFLICT (output_handle, transaction_id, block_hash) clause only targets the new 3-column index, so old-index violations raise errors instead of being handled

    This fundamentally defeats the purpose of the PR.

💛 Nit (1)

Inconsistent log field naming in dfg.rsproducer vs consumer_id in the same log line (the _id suffix is misleading now that these are Transaction structs, not raw IDs).

✅ Positive Observations

  • Clean Transaction struct abstraction
  • Improved error handling in set_computation_error (now checks rows_affected)
  • Proper concurrent index creation to avoid table locks
  • Consistent test updates across all helpers

@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch from fa0cfd7 to c88a87a Compare March 27, 2026 15:05
@rudy-6-4 rudy-6-4 requested a review from a team as a code owner March 27, 2026 15:05
@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch 2 times, most recently from 9be5ded to bc4b51d Compare March 27, 2026 15:24
@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Mar 27, 2026

Claude timed out while processing the request. — View run

@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@claude /pr-review

1 similar comment
@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Mar 30, 2026

Claude finished @rudy-6-4's request.View run

claude[bot]
claude bot previously approved these changes Mar 30, 2026
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AUTOMATED]

Code Review

Summary

This PR correctly addresses a critical bug where blockchain reorganizations (reorgs) could cause transaction ID collisions, leading to circular dependency cycles in the computation scheduler. The fix adds block_hash as part of the composite key (output_handle, transaction_id, block_hash) to uniquely identify computations.

What was reviewed

  • Bug analysis (2 independent passes): No runtime bugs found. All SQL queries, Rust type changes, and data flows are internally consistent.
  • Guideline compliance: Code follows project conventions (Angular commit format, Rust patterns).
  • Error handling: No silent failures introduced. The addition of rows_affected() check in set_computation_error is a positive improvement over the previous let _ = query!(...) pattern.

Changes verified

Area Status
DB migration adds block_hash to PK and unique index ✅ Correct
Transaction struct with (transaction_id, block_hash) ✅ Well-designed
SQL SELECT, INSERT, UPDATE queries include block_hash ✅ Consistent
ON CONFLICT clause updated to composite key ✅ Correct
HashMap grouping uses Transaction as key ✅ Hash/Eq properly derived
All scheduler/DFG type signatures updated ✅ Complete
Test helpers updated with block_hash: ZERO ✅ Appropriate
sqlx-cli version bump (0.7.2 → 0.8.6)
CI logs now include db-migration container ✅ Nice addition

Nits (non-blocking)

  • Nit: Commit message body typos"butkeeping""but keeping", "sametx""same tx" in the commit body. Minor, but worth noting since PRs are squash-merged.
  • Nit: Redundant PartialEq impl — The manual impl PartialEq<Transaction> for &Transaction in types.rs is redundant with the derived PartialEq on Transaction. Rust's auto-deref handles &Transaction == Transaction comparisons already. Not harmful, but could be removed for clarity.

Verdict

Clean, well-scoped fix. The composite key approach is the right solution for disambiguating transactions across reorgs. All data paths (ingest → DB → scheduler → worker → DB update) are consistently updated. No bugs or security issues found.

@claude
Copy link
Copy Markdown

claude bot commented Mar 30, 2026

The review has been posted successfully. Here's a summary of what was done:

PR Review Complete ✅

PR #2098: fix(coprocessor): use block hash to identify transaction

Review Process

  • Pre-validation: PR is open and non-trivial (19 files, 172+97 lines) → proceeded with review
  • 4 parallel review agents ran:
    • 🔍 Bug Hunter 1 (opus): No bugs found
    • 🔍 Bug Hunter 2 (opus): No bugs found — confirmed all SQL queries, type changes, and data flows are consistent
    • 📏 Guideline Compliance (sonnet): Minor commit message typos found (nit)
    • ⚠️ Error Handling Auditor (sonnet): Raised concerns about migration-period data mismatches, but these were invalidated by a validation agent that traced the full data flow
  • 1 validation agent (opus) confirmed that the migration-period concern is not a real issue since block_hash values are read from and written back to the same rows

Verdict: APPROVED

Clean, well-scoped fix with consistent changes across DB schema, Rust types, SQL queries, and the scheduler/worker pipeline. Two minor nits noted (commit body typos and a redundant PartialEq impl) — non-blocking.

Review posted at: #2098 (review)

@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch 2 times, most recently from 22a3d3a to e032e6b Compare March 30, 2026 15:39
@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Mar 30, 2026

Claude finished @rudy-6-4's request.View run

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AUTOMATED]

Code Review Summary

This PR correctly adds block_hash to the composite key identifying computations, enabling proper handling of blockchain reorgs. The changes are thorough and consistent across the DB migration, SQL queries, and Rust code.

What was reviewed

  • Database migration for adding block_hash to the primary key and unique index
  • SQL query updates (INSERT, SELECT, UPDATE) across host-listener and tfhe-worker
  • New Transaction struct in scheduler types and its propagation through the DFG
  • Error handling improvements in set_computation_error
  • Test helper updates
  • sqlx-cli version bump and CI logging addition

Findings

1 low-severity observation, 1 nit — neither blocking.

Overall the PR is well-structured with consistent changes across all layers. The rows_affected() == 0 warning in set_computation_error is a nice defensive improvement.

@@ -0,0 +1,26 @@
ALTER TABLE computations
ADD COLUMN IF NOT EXISTS block_hash BYTEA NOT NULL DEFAULT '\x00'::BYTEA,
ADD COLUMN IF NOT EXISTS block_number BIGINT NOT NULL DEFAULT 0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AUTOMATED] Nit: ADD COLUMN IF NOT EXISTS block_number is a no-op here

The previous migration 20260319120000_add_block_number_for_state_revert.sql already added block_number BIGINT NULL DEFAULT NULL to computations. Since IF NOT EXISTS is used, this line silently does nothing — the column retains its original NULL DEFAULT NULL constraint rather than becoming NOT NULL DEFAULT 0 as written here.

This is harmless because:

  • The INSERT in tfhe_event_propagate.rs always provides an explicit block_number value
  • The tfhe-worker SELECT query doesn't read block_number

But it may be misleading to future readers who expect block_number to be NOT NULL. Consider either:

  • Removing the block_number part from this migration (since it's already handled)
  • Or adding an explicit ALTER TABLE computations ALTER COLUMN block_number SET NOT NULL, ALTER COLUMN block_number SET DEFAULT 0; if the NOT NULL constraint is desired

Confidence: 95/100

@claude
Copy link
Copy Markdown

claude bot commented Mar 30, 2026

The review has been posted successfully. Here's a summary of the review:

PR Review: fix(coprocessor): use block hash to identify transaction (#2098)

Review Posted ✅

The review was posted as a COMMENT (non-blocking) with inline comments on 2 files.

Summary

This PR adds block_hash to the composite key identifying computations in the database to handle blockchain reorgs correctly. The changes span 18 files across the DB migration, SQL queries, scheduler types, and test helpers.

No bugs found. The changes are thorough and consistent. Two non-blocking nits were noted:

  1. Migration block_number no-op (20260327161441_computations_block_hash.sql:3): The ADD COLUMN IF NOT EXISTS block_number BIGINT NOT NULL DEFAULT 0 line is a no-op because a previous migration already added block_number as BIGINT NULL DEFAULT NULL. The column retains its original nullable constraint. This is harmless but potentially misleading.

  2. Inconsistent log field naming (scheduler/src/dfg.rs:457): One error log uses producer while a similar log on line 386 uses producer_id — minor inconsistency for log searching.

@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch 2 times, most recently from 945e257 to 2b30338 Compare April 3, 2026 14:17
fixes a bug where 2 transactions are swapped in a reorg
but keeping the same tx id and being dependent on each other
the 4 apparent tx are merged to 2 tx and create a cycle
@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/tx_id_with_block_hash branch from 2b30338 to f11fe50 Compare April 3, 2026 15:30
@rudy-6-4
Copy link
Copy Markdown
Contributor Author

rudy-6-4 commented Apr 3, 2026

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Apr 3, 2026

Claude timed out while processing the request. — View run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants