Skip to content

Nonce collision when executing parallel contract writes from same account #415

@chrisli30

Description

@chrisli30

Problem

When a workflow has multiple contract write nodes executing in parallel (e.g., from a branch with multiple paths), they fetch the same nonce simultaneously, causing a nonce collision. Only one transaction succeeds while the other(s) hang indefinitely waiting for confirmation.

Evidence from Logs

Task ID: 01K73WAEHKWWV2RX4H4MKAVF2F

Both approval transactions used the same nonce:

  • First approval (approve_token1_permit2): nonce 58, UserOp hash 0x60b464495163376f48327e006f538698877f5be4c285b610fd2d42ebede25190 - STUCK
  • Second approval (approve_token1_swaprouter): nonce 58, UserOp hash 0xc04c9bde00b1a6a45b631e7cbbf488ba8a64fb9463ce96142dcece2e5c3c7fb3 - SUCCEEDED

Root Cause

The contract write processor fetches nonces without coordination. When multiple contract writes execute in parallel:

  1. Both fetch current nonce at the same time → both get nonce 58
  2. Both build and submit UserOps with nonce 58
  3. Only one gets included on-chain (whichever the bundler processes first)
  4. The other becomes permanently invalid (can never confirm)
  5. Workflow hangs waiting for invalid transaction (until 3-minute timeout)

Impact

  • Workflows with parallel contract writes will hang for 3 minutes before timing out
  • Client never receives response during this time (blocking requests)
  • Gas is wasted on the failed transaction
  • Poor user experience

Potential Solutions

  1. Sequential execution: Detect when multiple contract writes from the same sender are in parallel and force sequential execution
  2. Nonce coordination: Implement a nonce manager that allocates sequential nonces to parallel transactions (58, 59, 60...)
  3. Batch transactions: Combine multiple contract calls into a single multicall transaction
  4. Better error handling: Detect nonce collision and fail fast instead of waiting for timeout

Steps to Reproduce

  1. Create a workflow with a branch node
  2. Add multiple contract write nodes on different branch paths (all using the same smart wallet)
  3. Trigger the workflow
  4. Observe that only one contract write succeeds, others hang

Related Code

  • core/taskengine/vm_runner_contract_write.go: UserOp nonce fetching
  • core/taskengine/vm.go: Parallel node execution in Kahn scheduler

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions