Conversation
There was a problem hiding this comment.
Pull request overview
Adds a non-blocking “try set busy” capability to the node processing status handler and wires it into block processing paths to prevent concurrent/re-entrant block processor calls, introducing a dedicated ErrBlockProcessorBusy for callers (e.g., sync) to handle gracefully.
Changes:
- Extend
common.ProcessStatusHandlerwithTrySetBusy(reason) booland implement it in enabled/disabled/stub handlers. - Guard shard/meta/base block processing entrypoints with
TrySetBusy, returningprocess.ErrBlockProcessorBusywhen already busy. - Update sync failure handling to ignore
ErrBlockProcessorBusy(no rollback/tracking) and adjust affected tests/stubs.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
common/interface.go |
Adds TrySetBusy to ProcessStatusHandler interface. |
statusHandler/processStatusHandler.go |
Implements TrySetBusy with mutex-protected state transition and logging. |
common/disabled/processStatusHandler.go |
Adds no-op TrySetBusy implementation (always true). |
testscommon/processStatusHandlerStub.go |
Extends stub to support TrySetBusyCalled. |
process/errors.go |
Introduces ErrBlockProcessorBusy. |
process/block/shardblock.go |
Uses TrySetBusy in ProcessBlock, CreateBlock, and (legacy) CommitBlock. |
process/block/shardblockProposal.go |
Uses TrySetBusy in ProcessBlockProposal. |
process/block/metablock.go |
Uses TrySetBusy in ProcessBlock, CreateBlock, and (legacy) CommitBlock. |
process/block/metablockProposal.go |
Uses TrySetBusy in ProcessBlockProposal. |
process/block/baseProcess.go |
Uses TrySetBusy in ProcessScheduledBlock; adjusts last executed root hash selection for v3. |
process/sync/baseSync.go |
Treats ErrBlockProcessorBusy as a non-actionable sync failure (no rollback/tracking). |
process/block/*_test.go |
Updates tests to hook TrySetBusyCalled instead of SetBusyCalled where needed. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| defer psh.mutStatus.Unlock() | ||
|
|
||
| if !psh.isIdle { | ||
| log.Debug("processStatusHandler.TrySetBusy: already busy", "reason", reason) | ||
| return false | ||
| } | ||
|
|
||
| log.Debug("processStatusHandler.TrySetBusy", "reason", reason) | ||
| psh.isIdle = false |
There was a problem hiding this comment.
TrySetBusy logs while holding mutStatus. This unnecessarily extends the critical section and can block other goroutines calling IsIdle/SetIdle/SetBusy. Consider updating isIdle under the lock, unlocking, then logging based on the outcome (including the already-busy case).
| defer psh.mutStatus.Unlock() | |
| if !psh.isIdle { | |
| log.Debug("processStatusHandler.TrySetBusy: already busy", "reason", reason) | |
| return false | |
| } | |
| log.Debug("processStatusHandler.TrySetBusy", "reason", reason) | |
| psh.isIdle = false | |
| if !psh.isIdle { | |
| psh.mutStatus.Unlock() | |
| log.Debug("processStatusHandler.TrySetBusy: already busy", "reason", reason) | |
| return false | |
| } | |
| psh.isIdle = false | |
| psh.mutStatus.Unlock() | |
| log.Debug("processStatusHandler.TrySetBusy", "reason", reason) |
| // TrySetBusy will atomically check if idle and set the internal state to "busy". | ||
| // Returns true if the state was successfully set to busy, false if already busy. | ||
| func (psh *processStatusHandler) TrySetBusy(reason string) bool { | ||
| psh.mutStatus.Lock() | ||
| defer psh.mutStatus.Unlock() | ||
|
|
||
| if !psh.isIdle { | ||
| log.Debug("processStatusHandler.TrySetBusy: already busy", "reason", reason) | ||
| return false | ||
| } | ||
|
|
||
| log.Debug("processStatusHandler.TrySetBusy", "reason", reason) | ||
| psh.isIdle = false | ||
| return true | ||
| } |
There was a problem hiding this comment.
The new TrySetBusy behavior (idle->busy succeeds, busy->false fails) is not covered by unit tests. Since this method is now used as a concurrency gate, please add tests in statusHandler/processStatusHandler_test.go to validate state transitions and return values (including concurrent callers).
| if !sp.processStatusHandler.TrySetBusy("shardProcessor.ProcessBlockProposal") { | ||
| return nil, process.ErrBlockProcessorBusy | ||
| } | ||
| defer sp.processStatusHandler.SetIdle() | ||
|
|
There was a problem hiding this comment.
ProcessBlockProposal sets the processor back to idle at the end of this method, but the async execution flow calls CommitBlockProposalState afterwards and that method is not guarded by TrySetBusy/SetIdle. This allows another goroutine to acquire busy and start processing between proposal execution and state commit, which can break the intended mutual exclusion. Consider extending the busy window across CommitBlockProposalState/RevertBlockProposalState (or adding a similar TrySetBusy guard there).
There was a problem hiding this comment.
this is only possible on pre-Supernova (execution triggered from Consensus or Sync flow), on Supernova (where CommitBlockProposalState/RevertBlockProposalState) are called, headersExecutor is the only caller, and calls sequentially sequentially.
| if !mp.processStatusHandler.TrySetBusy("metaProcessor.ProcessBlockProposal") { | ||
| return nil, process.ErrBlockProcessorBusy | ||
| } | ||
| defer mp.processStatusHandler.SetIdle() | ||
|
|
There was a problem hiding this comment.
Same issue as in shardProcessor: ProcessBlockProposal releases busy before CommitBlockProposalState runs (and CommitBlockProposalState is not guarded), so another goroutine can start processing in between and corrupt shared processor/account state. Consider guarding commit/revert proposal state as well or otherwise holding the busy lock across the full proposal+commit sequence.
There was a problem hiding this comment.
not possible on Supernova, single worker for execution, and the mentioned unprotected methods are only called on Supernova.
| func (boot *baseBootstrap) doJobOnSyncBlockFail(bodyHandler data.BodyHandler, headerHandler data.HeaderHandler, err error) { | ||
| if errors.Is(err, process.ErrBlockProcessorBusy) { | ||
| // block processor is busy with another call (e.g. consensus processing the same block); | ||
| // no processing started, nothing to track or roll back - just retry on next sync iteration | ||
| return | ||
| } |
There was a problem hiding this comment.
doJobOnSyncBlockFail now special-cases ErrBlockProcessorBusy, but syncBlocks still treats this as a SyncBlock error and will log it and wait up to sleepTimeOnFail (400ms) before the next attempt. If the intent is a fast retry when the processor is busy, consider handling ErrBlockProcessorBusy earlier (e.g., in syncBlock/syncBlocks) to avoid the failure backoff path and reduce noisy logs/delays.
There was a problem hiding this comment.
accepted, no longer possible on Supernova.
| if !sp.processStatusHandler.TrySetBusy("shardProcessor.ProcessBlock") { | ||
| return process.ErrBlockProcessorBusy | ||
| } | ||
| defer sp.processStatusHandler.SetIdle() |
There was a problem hiding this comment.
This new early-return branch (ErrBlockProcessorBusy) is not covered by tests in this package. Please add a unit test that sets TrySetBusy to return false and asserts ProcessBlock returns ErrBlockProcessorBusy (and does not call SetIdle).
| if !mp.processStatusHandler.TrySetBusy("metaProcessor.ProcessBlock") { | ||
| return process.ErrBlockProcessorBusy | ||
| } | ||
| defer mp.processStatusHandler.SetIdle() |
There was a problem hiding this comment.
This new early-return branch (ErrBlockProcessorBusy) is not covered by tests. Please add a unit test that makes TrySetBusy return false and asserts ProcessBlock returns ErrBlockProcessorBusy.
| func (bp *baseProcessor) ProcessScheduledBlock(headerHandler data.HeaderHandler, bodyHandler data.BodyHandler, haveTime func() time.Duration) error { | ||
| var err error | ||
| bp.processStatusHandler.SetBusy("baseProcessor.ProcessScheduledBlock") | ||
| if !bp.processStatusHandler.TrySetBusy("baseProcessor.ProcessScheduledBlock") { | ||
| return process.ErrBlockProcessorBusy | ||
| } | ||
| defer func() { |
There was a problem hiding this comment.
ProcessScheduledBlock can now return ErrBlockProcessorBusy when TrySetBusy fails, but there is no test covering this new branch. Please add a unit test (similar to existing busy/idle sequence tests) asserting ErrBlockProcessorBusy is returned when TrySetBusyCalled returns false.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## feat/supernova-async-exec #7815 +/- ##
=============================================================
- Coverage 77.58% 77.55% -0.03%
=============================================================
Files 882 882
Lines 123941 123963 +22
=============================================================
- Hits 96155 96136 -19
- Misses 21416 21444 +28
- Partials 6370 6383 +13 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…x-chain-go into status-handler-try-set-busy
Reasoning behind the pull request
Proposed changes
Testing procedure
Pre-requisites
Based on the Contributing Guidelines the PR author and the reviewers must check the following requirements are met:
featbranch created?featbranch merging, do all satellite projects have a proper tag insidego.mod?