Skip to content

add trySetBusy#7815

Open
AdoAdoAdo wants to merge 5 commits intofeat/supernova-async-execfrom
status-handler-try-set-busy
Open

add trySetBusy#7815
AdoAdoAdo wants to merge 5 commits intofeat/supernova-async-execfrom
status-handler-try-set-busy

Conversation

@AdoAdoAdo
Copy link
Copy Markdown
Contributor

Reasoning behind the pull request

Proposed changes

Testing procedure

Pre-requisites

Based on the Contributing Guidelines the PR author and the reviewers must check the following requirements are met:

  • was the PR targeted to the correct branch?
  • if this is a larger feature that probably needs more than one PR, is there a feat branch created?
  • if this is a feat branch merging, do all satellite projects have a proper tag inside go.mod?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a non-blocking “try set busy” capability to the node processing status handler and wires it into block processing paths to prevent concurrent/re-entrant block processor calls, introducing a dedicated ErrBlockProcessorBusy for callers (e.g., sync) to handle gracefully.

Changes:

  • Extend common.ProcessStatusHandler with TrySetBusy(reason) bool and implement it in enabled/disabled/stub handlers.
  • Guard shard/meta/base block processing entrypoints with TrySetBusy, returning process.ErrBlockProcessorBusy when already busy.
  • Update sync failure handling to ignore ErrBlockProcessorBusy (no rollback/tracking) and adjust affected tests/stubs.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
common/interface.go Adds TrySetBusy to ProcessStatusHandler interface.
statusHandler/processStatusHandler.go Implements TrySetBusy with mutex-protected state transition and logging.
common/disabled/processStatusHandler.go Adds no-op TrySetBusy implementation (always true).
testscommon/processStatusHandlerStub.go Extends stub to support TrySetBusyCalled.
process/errors.go Introduces ErrBlockProcessorBusy.
process/block/shardblock.go Uses TrySetBusy in ProcessBlock, CreateBlock, and (legacy) CommitBlock.
process/block/shardblockProposal.go Uses TrySetBusy in ProcessBlockProposal.
process/block/metablock.go Uses TrySetBusy in ProcessBlock, CreateBlock, and (legacy) CommitBlock.
process/block/metablockProposal.go Uses TrySetBusy in ProcessBlockProposal.
process/block/baseProcess.go Uses TrySetBusy in ProcessScheduledBlock; adjusts last executed root hash selection for v3.
process/sync/baseSync.go Treats ErrBlockProcessorBusy as a non-actionable sync failure (no rollback/tracking).
process/block/*_test.go Updates tests to hook TrySetBusyCalled instead of SetBusyCalled where needed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +36 to +44
defer psh.mutStatus.Unlock()

if !psh.isIdle {
log.Debug("processStatusHandler.TrySetBusy: already busy", "reason", reason)
return false
}

log.Debug("processStatusHandler.TrySetBusy", "reason", reason)
psh.isIdle = false
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TrySetBusy logs while holding mutStatus. This unnecessarily extends the critical section and can block other goroutines calling IsIdle/SetIdle/SetBusy. Consider updating isIdle under the lock, unlocking, then logging based on the outcome (including the already-busy case).

Suggested change
defer psh.mutStatus.Unlock()
if !psh.isIdle {
log.Debug("processStatusHandler.TrySetBusy: already busy", "reason", reason)
return false
}
log.Debug("processStatusHandler.TrySetBusy", "reason", reason)
psh.isIdle = false
if !psh.isIdle {
psh.mutStatus.Unlock()
log.Debug("processStatusHandler.TrySetBusy: already busy", "reason", reason)
return false
}
psh.isIdle = false
psh.mutStatus.Unlock()
log.Debug("processStatusHandler.TrySetBusy", "reason", reason)

Copilot uses AI. Check for mistakes.
Comment on lines +32 to +46
// TrySetBusy will atomically check if idle and set the internal state to "busy".
// Returns true if the state was successfully set to busy, false if already busy.
func (psh *processStatusHandler) TrySetBusy(reason string) bool {
psh.mutStatus.Lock()
defer psh.mutStatus.Unlock()

if !psh.isIdle {
log.Debug("processStatusHandler.TrySetBusy: already busy", "reason", reason)
return false
}

log.Debug("processStatusHandler.TrySetBusy", "reason", reason)
psh.isIdle = false
return true
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new TrySetBusy behavior (idle->busy succeeds, busy->false fails) is not covered by unit tests. Since this method is now used as a concurrency gate, please add tests in statusHandler/processStatusHandler_test.go to validate state transitions and return values (including concurrent callers).

Copilot uses AI. Check for mistakes.
Comment on lines +289 to 293
if !sp.processStatusHandler.TrySetBusy("shardProcessor.ProcessBlockProposal") {
return nil, process.ErrBlockProcessorBusy
}
defer sp.processStatusHandler.SetIdle()

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProcessBlockProposal sets the processor back to idle at the end of this method, but the async execution flow calls CommitBlockProposalState afterwards and that method is not guarded by TrySetBusy/SetIdle. This allows another goroutine to acquire busy and start processing between proposal execution and state commit, which can break the intended mutual exclusion. Consider extending the busy window across CommitBlockProposalState/RevertBlockProposalState (or adding a similar TrySetBusy guard there).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only possible on pre-Supernova (execution triggered from Consensus or Sync flow), on Supernova (where CommitBlockProposalState/RevertBlockProposalState) are called, headersExecutor is the only caller, and calls sequentially sequentially.

Comment on lines +311 to 315
if !mp.processStatusHandler.TrySetBusy("metaProcessor.ProcessBlockProposal") {
return nil, process.ErrBlockProcessorBusy
}
defer mp.processStatusHandler.SetIdle()

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as in shardProcessor: ProcessBlockProposal releases busy before CommitBlockProposalState runs (and CommitBlockProposalState is not guarded), so another goroutine can start processing in between and corrupt shared processor/account state. Consider guarding commit/revert proposal state as well or otherwise holding the busy lock across the full proposal+commit sequence.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not possible on Supernova, single worker for execution, and the mentioned unprotected methods are only called on Supernova.

Comment on lines 779 to +784
func (boot *baseBootstrap) doJobOnSyncBlockFail(bodyHandler data.BodyHandler, headerHandler data.HeaderHandler, err error) {
if errors.Is(err, process.ErrBlockProcessorBusy) {
// block processor is busy with another call (e.g. consensus processing the same block);
// no processing started, nothing to track or roll back - just retry on next sync iteration
return
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doJobOnSyncBlockFail now special-cases ErrBlockProcessorBusy, but syncBlocks still treats this as a SyncBlock error and will log it and wait up to sleepTimeOnFail (400ms) before the next attempt. If the intent is a fast retry when the processor is busy, consider handling ErrBlockProcessorBusy earlier (e.g., in syncBlock/syncBlocks) to avoid the failure backoff path and reduce noisy logs/delays.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted, no longer possible on Supernova.

Comment on lines +100 to 103
if !sp.processStatusHandler.TrySetBusy("shardProcessor.ProcessBlock") {
return process.ErrBlockProcessorBusy
}
defer sp.processStatusHandler.SetIdle()
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new early-return branch (ErrBlockProcessorBusy) is not covered by tests in this package. Please add a unit test that sets TrySetBusy to return false and asserts ProcessBlock returns ErrBlockProcessorBusy (and does not call SetIdle).

Copilot uses AI. Check for mistakes.
Comment on lines +161 to 164
if !mp.processStatusHandler.TrySetBusy("metaProcessor.ProcessBlock") {
return process.ErrBlockProcessorBusy
}
defer mp.processStatusHandler.SetIdle()
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new early-return branch (ErrBlockProcessorBusy) is not covered by tests. Please add a unit test that makes TrySetBusy return false and asserts ProcessBlock returns ErrBlockProcessorBusy.

Copilot uses AI. Check for mistakes.
Comment on lines 2581 to 2586
func (bp *baseProcessor) ProcessScheduledBlock(headerHandler data.HeaderHandler, bodyHandler data.BodyHandler, haveTime func() time.Duration) error {
var err error
bp.processStatusHandler.SetBusy("baseProcessor.ProcessScheduledBlock")
if !bp.processStatusHandler.TrySetBusy("baseProcessor.ProcessScheduledBlock") {
return process.ErrBlockProcessorBusy
}
defer func() {
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProcessScheduledBlock can now return ErrBlockProcessorBusy when TrySetBusy fails, but there is no test covering this new branch. Please add a unit test (similar to existing busy/idle sequence tests) asserting ErrBlockProcessorBusy is returned when TrySetBusyCalled returns false.

Copilot uses AI. Check for mistakes.
sstanculeanu
sstanculeanu previously approved these changes Mar 31, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 6.25000% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.55%. Comparing base (3f8d81a) to head (5aa64ab).
⚠️ Report is 5 commits behind head on feat/supernova-async-exec.

Files with missing lines Patch % Lines
statusHandler/processStatusHandler.go 0.00% 9 Missing ⚠️
process/block/metablock.go 0.00% 3 Missing and 3 partials ⚠️
process/block/shardblock.go 0.00% 3 Missing and 3 partials ⚠️
process/block/baseProcess.go 40.00% 2 Missing and 1 partial ⚠️
process/block/metablockProposal.go 0.00% 1 Missing and 1 partial ⚠️
process/block/shardblockProposal.go 0.00% 1 Missing and 1 partial ⚠️
process/sync/baseSync.go 0.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@                      Coverage Diff                      @@
##           feat/supernova-async-exec    #7815      +/-   ##
=============================================================
- Coverage                      77.58%   77.55%   -0.03%     
=============================================================
  Files                            882      882              
  Lines                         123941   123963      +22     
=============================================================
- Hits                           96155    96136      -19     
- Misses                         21416    21444      +28     
- Partials                        6370     6383      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@raduchis raduchis self-requested a review April 2, 2026 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants