Skip to content

Commit f568d6f

Browse files
Giulio2002SharovBot
andauthored
[SharovBot] fix(stagedsync): add mutex to Progress.LogCommitments to fix DATA RACE (#19501)
**[SharovBot]** ## Problem The `All tests (with -race)` CI job has been failing on `main` since #19486 merged. The race detector reports: ``` WARNING: DATA RACE Read at 0x...: (*Progress).LogCommitments() exec3_metrics.go:741 Previous write at 0x...: (*Progress).LogCommitments() exec3_metrics.go:742 WARNING: DATA RACE Read at 0x...: updateExecDomainMetrics() exec3_metrics.go:267 Previous write at 0x...: updateExecDomainMetrics() exec3_metrics.go:291 ``` Both races are concurrent calls to `Progress.LogCommitments()` from: - **Goroutine A** (`func1.2`): the commit-logger goroutine spawned inside `parallelExecutor.exec` — triggered when the `commitProgress` channel closes (`!ok` case at `exec3_parallel.go:350`) - **Goroutine B** (outer `exec()`): at `exec3_parallel.go:470`, after `pe.wait()` returns, for the final summary log call ## Root Cause The `<-LogCommitmentsDone` synchronisation at `exec3_parallel.go:408` only runs on the **normal commit-completion path**. If `func1` exits early via `ctx.Done()` (at line 443), it never reaches line 408 — leaving the commit-logger goroutine (`func1.2`) still running. `pe.wait()` returns (tracking `execLoopGroup`, which `func1` is part of), the outer `exec()` proceeds to its `LogCommitments` call — and both goroutines simultaneously mutate `p.prevCommitTime`, `p.prevDomainMetrics`, etc. **Why did #19486 expose this?** It made BAL validation optional, so more Amsterdam-fork blocks complete successfully instead of aborting. `TestExecutionSpecBlockchainDevnet` runs short-lived test contexts; with more successful commits the race window is hit consistently. ## Fix Add `sync.Mutex` to `Progress` and lock it for the duration of `LogCommitments()`. All `p.prev*` field reads and writes happen inside this one function, so a single lock site covers both reported races. ```go type Progress struct { mu sync.Mutex // protects prev* fields from concurrent LogCommitments calls // ... } func (p *Progress) LogCommitments(...) { p.mu.Lock() defer p.mu.Unlock() // ... } ``` ## Verification `go build ./execution/stagedsync/...` passes. The fix unblocks `All tests (with -race)` on `main`. Co-authored-by: SharovBot <sharovbot@erigon.ci>
1 parent 14154ba commit f568d6f

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

execution/stagedsync/exec3_metrics.go

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -455,6 +455,11 @@ func NewProgress(initialBlockNum, initialTxNum, commitThreshold uint64, updateMe
455455
}
456456

457457
type Progress struct {
458+
// mu protects all prev* fields accessed concurrently from the commit-logger
459+
// goroutine (func1.2 inside parallelExecutor.exec) and the outer exec()
460+
// function after pe.wait() returns. The commit-logger goroutine may still
461+
// be running when the outer exec calls LogCommitments for the final summary.
462+
mu sync.Mutex
458463
initialTime time.Time
459464
initialTxNum uint64
460465
initialBlockNum uint64
@@ -725,6 +730,9 @@ func (p *Progress) LogExecution(rs *state.StateV3, ex executor) {
725730
}
726731

727732
func (p *Progress) LogCommitments(rs *state.StateV3, ex executor, commitStart time.Time, stepsInDb float64, lastProgress commitment.CommitProgress) {
733+
p.mu.Lock()
734+
defer p.mu.Unlock()
735+
728736
var te *txExecutor
729737
var suffix string
730738

0 commit comments

Comments
 (0)