Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 18 additions & 8 deletions service/worker/batcher/workflow_v2_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -150,23 +150,28 @@ func TestBatchWorkflowV2_TuneSignal(t *testing.T) {
var mu sync.Mutex
var capturedParams []BatchParams

// firstActivityDone is kept open while the test runs, which prevents the
// first activity mock goroutine from returning before the workflow has had
// a chance to process the tune signal. The Cadence test framework delivers
// CanceledError to the activity future independently of the goroutine, so
// blocking here does not prevent the workflow from completing. The channel
// is closed by t.Cleanup once the assertions are done.
// firstActivityDone gates the first activity mock goroutine so it does not
// return before the workflow has had a chance to process the tune signal.
// The Cadence test framework delivers CanceledError to the activity future
// independently of the goroutine, so blocking here does not prevent the
// workflow from completing.
firstActivityDone := make(chan struct{})
t.Cleanup(func() { close(firstActivityDone) })

// activityWG tracks running activity mock goroutines so we can wait for
// them to finish before reading capturedParams, avoiding a data race.
var activityWG sync.WaitGroup

env.OnActivity(batchActivityV2Name, mock.Anything, mock.Anything).
Return(func(_ context.Context, p BatchParams) (HeartBeatDetails, error) {
activityWG.Add(1)
defer activityWG.Done()
Comment on lines +166 to +167
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Bug: WaitGroup.Add(1) called inside goroutine instead of before it

At line 166, activityWG.Add(1) is called inside the activity callback goroutine rather than before it starts. Per the sync.WaitGroup documentation, "calls with a positive delta that occur when the counter is zero must happen before a Wait." If the Go scheduler delays the goroutine start, activityWG.Wait() (line 204) could observe a zero counter and return before Add(1) has executed, reading capturedParams while the goroutine is still writing to it.

In practice this is very unlikely because the Cadence test framework invokes the second activity synchronously during ExecuteWorkflow (which completes before Wait is called), and the first activity must have started to block on the channel. Still, it violates the documented contract and could theoretically cause the race detector to fire under extreme scheduling.

A cleaner approach: add both counts eagerly before the mock is invoked, or use activityWG.Add(1) inside a wrapper that is called synchronously by the framework before spawning the goroutine.

Suggested fix:

// Before registering the mock, add the expected count:
activityWG.Add(2) // we expect exactly 2 invocations

env.OnActivity(batchActivityV2Name, mock.Anything, mock.Anything).
    Return(func(_ context.Context, p BatchParams) (HeartBeatDetails, error) {
        defer activityWG.Done()
        // ... rest unchanged

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion


mu.Lock()
capturedParams = append(capturedParams, p)
n := len(capturedParams)
mu.Unlock()
if n == 1 {
// Block until the test is done. This prevents the future from
// Block until the workflow is done. This prevents the future from
// resolving with nil before the workflow processes the tune signal,
// which would cause the workflow to take the early-exit path.
<-firstActivityDone
Expand All @@ -193,6 +198,11 @@ func TestBatchWorkflowV2_TuneSignal(t *testing.T) {
assert.Equal(t, 8, result.SuccessCount)
assert.Equal(t, 3, result.CurrentPage)

// Release the first activity goroutine and wait for all activity goroutines
// to finish before reading capturedParams.
close(firstActivityDone)
activityWG.Wait()

mu.Lock()
Copy link
Copy Markdown
Member

@Shaddoll Shaddoll Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all activities have finished, we don't need this lock, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the final read-side lock. We still need the mutex while the two activity goroutines append into capturedParams, but once activityWG.Wait() returns the post-wait copy can read the slice directly.

captured := make([]BatchParams, len(capturedParams))
copy(captured, capturedParams)
Expand Down
Loading