Skip to content

Fix forwardingStreamHandler errgroup race#317

Merged
EdSchouten merged 1 commit intobuildbarn:mainfrom
meroton:fix-flaky-grpc-test
Feb 4, 2026
Merged

Fix forwardingStreamHandler errgroup race#317
EdSchouten merged 1 commit intobuildbarn:mainfrom
meroton:fix-flaky-grpc-test

Conversation

@moroten
Copy link
Contributor

@moroten moroten commented Feb 3, 2026

The commit Add generic gRPC stream forwarding (#306), introduced a flakiness in the grpc_test due to calling errgroup.Go(func() error { return err }) when an error had occurred in a go routine that was asynchrounous to the errgroup.Wait() call. This could lead to errgroup.Go() being called just when errgroup.Wait() was about to return, which is not allowed.

@aspect-workflows
Copy link

aspect-workflows bot commented Feb 3, 2026

Test

2 test targets passed

Targets
//pkg/blobstore/sharding/integration:integration_test [k8-fastbuild]                 85ms
//pkg/grpc:grpc_test [k8-fastbuild]                                                  100ms

Total test execution time was 185ms. 27 tests (93.1%) were fully cached saving 6s.

@moroten moroten force-pushed the fix-flaky-grpc-test branch from a84a20e to a374a9c Compare February 4, 2026 14:11
@moroten
Copy link
Contributor Author

moroten commented Feb 4, 2026

This PR now adds program.Group.GoAsync to generalize the support needed to remove the errgroup.Go(func() { return err }) lines inside asynchronous go routines. Does that look alright?

@EdSchouten
Copy link
Member

I don't know. I'd rather not make program.Group fat. Stuff should just clean up their own goroutines. What are your thoughts on doing something like this instead?

var propagatedError atomic.Pointer[error]
ctxWithCancel, cancel := context.WithCancel(ctx)
defer cancel()

x, err := NewStream()
...

If we want to propagate errors from the background goroutine, we do this:

propagatedError.Store(err)
cancel()

@moroten
Copy link
Contributor Author

moroten commented Feb 4, 2026

Okay, I resolved it with context.WithCancelCause.

@moroten moroten force-pushed the fix-flaky-grpc-test branch from a374a9c to a38a942 Compare February 4, 2026 14:41
@EdSchouten
Copy link
Member

Completely forgot about CancelCauseFunc. Nice!

@moroten moroten force-pushed the fix-flaky-grpc-test branch from a38a942 to 136b202 Compare February 4, 2026 14:43
Commit 5b5db75,
"Add generic gRPC stream forwarding (buildbarn#306)",
introduced a flakiness in the grpc_test due to calling
`errgroup.Go(func() error { return err })` when an error had occurred in
a go routine that was asynchrounous to the errgroup.Wait() call. This
could lead to `errgroup.Go()` being called just when `errgroup.Wait()`
was about to return, which is not allowed.
@moroten moroten force-pushed the fix-flaky-grpc-test branch from 136b202 to c76188c Compare February 4, 2026 14:48
@EdSchouten EdSchouten merged commit bebcf08 into buildbarn:main Feb 4, 2026
3 checks passed
@moroten moroten deleted the fix-flaky-grpc-test branch February 4, 2026 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments