Fix flaky NoReconnectionToGatewayNotReturnedByManager test#9942
Merged
ReubenBond merged 1 commit intodotnet:mainfrom Feb 19, 2026
Merged
Conversation
The test used a 1-second response timeout, which could cause legitimate grain calls to the real gateway to spuriously timeout on slow CI machines, leading to timeoutCount == 2 instead of the expected 1. - Increase response timeout from 1s to 3s (still below the 5s OpenConnectionTimeout so fake-gateway calls still produce TimeoutException) - Change timeoutCount assertion from exact equality to >= 1, since the core assertion is connectionCount == 1 (verifying no reconnection) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a flaky test NoReconnectionToGatewayNotReturnedByManager that was failing intermittently on slow CI machines. The test verifies that Orleans clients don't attempt to reconnect to gateways that have been removed from the gateway list.
Changes:
- Increased response timeout from 1s to 3s to prevent spurious timeouts on slow CI machines while still being below the 5s OpenConnectionTimeout
- Changed timeout count assertion from exact equality (
== 1) to minimum threshold (>= 1) to tolerate edge-case performance variations - Improved inline documentation explaining the timeout choice
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The test
Tester.GatewayConnectionTests.NoReconnectionToGatewayNotReturnedByManageris flaky. It fails with:Root Cause
The test sets \ResponseTimeout\ to only 1 second, but the default \OpenConnectionTimeout\ is 5 seconds. On slow CI machines, legitimate grain calls to the real gateway can exceed 1 second due to grain activation overhead, causing a spurious \TimeoutException\ that inflates \ imeoutCount\ to 2.
The \connectionCount == 1\ assertion (line 159) always passes, confirming that only one TCP connection was accepted by the fake gateway — the extra timeout comes from a slow real-gateway call, not a reconnection.
Fix
Microsoft Reviewers: Open in CodeFlow