Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adjusts unit test thread synchronization to reduce intermittent failures (“flaps”) by allowing more time for worker threads to complete before assertions run.
Changes:
- Increased thread
Jointimeouts from 1s to 10s in three NUID-related tests. - Aimed to reduce nondeterministic failures under slower CI conditions.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This was referenced Mar 10, 2026
Add MockServer ready signal and use ConnectRetryAsync for mock server tests to avoid race between acceptance loop startup and client connect. Add explicit ConnectRetryAsync to Request_reply_many_test_max_count. Increase Watcher_timeout_reconnect timeout from 30s to 60s to allow margin for idle timeout detection on slow CI. Skip first ping from max RTT assertion in SlowConsumer test since it catches the publish burst tail. Increase CommandTimeout in Buffer_high_pressure_pub_test from 5s to 30s for the 195MB high-pressure workload.
Apply the same MockServer ready signal and ConnectRetryAsync pattern to ProtocolParserSizeCheckTest and PingCancellationTest to fix connection race on net481 under CI load.
Use ConnectRetryAsync for real server PingCancellation tests. Add MockServer ready wait, ConnectRetryAsync, and increased CommandTimeout to SendBufferTest to handle 8MB publish under CI load on net481.
FakeServer only accepts one client, so ConnectRetryAsync consumes the single accept slot on the first failed attempt, making all subsequent retries fail. Use increased ConnectTimeout (10s) with plain ConnectAsync instead.
Add ready wait and ConnectTimeout/ConnectRetryAsync to the two tests that were missed in the previous pass (Valid_msg_still_works and Valid_hmsg_still_works).
This was referenced Apr 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix flaky CI test failures across Windows and Linux runners. The main issues were mock/fake server connection races (acceptance loop not ready before client connects), tight timeouts under CI load, and assertion sensitivity to transient latency spikes.
Readytask signaling when acceptance loop starts, useConnectRetryAsyncfor tests using itReadytask, use increasedConnectTimeout(single-accept server can't use retry)Request_reply_many_test_max_count: add explicitConnectRetryAsyncWatcher_timeout_reconnect: increase timeout from 30s to 60s for idle timeout detection marginSlowConsumer: skip first ping from max RTT assertion (catches publish burst tail)Buffer_high_pressure_pub_test: increaseCommandTimeoutfrom 5s to 30s for 195MB workloadPingCancellationTest: useConnectRetryAsyncfor mock server tests,ConnectRetryAsyncfor real server testsSendBufferTest: add ready wait,ConnectRetryAsync, increasedCommandTimeout