Skip to content

Fix test flaps#1083

Merged
mtmk merged 8 commits intomainfrom
Fix-NuidTests-flap
Apr 10, 2026
Merged

Fix test flaps#1083
mtmk merged 8 commits intomainfrom
Fix-NuidTests-flap

Conversation

@mtmk
Copy link
Copy Markdown
Member

@mtmk mtmk commented Mar 4, 2026

Fix flaky CI test failures across Windows and Linux runners. The main issues were mock/fake server connection races (acceptance loop not ready before client connects), tight timeouts under CI load, and assertion sensitivity to transient latency spikes.

  • MockServer: add Ready task signaling when acceptance loop starts, use ConnectRetryAsync for tests using it
  • FakeServer: add Ready task, use increased ConnectTimeout (single-accept server can't use retry)
  • Request_reply_many_test_max_count: add explicit ConnectRetryAsync
  • Watcher_timeout_reconnect: increase timeout from 30s to 60s for idle timeout detection margin
  • SlowConsumer: skip first ping from max RTT assertion (catches publish burst tail)
  • Buffer_high_pressure_pub_test: increase CommandTimeout from 5s to 30s for 195MB workload
  • PingCancellationTest: use ConnectRetryAsync for mock server tests, ConnectRetryAsync for real server tests
  • SendBufferTest: add ready wait, ConnectRetryAsync, increased CommandTimeout

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts unit test thread synchronization to reduce intermittent failures (“flaps”) by allowing more time for worker threads to complete before assertions run.

Changes:

  • Increased thread Join timeouts from 1s to 10s in three NUID-related tests.
  • Aimed to reduce nondeterministic failures under slower CI conditions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/NATS.Client.CoreUnit.Tests/NuidTests.cs Outdated
Comment thread tests/NATS.Client.CoreUnit.Tests/NuidTests.cs Outdated
Comment thread tests/NATS.Client.CoreUnit.Tests/NuidTests.cs Outdated
@mtmk mtmk requested a review from scottf March 4, 2026 13:37
@mtmk mtmk changed the title Fix NuidTests flap Fix test flaps Mar 5, 2026
@mtmk mtmk marked this pull request as draft March 9, 2026 13:39
mtmk added 6 commits April 9, 2026 11:55
Add MockServer ready signal and use ConnectRetryAsync for mock server
tests to avoid race between acceptance loop startup and client connect.
Add explicit ConnectRetryAsync to Request_reply_many_test_max_count.
Increase Watcher_timeout_reconnect timeout from 30s to 60s to allow
margin for idle timeout detection on slow CI. Skip first ping from max
RTT assertion in SlowConsumer test since it catches the publish burst
tail. Increase CommandTimeout in Buffer_high_pressure_pub_test from 5s
to 30s for the 195MB high-pressure workload.
Apply the same MockServer ready signal and ConnectRetryAsync pattern
to ProtocolParserSizeCheckTest and PingCancellationTest to fix
connection race on net481 under CI load.
Use ConnectRetryAsync for real server PingCancellation tests. Add
MockServer ready wait, ConnectRetryAsync, and increased CommandTimeout
to SendBufferTest to handle 8MB publish under CI load on net481.
FakeServer only accepts one client, so ConnectRetryAsync consumes the
single accept slot on the first failed attempt, making all subsequent
retries fail. Use increased ConnectTimeout (10s) with plain ConnectAsync
instead.
Add ready wait and ConnectTimeout/ConnectRetryAsync to the two tests
that were missed in the previous pass (Valid_msg_still_works and
Valid_hmsg_still_works).
@mtmk mtmk marked this pull request as ready for review April 9, 2026 15:57
Copy link
Copy Markdown
Contributor

@scottf scottf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mtmk mtmk merged commit ae10764 into main Apr 10, 2026
49 of 50 checks passed
@mtmk mtmk deleted the Fix-NuidTests-flap branch April 10, 2026 08:08
This was referenced Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants