Skip to content

test: Improve resilience of MySQL, MSSQL, and NServiceBus integration tests#3656

Open
tippmar-nr wants to merge 2 commits into
mainfrom
test/integration-test-flake-fixes
Open

test: Improve resilience of MySQL, MSSQL, and NServiceBus integration tests#3656
tippmar-nr wants to merge 2 commits into
mainfrom
test/integration-test-flake-fixes

Conversation

@tippmar-nr

@tippmar-nr tippmar-nr commented Jun 24, 2026

Copy link
Copy Markdown
Member

Hardens four integration tests against transient/timing-dependent failures observed in CI.

MySQL

Stored-procedure setup occasionally failed on transient errors and on re-runs where the procedure/table already existed. Added a shared MySqlRetryHelper that retries transient CreateProcedure failures and makes drops idempotent; wired into MySqlExerciser and MySqlConnectorExerciser.

MSSQL truncation

MsSqlTruncationTests opted into a BaselinePayloadBytes size-envelope check that summed all collector payloads across a non-deterministic number of harvest cycles, producing flaky +-5% failures. Truncation correctness is already proven deterministically by the existing length == 4096 and EndsWith("...") assertions, so the envelope check (and its companion Wait 5000 padding command) is removed. The BaselinePayloadBytes mechanism remains in the shared fixture for possible future use.

NServiceBus

NsbSendTests asserts on a transaction trace (TryGetTransactionSample) but never waited for the trace harvest and left the transaction-traces cycle at the 60s default while only speeding up metrics. When the trace harvest had not yet fired, Assert.NotNull(transactionSample) failed. Added ConfigureFasterTransactionTracesHarvestCycle(10) and a WaitForLogLine(TransactionSampleLogLineRegex) so the read is deterministic.

Agent-log read race (harness)

WaitForLogLines reads the agent log while the agent writes it. A momentary incompatible share mode on the writer side made the reader's FileStream open throw a transient IOException ("being used by another process"), which the wait loop did not retry -- it aborted the exercise on its first iteration, so no data was harvested and the test's metric assertions all failed (seen on PostgresSqlExecuteScalarAsyncTestsCoreLatest). AgentLogFile.GetFileLines now retries the file open briefly (10 attempts, 100ms apart) so a read that races a write no longer fails the test. This fix applies to every test that reads the agent log.

Testing

  • NsbSendTestsFW48 passed 3/3 consecutive local runs after the NServiceBus fix.
  • PostgresSqlExecuteScalarAsyncTestsCoreLatest passed 2/2 local runs after the harness fix, confirming no regression to normal log reading. The share-mode race is not deterministically reproducible locally, so the harness retry was verified for non-regression rather than by forcing the original failure.
  • UnboundedIntegrationTests builds clean.

… tests

- MySQL: retry transient CreateProcedure failures and make DROP idempotent
  via a shared MySqlRetryHelper.
- MsSqlTruncation: drop the non-deterministic BaselinePayloadBytes envelope
  check (truncation is already proven by the length/ellipsis assertions);
  keep the fixture mechanism for future use.
- NsbSend: wait for the transaction_sample_data harvest and speed up the
  transaction-traces cycle so the trace-sample assertion is deterministic.
@tippmar-nr tippmar-nr marked this pull request as ready for review June 24, 2026 19:46
@tippmar-nr tippmar-nr requested a review from a team as a code owner June 24, 2026 19:46
WaitForLogLines reads the agent log while the agent writes it. A momentary
incompatible share mode on the writer side makes the reader's FileStream open
throw a transient IOException, which the wait loop did not retry, aborting the
exercise on its first iteration. Retry the open briefly in
AgentLogFile.GetFileLines so a read that races a write no longer fails the test.
@tippmar-nr tippmar-nr marked this pull request as draft June 24, 2026 21:14
@tippmar-nr tippmar-nr marked this pull request as ready for review June 24, 2026 21:15
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.99%. Comparing base (8f8928d) to head (615fd63).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3656   +/-   ##
=======================================
  Coverage   81.99%   81.99%           
=======================================
  Files         511      511           
  Lines       34724    34724           
  Branches     4134     4134           
=======================================
+ Hits        28471    28473    +2     
+ Misses       5277     5276    -1     
+ Partials      976      975    -1     
Flag Coverage Δ
Agent 82.94% <ø> (+<0.01%) ⬆️
Profiler 72.23% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.
see 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants