[#31522] tserver: Fix VerifyPgClientServiceCleanupQueue on macOS#31523
[#31522] tserver: Fix VerifyPgClientServiceCleanupQueue on macOS#31523ellabaron-code wants to merge 1 commit into
Conversation
The test opens 30 user connections and asserts that PgClientService has exactly 31 sessions (30 + 1 yb_ash bgworker). On macOS the count was 32. The extra session is the internal libpq backend spawned by TriggerRelcacheInitConnection. The first user Connect() triggers it because no per-DB relcache init file exists on a fresh cluster (relcache.c:6742). The internal backend exits promptly, but PgClientService waits a hardcoded delay after the backend's RPC channel shuts down before calling getsid(pid) to confirm the process is gone and removing the session. That delay was 250ms on macOS vs. 50ms on Linux release — and the test asserts ~230ms after the libpq close, so on macOS the session is still in sessions_ when the assertion runs. The 250ms macOS branch was added in commit a6c5ea1 to fix PgSharedMemTest.ConnectionShutdown. That test still passes with the unified 50ms delay, so remove the macOS-specific branch.
✅ Deploy Preview for infallible-bardeen-164bc9 ready!Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Code Review
This pull request removes the platform-specific delay for macOS in the ListenConnectionShutdown callback within PgClientServiceImpl, standardizing the delay using RegularBuildVsSanitizers(50ms, 1000ms) across all platforms. I have no feedback to provide.
|
trigger jenkins |
|
Jenkins build has been triggered. Results will be posted once it completes. CSI JenkinsBot |
|
✅ Jenkins build for commit Passed: 12 🔨 DB Build/Test Job Summary
JenkinsBot |
|
The PR summary does not explain why it fixes this test.
Also please note that a6c5ea1 was fixing the test in Jenkins where tests were we used overloaded machines to run tests. As result we use only minimal subset of tests on Mac in Jenkins now. |
|
The WaitFor at line 261 isn't where this fails — the failure is the immediate ASSERT_EQ at line 258 right after the connection loop. Timeline from the test log: 08:41:20.845679 — relcache init connection request to "yugabyte" starts (fires once on first Connect to a fresh DB) I'd prefer to keep the fix as-is. Reliable macOS test runs matter to us because we do most of our development on macOS, and aligning the delay with what every other platform already uses removes a long-standing source of flake without affecting anything else. Happy to add a code comment explaining the timing if that helps. |
Summary
Removes the macOS-specific 250ms branch in
PgClientService's sessioncleanup delay so all platforms use 50ms. The macOS delay caused
VerifyPgClientServiceCleanupQueueto fail because the internal libpqbackend spawned by
TriggerRelcacheInitConnectionon the firstConnect()was still insessions_when the assertion ran (~230msafter libpq close).
The 250ms macOS branch was added in a6c5ea1 to fix
PgSharedMemTest.ConnectionShutdown, which still passes with theunified 50ms delay.
Fixes #31522.
Test plan
./yb_build.sh release --cxx-test pg_client-test --gtest_filter PgClientTest.VerifyPgClientServiceCleanupQueuepasses.PgSharedMemTest.ConnectionShutdownstill passes with the unified 50ms delay (regression check for the original a6c5ea1 motivation).CSI