KAFKA-17999: Fix flaky DynamicConnectionQuotaTest testDynamicConnectionQuota #21354
+17
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
DynamicConnectionQuotaTest#testDynamicConnectionQuotaby waiting until the broker’smax.connections.per.ip.overridesis actually applied inConnectionQuotasbefore running the second verification.Background
testDynamicConnectionQuotadynamically updatesmax.connections.per.ip.overridesfrom 5 to 7 and immediately startsverifyMaxConnections(7, ...).In practice, the config update is propagated asynchronously: the broker
SocketServerreconfigure is triggered, but there is a short window where the data-plane acceptor still evaluates new connections using the old override state(i.e, 5). This can cause one of the “prefill” sockets in the second verification to be rejected undermax=5, breaking the test’s assumptions and leading to intermittent failures.Changes
ConnectionQuotasto check whether an override for a given IP is present.DynamicConnectionQuotaTest, after alteringmax.connections.per.ip.overrides, wait until the broker’sConnectionQuotasobserves 127.0.0.1 -> 7 before executing the secondverifyMaxConnections(...).Reproduction / Verification
testDynamicConnectionQuotawhen the override was not yet applied during the second verification.Test Fail Ratio in my local.
Flaky / Success diagram