Retry operational CASE setup after responder BUSY#72824
Draft
feasel0 wants to merge 3 commits into
Draft
Conversation
Co-authored-by: Copilot <copilot@github.com>
for more information, see https://pre-commit.ci
Contributor
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #72824 +/- ##
==========================================
+ Coverage 56.62% 56.66% +0.03%
==========================================
Files 1642 1642
Lines 113154 113159 +5
Branches 13235 13225 -10
==========================================
+ Hits 64079 64121 +42
+ Misses 49075 49038 -37 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
PR #72824: Size comparison from d9d3108 to 3908735 Full report (22 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, efr32, nrfconnect, psoc6, qpg, realtek, stm32)
|
|
PR #72824: Size comparison from d9d3108 to a92e88c Full report (33 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, efr32, esp32, nrfconnect, psoc6, qpg, realtek, stm32, telink)
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The problem
A Python
ReadAttribute()first callsGetConnectedDevice()before sending the Interaction Model read. The normal controller path requests only a small operational CASE setup attempt budget, and that budget is decremented as soon as operational discovery/setup begins.If the responder then sends a Secure Channel
BUSYstatus with a requested wait time,OperationalSessionSetup::OnResponderBusy()records the delay, butOnSessionEstablishmentError()only schedules a retry whenmRemainingAttempts > 0. With the attempt budget already exhausted, the stack surfacesCHIP_ERROR_BUSYback to the caller instead of honoring the peer's "wait, then retry" response.This appears as an intermittent read failure under stress: session/subscription recovery or other operational traffic creates timing pressure, the device responds
BUSY, and a plain one-shot read fails while trying to reacquire a CASE session. The next attempt typically succeeds once the device is no longer contended.There was a TODO comment in
ScheduleSessionSetupReattempt()about whetherBUSYshould restore retry budget, so this issue was clearly contemplated by the original author.The fix
Update
OperationalSessionSetup::OnResponderBusy()so that when a responderBUSYarrives after the normal attempt budget is exhausted, the setup object grants one bounded retry. BecauseOnResponderBusy()runs beforeOnSessionEstablishmentError()for the sameBUSY, the restored budget allows the existing retry path to schedule the peer-requested wait instead of failing.The retry is intentionally limited to one extra grant per setup object, tracked by a new
mGrantedBusyRetryAfterAttemptExhaustionflag, so this honors the responder's explicit retry request without creating an unbounded retry loop. Larger existing retry budgets are left untouched, and the normal busy-delay handling continues to use the responder-requested delay when scheduling the retry.Testing
Added
TestOperationalSessionSetupBusy.cppand wired it intosrc/app/tests/BUILD.gn.The new unit tests cover:
BUSYarrives after the attempt budget is exhaustedBUSYretry only once per setup object, even across multipleBUSYresponsesRelated issues
Fixes #33830