bluetooth-fw/nimble: fix discovery stop race causing KernelBG hang#1274
Open
gmarull wants to merge 1 commit into
Open
bluetooth-fw/nimble: fix discovery stop race causing KernelBG hang#1274gmarull wants to merge 1 commit into
gmarull wants to merge 1 commit into
Conversation
`bt_driver_gatt_stop_discovery()` could block forever on
`xSemaphoreTake(s_discovery_stopped, portMAX_DELAY)` when a discovery
completed naturally between the in-progress check and the flag set.
KernelBG would then miss its watchdog and the watch reset into PRF.
Race sequence:
1. KernelBG enters stop, reads `s_discovery_in_progress = true`.
2. NimBLE host task fires the last discovery callback. Its top check
sees `s_stop_discovery_requested = false`, so it takes the natural
completion path, sets `s_discovery_in_progress = false`, and
returns without giving the semaphore.
3. KernelBG resumes, sets `s_stop_discovery_requested = true`, and
blocks on the semaphore forever.
Fix by setting the stop flag before checking in-progress and by giving
the semaphore from every discovery termination point (new
`prv_signal_discovery_done()` helper) whenever a stop is pending. Drain
any stale signal at the start of stop. Also reset the stop flag and set
in-progress before issuing `ble_gattc_disc_all_svcs()` in
`bt_driver_gatt_start_discovery_range()` so callbacks on the NimBLE host
task cannot observe stale flags and silently abort the new discovery,
and free the context on failure.
Fixes FIRM-1895
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Gerard Marull-Paretas <gerard@teslabs.com>
jplexer
approved these changes
May 12, 2026
Member
Author
|
@sjp4 pls try |
Member
I tried - got some more crashes (see https://linear.app/core-dev/issue/MOB-6961/crashlooped-to-prf) - I think it was on that version but not 100% sure |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
bt_driver_gatt_stop_discovery()could block forever onxSemaphoreTake(s_discovery_stopped, portMAX_DELAY)when a discovery completed naturally between the in-progress check and the flag set. KernelBG would then miss its watchdog and the watch reset into PRF.Race sequence:
s_discovery_in_progress = true.s_stop_discovery_requested = false, so it takes the natural completion path, setss_discovery_in_progress = false, and returns without giving the semaphore.s_stop_discovery_requested = true, and blocks on the semaphore forever.Fix by setting the stop flag before checking in-progress and by giving the semaphore from every discovery termination point (new
prv_signal_discovery_done()helper) whenever a stop is pending. Drain any stale signal at the start of stop. Also reset the stop flag and set in-progress before issuingble_gattc_disc_all_svcs()inbt_driver_gatt_start_discovery_range()so callbacks on the NimBLE host task cannot observe stale flags and silently abort the new discovery, and free the context on failure.Fixes FIRM-1895