fix(fcm): recover connection status after settings reload (#183)#184
Merged
Conversation
) (#1120) Issue #183: changing settings (which reloads the config entry) left the FCM connection status stuck on "Disconnected" until a full Home Assistant restart. The submitted diagnostics were self-contradictory: the receiver snapshot was healthy/STARTED while coordinator.fcm_status was degraded ("Push transport not ready; continuing with cached data"). Root cause (empirically reproduced, hypothesis H2): two receiver stores diverge across a refcount->0 release. _pop_fcm_receiver empties the per-entry dict fcm_receivers (read by is_push_ready via the provider getter), but _sync_legacy_fcm_receiver_alias only dropped the legacy fcm_receiver singleton (read by diagnostics) when it held the wrong type. A healthy, right-type receiver therefore survived as the singleton while the per-entry store was empty: is_push_ready() resolved nothing and returned False (rendered "Disconnected"), while diagnostics still reported the receiver as healthy/connected. Fix: treat fcm_receivers as the single source of truth whenever it exists. When the per-entry dict holds no valid receiver (e.g. after the release path empties it on reload), drop the legacy alias as well. The legacy migration path (dict absent entirely, used by _get_fcm_receivers) is preserved unchanged. This also makes the reload re-acquire reliably repopulate the store: with no stale singleton to reuse, _async_acquire_ shared_fcm falls through to creating a fresh receiver that re-registers the per-entry store and providers, so is_push_ready recovers without a restart. Tests: - test_issue_183_reload_fcm_desync.py: alias cleared when dict empties, readiness and diagnostics stay consistent, runtime re-sync heals, and the absent-dict migration path is preserved. - test_fcm_receiver.test_reload_reacquire_repopulates_receiver_store: full acquire -> release@0 -> re-acquire cycle on the real paths. - test_coordinator_status: status stays non-CONNECTED while readiness is false and recovers on the next cycle once it is true. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a regression where modifying the integration's settings (which triggers a config-entry reload) left the connection status stuck on Disconnected until Home Assistant was fully restarted. Reloading the integration alone did not recover it.
Fixes #183
Root cause
The FCM receiver is tracked in two places:
fcm_receivers(the source of truth thatapi.is_push_ready()resolves through the domain provider), andfcm_receiver(read by diagnostics and a few fallback paths).On a reload, the shared FCM receiver is released when the refcount drops to zero, which empties the per-entry
fcm_receiversdict. However,_sync_legacy_fcm_receiver_aliasonly cleared the legacy singleton when it held the wrong type, so a healthy singleton survived the release. The result was a desynchronized state:healthy(matchingref_count: 0+start_count: 5in the issue's diagnostic dump),is_push_ready()resolved the now-empty per-entry dict and returnedFalse,is_push_ready()) showed Disconnected, and"Push transport not ready; continuing with cached data"— verbatim thefcm_status.reasonfrom the report.Because the stale singleton was reused on re-acquire, the per-entry store was never repopulated and the provider was never re-registered, so the condition persisted across reloads and only a full HA restart cleared it.
Fix
_sync_legacy_fcm_receiver_aliasnow treats the per-entryfcm_receiversdict as the single source of truth whenever it exists: if the dict is present but empty (post-release), the legacy singleton alias is cleared as well. This forces the next re-acquire down the fresh-receiver path, which rebuilds both the per-entry store and the provider registration, sois_push_ready()recovers without an HA restart and diagnostics stay consistent.The legacy migration path (
fcm_receiversentirely absent) is unchanged and remains guarded by a test.Tests
is_push_ready()/ domain-provider / diagnostics paths, and assert recovery after the fix.fcm_statusheals in the next poll cycle once readiness is correct.ruff,ruff format,mypy --strictclean; full suite green.