Skip to content

Treat Home Assistant state_reported keepalives as liveness signals#364

Merged
tomquist merged 4 commits into
developfrom
claude/investigate-issue-363-W4Sh0
May 16, 2026
Merged

Treat Home Assistant state_reported keepalives as liveness signals#364
tomquist merged 4 commits into
developfrom
claude/investigate-issue-363-W4Sh0

Conversation

@tomquist
Copy link
Copy Markdown
Owner

@tomquist tomquist commented May 16, 2026

https://claude.ai/code/session_01LwA9fBRZcLjRi7fEU4fHyw

Summary by CodeRabbit

  • Bug Fixes

    • Fixed Home Assistant sensors being falsely reported as stale when values remain unchanged (e.g., solar production at night). The system now better handles entity freshness tracking and includes a fallback mechanism to verify sensor status.
  • Tests

    • Expanded test coverage for stale sensor detection and freshness validation scenarios.

Review Change Stack

claude added 2 commits May 16, 2026 07:48
HA's subscribe_entities feed omits the state field from compressed diffs
when a sensor is reported with an unchanged value, sending only an
updated timestamp. The previous code skipped those events entirely, so
the per-entity update_time never refreshed and the 60 s staleness check
fired falsely for sensors whose value is legitimately constant (e.g.
solar production on an unloaded phase, an empty production sensor at
night). Recognize bare ``lu``/``lc`` diffs as keepalives so the
staleness check only trips when the websocket has truly gone silent.

Fixes #363.
…ensors

HA's subscribe_entities only forwards EVENT_STATE_CHANGED; sensors whose
value doesn't change (e.g. solar production on an unloaded phase) fire
EVENT_STATE_REPORTED instead and never reach our websocket, so the
per-entity push timer drifts past the 60 s staleness threshold even
though HA itself is current.

When local push silence crosses the threshold, fan out parallel
GET /api/states/{entity} requests bounded by a 1 s total wall-clock
budget (so a battery's UDP request never stalls). HA's response includes
last_reported, which is mutated on every state write including same-value
reports — use that as the authoritative freshness signal. If HA's own
last_reported is also stale, leave the local cache untouched and let the
existing staleness check raise.

Also recognize bare lu/lc diffs from subscribe_entities (state unchanged
but attributes changed) as liveness keepalives, sparing the REST round
trip in that narrower case.

Fixes #363.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 16, 2026

Warning

Rate limit exceeded

@tomquist has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 51 minutes and 27 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 85b2695d-5c99-4683-956f-351332119978

📥 Commits

Reviewing files that changed from the base of the PR and between e878701 and 6f5ee90.

📒 Files selected for processing (2)
  • src/astrameter/powermeter/homeassistant.py
  • src/astrameter/powermeter/homeassistant_test.py

Walkthrough

This PR fixes false stale-state errors in Home Assistant sensor integration by detecting websocket keepalives and adding a REST fallback. Sensors with constant values no longer incorrectly report as stale. The fix includes compressed-diff keepalive recognition, a bounded parallel REST refresh mechanism, and comprehensive test coverage.

Changes

Home Assistant stale state detection fix via websocket keepalives and REST fallback

Layer / File(s) Summary
Websocket keepalive detection
src/astrameter/powermeter/homeassistant.py, src/astrameter/powermeter/homeassistant_test.py
New constants _HA_LU, _HA_LC, REST_REFRESH_TIMEOUT_SECONDS and datetime imports enable recognition of compressed-diff keepalives (lu/lc updates without state values). Compressed-diff handler treats keepalives as staleness refreshes via new _mark_entity_alive() helper. Websocket loop reconnection comment clarifies stale-until-fresh semantics. Tests verify state_reported event refresh behavior and message signaling on keepalives.
REST staleness fallback mechanism
src/astrameter/powermeter/homeassistant.py, src/astrameter/powermeter/homeassistant_test.py
New _build_rest_state_url() generates entity state endpoint URLs. _refresh_stale_via_rest() implements bounded parallel polling: identifies locally stale entities, issues parallel GET /api/states/{entity_id} requests with wall-clock timeout, parses HA's last_reported/last_updated, validates age against max_state_age_seconds, and conditionally updates cache only if HA's report is fresh. Integration into get_powermeter_watts() triggers refresh before returning cached values. Extensive test harness with fake HTTP session/response objects verifies refresh validation, timeout bounds, error handling, and URL scheme/path_prefix correctness.
User-facing documentation
CHANGELOG.md
Fixed entry documents that Home Assistant sensors no longer falsely report as stale when values remain unchanged (issue #363).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 31.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main technical change: treating Home Assistant state_reported keepalives as liveness signals to prevent false staleness errors.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/investigate-issue-363-W4Sh0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tomquist tomquist marked this pull request as ready for review May 16, 2026 08:50
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/astrameter/powermeter/homeassistant.py`:
- Around line 327-376: The REST handler currently writes the fetched value and
sets _entity_update_time to self._clock(), which discards Home Assistant's
reported timestamp and can overwrite newer websocket updates; in
_apply_rest_state use the parsed reported_dt (from "last_reported" or
"last_updated") as the source-of-truth timestamp and only apply the REST
snapshot if reported_dt is within _max_state_age_seconds and is strictly newer
than any existing cached timestamp for that entity (compare reported_dt to the
existing _entity_update_time converted to a compatible timezone-aware datetime
or store timestamps consistently), then set _entity_values[entity_id] and
_entity_update_time[entity_id] from reported_dt (not self._clock()), and finally
call _check_entities_ready() and _message_event.set() as before; reference
symbols: _apply_rest_state, _entity_values, _entity_update_time,
_max_state_age_seconds, _clock, _check_entities_ready, _message_event.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0d9eb536-14e8-40d0-9ba1-11b3c5461945

📥 Commits

Reviewing files that changed from the base of the PR and between 15f62db and e878701.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • src/astrameter/powermeter/homeassistant.py
  • src/astrameter/powermeter/homeassistant_test.py

Comment on lines +327 to +376
async def _fetch_rest_state(self, entity_id: str) -> None:
assert self._session is not None
url = self._build_rest_state_url(entity_id)
headers = {"Authorization": f"Bearer {self.access_token}"}
try:
async with self._session.get(url, headers=headers) as resp:
if resp.status != 200:
logger.debug(
"Home Assistant REST refresh for %s: HTTP %s",
entity_id,
resp.status,
)
return
data = await resp.json()
except (aiohttp.ClientError, asyncio.TimeoutError) as exc:
logger.debug(
"Home Assistant REST refresh for %s failed: %s", entity_id, exc
)
return
if isinstance(data, dict):
self._apply_rest_state(entity_id, data)

def _apply_rest_state(self, entity_id: str, data: dict[str, Any]) -> None:
state_val = data.get("state")
if state_val in (None, "unknown", "unavailable"):
return
try:
value = float(state_val) # type: ignore[arg-type]
except (ValueError, TypeError):
return
# Trust HA's ``last_reported`` (mutated on every state write).
# If HA itself hasn't seen an update within the staleness window,
# don't refresh local cache — let the staleness check raise.
if self._max_state_age_seconds > 0:
reported_iso = data.get("last_reported") or data.get("last_updated")
if not isinstance(reported_iso, str):
return
try:
reported_dt = datetime.fromisoformat(reported_iso)
except ValueError:
return
if reported_dt.tzinfo is None:
reported_dt = reported_dt.replace(tzinfo=timezone.utc)
ha_age = (datetime.now(timezone.utc) - reported_dt).total_seconds()
if ha_age > self._max_state_age_seconds:
return
self._entity_values[entity_id] = value
self._entity_update_time[entity_id] = self._clock()
self._check_entities_ready()
self._message_event.set()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Apply REST refreshes using HA’s actual age, and only if the cache is still older than the request.

Lines 373-374 currently reset _entity_update_time to self._clock() after any fresh REST response. That discards HA’s reported age, so a value that was already almost stale can get another full freshness window after one REST hit. The same path also unconditionally overwrites the cache after async I/O, so a websocket update that lands while the GET is in flight can be replaced by an older REST snapshot.

Suggested direction
-    async def _fetch_rest_state(self, entity_id: str) -> None:
+    async def _fetch_rest_state(self, entity_id: str) -> None:
         assert self._session is not None
+        request_started = self._clock()
         url = self._build_rest_state_url(entity_id)
         headers = {"Authorization": f"Bearer {self.access_token}"}
         try:
             async with self._session.get(url, headers=headers) as resp:
                 if resp.status != 200:
@@
-        if isinstance(data, dict):
-            self._apply_rest_state(entity_id, data)
+        if isinstance(data, dict):
+            self._apply_rest_state(entity_id, data, request_started)

-    def _apply_rest_state(self, entity_id: str, data: dict[str, Any]) -> None:
+    def _apply_rest_state(
+        self, entity_id: str, data: dict[str, Any], request_started: float
+    ) -> None:
         state_val = data.get("state")
         if state_val in (None, "unknown", "unavailable"):
             return
         try:
             value = float(state_val)  # type: ignore[arg-type]
         except (ValueError, TypeError):
             return
+        current_update = self._entity_update_time.get(entity_id)
+        if current_update is not None and current_update > request_started:
+            return
+
+        ha_age = 0.0
         if self._max_state_age_seconds > 0:
             reported_iso = data.get("last_reported") or data.get("last_updated")
             if not isinstance(reported_iso, str):
                 return
             try:
                 reported_dt = datetime.fromisoformat(reported_iso)
             except ValueError:
                 return
             if reported_dt.tzinfo is None:
                 reported_dt = reported_dt.replace(tzinfo=timezone.utc)
-            ha_age = (datetime.now(timezone.utc) - reported_dt).total_seconds()
+            ha_age = max(
+                0.0, (datetime.now(timezone.utc) - reported_dt).total_seconds()
+            )
             if ha_age > self._max_state_age_seconds:
                 return
         self._entity_values[entity_id] = value
-        self._entity_update_time[entity_id] = self._clock()
+        self._entity_update_time[entity_id] = self._clock() - ha_age
         self._check_entities_ready()
         self._message_event.set()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def _fetch_rest_state(self, entity_id: str) -> None:
assert self._session is not None
url = self._build_rest_state_url(entity_id)
headers = {"Authorization": f"Bearer {self.access_token}"}
try:
async with self._session.get(url, headers=headers) as resp:
if resp.status != 200:
logger.debug(
"Home Assistant REST refresh for %s: HTTP %s",
entity_id,
resp.status,
)
return
data = await resp.json()
except (aiohttp.ClientError, asyncio.TimeoutError) as exc:
logger.debug(
"Home Assistant REST refresh for %s failed: %s", entity_id, exc
)
return
if isinstance(data, dict):
self._apply_rest_state(entity_id, data)
def _apply_rest_state(self, entity_id: str, data: dict[str, Any]) -> None:
state_val = data.get("state")
if state_val in (None, "unknown", "unavailable"):
return
try:
value = float(state_val) # type: ignore[arg-type]
except (ValueError, TypeError):
return
# Trust HA's ``last_reported`` (mutated on every state write).
# If HA itself hasn't seen an update within the staleness window,
# don't refresh local cache — let the staleness check raise.
if self._max_state_age_seconds > 0:
reported_iso = data.get("last_reported") or data.get("last_updated")
if not isinstance(reported_iso, str):
return
try:
reported_dt = datetime.fromisoformat(reported_iso)
except ValueError:
return
if reported_dt.tzinfo is None:
reported_dt = reported_dt.replace(tzinfo=timezone.utc)
ha_age = (datetime.now(timezone.utc) - reported_dt).total_seconds()
if ha_age > self._max_state_age_seconds:
return
self._entity_values[entity_id] = value
self._entity_update_time[entity_id] = self._clock()
self._check_entities_ready()
self._message_event.set()
async def _fetch_rest_state(self, entity_id: str) -> None:
assert self._session is not None
request_started = self._clock()
url = self._build_rest_state_url(entity_id)
headers = {"Authorization": f"Bearer {self.access_token}"}
try:
async with self._session.get(url, headers=headers) as resp:
if resp.status != 200:
logger.debug(
"Home Assistant REST refresh for %s: HTTP %s",
entity_id,
resp.status,
)
return
data = await resp.json()
except (aiohttp.ClientError, asyncio.TimeoutError) as exc:
logger.debug(
"Home Assistant REST refresh for %s failed: %s", entity_id, exc
)
return
if isinstance(data, dict):
self._apply_rest_state(entity_id, data, request_started)
def _apply_rest_state(
self, entity_id: str, data: dict[str, Any], request_started: float
) -> None:
state_val = data.get("state")
if state_val in (None, "unknown", "unavailable"):
return
try:
value = float(state_val) # type: ignore[arg-type]
except (ValueError, TypeError):
return
current_update = self._entity_update_time.get(entity_id)
if current_update is not None and current_update > request_started:
return
ha_age = 0.0
# Trust HA's ``last_reported`` (mutated on every state write).
# If HA itself hasn't seen an update within the staleness window,
# don't refresh local cache — let the staleness check raise.
if self._max_state_age_seconds > 0:
reported_iso = data.get("last_reported") or data.get("last_updated")
if not isinstance(reported_iso, str):
return
try:
reported_dt = datetime.fromisoformat(reported_iso)
except ValueError:
return
if reported_dt.tzinfo is None:
reported_dt = reported_dt.replace(tzinfo=timezone.utc)
ha_age = max(
0.0, (datetime.now(timezone.utc) - reported_dt).total_seconds()
)
if ha_age > self._max_state_age_seconds:
return
self._entity_values[entity_id] = value
self._entity_update_time[entity_id] = self._clock() - ha_age
self._check_entities_ready()
self._message_event.set()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/astrameter/powermeter/homeassistant.py` around lines 327 - 376, The REST
handler currently writes the fetched value and sets _entity_update_time to
self._clock(), which discards Home Assistant's reported timestamp and can
overwrite newer websocket updates; in _apply_rest_state use the parsed
reported_dt (from "last_reported" or "last_updated") as the source-of-truth
timestamp and only apply the REST snapshot if reported_dt is within
_max_state_age_seconds and is strictly newer than any existing cached timestamp
for that entity (compare reported_dt to the existing _entity_update_time
converted to a compatible timezone-aware datetime or store timestamps
consistently), then set _entity_values[entity_id] and
_entity_update_time[entity_id] from reported_dt (not self._clock()), and finally
call _check_entities_ready() and _message_event.set() as before; reference
symbols: _apply_rest_state, _entity_values, _entity_update_time,
_max_state_age_seconds, _clock, _check_entities_ready, _message_event.

If a state_changed push arrives while a REST staleness fallback is in
flight for the same entity, the WS value is at least as fresh as
anything the REST round-trip can return. Snapshot the local update
timestamp before the await and skip applying the REST response if it
changed, so the push wins instead of being clobbered by a stale REST
read.
@tomquist tomquist merged commit 847d9cd into develop May 16, 2026
13 checks passed
@tomquist tomquist deleted the claude/investigate-issue-363-W4Sh0 branch May 16, 2026 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants