Ensure all mqtt messages report state to core #140001

elupus · 2025-03-06T21:18:27Z

Proposed change

Core will filter the state of entities and report state changed events in case of changed data and state reported event if no data was changed.

This allow helpers like derivate and integrate to work correctly with constant entities as well as things like last_reported.

Type of change

Dependency upgrade
Bugfix (non-breaking change which fixes an issue)
New integration (thank you!)
New feature (which adds functionality to an existing integration)
Deprecation (breaking change to happen in the future)
Breaking change (fix/feature causing existing functionality to break)
Code quality improvements to existing code or addition of tests

Additional information

This PR fixes or closes issue: fixes #
This PR is related to issue: MQTT Message Not Updating last_reported #121978 and Derivative integration gives incorrect values when source does not change #83496
Link to documentation pull request:
Link to developer documentation pull request:
Link to frontend pull request:

Checklist

The code change is tested and works locally.
Local tests pass. Your PR cannot be merged unless tests pass
There is no commented out code in this PR.
I have followed the development checklist
I have followed the perfect PR recommendations
The code has been formatted using Ruff (ruff format homeassistant tests)
Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

Documentation added/updated for www.home-assistant.io

If the code communicates with devices, web services, or third-party tools:

The manifest file has all fields filled out correctly.
Updated and included derived files by running: python3 -m script.hassfest.
New or updated dependencies have been added to requirements_all.txt.
Updated by running python3 -m script.gen_requirements_all.
For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.

To help with the load of incoming pull requests:

I have reviewed two other open pull requests in this repository.

Core will filter the state of entities and report state changed events in case of changed data and state reported event if no data was changed. This allow helpers like derivate and integrate to work correctly with constant entities as well as things like last_reported.

home-assistant · 2025-03-06T21:18:35Z

Hey there @emontnemery, @jbouwh, @bdraco, mind taking a look at this pull request as it has been labeled with an integration (mqtt) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of mqtt can trigger bot actions by commenting:

@home-assistant close Closes the pull request.
@home-assistant rename Awesome new title Renames the pull request.
@home-assistant reopen Reopen the pull request.
@home-assistant unassign mqtt Removes the current integration label and assignees on the pull request, add the integration domain after the command.
@home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the pull request.
@home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the pull request.

elupus · 2025-03-06T21:20:30Z

This was originally introduced in #100438 but has since been reworked a bit. Was any perf measurements done during that pull?

frenck · 2025-03-06T21:21:14Z

@elupus Please update the PR title, we do not use conventional commits.

Thanks 👍

../Frenck

jbouwh · 2025-03-06T21:25:46Z

I'd like to have a closer look at this change before it is merged.

homeassistant/components/mqtt/entity.py

home-assistant · 2025-03-06T21:41:04Z

Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍

Learn more about our pull request process.

jbouwh · 2025-03-06T21:52:17Z

When a payload arrives an is processed by the entity platform the attribute filter is used to detect if any attribute was changed. If not, the method returns. Adding a state write would cause additional overhead. To add it back where it was in the past would require a major refactoring. It was never intended to always do a state write an all entities when a message comes in.

elupus · 2025-03-06T22:11:25Z

There is an open issue with a test still. Seems the event entity relied on this behaviour in test_handling_retained_event_payloads which i can't fully figure out. Sort of seem it ignored messages that did not change the event data. But sort of did it in the background.

I do understand that this is a performance tradeoff. I kinda do like last_reported info and being able to rely on reported events. I suppose it could be opt-in, but those type of config choices are hard for anybody to get right.

bdraco

While getting last_reported is nice, there are so many power sensors (mostly from Z2M) on the market that flood the state machine with data we would effectively be forcing those users to buy better hardware or move to a different solution. I think this should be opt-in.

erkr · 2025-03-07T08:32:48Z

BDraco's concern is a valid one. Maybe wise to rate limit 'last_reported' updates to something like max 1 per second per device

jbouwh · 2025-03-07T10:09:24Z

Maybe wise to rate limit 'last_reported' updates to something like max 1 per second per device

Not sure that will be enough, as the MQTT protocol has some overhead in continuesly reporting states and we do not want to write states when they are not changes. For sensors that need consequent writing we have force_update, but even using that I would would not recomment to use.

erkr · 2025-03-07T10:45:44Z

Maybe wise to rate limit 'last_reported' updates to something like max 1 per second per device
... force_update, but even using that I would would not recomment to use.

Basically, I'm fine to use something like force_update for specific sensors only (in case of performance impact ).
But currently force_update does more to my understanding. It also updated last_updated and last_changed, even when both the state and attributes didn't alter. That has the side effect it unnecessarily triggers state event used in automations and templates.

From performance perspective I seriously doubt that not updating last_reported solves the performance impact due power devices and quality sensors spamming MQTT:

The main hit is in MQTT broker itself and the MQTT integration has to register/receive all topic updates anyway.
Second hit is the MQTT integration that compares the content to decide to update the entity state (last_updated and/or last_changed)
Only leaving out the last hit;, updating last_reported in case state was not changed, is probably not significant
If it is still noticeable on the total penalty (a given), a rate limit is what I proposed

jbouwh · 2025-03-07T11:10:38Z

Another note to make is that a lot of MQTT platforms use value templates. In many cases, there is no active update, when the template returns empty. This PR is not addressing this. E.g.:

core/homeassistant/components/mqtt/climate.py

Lines 720 to 722 in 2985f08

    
           if not preset_mode: 
        
               _LOGGER.debug("Ignoring empty preset_mode from '%s'", msg.topic) 
        
               return

This pattern is seen at many places. Payloads contain details that are to be ignored as they are not in the context of the entity.
The attribute filter tackled that.

jbouwh · 2025-03-07T12:34:20Z

Openened: home-assistant/home-assistant.io#37856

I think we if we want to enable state writes it should be an opt in instead.

bdraco · 2025-03-07T17:44:43Z

From performance perspective I seriously doubt that not updating last_reported solves the performance impact due power devices and quality sensors spamming MQTT

Think about all the objects, syscalls (time and others), dict merges that need to be created to fire EVENT_STATE_REPORTED.
Next think about all the downstream consumers of EVENT_STATE_REPORTED.
Than add in the cost of putting it in the recorder queue.
Than add in the cost of the database update.

And that's far from everything that has to happen to fire that event.

While we have done a great job optimizing all that, very little of it has a native code implementation besides generating the ulid (ulid-transform) and hashing (fnv-hash-fast) in the recorder. The rest is implemented mainly in Python code, which is hard to optimize without rewriting most core objects and event paths into a better-performing language. While that is possible, it would make the core more challenging to maintain. Even if we move all the expensive object creation to something like Cython, the objects still have to be passed back to Python, limiting optimizing for this path.

erkr · 2025-03-07T18:45:05Z

@bdraco
You are definitely more knowledgeable. So your option counts.
My remark was a relative one: Frequent topic updates will put a lot of extra stress on the MQTT broker (add-on on the same system in my case) and bog down handling of all the related updates by HA MQTT integration. That part is doing expensive JSON or string parsing.
I was assuming the state_reported will remain in the shadow of that.

elupus · 2025-03-08T08:25:18Z

If we want to make it optional it should be opt out of state reported not opt in. If we have a feature that by default large integrations ignore, nothing can rely on it.

We could maybe default it to on, but on upgrade default it to off. Ie new users get it on by default. Old users stay on off untill they enable.

We could also remember all unchanged entities and trigger write on those with a delay, but that get complicated fast.

bdraco · 2025-03-08T08:51:37Z

If we want to make it optional it should be opt out of state reported not opt in. If we have a feature that by default large integrations ignore, nothing can rely on it.

That's already the case, as what constitutes a reported state will vary widely between each integration. A typical pattern for integrations and libs is suppressing duplicate data when nothing changes. Suppose I think about the integrations I use that do this. In that case, it includes: nearly all passive Bluetooth devices, anything that relies on Bluetooth RSSI updates, Bluetooth advertisements that don't change (some of that happens at the BlueZ level), HomeKit accessory protocol updates, and ESPHome does it on device for various sensor reads as well as the integration. That's far from an exhaustive list, and only the most common ones I can think of off the top of my head.

elupus · 2025-03-08T09:26:42Z

Im leaning towards abandoning this change. jbouwh might have something better when it comes to opting in in the works.

However on a tangent. Should there be a capabilities flag or something to indicate if an entity can report last reported? We could indicate in gui based on that and helpers could take it into account. What do you guys think?

emontnemery · 2025-03-10T17:54:20Z

Not suppressing state changes is today mainly of interest to sensor helpers to ensure correct calculations since we don't show the last_reported timestamp in frontend.
I think we should thus do this only for numerical mqtt sensors, not for all mqtt platforms. A flag to opt-in as suggested before by @bdraco and @jbouwh could make sense, but it seems like almost too much detail.

We could also make this an entity option, configurable by the user via the more details dialog.

there are so many power sensors (mostly from Z2M) on the market that flood the state machine with data we would effectively be forcing those users to buy better hardware or move to a different solution

What does flood mean in this case?

emontnemery · 2025-03-11T08:30:31Z

@bdraco You mention this work needs to be done:

Think about all the objects, syscalls (time and others), dict merges that need to be created to fire EVENT_STATE_REPORTED.

I agree with this concern, although we try to do this as lazily as possible, for example only creating some objects if there are subscribers to EVENT_STATE_REPORTED etc.

Next think about all the downstream consumers of EVENT_STATE_REPORTED.

We have implemented a lot of restrictions on subscribing to EVENT_STATE_REPORTED:

EVENT_STATE_REPORTED is not included when subscribing to all events
When subscribing to EVENT_STATE_REPORTED a filter function needs to be supplied (typically to subscribe to EVENT_STATE_REPORTED only for a single entity
It's not allowed to use EVENT_STATE_REPORTED to trigger automations or trigger based template entities

Add in the cost of putting it in the recorder queue.
Add in the cost of the database update.

True, but as you also mention, this is very highly optimized. For starters, STATE_REPORTED events are not even recorded. The recorder instead bumps the last_reported timestamp of the last known state on state changes:

core/homeassistant/components/recorder/core.py

Lines 1105 to 1107 in 3675322

    
           states_manager.update_pending_last_reported( 
        
               old_state_id, old_state.last_reported_timestamp 
        
           )

This means there is at worst a doubling of database accesses (if every state change needs the previous state's last_reported timestamp to be updated), no matter how many redundant state writes an entity causes.

I don't mean the concerns are not valid, I just want to point out that we have spent a lot of effort on minimizing the impact of spammy entities.

bdraco · 2025-03-11T22:43:41Z

Think about all the objects, syscalls (time and others), dict merges that need to be created to fire EVENT_STATE_REPORTED.

I agree with this concern, although we try to do this as lazily as possible, for example only creating some objects if there are subscribers to EVENT_STATE_REPORTED etc.

I'm more concerned about this part of the update cycle than the other parts that we spent all the time optimizing.

Specifically __async_calculate_state, _async_write_ha_state and async_set_internal are almost always the top hitters in the profiles I get from MQTT users.

I don't have a good way to optimize those without invasive changes (ie Cython, Rust or something similar).

The database update is still mildly concerning as well because updates are so much more expensive than inserts since they have to lookup old data as well.

bdraco · 2025-03-11T22:56:52Z

For some perspective I took a non-scientific look at a what would happen if the duplicate state writes suppression was removed from the most common push integrations I use:

Passive Bluetooth entities - additional ~700 state writes per minute
HomeKit Controller - additional ~1200 state writes per minute
Unifi Protect - additional ~7200 state writes per minute (I have multiple cameras on a busy street that have many sensors which fire ~1600 events per minute)

Additionally I have unifi (network) integration disabled because it generates far too many state writes already and produces a noticeable lag on my Blue when it's turned on with a network size of ~300 devices.

bdraco · 2025-03-11T23:53:52Z

There is a great discussion in #140318 (comment) that is relevant to this

The problem with these types of changes is that we really don’t know what the stability and performance impact on users will be, as we have no telemetry data. As a result, we have to rely on past experiences and data that users have manually provided when they encountered issues which is inherently a small sample size. This makes it difficult to make an informed decision, as there will always be users who end up being pushed over the edge as we try to gather more data. Without telemetry, it’s hard to know how many people will be negatively impacted, making it tough to make a well-informed choice. Ultimately, we have to rely on the assumption that we didn’t overwhelm too many users with each change, and our users find out the hard way when we’re wrong.

jbouwh · 2025-03-30T16:47:18Z

I think this PR should be closed as well

bdraco · 2025-03-30T17:03:55Z

I agree. We aren't ready for this yet. We should focus on the areas Erik identified in #140318 (comment) first

We should investigate the feasibility of a fast happy path for state writes implemented in C or Rust (although I realize that's not likely to be implemented any time soon)

We should add some kind of dashboard tracking entities with many state writes so the user can disable them

We should decrease the database retention period of sensors (that doesn't solve the CPU consumption caused by spammy sensors, but it improves the disk usage)

elupus requested review from emontnemery, jbouwh and bdraco as code owners March 6, 2025 21:18

home-assistant bot added bugfix cla-signed core has-tests integration: mqtt small-pr PRs with less than 30 lines. labels Mar 6, 2025

home-assistant bot assigned emontnemery, jbouwh and bdraco Mar 6, 2025

home-assistant bot added the Quality Scale: platinum label Mar 6, 2025

elupus changed the title ~~fix(mqtt): ensure all messages report state to core~~ Ensure all mqtt messages report state to core Mar 6, 2025

jbouwh requested changes Mar 6, 2025

View reviewed changes

homeassistant/components/mqtt/entity.py Show resolved Hide resolved

home-assistant bot marked this pull request as draft March 6, 2025 21:41

Remove no unused attribute parameter

05dcdde

Ensure we skip automatic write for event

48ad62a

elupus marked this pull request as ready for review March 6, 2025 22:33

home-assistant bot requested a review from jbouwh March 6, 2025 22:33

bdraco requested changes Mar 6, 2025

View reviewed changes

home-assistant bot marked this pull request as draft March 6, 2025 23:08

jbouwh removed the small-pr PRs with less than 30 lines. label Mar 7, 2025

erkr mentioned this pull request Mar 7, 2025

MQTT last_reported not updated #139989

Closed

Merge branch 'dev' into mqtt/fix_report

c0f23a9

jbouwh mentioned this pull request Mar 10, 2025

Allow to disable state write suppressing for MQTT sensors #140318

Closed

19 tasks

frenck closed this Mar 30, 2025

github-actions bot locked and limited conversation to collaborators Mar 31, 2025

Uh oh!

Ensure all mqtt messages report state to core #140001

Ensure all mqtt messages report state to core #140001

Uh oh!

Conversation

elupus commented Mar 6, 2025

Proposed change

Type of change

Additional information

Checklist

Uh oh!

home-assistant bot commented Mar 6, 2025

Uh oh!

elupus commented Mar 6, 2025

Uh oh!

frenck commented Mar 6, 2025

Uh oh!

jbouwh commented Mar 6, 2025

Uh oh!

Uh oh!

home-assistant bot commented Mar 6, 2025

Uh oh!

jbouwh commented Mar 6, 2025

Uh oh!

elupus commented Mar 6, 2025

Uh oh!

bdraco left a comment

Choose a reason for hiding this comment

Uh oh!

erkr commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbouwh commented Mar 7, 2025

Uh oh!

erkr commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbouwh commented Mar 7, 2025

Uh oh!

jbouwh commented Mar 7, 2025

Uh oh!

bdraco commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erkr commented Mar 7, 2025

Uh oh!

elupus commented Mar 8, 2025

Uh oh!

bdraco commented Mar 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elupus commented Mar 8, 2025

Uh oh!

emontnemery commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

emontnemery commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bdraco commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bdraco commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bdraco commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbouwh commented Mar 30, 2025

Uh oh!

bdraco commented Mar 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

erkr commented Mar 7, 2025 •

edited

Loading

erkr commented Mar 7, 2025 •

edited

Loading

bdraco commented Mar 7, 2025 •

edited

Loading

bdraco commented Mar 8, 2025 •

edited

Loading

emontnemery commented Mar 10, 2025 •

edited

Loading

emontnemery commented Mar 11, 2025 •

edited

Loading

bdraco commented Mar 11, 2025 •

edited

Loading

bdraco commented Mar 11, 2025 •

edited

Loading

bdraco commented Mar 11, 2025 •

edited

Loading

bdraco commented Mar 30, 2025 •

edited

Loading