ESPHome 2025.5.0 S3 Box 3 reboots on end of voice assistant announcement

Since updating to ESPHome 2025.0 the S3 box reboots on the end of any conversation when it stops the mixing speaker. 

```
[01:13:50][D][voice_assistant:626]: Speech recognised as: "Hi."
[01:13:50][D][voice_assistant:456]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[01:13:50][D][voice_assistant:598]: Event Type: 5
[01:13:50][D][voice_assistant:631]: Intent started
[01:13:50][D][i2s_audio.microphone:467]: Task is stopping, attempting to unload the I2S audio driver
[01:13:50][D][i2s_audio.microphone:472]: Task is finished, freeing resources
[01:13:50][D][voice_assistant:598]: Event Type: 6
[01:13:50][D][voice_assistant:598]: Event Type: 7
[01:13:50][D][voice_assistant:656]: Response: "Hello from Home Assistant."
[01:13:50][D][voice_assistant:598]: Event Type: 8
[01:13:50][D][voice_assistant:678]: Response URL: "http://192.168.210.124:8123/api/tts_proxy/N8NZ5AQTlsBHZM689FoQhQ.flac"
[01:13:50][D][voice_assistant:456]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[01:13:50][D][voice_assistant:463]: Desired state set to STREAMING_RESPONSE
[01:13:50][D][media_player:074]: 'Media Player' - Setting
[01:13:50][D][media_player:081]:   Media URL: http://192.168.210.124:8123/api/tts_proxy/N8NZ5AQTlsBHZM689FoQhQ.flac
[01:13:50][D][media_player:087]:  Announcement: yes
[01:13:50][D][speaker_media_player:408]: State changed to ANNOUNCING
[01:13:50][D][voice_assistant:598]: Event Type: 2
[01:13:50][D][voice_assistant:697]: Assist Pipeline ended
[01:13:50][D][sensor:093]: 'S3 Temperature': Sending state 21.09534 °C with 2 decimals of accuracy
[01:13:50][D][sensor:093]: 'S3 Humidity': Sending state 60.02083 % with 2 decimals of accuracy
[01:13:50][D][ring_buffer:034][ann_read]: Created ring buffer with size 1000000
[01:13:50][D][speaker_media_player.pipeline:114]: Reading FLAC file type
[01:13:50][D][speaker_media_player.pipeline:124]: Decoded audio has 1 channels, 48000 Hz sample rate, and 16 bits per sample
[01:13:50][D][ring_buffer:034]: Created ring buffer with size 9600
[01:13:50][D][speaker_mixer:312]: Starting speaker mixer
[01:13:50][D][speaker_mixer:320]: Started speaker mixer
[01:13:50][D][ring_buffer:034][speaker_task]: Created ring buffer with size 48000
[01:13:50][D][i2s_audio.speaker:117]: Starting Speaker
[01:13:50][D][i2s_audio.speaker:122]: Started Speaker
[01:13:52][D][speaker_media_player:408]: State changed to IDLE
[01:13:52][D][voice_assistant:329]: Announcement finished playing
[01:13:52][D][voice_assistant:456]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED
[01:13:52][D][voice_assistant:463]: Desired state set to RESPONSE_FINISHED
[01:13:52][D][micro_wake_word:360]: Starting wake word detection
[01:13:52][D][micro_wake_word:378]: State changed from STOPPED to STARTING
[01:13:52][D][voice_assistant:456]: State changed from RESPONSE_FINISHED to IDLE
[01:13:52][D][voice_assistant:463]: Desired state set to IDLE
[01:13:52][D][ring_buffer:034][mww]: Created ring buffer with size 3840
[01:13:52][D][micro_wake_word:261]: Inference task has started, attempting to allocate memory for buffers
[01:13:52][D][micro_wake_word:266]: Inference task is running
[01:13:52][D][micro_wake_word:378]: State changed from STARTING to DETECTING_WAKE_WORD
[01:13:52][D][i2s_audio.microphone:455]: Task has started, attempting to setup I2S audio driver
[01:13:52][D][speaker_mixer:325]: Stopping speaker mixer
WARNING esp32-s3-box-3-2e8c34 @ 192.168.210.49: Connection error occurred: [Errno 104] Connection reset by peer
INFO Processing unexpected disconnect from ESPHome API for esp32-s3-box-3-2e8c34 @ 192.168.210.49
WARNING Disconnected from API
INFO Successfully connected to esp32-s3-box-3-2e8c34 @ 192.168.210.49 in 0.009s
INFO Successful handshake with esp32-s3-box-3-2e8c34 @ 192.168.210.49 in 0.165s
[01:14:39][D][sensor:093]: 'WiFi db': Sending state -48.00000 dBm with 0 decimals of accuracy
[01:14:39][D][sensor:093]: 'WiFi Signal': Sending state 100.00000 % with 0 decimals of accuracy
```

I have a suspicion this is related to the new features that were just released.
[https://www.esphome.io/changelog/2025.5.0.html](https://www.esphome.io/changelog/2025.5.0.html)

I also noted that there is a new i2s_audio driver although I have tested the legacy mode and the issue still replicates:
[https://github.com/esphome/esphome/pull/8703](https://github.com/esphome/esphome/pull/8703)

I can see there is pending updates to the esphome wakewordvoiceassistants to support the new features which I suspect would resolve this:
[https://github.com/esphome/wake-word-voice-assistants/pull/110](https://github.com/esphome/wake-word-voice-assistants/pull/110)

It looks like a refactor of the way the voice assistant is configured as you can't assume the order of an interaction is wake word->stt->tts->done

I have been trying to handle how it ends the announcement but I haven't been able to mitigate this yet so I think the media_player and voice_assistant code may need to be updated similar to the pending update from esphome. I will wait until they release their changes.

I wanted to raise this as I expect others will start reporting it soon as it appears there is a breaking change in this release.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESPHome 2025.5.0 S3 Box 3 reboots on end of voice assistant announcement #159

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

ESPHome 2025.5.0 S3 Box 3 reboots on end of voice assistant announcement #159

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions