Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Playing a Mp3 File with a voice Command and a automation ====> ESP Crashes #36

Open
pimp1310 opened this issue May 4, 2024 · 15 comments

Comments

@pimp1310
Copy link

pimp1310 commented May 4, 2024

Hello,

When i try to play a mp3 file with a automation thats reacts to a spoken command, the esp crashes.

When i send it over Services it works well without crashes. When i trigger the command in the automation manually it works well too.
I think its a timing problem? because the assist is active?

the command is the same, in the automation and when i use services

service: media_player.play_media target: entity_id: media_player.v_assist_kitchen_esp32nr168_s3_dev_media_player data: media_content_type: music media_content_id: http://live-icy.gss.dr.dk/A/A12H.mp3

here is the Crash and reboot when triggered over the animation and the spoken command.

`23:26:08][D][esp_adf_pipeline:302]: State changed from STOPPING to UNINITIALIZED
[23:26:08][D][voice_assistant:439]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[23:26:08][D][voice_assistant:563]: Event Type: 4
[23:26:08][D][voice_assistant:591]: Speech recognised as: " Schalte das Radio an."
[23:26:08][D][voice_assistant:563]: Event Type: 5
[23:26:08][D][voice_assistant:596]: Intent started
[23:26:08][D][media_player:059]: 's3-dev_media_player' - Setting
[23:26:08][D][media_player:066]: Media URL: http://live-icy.gss.dr.dk/A/A12H.mp3
[23:26:08][D][adf_media_player:030]: Got control call in state 1
[23:26:08][D][esp_adf_pipeline:050]: Starting request, current state UNINITIALIZED
[23:26:08][D][esp-idf:000]: I (573009) MP3_DECODER: MP3 init

[23:26:08][D][esp-idf:000]: I (573021) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=4

[23:26:08][D][i2s_audio:072]: Installing driver : yes
[23:26:08][D][esp_adf_pipeline:358]: pipeline tag 0, http
[23:26:08][D][esp_adf_pipeline:358]: pipeline tag 1, decoder
[23:26:08][D][esp_adf_pipeline:358]: pipeline tag 2, i2s_out
[23:26:08][D][esp-idf:000]: I (573052) AUDIO_PIPELINE: link el->rb, el:0x3d8263a4, tag:http, rb:0x3d8268b0

[23:26:08][D][esp-idf:000]: I (573060) AUDIO_PIPELINE: link el->rb, el:0x3d82654c, tag:decoder, rb:0x3d8278f0

[23:26:08][D][esp_adf_pipeline:370]: Setting up event listener.
[23:26:08][D][esp_adf_pipeline:302]: State changed from UNINITIALIZED to PREPARING
[23:26:08][I][adf_media_player:135]: got new pipeline state: 1
[23:26:08][D][adf_i2s_out:127]: Set final i2s settings: 22050
[23:26:08][D][esp-idf:000]: I (573095) AUDIO_THREAD: The http task allocate stack on external memory

[23:26:08][D][esp-idf:000]: I (573101) AUDIO_ELEMENT: [http-0x3d8263a4] Element task created

[23:26:08][D][esp-idf:000]: I (573112) AUDIO_THREAD: The decoder task allocate stack on external memory

[23:26:08][D][esp-idf:000]: I (573121) AUDIO_ELEMENT: [decoder-0x3d82654c] Element task created

[23:26:09][D][esp-idf:000]: I (573123) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[23:26:09][D][esp-idf:000]: I (573132) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[23:26:09][D][esp_aud:000]:
ERROR Fatal error: protocol.data_received() call failed.
protocol: <aioesphomeapi._frame_helper.noise.APINoiseFrameHelper object at 0x7f25b25d7c10>
transport: <_SelectorSocketTransport fd=6 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
File "/usr/lib/python3.11/asyncio/selector_events.py", line 1009, in _read_ready__data_received
self._protocol.data_received(data)
File "aioesphomeapi/_frame_helper/noise.py", line 136, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame
File "/usr/local/lib/python3.11/dist-packages/noise/state.py", line 74, in decrypt_with_ad
plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/noise/backends/default/ciphers.py", line 13, in decrypt
return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "src/chacha20poly1305_reuseable/init.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
File "src/chacha20poly1305_reuseable/init.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
File "src/chacha20poly1305_reuseable/init.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len
File "src/chacha20poly1305_reuseable/init.py", line 273, in chacha20poly1305_reuseable._decrypt_data
cryptography.exceptions.InvalidTag
WARNING v-assist-kitchen-esp32nr168 @ 192.168.0.157: Connection error occurred: v-assist-kitchen-esp32nr168 @ 192.168.0.157: Invalid encryption key: received_name=v-assist-kitchen-esp32nr168
INFO Processing unexpected disconnect from ESPHome API for v-assist-kitchen-esp32nr168 @ 192.168.0.157
WARNING Disconnected from API
INFO Successfully connected to v-assist-kitchen-esp32nr168 @ 192.168.0.157 in 0.009s
INFO Successful handshake with v-assist-kitchen-esp32nr168 @ 192.168.0.157 in 0.118s
[23:26:09][D][esp-idf:000]: I (573327) HTTP_CLIENT: Body received in fetch header state, 0x3fccbd99, 1152

[23:26:09][D][esp-idf:000]: I (573331) HTTP_STREAM: total_bytes=0

[23:26:12][D][esp_adf_pipeline:400]: Called deinit_all
[23:26:14][D][esp-idf:000]: W (578286) AUDIO_ELEMENT: [http-0x3d8263a4] Element task destroy timeout[2000]

[23:26:16][D][esp-idf:000]: W (580288) AUDIO_ELEMENT: [decoder-0x3d82654c] Element task destroy timeout[2000]

[23:26:16][D][esp-idf:000]: W (580290) AUDIO_ELEMENT: [i2s_out] Element has not create when AUDIO_ELEMENT_TERMINATE

[23:26:16][D][esp-idf:000]: I (580292) AUDIO_PIPELINE: audio_pipeline_unlinked

WARNING v-assist-kitchen-esp32nr168 @ 192.168.0.157: Connection error occurred: [Errno 104] Connection reset by peer
INFO Processing unexpected disconnect from ESPHome API for v-assist-kitchen-esp32nr168 @ 192.168.0.157
WARNING Disconnected from API
INFO Successfully connected to v-assist-kitchen-esp32nr168 @ 192.168.0.157 in 0.009s
INFO Successful handshake with v-assist-kitchen-esp32nr168 @ 192.168.0.157 in 0.099s

`

@gnumpi
Copy link
Owner

gnumpi commented May 7, 2024

I also noticed some issues when trying to trigger a mp3 stream from a voice command. Actually there was an issue of simultaneously trying to play the tts response message and the requested audio stream. I re-wrote some of the code to prevent this. You can try if this also fixes your issue:

- source:
      type: git
      url: https://github.com/gnumpi/esphome_audio
      ref: dev-next
    components: [ adf_pipeline, i2s_audio ]

@pimp1310
Copy link
Author

pimp1310 commented May 7, 2024

I also noticed some issues when trying to trigger a mp3 stream from a voice command. Actually there was an issue of simultaneously trying to play the tts response message and the requested audio stream. I re-wrote some of the code to prevent this. You can try if this also fixes your issue:

- source:
      type: git
      url: https://github.com/gnumpi/esphome_audio
      ref: dev-next
    components: [ adf_pipeline, i2s_audio ]

i give it a try, now when i start a mp3 stream over voice he says something unreconizeable (i think he says "done" but my language is german) but plays nothing.
The Freeze or reboot is gone but he plays nothing, when send the same command when the assist is in idle it works he plays the file.

when i trigger the play command when assist is in idle the LOG says something like

got new pipeline state: 6, while in MP state PLAYING

but when i trigger it over voice there is nothing like this.

here is the log

`
[20:24:37][D][esp_adf_pipeline:444]: [ADFMicrophone] Pipeline changed from DESTROYING to UNINITIALIZED. (REQ: 1)
[20:24:37][D][esp_adf_pipeline:444]: [ADFMicrophone] Pipeline changed from UNINITIALIZED to UNINITIALIZED. (REQ: 1)
[20:24:38][D][voice_assistant:563]: Event Type: 4
[20:24:38][D][voice_assistant:591]: Speech recognised as: " Schalte das Radio an."
[20:24:38][D][voice_assistant:563]: Event Type: 5
[20:24:38][D][voice_assistant:596]: Intent started
[20:24:38][D][media_player:059]: 's3-dev_media_player' - Setting
[20:24:38][D][media_player:066]: Media URL: http://live-icy.gss.dr.dk/A/A12H.mp3
[20:24:38][D][esp_audio_sources:058]: Set new uri: http://live-icy.gss.dr.dk/A/A12H.mp3
[20:24:38][D][adf_media_player:047]: Got control call in state IDLE
[20:24:38][D][esp_adf_pipeline:060]: Starting request, current state UNINITIALIZED
[20:24:38][D][esp_adf_pipeline:444]: [MediaPlayer] Pipeline changed from UNINITIALIZED to INITIALIZING. (REQ: 0)
[20:24:38][I][adf_media_player:167]: got new pipeline state: 1, while in MP state IDLE
[20:24:38][I][adf_media_player:226]: current mp state: IDLE
[20:24:38][I][adf_media_player:227]: anouncement: false
[20:24:38][I][adf_media_player:228]: play_intent: false
[20:24:38][I][adf_media_player:229]: current_uri_: yes
[20:24:38][D][voice_assistant:563]: Event Type: 6
[20:24:38][D][esp-idf:000]: I (263807) MP3_DECODER: MP3 init

[20:24:38][D][esp-idf:000]: I (263816) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=4

[20:24:38][D][i2s_audio:072]: Installing driver : yes
[20:24:38][D][esp_adf_pipeline:493]: pipeline tag 0, http
[20:24:38][D][esp_adf_pipeline:493]: pipeline tag 1, decoder
[20:24:38][D][esp_adf_pipeline:493]: pipeline tag 2, i2s_out
[20:24:38][D][esp-idf:000]: I (263844) AUDIO_PIPELINE: link el->rb, el:0x3d8263a4, tag:http, rb:0x3d832940

[20:24:38][D][esp-idf:000]: I (263846) AUDIO_PIPELINE: link el->rb, el:0x3d8325dc, tag:decoder, rb:0x3d833980

[20:24:38][D][esp_adf_pipeline:503]: Setting up event listener.
[20:24:38][D][esp_adf_pipeline:444]: [MediaPlayer] Pipeline changed from INITIALIZING to CREATED. (REQ: 0)
[20:24:38][I][adf_media_player:167]: got new pipeline state: 2, while in MP state IDLE
[20:24:38][I][adf_media_player:226]: current mp state: IDLE
[20:24:38][I][adf_media_player:227]: anouncement: false
[20:24:38][I][adf_media_player:228]: play_intent: false
[20:24:38][I][adf_media_player:229]: current_uri_: yes
[20:24:38][D][voice_assistant:563]: Event Type: 7
[20:24:38][D][voice_assistant:619]: Response: "Done"
[20:24:38][D][esp_adf_pipeline:444]: [MediaPlayer] Pipeline changed from CREATED to PREPARING. (REQ: 0)
[20:24:38][I][adf_media_player:167]: got new pipeline state: 3, while in MP state IDLE
[20:24:38][D][adf_i2s_out:127]: Set final i2s settings: 22050
[20:24:38][I][adf_media_player:226]: current mp state: PLAYING
[20:24:38][I][adf_media_player:227]: anouncement: false
[20:24:38][I][adf_media_player:228]: play_intent: false
[20:24:38][I][adf_media_player:229]: current_uri_: yes
[20:24:38][D][voice_assistant:563]: Event Type: 8
[20:24:38][D][voice_assistant:639]: Response URL: "http://192.168.0.73:8123/api/tts_proxy/e9b450d14bc2363d292c84f17cfad5cfbd58a458_de-de_1bd7a6bba9_tts.piper_2.mp3"
[20:24:38][D][voice_assistant:439]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[20:24:38][D][voice_assistant:445]: Desired state set to STREAMING_RESPONSE
[20:24:38][D][esp_audio_sources:063]: Prepare elements called (initial_call)!
[20:24:38][D][esp_audio_sources:097]: Use fixed settings: no
[20:24:38][D][esp_audio_sources:098]: Streamer status: 1
[20:24:38][D][esp_audio_sources:099]: decoder status: 1
[20:24:38][D][adf_audio_element:108]: Preparing [http]...
[20:24:38][D][esp-idf:000]: I (264018) AUDIO_THREAD: The http task allocate stack on external memory

[20:24:38][D][esp-idf:000]: I (264028) AUDIO_ELEMENT: [http-0x3d8263a4] Element task created

[20:24:38][D][adf_audio_element:108]: Preparing [decoder]...
[20:24:38][D][esp-idf:000]: I (264039) AUDIO_THREAD: The decoder task allocate stack on external memory

[20:24:39][D][esp-idf:000]: I (264049) AUDIO_ELEMENT: [decoder-0x3d8325dc] Element task created

[20:24:39][D][media_player:059]: 's3-dev_media_player' - Setting
[20:24:39][D][media_player:066]: Media URL: http://192.168.0.73:8123/api/tts_proxy/e9b450d14bc2363d292c84f17cfad5cfbd58a458_de-de_1bd7a6bba9_tts.piper_2.mp3
[20:24:39][D][esp_audio_sources:058]: Set new uri: http://192.168.0.73:8123/api/tts_proxy/e9b450d14bc2363d292c84f17cfad5cfbd58a458_de-de_1bd7a6bba9_tts.piper_2.mp3
[20:24:39][D][adf_media_player:047]: Got control call in state PLAYING
[20:24:39][D][esp_adf_pipeline:070]: Called 'stop' while in PREPARING state.
[20:24:39][D][voice_assistant:563]: Event Type: 2
[20:24:39][D][voice_assistant:653]: Assist Pipeline ended
[20:24:39][D][adf_audio_element:108]: Preparing [i2s_out]...
[20:24:39][D][esp-idf:000]: I (264121) AUDIO_ELEMENT: [i2s_out-0x3d8327a8] Element task created

[20:24:39][D][light:036]: 'Status led' Setting:
[20:24:39][D][light:047]: State: OFF
[20:24:39][D][adf_audio_element:165]: Resuming [http]...
[20:24:39][D][adf_audio_element:172]: [http] Sending resume command.
[20:24:39][D][adf_audio_element:165]: Resuming [decoder]...
[20:24:39][D][adf_audio_element:172]: [decoder] Sending resume command.
[20:24:39][D][esp-idf:000]: I (264161) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[20:24:39][D][esp-idf:000]: I (264173) MP3_DECODER: MP3 opened

[20:24:39][D][esp-idf:000]: I (264348) HTTP_CLIENT: Body received in fetch header state, 0x3fccbae1, 1671

[20:24:39][D][esp-idf:000]: I (264351) HTTP_STREAM: total_bytes=0

[20:24:39][I][HTTPStreamReader:192]: [ * ] Receive music info from mp3 decoder, sample_rates=48000, bits=16, ch=2
[20:24:39][D][adf_i2s_out:114]: update i2s clk settings: rate:48000 bits:16 ch:2
[20:24:39][D][adf_i2s_out:127]: Set final i2s settings: 48000
[20:24:39][D][adf_audio_element:108]: Preparing [http]...
[20:24:39][D][adf_audio_element:108]: Preparing [decoder]...
[20:24:39][D][esp-idf:000]: W (264380) AUDIO_ELEMENT: OUT-[decoder] AEL_IO_ABORT

[20:24:39][D][esp-idf:000]: W (264385) MP3_DECODER: output aborted -3

[20:24:39][I][adf_media_player:226]: current mp state: PLAYING
[20:24:39][I][adf_media_player:227]: anouncement: false
[20:24:39][I][adf_media_player:228]: play_intent: false
[20:24:39][I][adf_media_player:229]: current_uri_: yes
[20:24:39][D][esp_audio_sources:063]: Prepare elements called (initial_call)!
[20:24:39][D][esp_audio_sources:097]: Use fixed settings: no
[20:24:39][D][esp_audio_sources:098]: Streamer status: 5
[20:24:39][D][esp_audio_sources:099]: decoder status: 5
[20:24:39][D][adf_audio_element:108]: Preparing [http]...
[20:24:39][D][adf_audio_element:108]: Preparing [decoder]...
[20:24:39][D][adf_audio_element:108]: Preparing [i2s_out]...
[20:24:39][D][adf_audio_element:165]: Resuming [http]...
[20:24:39][D][adf_audio_element:172]: [http] Sending resume command.
[20:24:39][D][adf_audio_element:165]: Resuming [decoder]...
[20:24:39][D][adf_audio_element:172]: [decoder] Sending resume command.
[20:24:39][D][adf_audio_element:191]: [http] Checking State, got 79
[20:24:39][D][adf_audio_element:191]: [decoder] Checking State, got 15
[20:24:39][D][adf_audio_element:191]: [http] Checking State, got 79
[20:24:39][D][adf_audio_element:191]: [decoder] Checking State, got 79
[20:24:39][D][esp-idf:000]: I (264496) HTTP_STREAM: total_bytes=8994

[20:24:39][I][HTTPStreamReader:192]: [ * ] Receive music info from mp3 decoder, sample_rates=22050, bits=16, ch=1
[20:24:39][D][adf_i2s_out:114]: update i2s clk settings: rate:22050 bits:16 ch:1
[20:24:39][D][adf_i2s_out:127]: Set final i2s settings: 22050
[20:24:39][D][adf_audio_element:108]: Preparing [http]...
[20:24:39][D][esp-idf:000]:
ERROR Fatal error: protocol.data_received() call failed.
protocol: <aioesphomeapi._frame_helper.noise.APINoiseFrameHelper object at 0x7f1452b17c10>
transport: <_SelectorSocketTransport fd=6 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
File "/usr/lib/python3.11/asyncio/selector_events.py", line 1009, in _read_ready__data_received
self._protocol.data_received(data)
File "aioesphomeapi/_frame_helper/noise.py", line 136, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame
File "/usr/local/lib/python3.11/dist-packages/noise/state.py", line 74, in decrypt_with_ad
plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/noise/backends/default/ciphers.py", line 13, in decrypt
return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "src/chacha20poly1305_reuseable/init.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
File "src/chacha20poly1305_reuseable/init.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
File "src/chacha20poly1305_reuseable/init.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len
File "src/chacha20poly1305_reuseable/init.py", line 273, in chacha20poly1305_reuseable._decrypt_data
cryptography.exceptions.InvalidTag
WARNING v-assist-kitchen-esp32nr168 @ 192.168.0.157: Connection error occurred: v-assist-kitchen-esp32nr168 @ 192.168.0.157: Invalid encryption key: received_name=v-assist-kitchen-esp32nr168
INFO Processing unexpected disconnect from ESPHome API for v-assist-kitchen-esp32nr168 @ 192.168.0.157
WARNING Disconnected from API
INFO Successfully connected to v-assist-kitchen-esp32nr168 @ 192.168.0.157 in 0.026s
INFO Successful handshake with v-assist-kitchen-esp32nr168 @ 192.168.0.157 in 0.111s
[20:24:39][D][esp-idf:000]: W (264716) HTTP_STREAM: No more data,errno:0, total_bytes:8994, rlen = 0

[20:24:39][D][esp-idf:000]: I (264719) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0

[20:24:39][I][esp_adf_pipeline:135]: [ http ] status: 15
[20:24:39][I][esp_adf_pipeline:138]: current state: RUNNING
[20:24:39][D][esp_adf_pipeline:444]: [MediaPlayer] Pipeline changed from RUNNING to FINISHING. (REQ: 0)
[20:24:39][I][adf_media_player:167]: got new pipeline state: 7, while in MP state PLAYING
[20:24:39][I][adf_media_player:226]: current mp state: PLAYING
[20:24:39][I][adf_media_player:227]: anouncement: false
[20:24:39][I][adf_media_player:228]: play_intent: false
[20:24:39][I][adf_media_player:229]: current_uri_: yes
[20:24:40][D][esp-idf:000]: I (265087) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2

[20:24:40][D][esp-idf:000]: I (265550) MP3_DECODER: Closed

[20:24:40][I][esp_adf_pipeline:135]: [ decoder ] status: 15
[20:24:40][I][esp_adf_pipeline:138]: current state: FINISHING
[20:24:40][D][esp-idf:000]: I (265665) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2

[20:24:40][I][esp_adf_pipeline:135]: [ i2s_out ] status: 15
[20:24:40][I][esp_adf_pipeline:138]: current state: FINISHING
[20:24:40][D][esp_adf_pipeline:444]: [MediaPlayer] Pipeline changed from FINISHING to STOPPED. (REQ: 1)
[20:24:40][I][adf_media_player:167]: got new pipeline state: 4, while in MP state PLAYING
[20:24:40][I][adf_media_player:226]: current mp state: IDLE
[20:24:40][I][adf_media_player:227]: anouncement: false
[20:24:40][I][adf_media_player:228]: play_intent: false
[20:24:40][I][adf_media_player:229]: current_uri_: yes
[20:24:40][D][esp_adf_pipeline:444]: [MediaPlayer] Pipeline changed from STOPPED to DESTROYING. (REQ: 1)
[20:24:40][I][adf_media_player:167]: got new pipeline state: 11, while in MP state IDLE
[20:24:40][I][adf_media_player:226]: current mp state: IDLE
[20:24:40][I][adf_media_player:227]: anouncement: false
[20:24:40][I][adf_media_player:228]: play_intent: false
[20:24:40][I][adf_media_player:229]: current_uri_: yes
[20:24:40][D][esp_adf_pipeline:514]: Called deinit_all
[20:24:40][D][esp-idf:000]: I (265920) AUDIO_PIPELINE: audio_pipeline_unlinked

[20:24:40][D][esp-idf:000]: W (265922) AUDIO_ELEMENT: [http] Element has not create when AUDIO_ELEMENT_TERMINATE

[20:24:40][D][esp-idf:000]: W (265930) AUDIO_ELEMENT: [decoder] Element has not create when AUDIO_ELEMENT_TERMINATE

[20:24:40][D][esp-idf:000]: W (265950) AUDIO_ELEMENT: [i2s_out] Element has not create when AUDIO_ELEMENT_TERMINATE

[20:24:40][D][esp-idf:000]: I (265961) I2S: DMA queue destroyed

[20:24:40][D][esp_adf_pipeline:444]: [MediaPlayer] Pipeline changed from DESTROYING to UNINITIALIZED. (REQ: 1)
[20:24:40][I][adf_media_player:167]: got new pipeline state: 0, while in MP state IDLE
[20:24:40][I][adf_media_player:226]: current mp state: IDLE
[20:24:40][I][adf_media_player:227]: anouncement: false
[20:24:40][I][adf_media_player:228]: play_intent: false
[20:24:40][I][adf_media_player:229]: current_uri_: yes
[20:24:40][D][esp_adf_pipeline:444]: [MediaPlayer] Pipeline changed from UNINITIALIZED to UNINITIALIZED. (REQ: 1)
[20:24:40][I][adf_media_player:167]: got new pipeline state: 0, while in MP state IDLE
[20:24:40][I][adf_media_player:226]: current mp state: IDLE
[20:24:40][I][adf_media_player:227]: anouncement: false
[20:24:40][I][adf_media_player:228]: play_intent: false
[20:24:40][I][adf_media_player:229]: current_uri_: yes
[20:24:42][D][voice_assistant:439]: State changed from STREAMING_RESPONSE to IDLE
[20:24:42][D][voice_assistant:445]: Desired state set to IDLE
[20:24:42][D][micro_wake_word:177]: State changed from IDLE to START_MICROPHONE
[20:24:43][D][micro_wake_word:115]: Starting Microphone
[20:24:43][D][esp_adf_pipeline:060]: Starting request, current state UNINITIALIZED
[20:24:43][D][micro_wake_word:177]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[20:24:43][D][esp_adf_pipeline:444]: [ADFMicrophone] Pipeline changed from UNINITIALIZED to INITIALIZING. (REQ: 0)
[20:24:43][D][esp-idf:000]: I (268926) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[20:24:43][D][i2s_audio:072]: Installing driver : yes
[20:24:43][D][esp_adf_pipeline:493]: pipeline tag 0, i2s_in
[20:24:43][D][esp_adf_pipeline:493]: pipeline tag 1, pcm_reader
[20:24:43][D][esp-idf:000]: I (268935) AUDIO_PIPELINE: link el->rb, el:0x3d8263e0, tag:i2s_in, rb:0x3d832754

[20:24:43][D][esp_adf_pipeline:503]: Setting up event listener.
[20:24:43][D][esp_adf_pipeline:444]: [ADFMicrophone] Pipeline changed from INITIALIZING to CREATED. (REQ: 0)
[20:24:43][D][esp_adf_pipeline:444]: [ADFMicrophone] Pipeline changed from CREATED to PREPARING. (REQ: 0)
[20:24:43][D][adf_audio_element:108]: Preparing [i2s_in]...
[20:24:43][D][esp-idf:000]: I (268968) AUDIO_ELEMENT: [i2s_in-0x3d8263e0] Element task created

[20:24:43][D][adf_audio_element:108]: Preparing [pcm_reader]...
[20:24:43][D][esp-idf:000]: I (268981) AUDIO_ELEMENT: [pcm_reader-0x3d83260c] Element task created

[20:24:43][D][esp_adf_pipeline:345]: wait for preparation, done
[20:24:43][D][esp_adf_pipeline:444]: [ADFMicrophone] Pipeline changed from PREPARING to STARTING. (REQ: 0)
[20:24:43][D][adf_audio_element:165]: Resuming [i2s_in]...
[20:24:43][D][adf_audio_element:172]: [i2s_in] Sending resume command.
[20:24:43][D][esp-idf:000]: I (269019) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[20:24:43][D][adf_audio_element:191]: [i2s_in] Checking State, got 78
[20:24:43][I][esp_adf_pipeline:135]: [ i2s_in ] status: 12
[20:24:44][D][esp_adf_pipeline:444]: [ADFMicrophone] Pipeline changed from STARTING to RUNNING. (REQ: 0)
[20:24:44][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
[20:25:17][I][ota:117]: Boot seems successful, resetting boot loop counter.
[20:25:17][D][esp32.preferences:114]: Saving 1 preferences to flash...
[20:25:17][D][esp32.preferences:143]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed`

@pimp1310
Copy link
Author

pimp1310 commented May 9, 2024

@gnumpi

have you seen my feedback to the new version in "dev-next" above?

@gnumpi
Copy link
Owner

gnumpi commented May 9, 2024

I am sorry, but playing music from voice commands is not supported until esphome/esphome#6691 got merged.

@pimp1310
Copy link
Author

I am sorry, but playing music from voice commands is not supported until esphome/esphome#6691 got merged.

Okay, I'm not familiar with this, what are the chances of it being merged?

and when is something like this usually merged?

Thank You

@gnumpi
Copy link
Owner

gnumpi commented May 12, 2024

I am sorry, but playing music from voice commands is not supported until esphome/esphome#6691 got merged.

Okay, I'm not familiar with this, what are the chances of it being merged?

and when is something like this usually merged?

Thank You

Actually, I don't know. There are quite a few PR for esphome, so the maintainers need to have their priorities and order of checking them. As this PR is mainly related to the ADF-pipeline which is only an external component for now, I guess it's not on their highest priority.
But, in the meantime I created a version without the API changes needed for sending the announcement flag from HA, and hence, which can be included as external source in your configuration:

Maybe you want to give it a try:

external_components:
   - source: 
       type: git
       url: https://github.com/gnumpi/esphome
       ref: adf-mods

    components: [ api, media_player, speaker, voice_assistant ]
  
  - source:
      type: git
      url: https://github.com/gnumpi/esphome_audio
      ref: dev-next

    components: [ adf_pipeline, i2s_audio ]

@s00500
Copy link

s00500 commented Jun 12, 2024

I also found a pretty reliable way to make it crash: basically playing a short mp3 on wake word detect and then starting VA.

Even with delay and wait for done I can not make this work anyway in a reliable way

I am using the onju voice config basically, with these modifications:

external_components:
  - source:
      type: git
      url: https://github.com/gnumpi/esphome_audio
      #ref: d957129e0f37d622dd6e0efb2400bf2c9582a35c # main (high pitch issue)
      ref: a8f66cbcb9ad96c71904225d0ee0e89641223247 #dev-next
    components: [adf_pipeline, i2s_audio]
  - source:
      type: git
      url: https://github.com/gnumpi/esphome
      ref: 2044c7e4d4506b840dd2978d2fce5e455d304627 # adf-mods
    components: [api, media_player, speaker, voice_assistant]

micro_wake_word:
  model: okay_nabu
  probability_cutoff: 0.5
  on_wake_word_detected:
    # - if:
    #     condition: media_player.is_playing
    #     then:
    #       - media_player.pause
    - logger.log: "WAKEWORD!#################################"
    - media_player.play_media:
        id: onju_out
        media_url: "http://192.168.0.244:8000/beep.mp3"
        #media_url: "https://homeserver.bachi:8123/local/beep/beep.mp3"
    - delay: 250ms
    - wait_until:
        media_player.is_idle: onju_out
    - delay: 250ms
    - logger.log: "DONE playing #####################"

    - voice_assistant.start:
        wake_word: !lambda return wake_word;

@gnumpi
Copy link
Owner

gnumpi commented Jun 12, 2024

Hey, thx for reporting your issue! Please, don't use the adf-mods anymore. All the necessary parts are already merged into the current ESPHome version. Can you confirm that your issue still exists with the latest dev-next commit and without the second external component in your config?

@s00500
Copy link

s00500 commented Jun 12, 2024

Hi @gnumpi

Yes it does, a8f66cb is the latest dev-next commit

I only added the adf-mods after starting to play arround so the issue exists in anycase... unfortunatly it is also not super easy to get the logs from the onju, as it just looses them when crashing when using network logging

But the issue is fairly consistent

@gnumpi
Copy link
Owner

gnumpi commented Jun 12, 2024

ok, let me try to reproduce the error with my hardware then ...

@gnumpi
Copy link
Owner

gnumpi commented Jun 12, 2024

unfortunately I couldn't reproduce your error, it is just working fine here. I don't have an Onju though, but I am running an ESP32-S3 with similar settings, and also in duplex mode.

@s00500
Copy link

s00500 commented Jun 17, 2024

Ok, it always works fine on the first try, then somehow hangs up... seems like it is not crashing anymore with my current config though...

Here is my config and the logs of 3 attempts... notice I use encryption...

Configuration

# Wohnzimmer
wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  fast_connect: true

ota:
  password: XXX

substitutions:
  name: "onju-voice-1"
  friendly_name: "Nabu 1"
  project_version: "1.0.0"
  device_description: "Onju Voice Satellite with ESPHome software and microWakeWord"
external_components:
  - source:
      type: git
      url: https://github.com/gnumpi/esphome_audio
      ref: d957129e0f37d622dd6e0efb2400bf2c9582a35c # main (high pitch issue)
      #ref: a8f66cbcb9ad96c71904225d0ee0e89641223247 #dev-next
    components: [adf_pipeline, i2s_audio]
  # - source:
  #     type: git
  #     url: https://github.com/gnumpi/esphome
  #     ref: 2044c7e4d4506b840dd2978d2fce5e455d304627 # adf-mods
  #   components: [api, media_player, speaker, voice_assistant]

esphome:
  name: "${name}"
  friendly_name: "${friendly_name}"
  comment: "${device_description}"
  name_add_mac_suffix: false
  project:
    name: tetele.onju_voice_satellite
    version: "${project_version}"
  min_version: 2024.3.0
  platformio_options:
    board_build.flash_mode: dio
    build_flags: "-DBOARD_HAS_PSRAM"
    board_build.arduino.memory_type: qio_opi
  on_boot:
    then:
      - light.turn_on:
          id: top_led
          effect: slow_pulse
          red: 100%
          green: 60%
          blue: 0%
      - wait_until:
          condition:
            wifi.connected:
      - light.turn_on:
          id: top_led
          effect: pulse
          red: 0%
          green: 100%
          blue: 0%
      - wait_until:
          condition:
            api.connected:
      - light.turn_on:
          id: top_led
          effect: none
          red: 0%
          green: 100%
          blue: 0%
      - delay: 1s
      - script.execute: reset_led

dashboard_import:
  package_import_url: github://tetele/onju-voice-satellite/esphome/onju-voice-microwakeword.yaml@main

esp32:
  board: esp32-s3-devkitc-1
  framework:
    type: esp-idf
    # version: recommended
    version: 4.4.6
    sdkconfig_options:
      # need to set a s3 compatible board for the adf-sdk to compile
      # board specific code is not used though
      CONFIG_ESP32_S3_BOX_BOARD: "y"

psram:
  mode: octal
  speed: 80MHz

# Enable logging
logger:

api:
  encryption:
    key: XXX
  services:
    - service: start_va
      then:
        - voice_assistant.start
    - service: stop_va
      then:
        - voice_assistant.stop
    - service: notification_on
      then:
        - script.execute: turn_on_notification
    - service: notification_clear
      then:
        - script.execute: clear_notification

globals:
  - id: thresh_percent
    type: float
    initial_value: "0.03"
    restore_value: false
  - id: touch_calibration_values_left
    type: uint32_t[5]
    restore_value: false
  - id: touch_calibration_values_center
    type: uint32_t[5]
    restore_value: false
  - id: touch_calibration_values_right
    type: uint32_t[5]
    restore_value: false
  - id: notification
    type: bool
    restore_value: false

interval:
  - interval: 1s
    then:
      - script.execute:
          id: calibrate_touch
          button: 0
      - script.execute:
          id: calibrate_touch
          button: 1
      - script.execute:
          id: calibrate_touch
          button: 2

i2s_audio:
  - id: i2s_shared
    i2s_lrclk_pin: GPIO13
    i2s_bclk_pin: GPIO18
    access_mode: duplex

adf_pipeline:
  - platform: i2s_audio
    type: audio_out
    id: adf_i2s_out
    i2s_audio_id: i2s_shared
    i2s_dout_pin: GPIO12
    sample_rate: 16000
    adf_alc: true
    bits_per_sample: 32bit
    fixed_settings: true
    #channel: left

  - platform: i2s_audio
    type: audio_in
    id: adf_i2s_in
    i2s_audio_id: i2s_shared
    i2s_din_pin: GPIO17
    #channel: left
    channel: right
    pdm: false
    sample_rate: 16000
    bits_per_sample: 32bit
    fixed_settings: true

microphone:
  - platform: adf_pipeline
    id: onju_microphone
    keep_pipeline_alive: true
    gain_log2: 3
    pipeline:
      - adf_i2s_in
      - self

media_player:
  - platform: adf_pipeline
    id: onju_out
    name: None
    internal: false
    keep_pipeline_alive: true
    pipeline:
      - self
      - resampler
      - adf_i2s_out
    on_state:
      then:
        - lambda: |-
            static float old_volume = -1;
            float new_volume = id(onju_out).volume;
            if(abs(new_volume-old_volume) > 0.0001) {
              if(old_volume != -1) {
                id(show_volume)->execute();
              }
            }
            old_volume = new_volume;

micro_wake_word:
  model: okay_nabu
  probability_cutoff: 0.5
  on_wake_word_detected:
    # - if:
    #     condition: media_player.is_playing
    #     then:
    #       - media_player.pause
    - logger.log: "WAKEWORD!#################################"
    - media_player.play_media:
        id: onju_out
        media_url: "http://192.168.0.244:8000/beep.mp3"
        #media_url: "https://homeserver.bachi:8123/local/beep/beep.mp3"
    - delay: 250ms
    - wait_until:
        media_player.is_idle: onju_out
    - delay: 250ms
    - logger.log: "DONE playing #####################"
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
    - logger.log: "Started VA #####################"

voice_assistant:
  id: va
  microphone: onju_microphone
  media_player: onju_out
  use_wake_word: false
  on_listening:
    - light.turn_on:
        id: top_led
        blue: 100%
        red: 100%
        green: 100%
        brightness: 100%
        effect: listening
  on_stt_vad_end:
    - light.turn_on:
        id: top_led
        blue: 100%
        red: 0%
        green: 20%
        brightness: 70%
        effect: processing
  on_tts_end:
    - light.turn_on:
        id: top_led
        blue: 0%
        red: 20%
        green: 100%
        effect: speaking
  on_end:
    - delay: 200ms
    - wait_until:
        not:
          media_player.is_playing: onju_out
    - script.execute: reset_led
    - if:
        condition:
          and:
            - switch.is_on: use_wake_word
            - binary_sensor.is_off: mute_switch
        then:
          - delay: 200ms
          - micro_wake_word.start
  on_client_connected:
    - if:
        condition:
          and:
            - switch.is_on: use_wake_word
            - binary_sensor.is_off: mute_switch
        then:
          - micro_wake_word.start:
  on_client_disconnected:
    - if:
        condition:
          and:
            - switch.is_on: use_wake_word
            - binary_sensor.is_off: mute_switch
        then:
          - voice_assistant.stop:
          - micro_wake_word.stop:
  on_error:
    - light.turn_on:
        id: top_led
        blue: 0%
        red: 100%
        green: 0%
        effect: none
    - delay: 1s
    - script.execute: reset_led

number:
  - platform: template
    name: "Touch threshold percentage"
    id: touch_threshold_percentage
    update_interval: never
    entity_category: config
    initial_value: 0.75
    min_value: 0.25
    max_value: 5
    step: 0.05
    optimistic: true
    on_value:
      then:
        - lambda: !lambda |-
            id(thresh_percent) = 0.01 * x;

esp32_touch:
  setup_mode: false
  sleep_duration: 2ms
  measurement_duration: 800us
  low_voltage_reference: 0.8V
  high_voltage_reference: 2.4V

  filter_mode: IIR_16
  debounce_count: 2
  noise_threshold: 0
  jitter_step: 0
  smooth_mode: IIR_2

  denoise_grade: BIT8
  denoise_cap_level: L0

binary_sensor:
  - platform: esp32_touch
    id: volume_down
    pin: GPIO4
    threshold: 539000
    on_press:
      then:
        - light.turn_on: left_led
        - script.execute:
            id: set_volume
            volume: -0.05
        - delay: 750ms
        - while:
            condition:
              binary_sensor.is_on: volume_down
            then:
              - script.execute:
                  id: set_volume
                  volume: -0.05
              - delay: 150ms
    on_release:
      then:
        - light.turn_off: left_led

  - platform: esp32_touch
    id: volume_up
    pin: GPIO2
    threshold: 580000
    on_press:
      then:
        - light.turn_on: right_led
        - script.execute:
            id: set_volume
            volume: 0.05
        - delay: 750ms
        - while:
            condition:
              binary_sensor.is_on: volume_up
            then:
              - script.execute:
                  id: set_volume
                  volume: 0.05
              - delay: 150ms
    on_release:
      then:
        - light.turn_off: right_led

  - platform: esp32_touch
    id: action
    pin: GPIO3
    threshold: 751000
    on_click:
      - if:
          condition:
            or:
              - switch.is_off: use_wake_word
              - binary_sensor.is_on: mute_switch
          then:
            - logger.log:
                tag: "action_click"
                format: "Voice assistant is running: %s"
                args: ['id(va).is_running() ? "yes" : "no"']
            - if:
                condition: media_player.is_playing
                then:
                  - media_player.stop
            - if:
                condition: voice_assistant.is_running
                then:
                  - voice_assistant.stop:
                else:
                  - voice_assistant.start:
          else:
            - logger.log:
                tag: "action_click"
                format: "Voice assistant was running with wake word detection enabled. Starting continuously"
            - if:
                condition: media_player.is_playing
                then:
                  - media_player.stop
            - voice_assistant.stop
            - delay: 1s
            - script.execute: reset_led
            - script.wait: reset_led
            - voice_assistant.start_continuous:

  - platform: gpio
    id: mute_switch
    pin:
      number: GPIO38
      mode: INPUT_PULLUP
    name: Disable wake word
    on_press:
      - script.execute: turn_off_wake_word
    on_release:
      - script.execute: turn_on_wake_word

light:
  - platform: esp32_rmt_led_strip
    id: leds
    pin: GPIO11
    chipset: SK6812
    num_leds: 6
    rgb_order: grb
    rmt_channel: 0
    default_transition_length: 0s
    gamma_correct: 2.8
  - platform: partition
    id: left_led
    segments:
      - id: leds
        from: 0
        to: 0
    default_transition_length: 100ms
  - platform: partition
    id: top_led
    segments:
      - id: leds
        from: 1
        to: 4
    default_transition_length: 100ms
    effects:
      - pulse:
          name: pulse
          transition_length: 250ms
          update_interval: 250ms
      - pulse:
          name: slow_pulse
          transition_length: 1s
          update_interval: 2s
      - addressable_lambda:
          name: show_volume
          update_interval: 50ms
          lambda: |-
            int int_volume = int(id(onju_out).volume * 100.0f * it.size());
            int full_leds = int_volume / 100;
            int last_brightness = int_volume % 100;
            int i = 0;
            for(; i < full_leds; i++) {
              it[i] = Color::WHITE;
            }
            if(i < 4) {
              it[i++] = Color(64, 64, 64).fade_to_white(last_brightness*256/100);
            }
            for(; i < it.size(); i++) {
              it[i] = Color(64, 64, 64);
            }
      - addressable_twinkle:
          name: listening_ww
          twinkle_probability: 1%
      - addressable_twinkle:
          name: listening
          twinkle_probability: 45%
      - addressable_scan:
          name: processing
          move_interval: 80ms
      - addressable_flicker:
          name: speaking
          intensity: 35%
  - platform: partition
    id: right_led
    segments:
      - id: leds
        from: 5
        to: 5
    default_transition_length: 100ms

script:
  - id: reset_led
    then:
      - if:
          condition:
            - lambda: return id(notification);
          then:
            - light.turn_on:
                id: top_led
                blue: 100%
                red: 100%
                green: 0%
                brightness: 100%
                effect: slow_pulse
          else:
            - if:
                condition:
                  and:
                    - switch.is_on: use_wake_word
                    - binary_sensor.is_off: mute_switch
                then:
                  - light.turn_on:
                      id: top_led
                      blue: 100%
                      red: 100%
                      green: 0%
                      brightness: 60%
                      effect: listening_ww
                else:
                  - light.turn_off: top_led

  - id: turn_on_notification
    then:
      - lambda: id(notification) = true;
      - script.execute: reset_led

  - id: clear_notification
    then:
      - lambda: id(notification) = false;
      - script.execute: reset_led

  - id: set_volume
    mode: restart
    parameters:
      volume: float
    then:
      - media_player.volume_set:
          id: onju_out
          volume: !lambda return clamp(id(onju_out).volume+volume, 0.0f, 1.0f);

  - id: show_volume
    mode: restart
    then:
      - light.turn_on:
          id: top_led
          effect: show_volume
      - delay: 1s
      - script.execute: reset_led

  - id: turn_on_wake_word
    then:
      - if:
          condition:
            and:
              - binary_sensor.is_off: mute_switch
              - switch.is_on: use_wake_word
          then:
            - micro_wake_word.start
            - script.execute: reset_led
          else:
            - logger.log:
                tag: "turn_on_wake_word"
                format: "Trying to start listening for wake word, but %s"
                args:
                  [
                    'id(mute_switch).state ? "mute switch is on" : "use wake word toggle is off"',
                  ]
                level: "INFO"

  - id: turn_off_wake_word
    then:
      - micro_wake_word.stop
      - script.execute: reset_led

  - id: calibrate_touch
    parameters:
      button: int
    then:
      - lambda: |-
          static uint8_t thresh_indices[3] = {0, 0, 0};
          static uint32_t sums[3] = {0, 0, 0};
          static uint8_t qsizes[3] = {0, 0, 0};
          static uint16_t consecutive_anomalies_per_button[3] = {0, 0, 0};

          uint32_t newval;
          uint32_t* calibration_values;
          switch(button) {
            case 0:
              newval = id(volume_down).get_value();
              calibration_values = id(touch_calibration_values_left);
              break;
            case 1:
              newval = id(action).get_value();
              calibration_values = id(touch_calibration_values_center);
              break;
            case 2:
              newval = id(volume_up).get_value();
              calibration_values = id(touch_calibration_values_right);
              break;
            default:
              ESP_LOGE("touch_calibration", "Invalid button ID (%d)", button);
              return;
          }

          if(newval == 0) return;

          //ESP_LOGD("touch_calibration", "[%d] qsize %d, sum %d, thresh_index %d, consecutive_anomalies %d", button, qsizes[button], sums[button], thresh_indices[button], consecutive_anomalies_per_button[button]);
          //ESP_LOGD("touch_calibration", "[%d] New value is %d", button, newval);

          if(qsizes[button] == 5) {
            float avg = float(sums[button])/float(qsizes[button]);
            if((fabs(float(newval)-avg)/avg) > id(thresh_percent)) {
              consecutive_anomalies_per_button[button]++;
              //ESP_LOGD("touch_calibration", "[%d] %d anomalies detected.", button, consecutive_anomalies_per_button[button]);
              if(consecutive_anomalies_per_button[button] < 10)
                return;
            } 
          }

          //ESP_LOGD("touch_calibration", "[%d] Resetting consecutive anomalies counter.", button);
          consecutive_anomalies_per_button[button] = 0;


          if(qsizes[button] == 5) {
            //ESP_LOGD("touch_calibration", "[%d] Queue full, removing %d.", button, id(touch_calibration_values)[thresh_indices[button]]);
            sums[button] -= (uint32_t) *(calibration_values+thresh_indices[button]);// id(touch_calibration_values)[thresh_indices[button]];
            qsizes[button]--;
          }
          *(calibration_values+thresh_indices[button]) = newval;
          sums[button] += newval;
          qsizes[button]++;
          thresh_indices[button] = (thresh_indices[button] + 1) % 5;

          //ESP_LOGD("touch_calibration", "[%d] Average value is %d", button, sums[button]/qsizes[button]);
          uint32_t newthresh = uint32_t((sums[button]/qsizes[button]) * (1.0 + id(thresh_percent)));
          //ESP_LOGD("touch_calibration", "[%d] Setting threshold %d", button, newthresh);

          switch(button) {
            case 0:
              id(volume_down).set_threshold(newthresh);
              break;
            case 1:
              id(action).set_threshold(newthresh);
              break;
            case 2:
              id(volume_up).set_threshold(newthresh);
              break;
            default:
              ESP_LOGE("touch_calibration", "Invalid button ID (%d)", button);
              return;
          }

switch:
  - platform: template
    name: Use Wake Word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    on_turn_on:
      - script.execute: turn_on_wake_word
    on_turn_off:
      - script.execute: turn_off_wake_word
  - platform: gpio
    id: dac_mute
    restore_mode: ALWAYS_OFF
    pin:
      number: GPIO21
      inverted: True

Logs

INFO Upload took 8.35 seconds, waiting for result...
INFO OTA successful
INFO Successfully uploaded program.
INFO Starting log output from onju-voice-1.local using esphome API
INFO Successfully connected to onju-voice-1 @ 192.168.1.181 in 8.051s
INFO Successful handshake with onju-voice-1 @ 192.168.1.181 in 0.078s
[23:10:36][I][app:100]: ESPHome version 2024.5.5 compiled on Jun 17 2024, 23:10:03
[23:10:36][I][app:102]: Project tetele.onju_voice_satellite version 1.0.0
[23:10:36][C][wifi:580]: WiFi:
[23:10:36][C][wifi:408]:   Local MAC: 80:65:99:DD:E9:54
[23:10:36][C][wifi:413]:   SSID: 'BachiNetIoT'
[23:10:36][C][wifi:416]:   IP Address: 192.168.1.181
[23:10:36][C][wifi:420]:   BSSID: 7E:8A:20:81:D7:D6
[23:10:36][C][wifi:421]:   Hostname: 'onju-voice-1'
[23:10:36][C][wifi:423]:   Signal strength: -65 dB ▂▄▆█
[23:10:36][C][wifi:427]:   Channel: 6
[23:10:36][C][wifi:428]:   Subnet: 255.255.254.0
[23:10:36][C][wifi:429]:   Gateway: 192.168.0.254
[23:10:36][C][wifi:430]:   DNS1: 192.168.0.245
[23:10:36][C][wifi:431]:   DNS2: 0.0.0.0
[23:10:36][C][logger:185]: Logger:
[23:10:36][C][logger:186]:   Level: DEBUG
[23:10:36][C][logger:188]:   Log Baud Rate: 115200
[23:10:36][C][logger:189]:   Hardware UART: USB_SERIAL_JTAG
[23:10:36][C][template.number:050]: Template Number 'Touch threshold percentage'
[23:10:36][C][template.number:051]:   Optimistic: YES
[23:10:36][C][template.number:052]:   Update Interval: never
[23:10:36][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip:
[23:10:36][C][esp32_rmt_led_strip:176]:   Pin: 11
[23:10:36][C][esp32_rmt_led_strip:177]:   Channel: 0
[23:10:36][C][esp32_rmt_led_strip:202]:   RGB Order: GRB
[23:10:36][C][esp32_rmt_led_strip:203]:   Max refresh rate: 0
[23:10:36][C][esp32_rmt_led_strip:204]:   Number of LEDs: 6
[23:10:36][C][switch.gpio:068]: GPIO Switch 'dac_mute'
[23:10:36][C][switch.gpio:091]:   Restore Mode: always OFF
[23:10:36][C][switch.gpio:031]:   Pin: GPIO21
[23:10:36][C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Disable wake word'
[23:10:36][C][gpio.binary_sensor:016]:   Pin: GPIO38
[23:10:36][C][light:103]: Light 'leds'
[23:10:36][C][light:105]:   Default Transition Length: 0.0s
[23:10:36][C][light:106]:   Gamma Correct: 2.80
[23:10:36][C][light:103]: Light 'left_led'
[23:10:36][C][light:105]:   Default Transition Length: 0.1s
[23:10:36][C][light:106]:   Gamma Correct: 2.80
[23:10:36][C][light:103]: Light 'top_led'
[23:10:36][C][light:105]:   Default Transition Length: 0.1s
[23:10:36][C][light:106]:   Gamma Correct: 2.80
[23:10:36][C][light:103]: Light 'right_led'
[23:10:36][C][light:105]:   Default Transition Length: 0.1s
[23:10:36][C][light:106]:   Gamma Correct: 2.80
[23:10:36][C][template.switch:068]: Template Switch 'Use Wake Word'
[23:10:36][C][template.switch:091]:   Restore Mode: restore defaults to ON
[23:10:36][C][template.switch:057]:   Optimistic: YES
[23:10:36][C][psram:020]: PSRAM:
[23:10:36][C][psram:021]:   Available: YES
[23:10:36][C][psram:024]:   Size: 8191 KB
[23:10:36][C][i2s_audio:028]: I2SController:
[23:10:36][C][i2s_audio:029]:   AccessMode: duplex
[23:10:36][C][i2s_audio:030]:   Port: 0
[23:10:36][C][i2s_audio:032]:   Reader registered.
[23:10:36][C][i2s_audio:035]:   Writer registered.
[23:10:36][C][i2s_audio:138]: I2S-Writer (Fixed-CFG):
[23:10:36][C][i2s_audio:140]:   sample-rate: 16000 bits_per_sample: 32
[23:10:36][C][i2s_audio:141]:   channel_fmt: 0 channels: 2
[23:10:36][C][i2s_audio:142]:   use_apll: no, use_pdm: no
[23:10:36][C][i2s_audio:135]: I2S-Reader (Fixed-CFG):
[23:10:36][C][i2s_audio:140]:   sample-rate: 16000 bits_per_sample: 32
[23:10:36][C][i2s_audio:141]:   channel_fmt: 3 channels: 1
[23:10:36][C][i2s_audio:142]:   use_apll: no, use_pdm: no
[23:10:36][C][esp32_touch:073]: Config for ESP32 Touch Hub:
[23:10:36][C][esp32_touch:074]:   Meas cycle: 0.80ms
[23:10:36][C][esp32_touch:075]:   Sleep cycle: 2.00ms
[23:10:36][C][esp32_touch:095]:   Low Voltage Reference: 0.8V
[23:10:36][C][esp32_touch:115]:   High Voltage Reference: 2.4V
[23:10:36][C][esp32_touch:135]:   Voltage Attenuation: 0V
[23:10:36][C][esp32_touch:169]:   Filter mode: IIR_16
[23:10:36][C][esp32_touch:170]:   Debounce count: 2
[23:10:36][C][esp32_touch:171]:   Noise threshold coefficient: 0
[23:10:36][C][esp32_touch:172]:   Jitter filter step size: 0
[23:10:36][C][esp32_touch:191]:   Smooth level: IIR_2
[23:10:36][C][esp32_touch:213]:   Denoise grade: BIT8
[23:10:36][C][esp32_touch:245]:   Denoise capacitance level: L0
[23:10:36][C][esp32_touch:260]:   Touch Pad 'volume_down'
[23:10:36][C][esp32_touch:261]:     Pad: T4
[23:10:36][C][esp32_touch:262]:     Threshold: 532029
[23:10:36][C][esp32_touch:260]:   Touch Pad 'volume_up'
[23:10:36][C][esp32_touch:261]:     Pad: T2
[23:10:36][C][esp32_touch:262]:     Threshold: 583257
[23:10:36][C][esp32_touch:260]:   Touch Pad 'action'
[23:10:36][C][esp32_touch:261]:     Pad: T3
[23:10:36][C][esp32_touch:262]:     Threshold: 730922
[23:10:37][C][mdns:115]: mDNS:
[23:10:37][C][mdns:116]:   Hostname: onju-voice-1
[23:10:37][C][ota:096]: Over-The-Air Updates:
[23:10:37][C][ota:097]:   Address: onju-voice-1.local:3232
[23:10:37][C][ota:100]:   Using Password.
[23:10:37][C][ota:103]:   OTA version: 2.
[23:10:37][C][api:139]: API Server:
[23:10:37][C][api:140]:   Address: onju-voice-1.local:6053
[23:10:37][C][api:142]:   Using noise encryption: YES
[23:10:37][C][micro_wake_word:057]: microWakeWord:
[23:10:37][C][micro_wake_word:058]:   Wake Word: okay nabu
[23:10:37][C][micro_wake_word:059]:   Probability cutoff: 0.500
[23:10:37][C][micro_wake_word:060]:   Sliding window size: 10
[23:10:37][C][esp_adf_pipeline.microphone:020]: ADF-Microphone
[23:10:37][C][adf_media_player:016]: ESP-ADF-MediaPlayer:
[23:10:37][C][adf_media_player:018]:   Number of ASPComponents: 3
[23:10:49][D][micro_wake_word:362]: Wake word sliding average probability is 0.549 and most recent probability is 1.000
[23:10:49][D][micro_wake_word:128]: Wake Word Detected
[23:10:49][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[23:10:49][D][micro_wake_word:134]: Stopping Microphone
[23:10:49][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING
[23:10:49][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[23:10:49][D][esp-idf:000][i2s_in]: W (21419) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[23:10:49][D][esp-idf:000][i2s_in]: W (21421) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[23:10:49][D][esp-idf:000][i2s_in]: W (21422) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[23:10:49][D][esp-idf:000][i2s_in]: W (21425) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[23:10:49][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED
[23:10:49][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE
[23:10:49][D][main:679]: WAKEWORD!#################################
[23:10:49][D][media_player:061]: 'Nabu 1' - Setting
[23:10:49][D][media_player:068]:   Media URL: http://192.168.0.244:8000/beep.mp3
[23:10:49][D][adf_media_player:030]: Got control call in state 1
[23:10:49][D][esp_adf_pipeline:050]: Starting request, current state UNINITIALIZED
[23:10:49][D][esp-idf:000]: I (21450) MP3_DECODER: MP3 init

[23:10:49][D][esp_adf_pipeline:358]: pipeline tag 0, http
[23:10:49][D][esp_adf_pipeline:358]: pipeline tag 1, decoder
[23:10:49][D][esp_adf_pipeline:358]: pipeline tag 2, resampler
[23:10:49][D][esp_adf_pipeline:358]: pipeline tag 3, i2s_out
[23:10:49][D][esp-idf:000]: I (21458) AUDIO_PIPELINE: link el->rb, el:0x3d8188e0, tag:http, rb:0x3d818f20

[23:10:49][D][esp-idf:000]: I (21460) AUDIO_PIPELINE: link el->rb, el:0x3d818a84, tag:decoder, rb:0x3d819f60

[23:10:49][D][esp-idf:000]: I (21461) AUDIO_PIPELINE: link el->rb, el:0x3d818c20, tag:resampler, rb:0x3d81afa0

[23:10:49][D][esp_adf_pipeline:370]: Setting up event listener.
[23:10:49][D][esp_adf_pipeline:302]: State changed from UNINITIALIZED to PREPARING
[23:10:49][I][adf_media_player:135]: got new pipeline state: 1
[23:10:49][D][adf_i2s_out:127]: Set final i2s settings: 16000
[23:10:49][D][esp_audio_processors:079]: New settings: SRC: rate: 16000, ch: 2 DST: rate: 16000, ch: 2 
[23:10:49][D][esp-idf:000]: I (21473) AUDIO_THREAD: The http task allocate stack on external memory

[23:10:49][D][esp-idf:000]: I (21476) AUDIO_ELEMENT: [http-0x3d8188e0] Element task created

[23:10:49][D][esp-idf:000]: I (21478) AUDIO_THREAD: The decoder task allocate stack on external memory

[23:10:49][D][esp-idf:000]: I (21480) AUDIO_ELEMENT: [decoder-0x3d818a84] Element task created

[23:10:49][D][esp-idf:000][http]: I (21481) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[23:10:49][D][esp-idf:000][decoder]: I (21484) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[23:10:49][D][esp-idf:000][decoder]: 
ERROR Fatal error: protocol.data_received() call failed.
protocol: <aioesphomeapi._frame_helper.noise.APINoiseFrameHelper object at 0x110153480>
transport: <_SelectorSocketTransport fd=10 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.11.9/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/selector_events.py", line 1013, in _read_ready__data_received
    self._protocol.data_received(data)
  File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
  File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame
  File "/opt/homebrew/lib/python3.11/site-packages/noise/state.py", line 74, in decrypt_with_ad
    plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/noise/backends/default/ciphers.py", line 13, in decrypt
    return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/chacha20poly1305_reuseable/__init__.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
  File "src/chacha20poly1305_reuseable/__init__.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
  File "src/chacha20poly1305_reuseable/__init__.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len
  File "src/chacha20poly1305_reuseable/__init__.py", line 273, in chacha20poly1305_reuseable._decrypt_data
cryptography.exceptions.InvalidTag
WARNING onju-voice-1 @ 192.168.1.181: Connection error occurred: onju-voice-1 @ 192.168.1.181: Invalid encryption key: received_name=onju-voice-1
INFO Processing unexpected disconnect from ESPHome API for onju-voice-1 @ 192.168.1.181
WARNING Disconnected from API
INFO Successfully connected to onju-voice-1 @ 192.168.1.181 in 0.178s
INFO Successful handshake with onju-voice-1 @ 192.168.1.181 in 0.067s
[23:10:49][D][esp-idf:000][decoder]: I (22027) MP3_DECODER: Closed

[23:10:49][D][esp-idf:000][resampler]: I (22123) AUDIO_ELEMENT: IN-[resampler] AEL_IO_DONE,-2

[23:10:49][D][esp-idf:000][i2s_out]: I (22170) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2

[23:10:49][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED
[23:10:49][I][adf_media_player:135]: got new pipeline state: 5
[23:10:50][D][main:698]: DONE playing #####################
[23:10:50][D][voice_assistant:502]: State changed from IDLE to START_MICROPHONE
[23:10:50][D][voice_assistant:508]: Desired state set to START_PIPELINE
[23:10:50][D][main:223]: Started VA #####################
[23:10:50][D][voice_assistant:220]: Starting Microphone
[23:10:50][D][ring_buffer:024]: Created ring buffer with size 16384
[23:10:50][D][esp_adf_pipeline:050]: Starting request, current state STOPPED
[23:10:50][D][esp_adf_pipeline:302]: State changed from STOPPED to PREPARING
[23:10:50][D][voice_assistant:502]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[23:10:50][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING
[23:10:50][D][esp-idf:000]: I (22585) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8394727 Bytes, Inter:154916 Bytes, Dram:154916 Bytes


[23:10:50][D][esp-idf:000][i2s_in]: I (22587) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[23:10:50][D][esp-idf:000]: I (22589) AUDIO_PIPELINE: Pipeline started

[23:10:50][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 14
[23:10:50][I][esp_adf_pipeline:214]: [ i2s_in ] status: 14
[23:10:50][I][esp_adf_pipeline:214]: [ i2s_in ] status: 12
[23:10:50][D][esp_adf_pipeline:131]: Check element [i2s_in] status, 3
[23:10:50][D][esp_adf_pipeline:131]: Check element [pcm_reader] status, 3
[23:10:50][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING
[23:10:50][D][voice_assistant:502]: State changed from STARTING_MICROPHONE to START_PIPELINE
[23:10:50][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 12
[23:10:50][D][voice_assistant:274]: Requesting start...
[23:10:50][D][voice_assistant:502]: State changed from START_PIPELINE to STARTING_PIPELINE
[23:10:50][D][voice_assistant:523]: Client started, streaming microphone
[23:10:50][D][voice_assistant:502]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[23:10:50][D][voice_assistant:508]: Desired state set to STREAMING_MICROPHONE
[23:10:50][D][voice_assistant:625]: Event Type: 1
[23:10:50][D][voice_assistant:628]: Assist Pipeline running
[23:10:50][D][voice_assistant:625]: Event Type: 3
[23:10:50][D][voice_assistant:639]: STT started
[23:10:50][D][light:036]: 'top_led' Setting:
[23:10:50][D][light:051]:   Brightness: 100%
[23:10:50][D][light:059]:   Red: 100%, Green: 100%, Blue: 100%
[23:10:50][D][light:109]:   Effect: 'listening'
[23:10:50][D][voice_assistant:625]: Event Type: 11
[23:10:50][D][voice_assistant:779]: Starting STT by VAD
[23:10:51][D][voice_assistant:625]: Event Type: 12
[23:10:51][D][voice_assistant:783]: STT by VAD end
[23:10:51][D][voice_assistant:502]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[23:10:51][D][voice_assistant:508]: Desired state set to AWAITING_RESPONSE
[23:10:51][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING
[23:10:51][D][voice_assistant:502]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[23:10:51][D][light:036]: 'top_led' Setting:
[23:10:51][D][light:051]:   Brightness: 70%
[23:10:51][D][light:059]:   Red: 0%, Green: 20%, Blue: 100%
[23:10:51][D][light:109]:   Effect: 'processing'
[23:10:51][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED
[23:10:51][D][voice_assistant:502]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[23:10:51][D][voice_assistant:625]: Event Type: 4
[23:10:51][D][voice_assistant:653]: Speech recognised as: "What's the time?"
[23:10:51][D][voice_assistant:625]: Event Type: 5
[23:10:51][D][voice_assistant:658]: Intent started
[23:10:51][D][voice_assistant:625]: Event Type: 6
[23:10:51][D][voice_assistant:625]: Event Type: 7
[23:10:51][D][voice_assistant:681]: Response: "The Time is now 23 10"
[23:10:51][D][voice_assistant:625]: Event Type: 8
[23:10:51][D][voice_assistant:701]: Response URL: "https://d5rs1tb0sih7razcsyc2qidzeetslehm.ui.nabu.casa/api/tts_proxy/f6ae2bd6e2dff1268cad6a62117d6e10d78853f8_en-au_f556b0cc52_tts.home_assistant_cloud.mp3"
[23:10:51][D][voice_assistant:502]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[23:10:51][D][voice_assistant:508]: Desired state set to STREAMING_RESPONSE
[23:10:51][D][media_player:061]: 'Nabu 1' - Setting
[23:10:51][D][media_player:068]:   Media URL: https://d5rs1tb0sih7razcsyc2qidzeetslehm.ui.nabu.casa/api/tts_proxy/f6ae2bd6e2dff1268cad6a62117d6e10d78853f8_en-au_f556b0cc52_tts.home_assistant_cloud.mp3
[23:10:51][D][media_player:074]:  Announcement: yes
[23:10:51][D][adf_media_player:030]: Got control call in state 1
[23:10:51][D][esp_adf_pipeline:050]: Starting request, current state STOPPED
[23:10:51][D][esp_adf_pipeline:302]: State changed from STOPPED to PREPARING
[23:10:51][I][adf_media_player:135]: got new pipeline state: 1
[23:10:51][D][adf_i2s_out:127]: Set final i2s settings: 16000
[23:10:52][D][esp_audio_processors:079]: New settings: SRC: rate: 24000, ch: 1 DST: rate: 16000, ch: 2 
[23:10:52][D][light:036]: 'top_led' Setting:
[23:10:52][D][light:059]:   Red: 20%, Green: 100%, Blue: 0%
[23:10:52][D][light:109]:   Effect: 'speaking'
[23:10:52][D][voice_assistant:625]: Event Type: 2
[23:10:52][D][voice_assistant:715]: Assist Pipeline ended
[23:10:52][D][esp-idf:000][http]: I (24423) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[23:10:52][D][esp-idf:000][decoder]: I (24425) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

ERROR Fatal error: protocol.data_received() call failed.
protocol: <aioesphomeapi._frame_helper.noise.APINoiseFrameHelper object at 0x1102b41a0>
transport: <_SelectorSocketTransport fd=10 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.11.9/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/selector_events.py", line 1013, in _read_ready__data_received
    self._protocol.data_received(data)
  File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
  File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame
  File "/opt/homebrew/lib/python3.11/site-packages/noise/state.py", line 74, in decrypt_with_ad
    plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/noise/backends/default/ciphers.py", line 13, in decrypt
    return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/chacha20poly1305_reuseable/__init__.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
  File "src/chacha20poly1305_reuseable/__init__.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
  File "src/chacha20poly1305_reuseable/__init__.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len
  File "src/chacha20poly1305_reuseable/__init__.py", line 273, in chacha20poly1305_reuseable._decrypt_data
cryptography.exceptions.InvalidTag
WARNING onju-voice-1 @ 192.168.1.181: Connection error occurred: onju-voice-1 @ 192.168.1.181: Invalid encryption key: received_name=onju-voice-1
INFO Processing unexpected disconnect from ESPHome API for onju-voice-1 @ 192.168.1.181
WARNING Disconnected from API
INFO Successfully connected to onju-voice-1 @ 192.168.1.181 in 0.122s
INFO Successful handshake with onju-voice-1 @ 192.168.1.181 in 0.056s
[23:10:53][D][esp-idf:000][http]: I (25404) HTTP_STREAM: total_bytes=19862

[23:10:53][I][HTTPStreamReader:129]: [ * ] Receive music info from mp3 decoder, sample_rates=24000, bits=16, ch=1
[23:10:53][D][adf_i2s_out:127]: Set final i2s settings: 16000
[23:10:53][D][esp_audio_processors:079]: New settings: SRC: rate: 24000, ch: 1 DST: rate: 16000, ch: 2 
[23:10:53][D][esp-idf:000][decoder]: I (25608) MP3_DECODER: Closed

[23:10:53][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING
[23:10:53][I][adf_media_player:135]: got new pipeline state: 2
[23:10:53][D][adf_i2s_out:127]: Set final i2s settings: 16000
[23:10:53][D][esp_audio_processors:079]: New settings: SRC: rate: 24000, ch: 1 DST: rate: 16000, ch: 2 
[23:10:53][D][esp-idf:000]: I (25622) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8402539 Bytes, Inter:162848 Bytes, Dram:162848 Bytes


[23:10:53][D][esp-idf:000][http]: I (25625) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[23:10:53][D][esp-idf:000][decoder]: I (25627) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[23:10:53][D][esp-idf:000][decoder]: I (25629) MP3_DECODER: MP3 opened

[23:10:53][D][esp-idf:000][resampler]: I (25668) RSP_FILTER: sample rate of source data : 24000, channel of source data : 1, sample rate of destination data : 16000, channel of destination data : 2

[23:10:53][I][esp_adf_pipeline:214]: [ resampler ] status: 12
[23:10:53][D][esp_adf_pipeline:131]: Check element [http] status, 2
[23:10:54][D][esp-idf:000][http]: I (26632) HTTP_STREAM: total_bytes=19862

[23:10:54][I][esp_adf_pipeline:214]: [ http ] status: 12
[23:10:54][D][esp_adf_pipeline:131]: Check element [http] status, 3
[23:10:54][D][esp_adf_pipeline:131]: Check element [decoder] status, 2
[23:10:54][I][esp_adf_pipeline:214]: [ decoder ] status: 12
[23:10:54][D][esp_adf_pipeline:131]: Check element [http] status, 3
[23:10:54][D][esp_adf_pipeline:131]: Check element [decoder] status, 3
[23:10:54][D][esp_adf_pipeline:131]: Check element [resampler] status, 3
[23:10:54][D][esp_adf_pipeline:131]: Check element [i2s_out] status, 3
[23:10:54][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING
[23:10:54][I][adf_media_player:135]: got new pipeline state: 3
[23:10:54][D][adf_i2s_out:127]: Set final i2s settings: 16000
[23:10:54][D][esp_audio_processors:079]: New settings: SRC: rate: 24000, ch: 1 DST: rate: 16000, ch: 2 
[23:10:54][I][HTTPStreamReader:129]: [ * ] Receive music info from mp3 decoder, sample_rates=24000, bits=16, ch=1
[23:10:54][D][adf_i2s_out:127]: Set final i2s settings: 16000
[23:10:54][D][esp_audio_processors:079]: New settings: SRC: rate: 24000, ch: 1 DST: rate: 16000, ch: 2 
[23:10:56][D][esp-idf:000][http]: W (28781) HTTP_STREAM: No more data,errno:0, total_bytes:19862, rlen = 0

[23:10:56][D][esp-idf:000][http]: I (28784) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0

[23:10:56][I][esp_adf_pipeline:214]: [ http ] status: 15
[23:10:56][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING
[23:10:56][I][adf_media_player:135]: got new pipeline state: 4
[23:10:57][D][esp-idf:000][decoder]: I (29468) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2

[23:10:57][D][esp-idf:000][decoder]: I (29834) MP3_DECODER: Closed

[23:10:57][D][esp-idf:000][resampler]: I (29930) AUDIO_ELEMENT: IN-[resampler] AEL_IO_DONE,-2

[23:10:57][D][esp-idf:000][i2s_out]: I (29978) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2

[23:10:57][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED
[23:10:57][I][adf_media_player:135]: got new pipeline state: 5
[23:10:57][D][light:036]: 'top_led' Setting:
[23:10:57][D][light:051]:   Brightness: 60%
[23:10:57][D][light:059]:   Red: 100%, Green: 0%, Blue: 100%
[23:10:57][D][light:109]:   Effect: 'listening_ww'
[23:10:57][D][micro_wake_word:177]: State changed from IDLE to START_MICROPHONE
[23:10:57][D][micro_wake_word:115]: Starting Microphone
[23:10:57][D][esp_adf_pipeline:050]: Starting request, current state STOPPED
[23:10:57][D][esp_adf_pipeline:302]: State changed from STOPPED to PREPARING
[23:10:57][D][micro_wake_word:177]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[23:10:57][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING
[23:10:57][D][esp-idf:000]: I (30328) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8398703 Bytes, Inter:159012 Bytes, Dram:159012 Bytes


[23:10:57][D][esp-idf:000][i2s_in]: I (30331) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1

[23:10:57][D][esp-idf:000]: I (30333) AUDIO_PIPELINE: Pipeline started

[23:10:57][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 14
[23:10:57][I][esp_adf_pipeline:214]: [ i2s_in ] status: 14
[23:10:57][I][esp_adf_pipeline:214]: [ i2s_in ] status: 12
[23:10:57][D][esp_adf_pipeline:131]: Check element [i2s_in] status, 3
[23:10:57][D][esp_adf_pipeline:131]: Check element [pcm_reader] status, 3
[23:10:57][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING
[23:10:57][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
[23:10:57][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 12
[23:11:06][D][micro_wake_word:362]: Wake word sliding average probability is 0.505 and most recent probability is 0.980
[23:11:06][D][micro_wake_word:128]: Wake Word Detected
[23:11:06][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[23:11:06][D][micro_wake_word:134]: Stopping Microphone
[23:11:06][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING
[23:11:06][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[23:11:06][D][esp-idf:000][i2s_in]: W (38937) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[23:11:06][D][esp-idf:000][i2s_in]: W (38939) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[23:11:06][D][esp-idf:000][i2s_in]: W (38942) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[23:11:06][D][esp-idf:000][i2s_in]: W (38945) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT

[23:11:06][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED
[23:11:06][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE
[23:11:06][D][main:679]: WAKEWORD!#################################
[23:11:06][D][media_player:061]: 'Nabu 1' - Setting
[23:11:06][D][media_player:068]:   Media URL: http://192.168.0.244:8000/beep.mp3
[23:11:06][D][adf_media_player:030]: Got control call in state 1
[23:11:06][D][esp_adf_pipeline:050]: Starting request, current state STOPPED
[23:11:06][D][esp_adf_pipeline:302]: State changed from STOPPED to PREPARING
[23:11:06][I][adf_media_player:135]: got new pipeline state: 1
[23:11:06][D][adf_i2s_out:127]: Set final i2s settings: 16000
[23:11:06][D][esp_audio_processors:079]: New settings: SRC: rate: 24000, ch: 1 DST: rate: 16000, ch: 2 
[23:11:06][D][esp-idf:000][http]: I (38966) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1

[23:11:06][D][esp-idf:000][decoder]: I (38969) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1

[23:11:06][D][esp_aud:000][decoder]: 
ERROR Fatal error: protocol.data_received() call failed.
protocol: <aioesphomeapi._frame_helper.noise.APINoiseFrameHelper object at 0x1102b4460>
transport: <_SelectorSocketTransport fd=10 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.11.9/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/selector_events.py", line 1013, in _read_ready__data_received
    self._protocol.data_received(data)
  File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
  File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame
  File "/opt/homebrew/lib/python3.11/site-packages/noise/state.py", line 74, in decrypt_with_ad
    plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/noise/backends/default/ciphers.py", line 13, in decrypt
    return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/chacha20poly1305_reuseable/__init__.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
  File "src/chacha20poly1305_reuseable/__init__.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
  File "src/chacha20poly1305_reuseable/__init__.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len
  File "src/chacha20poly1305_reuseable/__init__.py", line 273, in chacha20poly1305_reuseable._decrypt_data
cryptography.exceptions.InvalidTag
WARNING onju-voice-1 @ 192.168.1.181: Connection error occurred: onju-voice-1 @ 192.168.1.181: Invalid encryption key: received_name=onju-voice-1
INFO Processing unexpected disconnect from ESPHome API for onju-voice-1 @ 192.168.1.181
WARNING Disconnected from API
INFO Successfully connected to onju-voice-1 @ 192.168.1.181 in 0.192s
INFO Successful handshake with onju-voice-1 @ 192.168.1.181 in 0.055s
[23:11:07][D][esp-idf:000][decoder]: I (39498) MP3_DECODER: Closed

[23:11:07][D][esp-idf:000][resampler]: I (39593) AUDIO_ELEMENT: IN-[resampler] AEL_IO_DONE,-2

[23:11:07][D][esp-idf:000][i2s_out]: I (39641) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2

[23:11:07][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED
[23:11:07][I][adf_media_player:135]: got new pipeline state: 5
[23:11:07][D][main:698]: DONE playing #####################
[23:11:07][D][main:223]: Started VA #####################

@s00500
Copy link

s00500 commented Jun 17, 2024

On the second try it still plays the beep but then does not start the VA pipeline anymore

@s00500
Copy link

s00500 commented Jun 23, 2024

upgraded to 2024.6 now... still feels like any media_player interaction I perform in the on_wakeword part of the config crashes it....

@s00500
Copy link

s00500 commented Aug 11, 2024

Seems this is fine now after testing with dev-next and ESPHome 2024.7.3 !

Works good for me at least now :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants