Conversation
ddcfe7d to
e2d336b
Compare
|
@shura-v Thanks for the PR, this looks useful just want to make sure I understand the usage - this is intended to tap into what is actually being scheduled to go out to the playerNode? Makes sense to be able to get these samples as they are played rather than as they are generated, if that is the case - but it also sounds like with the suppression this would allow you to manage the playback with your own audio system, is that intended with this? |
|
@ZachNagengast The main goal is playback-synchronized lip sync. SpeechCallback reflects generation time, which can drift from audible playback if audio is produced ahead of output. This callback is intended to reflect the actual playback path so consumers can stay aligned with what was really played.
I would not frame this as a first-class custom audio backend API. The intent here is playback-aligned synchronization, with silent test coverage as a practical addition. |
TTSKitcurrently exposes generation-time audio viaSpeechCallback, which is great for streaming UX but not sufficient for features that need to stay synchronized with actual playback.For example, playback-reactive visualizers or lip sync can drift because audio chunks may be generated and queued ahead of real output.
It would be useful to expose an optional playback-aligned callback from
AudioOutputthat fires from the actual playback path and provides the chunk being played, rather than the chunk being generated or enqueued.As a bonus, the test path can now suppress audible playback, which makes playback-callback integration tests much less surprising.