Skip to content

Conversation

@twitchard
Copy link
Collaborator

  • New --instant-mode Flag: Introduced an --instant-mode flag (config: tts.instantMode) which can be used with streaming (--streaming) and single generation (--num-generations=1) to potentially achieve faster TTS synthesis results.
  • Improved Streaming Playback: Audio playback during streaming (--play all or --play first with --streaming) now pipes audio data directly to the detected audio player's standard input (stdin). This enables lower latency playback as audio chunks are received, without needing to write temporary snippet files to disk first.
  • One file per generation, not one file per chunk: When using streaming mode, the tool now saves one consolidated audio file per generation (e.g., output_gen123.wav) instead of multiple files per snippet (e.g., output_gen123.0.wav, output_gen123.1.wav).

@twitchard twitchard merged commit afb2b0c into main Apr 18, 2025
1 check passed
@twitchard twitchard deleted the twitchard/improve-streaming branch April 18, 2025 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants