This issue is to track the support of the AES67 daemon for real-time transcription of audio streams using OpenAI's Whisper, integrated through Whisper.cpp, a high-performance C/C++ inference of Whisper.
The transcription feature enables speech-to-text conversion of daemon's configured Sinks with good robustness and accuracy, making it a valuable addition for multimedia and broadcast applications.
Audio transcription feature has been integrated while maintaining robust performance in multi-sink setups by leveraging a multi-threaded architecture.
See branch asr-whisper