Hi,
Thanks for sharing your TTS project — great work.
I have a few technical questions to better evaluate it for production use:
What is the typical RTF (real-time factor) you observe in practice?
What is the average GPU and/or CPU usage during inference (and on which hardware, if possible)?
Is the model designed or well-suited for batch inference (multiple utterances per forward pass)?
Additionally, I’d like to know:
Is there any support or roadmap for training or fine-tuning the model in French?
If not, would the current architecture and training pipeline allow it with custom datasets?
Thanks in advance for your time and insights.
Best regards,
Hi,
Thanks for sharing your TTS project — great work.
I have a few technical questions to better evaluate it for production use:
What is the typical RTF (real-time factor) you observe in practice?
What is the average GPU and/or CPU usage during inference (and on which hardware, if possible)?
Is the model designed or well-suited for batch inference (multiple utterances per forward pass)?
Additionally, I’d like to know:
Is there any support or roadmap for training or fine-tuning the model in French?
If not, would the current architecture and training pipeline allow it with custom datasets?
Thanks in advance for your time and insights.
Best regards,