Open
Description
Hi,
First of all congrats for this new version! This is very easy to use and also fast (fast enough on laptop CPU).
As far as speaker_reference / voice to mimic is concerned what do you advise regarding its duration, format. Are there best practices you recommend ?
For example CoquiTTS used to recommend a 6 second voice 22kHz wav extract
So far I couldn't get cloning results as good as with CoquiTTS, the output voice is shivering a little bit and seems to lack harmonics. I used the simple GUI to do so.
Kind regards from France
Metadata
Metadata
Assignees
Labels
No labels
Activity